Yazoo ---> Online Help Docs ---> Yazoo scripting ---> Functions

Defining functions (properly)

In Yazoo, functions are objects. This is both a source of strength and one of Yazoo's greatest weaknesses. On the one hand, when something goes wrong we can go into a function as if it were a variable, peek at all its internal members, and fiddle around with it for the next run. The author, who has a weak sense of propriety, considers this a plus. The big drawback is that it is possible for the return value of a function to be overwritten before it gets used. There are several ways to avoid this, but at the cost of some manual labor -- about one line or so per function call.

Let's demonstrate some of these unusual properties of Yazoo functions by actually writing one. As we said, functions are objects, so, unsurprisingly, we define them using the same operators we have been using for all other objects. There are two new ingredients: an args variable, containing the function arguments; and the `code' marker (semicolon for shorthand), which marks the end the function's variable definitions and the beginning of its executable code. The function exits with the classic return statement, which may or may not be followed by some object or value to pass back to the caller. It is usually easiest to define functions from scripts, not from the command line, since that allows us to avoid trying to fit the whole definition onto one line.

SwapDigits :: { lh_dig :: rh_dig :: ubyte code lh_dig = round_down(args[1] / 10) rh_dig = args[1] mod 10 return rh_dig*10 + lh_dig }

Now that we have written the function, we can run it in the normal way.

print( SwapDigits(27) )

We get 72. So far so good. If for some reason we were dubious as to whether the function had worked properly, we might reassure ourselves by checking its two members as if the function was a composite variable.

print( SwapDigits.lh_dig, " ", SwapDigits.rh_dig, "\n" )

No surprises: the result is "2 7".

It may be reassuring to know that the two variables in the constructor (the part before the code marker) will definitely be defined inside the function. One might suspect that if there had already existed, say, a global member named lh_dig, then SwapDigits() would simply have tried to redefine that member. No -- the define operators always create members inside the immediate function, so we don't have to worry about accidentally reusing common member names like counter variables. (Mind that this only applies to the left-hand arguments of the define operators; on the right-hand side Yazoo searches all the way back to the workspace. And other operators such as equate don't restrict either argument: lh_dig = 5 does its job whether or not lh_dig is in SwapDigits() or not.)

As advertised, we can modify the members of a function rather arbitrarily after the function has been created. For example, we could remove one of its members, though SwapDigits would obviously stop working until we redefined that member. We can't change the function code itself, at least not with the tools we have so far, but we can introduce auxiliary codes within the function. So we can, for example, define a sub-function that prints the members of SwapDigits, as we did above.

SwapDigits.printout :: { code print(SwapDigits.lh_dig, " ", SwapDigits.rh_dig, "\n") }

Two features of printout() are noteworthy in their own right. First, there is no return statement. This is fine; when Yazoo hits the end of a function it does as if it had encountered a return with no argument. (Technically, in both cases it returns nothing, literally, as in: it returns the void.) The second point is that since printout() was defined from a path that did not pass through SwapDigits(), it does not have automatic access to SwapDigits's variables and thus has to name SwapDigits.lh_dig and SwapDigits.rh_dig explicitly.

Because they are objects, functions can be used as templates for defining other functions. For example, we could write:

SD2 :: SwapDigits

Now SD2 will contain the two variables lh_dig and rh_dig, and it will have all the code that followed the code marker in SwapDigits's original definition. In hindsight we really should have defined printout() in the original definition of SwapDigits(). The reason is that printout() will not be present in SD2, since we added printout() separately and therefore it is not part of the definition of SwapDigits. For this reason (and only this reason) a command like

SD2 = SwapDigits

will give a type-mismatch error. Perhaps a slight misnomer, since their types are actually the same (those depend only on the original definition) -- but their data structures do not match. Likewise, we could not have defined SD2 using

SD2 := SwapDigits | define-equate

(note the colon) since that would caused the same error for the same reason.

Usually, only one copy of a stand-alone function is ever needed. The exception is the case of recursive functions. Owing to the fact that a Yazoo function is an object, a new copy of that object needs to be defined if the function wishes to run itself in mid-execution. The memory requirement is then the same as for stack-based functions (N functions for a depth-N recursion). The basic procedure is to do something like the following:

factorial :: { fval :: ulong code if args[1] == 1 fval = 1 else new_fact :: this fval = args[1]*new_fact(args[1]-1) endif return fval }

Importantly, the definition of new_fact could not have been put in the constructor: that would have caused Yazoo to try to construct an infinity of nested factorial() functions, and eventually throw in the towel with a recursion-depth error.

The same rules apply to indirect as to direct recursion. In other words, if a() calls b() calls a(), then function b() needs to make a new copy of a() in order to avoid overwriting the data of the outer call to a() that is still running, and a() needs to make a new b().

Here's one unusual feature of Yazoo functions:

f :: { code return "a" code return "b" }

Two coding blocks---so there should be a way to access them both. If we call f() we get back "a", so the first block runs by default. The way to run the second coding block is to call f#2(), after which we get back "b". We can infer that f#1() is simply longhand for f(), and f#0() runs the code before the first code marker (the constructor, when we talk about classes). We'll come back to this a bit later.

As we might suspect from earlier warnings, there was a rather serious defect in the functions we have been writing, which shows up when we try to do the following:

print( factorial(3), " ", factorial(5), "\n" )

What we get is

120 120

Well, we got back two copies of the second result -- not at all what we wanted. The problem is due to an unfortunate coincidence of two facts: 1) unless instructed otherwise, functions return their arguments by reference, and 2) print() runs both arguments before either one is printed. Indeed, all functions evaluate all their arguments before they begin executing their own code. In our example above, by the time print() got around to printing, factorial() had already run twice, its return variable had been set to 6 and then reset to 120, and when all was said and done, both tokens in print's argument pointed to this value.

We encounter the same problem when we define sets, since they also work with tokens.

swapped_dig :: { factorial(3), " ", factorial(5) }

Again, we end up with two tokens both pointing to the same return variable, which now stores the value 120.

One obvious way to get around the problem is to do away with the troublesome tokens, and straightaway copy the return value of each function call into a new variable (which we can do) before the function runs again.

print( arg1 := factorial(3), " ", arg2 := factorial(5), "\n" ) facts :: { el1 := factorial(3), el2 := factorial(5) }

This is quite sloppy, and we're likely to forget to do it at some point. The better, fail-safe method is to have the function itself store each separate return value in a new variable, by unlinking and re-assigning the member with each function call. That way the first token will still cling to the old return variable after the new variable has been created for the second token.

factorial :: { rtrn :: ulong fval :: ulong code ... ((rtrn =@ *) :: ulong) = fval return rtrn }

If Yazoo is being run interactively, there are a number of ways that we can use user.zoo's new() function to accomplish this with somewhat less trouble:

( rtrn = @new(ulong) ) = fval return rtrn

rtrn = @new(fval) return rtrn

or even just

return new(fval)

at the end. Reallocating the return variables is admittedly a nuisance, even with the new() function, but you take your chances otherwise.

Prev: Functions Next: Function arguments

Last update: July 28, 2013