Aside from a select few shared variables, start.zoo is careful to isolate the workspace from its own data structures, and it compiles the user's scripts in a namespace that is entirely separate from its own. Thus it is natural for there to be two script files provided with Yazoo: start.zoo which defines the outer world, and user.zoo which inhabits the user's workspace and uses its namespace. user.zoo is just an ordinary script which is run before the command prompt appears -- any sort of code can appear in it. The user.zoo file that comes with the program defines a number of constants and functions that the author has found useful (although as its name suggests the user should feel welcome to modify it to suit his own purposes). This section describes each of the pre-programmed members that are defined by the author's script.
The following are the pre-loaded variables and types defined in user.zoo.
Boolean:
a data type, equivalent to an unsigned byte but intended for Boolean variables.
true, false:
1 and 0 respectively, intended to be the two values of Boolean variables.
on, off:
1 and 0 respectively; for the disasm and calculator variables (from start.zoo).
passed:
0, denoting the error code of a function that did not cause any error.
e:
the exponential constant. Rather than provide an exp() function, Yazoo defines e so that the user can evaluate exponentials by writing e^x, e^-2, etc.
pi:
the famous constant pi.
root:
an alias to the user's workspace.
DirectoryPaths[]:
a string array of pathnames to folders. The user.zoo routines (but not the built-in Yazoo functions!) will search each of these paths when looking for a file. user.zoo preloads the empty path, which is C's default directory (which in the author's experience has always just been the current shell directory when Yazoo was run). One adds elements to this array in the normal way, using the [^...], [+..] operators, etc.
pwd:
the current path to the workspace, stored as a string.
The first four routines in user.zoo assist Yazoo in some of the more awkward aspects of the way it handles variables.
new()
syntax: (same type) var2 = @new((any type) var1 [, (compatible type or *) data_to_copy])
The new() function returns a new instance of a variable, of the same type as the old and storing the same data. This is particularly useful within functions, where, as emphasized earlier, it is usually wise to create a new return variable with each function call. Instead of writing
f1 :: {
...
((rtrn =@ *) @:: double) = x + y
return rtrn }
we accomplish this with
f1 :: {
...
return ( new(rtrn) = x + y ) }
or even just
f1 :: {
...
return new(x + y) }
new() is certainly helpful, but it is not necessarily the panacea that it may appear at first. The data contained in the arguments must be copied to the new variable, but in some cases the types may not match. For example, suppose we created a variable and later modified it:
comp1 :: { a :: ulong }
comp1.b :: sshort
new(comp1) | cannot copy the data
In this case new() will create a new variable using comp1's constructor, but since this does not match its current composition it will not be able to copy the data over, and will print a warning. The most common case where type does not match actual data is the case of an array:
arr1[4] :: string
new(arr1) | cannot copy the data
As explained in the section on arrays, arr1 is actually a composite variable with no explicit definition; the string data type defines that variable's single, 4-index member. Thus arr1 itself has a null type, and it follows that new() will create a null variable with no members which can't be copied to. However, new() can copy arrays created using the array() function (also in user.zoo), so these two functions are synergistic.
By using the optional second argument to new(), one can decouple the variable that donates the type specification (first argument) from the variable that provides the data (second argument). If this second argument is void (as in new(v, *)) then no data is copied.
Note that new({a, b}) doesn't make new instances of a and b, but rather just returns a new set of aliases to those variables.
array()
syntax: (numeric) new_array := @array((numeric) size1, size2, ..., (string) type)
Generates an N-dimensional array having the specified number of indices in each dimension, of the specified type (written as a string). The advantage of array()-generated arrays over those defined in the ordinary way is that these arrays have their full type definitions embedded in them, so they can be used as templates for defining copies of themselves.
a := @array(5, 2, "{ x :: ubyte, s :: string }")
a2 := a
The type definition explicitly encodes the given size in each dimension, so each copy of the array will be born with the original size even if the template array had been subsequently resized. If we had written
a := @array(5, ...)
a[^8]
a2 := a
then a would have had dimensions 8 x 2, a2 would be the original 5 x 2, and there would be a type mismatch error when we tried to copy the data of one to the other.
The new() function does not work with standard arrays, but does work with arrays defined by the array() function.
resize()
syntax: (numeric) resize((composite) array_var, (numeric) size [(any type) template_var])
Some data types cannot be conveniently packaged into arrays. For example, a single function running as part of an array can run into difficulties that it would not have running solo:
f :: { storage[*] :: ubyte ; storage[^args[1]], .... }
f_array[5] :: f
f_array[2](4) | won't work
Here, the third line will throw an incomplete-variable error, because it tries to perform an operation that would disrupt the uniform structure of the f_array. To get around this problem, we can bundle the five functions into a set, rather than a true array -- and then create, add and delete elements from that set by use of the resize() function instead of the usual array-resize operator [^...].
f_array :: {}
resize(f_array, 5, f)
...
resize(f_array, 2)
...
resize(f_array, 10)
...
The first argument of resize() is the set, and the second is the number of elements to rescale to. Like `[^...]', the resize() function either deletes elements from the end or adds new elements to the end, without affecting the other elements at the beginning of the set. The optional third argument gives the type of new array elements that are to be created. Unlike an ordinary array, this type can differ from that of existing elements in the set, but that will only affect new elements that are created. If no type is provided then the first element in the set is used as the template for any new elements that are created; if there is no existing first index then the new elements will be of the void type.
copy()
syntax: copy((variable) var1, { "from" or "to" }, (variable) var2)
This routine copies data member-by-member from one variable to another. This is useful for cases when otherwise equivalent variables cannot be copied directly using ordinary equate `=', because their indices are packaged in different groupings of members. For example, the following causes an error:
num3_a[3] :: double
num3_b :: { a :: b :: c :: double }
num3_a = num3_b | error on this step
A type-mismatch error results because Yazoo will not copy three separate members' data into the three indices spanned by a single member. copy() circumvents that problem by manually copying each num3_b[i] to num3_a[i].
The direction of data flow is determined by the second argument. If the second argument is "from", data is copied from the third argument into the first; if it is "to" data is copied from the first argument into the third. There is no return value, but the error code is stored in the variable copy.err_code, with 0 indicating no error.
Next, user.zoo provides five routines that manipulate strings.
cat()
syntax: (string) concatenated string = cat((variables) var1, var2, ...)
Returns a string which is the concatenation of the arguments. This is just a convenient implementation of the print_string() function: s = cat(v1, v2) is equivalent to print_string(s, v1, v2).
char()
syntax: (ubyte) ASCII_code = char((string) single_character_string)
Returns the ASCII code for a single-character string.
C_string()
syntax: (block) string bytes = C_string((string) my_string)
Strings in Yazoo are normally stored internally as linked lists. C_string() converts a length-N resizable Yazoo string to a N+1-byte C-style string containing a terminating 0 character. The C-string is returned as a block type.
lowercase()
syntax: (string) lowercase_string = lowercase((string) my_string)
Converts a mixed-case string to lowercase.
uppercase()
syntax: (string) uppercase_string = uppercase((string) my_string)
Converts a mixed-case string to uppercase.
There are three routines in user.zoo for printing to the screen in different ways.
printl()
syntax: printl([data to print])
This function is the same as print() except that it adds an end-of-line character at the end.
sprint()
syntax: sprint([data to print])
sprint() is used for printing composite objects such as variables and functions; the `s' probably originally stood for `spaced', `set', or `structure'. This is one of the most useful functions. Each member of the object is separated by a comma; the contents of composite members are enclosed by braces. Void members are represented by asterisks. The output is in exactly the format that Yazoo uses for constructing sets.
> my_object :: { a := 5, b :: { 4, 10, "Hi" }, nothing }
> sprint("object: ", my_object)
object, { 5, { 4, 10, "Hi" }, * }
sprint() should not be used to print any of the error or data registers, since it calls functions that can overwrite those registers. A known bug is that if sprint() crashes due to a recursion limit error, then the next (but only the next) sprint() call will fail as well.
mprint()
syntax: mprint([data to print])
This `matrix' print function prints tables of numbers. Each index of the argument is printed on a separate line; each index of a row prints separately with a number of spaces in between. For example:
> table :: { 2, { 3, nothing, 5 }, { 5/2, "Hello" } }
> mprint(table)
2
3 * 5
2.5 Hello
mprint() has two user-adjustable parameters. mprint.field_width controls the number of spaces in each row; it defaults to 11. mprint.max_digits controls the precision of numbers that are printed out; it defaults to 6. A max_digits of zero means `no limit'. mprint() should not be used to print registers.
The following six routines are applied to the running of scripts or pre-compiled bytecode, or pertain to the execution of commands from the prompt.
run()
syntax: (numeric) script_return_value = run((string) filename [, (composite) target])
The essential run() function runs a script stored in a file. It is quite possibly the most useful routine in user.zoo, since running code is otherwise rather a formal and tedious procedure. run() performs a compilation incorporating the user's namespace (unless the source file ends in ".hob", in which case the script is presumed precompiled); it then transforms the bytecode and runs it. Any errors in the process are flagged along with the offending text. run() searches all directories in the DirectoryPaths[] array. If there is a direct return from the lowest level of a script (i.e. not within a function or type definition) then the return variable will be handed back to the calling script.
Normally the specified script is run in the user's workspace. Optionally, we can pass some other variable or function as a second argument to run(), in which case the script runs inside that object instead.
A given script is often run multiple times. By default, when executing a script run() first checks to see whether it has seen that script before, and if so removes any root-level objects that the script defined when it was last run. This is to avoid type-mismatch errors when the script tries redefining those objects. Occasionally this can be damaging. The user can tell run() not to pre-delete prior definitions by setting run.CleanUp = false.
do_in()
syntax: do_in((composite) target [, search path [, code_args [, bytecode_mod_args]]] , code, base script [, code, code modifying bytecode[]])
The do_in() tool allows one to run code in a specified location and with a specified search path, and gives the option of manually modifying the bytecode before it is run. The idea is that it is easier to write bytecode by perturbing a compiled script than to write everything from scratch.
The first argument to do_in() is the variable to run the code inside. The optional second argument gives a customizable search path, and it exactly mirrors the optional third argument to transform() (see the reference on transform() for how to specify a path). The third and fourth arguments, if given, are passed as args[1] for the script to be run and the bytecode-modifying script respectively.
Following the first code marker we give the text of the script that we want to run, or the closest that the Yazoo compiler can achieve. Often this is all we need. On occasion we may wish to modify the compiled bytecode of the baseline script before it executes, perhaps to achieve something that is unscriptable. do_in() accommodates this need by running, in unusual fashion, the code following an optional second code marker/semicolon in its argument list (if that exists) after compilation but before execution. At that time the compiled baseline script will be stored in an array entitled bytecode of signed longs, and we may alter in any way whatsoever provided the bytecode comes out legitimate. In the extreme case we can give no baseline script and simply alias bytecode[] to an existing slong-typed array that was already filled with bytecode.
Here we show how to use do_in() to create an unjammable alias, which cannot be done using ordinary Yazoo scripting.
do_in(
root
code
al := @var1
code
bytecode[3] = that + 128 | add an unjammable flag
)
compile_and_do_in()
syntax: compile_and_do_in((composite) target [, search path [, code_args [, bytecode_mod_args]]] , code, (string) base script string [, code, code modifying bytecode[]])
Compiles a script, optionally modifies it, and then executes the script in the provided directory. This is equivalent to do_in() except that the script is stored as an uncompiled string rather than compiled code. Even though the script string appears in the second coding block of the function arguments, it is passed in the same way as parameters in the first coding block (the constructor); for example:
compile_and_do_in(target_variable; "addendum :: string")
go()
syntax: go([ code, ] path)
Yazoo's go() function is similar to UNIX's cd command: it changes the working variable for commands entered from the prompt. A search path is dragged along behind that leads eventually back to root (the original workspace). To see how this works, type:
> a :: { b := 2 }
> go(a)
> sprint(b) | we are in 'a', so this is legal
2
> sprint(a) | search path extends back to root, so we can see 'a'
{ 2 }
The search path exactly backtracks the given path. If one types go(a[b].c().d, then the working variable is `d', and the search path goes backwards through (in order): the return variable of `c', then `c' itself, then the b'th element of `a', then `a' itself and finally root. Typing just go() sends one back to the root; typing go(root) is actually not quite as good because it puts root on the path list twice. To see the path, look at the global pwd variable.
go() works by updating the go_paths[] array defined by start.zoo. Each command entered from the prompt is transformed and run according to the current state of go_paths, so invoking go() does not take effect until the next entry from the prompt. Thus it was necessary in our example to separate the second and third lines: go(a), sprint(b) would have thrown a member-not-found error. For the same reason, while running a script (via run()), go() will do nothing until the script finishes -- use do_in() instead.
When the user calls go(...), Yazoo constructs the argument list before go() itself has a chance to run. Owing to this fact, certain sorts of go-paths will cause an error that go() can do nothing about. For example, go(this[3]) will never work because `this' is construed as the argument variable, not the working variable. To get around this problem, go() gives us the option of writing the path after a code marker or semicolon, as in go(code, this[3]), as those paths are not automatically evaluated. A code marker is also useful if we need to step to a function's return variable but don't want the function to run more than once. go(code, a.f().x) will evaluate f() just a single time in the course of go-processing, whereas for technical reasons f() would have run twice had we not included the code marker.
go() at present has many limitations. Each path must begin with a member or register name or this, and all subsequent steps must consist of step-to-member (a.b) and step-to-index (a[b] and related) operations and function calls (a()). No [+..] or +[..] operators are allowed. The step-to-index operations are particularly dicey because of two nearly contradictory requirements: the path can only step through single indices, and for practical use the path must nearly always span complete members (i.e. all of the indices of most arrays). Although the latter is not a hard requirement, it is really hard to do anything meaningful within a single element of an array, because so many common operations involve creating tokens and hidden variables which can only be done for all elements of the array simultaneously. Even go() will not work at that point, so in this sticky situation start.zoo will eventually take pity and bail the user out. The upshot of all this is that go() is not very good inside of arrays.
jump() is a similar operation to go(), except that go() can shorten a path whereas successive jumps keep appending to the current search path.
jump()
syntax: jump([ code, ] path)
jump() is basically identical to go() except in the way that it handles the first step in a search path. For most details, see the explanation of go() above. The difference between the two functions can be seen by example.
> a :: { b :: { ... } }
> go(a.b), print(pwd)
root.a.b
> go(a), print(pwd) | starting from a.b
root.a
> go(b), print(pwd)
root.a.b
> jump(a), print(pwd) | again, starting from a.b
root.a.b-->a
jump() takes advantage of the fact that search paths in Yazoo can twine arbitrarily through memory space; we don't have to restrict ourselves to paths where each variable is `contained in' the last. A more useful path would be something like root.a.b-->c.d: that would allow us to work inside of `d' while retaining access to `a' and `b', even if those latter lie along a different branch.
ls()
syntax: (string) var_names = ls([(variable) var])
Returns the names of the variables in the current directory, which is usually root (see go() and jump()). If an argument is provided then ls() returns the names of the variables inside that argument variable. Remember that ls() requires the parentheses! Just typing `ls' (no parentheses) at the command prompt will print out the internal variables of the ls() function.
The next four functions in user.zoo perform various file I/O and list- or table-related operations.
Load()
syntax: (string) filedata = Load((string) filename)
Load() (capital `L') extends the built-in load() function by searching all paths in the DirectoryNames[] array.
Save()
syntax: Save((string) filename, (string) filedata)
Save() (capital `S') extends the built-in save() function by searching all paths in the DirectoryNames[] array. This is important when a filename such as archive/mail.txt is provided, since the archive/ folder may not be in the default (./) directory.
SaveTable()
syntax: SaveTable((string) filename, (table) data [, (string) header])
The SaveTable() routine exports data stored a set or array in table format to a file. In some ways it is similar to mprint(): successive indices of the table are written to successive lines, although fields within each index are separated by tabs (not spaces, as in mprint()). If the optional header is specified, that is printed verbatim at the top of the table, whether or not the header rows correspond to the rows of the table.
ReadTable()
syntax: ReadTable((table) to_read, (string) raw_text [, code, (Booleans) IfHeader, ResizeFirstIndex, ResizeSecondIndex = values])
The counterpart to SaveTable() is ReadTable(), which loads data into an array. It reads the data from a string, not a file, and tries to parse the data into the provided table (and if it fails it will print an error message to the screen). If the IfHeader variable is set to true, then the first line of text is skipped. Setting the Resize...Index arguments gives ReadTable() permission to adjust the size of the table to fit the data; in order for this to work the table must be a square array (i.e. not a set of members that can be resized independently). The default values of the optional arguments are false for IfHeader, and true for ResizeFirstIndex and ResizeSecondIndex. An error results in a non-zero value for ReadTable.err_code.
Finally, user.zoo provides six mathematical operations.
round()
syntax: (numeric) rounded_integer = round((numeric) real_number)
Rounds a real number to the nearest integer. For example, 1.499 rounds to 1, 1.5 rounds up to 2, and -1.5 rounds `up' to -1.
min()
syntax: (numeric) result = min((numeric list) the_list [, code, rtrn = { index / value / both])
Returns the minimum element of a list: its index, value (the default), or the combination { index, value}.
max()
syntax: (numeric) result = max((numeric list) the_list [, code, rtrn = { index / value / both])
Returns the maximum element of a list: its index, value (the default), or both { index, value }.
sum()
syntax: (numeric) result = sum((numeric list) the_list)
Returns the sum of elements of a numeric list.
mean()
syntax: (numeric) result = mean((numeric list) the_list)
Returns the average (arithmetic mean) of the elements of a numeric list.
sort()
syntax: sort((table) table_to_sort, { (list) sort_by_list or (numeric) sorting_index } [, code, direction = { increasing / decreasing])
Sorts a list or table, which is passed as the first argument. If it is a table then a second argument is required: either the column number to sort by, or a separate list to sort against. So the following two sorts are equivalent:
MyTable[10] :: { a :: b :: double }
sort(MyTable, 1) | sort by first column
sort(MyTable, MyTable[*].a)
The sort-by list will be unaffected.
Whether to sort in increasing or decreasing order can be specified after the semicolon/code marker; the default is `increasing'. The column to sort by, whether it is in the same table or in a separate list, must be numeric; sort() will not alphabetize strings.
Last update: July 28, 2013