Co Tutorial

Contents

Introduction

Co is an interpreted language with dynamic typing which supports prototyping and in which coroutines are first-class objects. It has basic branching and looping constructs, and a minimal set of built in functions for I/O, math, and string manipulation.

Basics

# Comment print("hello world\n"); # string literal expression print("hello", "world\n"); # print is varadic print(2 + 3 * 5, "\n"); # an arithmetic expression

print is a built in function which takes any number of expressions as arguments, evaluates them, and prints them to stdout with a space between each one.

In this example, we can see string literal, and arithmetic expressions.

Variables

var x = "hello"; var y = 2 * (3 + 5); print(x, "world", y, "\n");

The keyword var introduces a declaration statement, which adds a named value to the current scope. In this example the current scope is the global scope (more about scopes later). On the right hand side of the declaration is an expression, which is evaluated when the declaration is executed, and whose result ends up as the value of the variable.

You can also write multiple declarations, just as you can in C:

var i, j; var x = 10, y, z = 100;

Types

Co variables don't obviously have a type the way they do in a language like C-- you declare everything with var, rather than using int for integers and float for floating point numbers. But internally, Co has the following six types:

null
The null type, which has only one value. The only thing that is equal to null is null. Usually used for termination codes and that kind of thing.
integer
32-bit signed integers.
real
IEEE double precision reals.
string
strings-- things like "hello".
object
More on these later
handle
Opaque pointer for passing back to built-in functions. For example, if you use the built in function open to open a file, it returns a variable of the handle type which you pass back to subsequent calls to read. Otherwise there would be no way of telling read which out of potentially many open files you wanted to read from. You can do very little with variables of the handle type except pass them back into the built in functions that are expecting them.
Instead of using the handle type, we could have provided a mechanism for built in objects. This is what many languages (Python, for example) do. Generally this seems to make life easier for the Python programmer at the expense of the Python extender, which is probably the right way around.

Assignment

var x = 4; var y = 5; x = x + y; x += x; print(x, "\n");

Assignment changes the value of a variable that's already been declared. If you assign to a variable that hasn't been declared, it gets automatically declared for you. Note also that you can declare variables without initializer expressions, in which case they are initialized to the null value:

var x; # no initializer expression, x gets initialized to null y = 5; # y gets declared for you

Autodeclarations are handy, especially for global variables in short scripts, but be careful-- remember that an assignment statement looks for the variable in ancestor scopes, and the variable is only autodeclared if it wasn't found in an ancestor scope. This is best illustrated by an example.

define f() { # # Uncommenting this line is much better-- it guarantees f has its own # variable i # # var i; # # Simply print out the contents of f. We're autodeclaring i. This is # somewhat risky, because if there's already an i in a parent scope (in # this case there is, in the global scope), then we use that one! # i = 0; while (i < length()) { print(get(i), "\n"); i += 1; } } # A global variable i i = 100; push(f, "hello"); push(f, "world"); f(); # # The global variable now has the value 2, because f has changed it-- not # usually what one intended. # print("The global i now has value:", i, "\n");

You can also assign to member expressions, which we'll explain later.

Literal Expressions

An integer literal is a string of digits 0-9 representing a decimal number. You can't do hex or octal at present.

A real literal is a string of digits 0-9 followed by a dot followed by some more digits. You can't do standard form at present, and you must have all the digits (i.e. you have to write 0.0, not .0 or 0.).

No particular reason for these restrictions other than laziness. It's very easy to add support for hex, octal, standard form etc. in the lexer.

A string literal is some ascii characters in single or double quotes. You can use \t and \n for tabs and newlines. If you want double quotes in the string, use single quotes to delimit it (and vice versa).

You can't do object literals at the moment, although those would be good to add one day. JavaScript does them in a nice way.

Control Flow

Conditions

You get if and else, with a syntax pretty much like C, JavaScript, Java, or anything else:
var a = 10; if (a == 10) print("ok\n"); else { print("a is not 10\n"); a = 100; }

if works by determining the truth of the condition you give it. This can be any expression, and there are rules for what values of each of the six types count as true:

null
Always false.
integer
False if zero, otherwise true.
real
False if 0.0, otherwise true.
string
The empty string is false, all other strings are true.
object
All objects are true.
Handle
False if it's a NULL pointer, otherwise true.

Boolean expressions (ones with == or != in them) evaluate to integers, with the value 1 for true, and 0 for false.

There's also a unary negation operator, !, just like C.

Loops

You get while, which works just like if, except after the end of the taken block, the condition is evaluated again and the block repeated if the condition is still true. You cannot have an else clause after a while.
a = 0; while (a != 4) { print("a:", a, "\n"); a += 1; }

You can use the break statement within a loop to break out of the innermost loop.

Objects

So far we've seen how to write top-level code and we could write little scripts similar to bc scripts with no functions.

Functions and Subroutines

Some languages (BBC Basic for example) make a distinction between subroutines (also called procedures) and functions. The technical difference is that subroutines can have local variables, and you can think of a function as a special case of a subroutine that has no locals. In languages like C, using this terminology, what we call functions are actually in many cases subroutines.

Thinking more about what the words function and procedure mean in other contexts, functions are often things that we define implicitly-- for example by declaring that the value of a function f(x) is the value such that some condition(s) are true of the input and output. Many functions that can be easily defined in this way are extremely difficult to evaluate on a computer or at all.

Procedures on the other hand are more explicit and more imperative, taking the form of instructions to machines to move and combine data. The fact that some classes of function can be evaluated with procedures probably has something to do with the Church-Turing thesis, and is fundamental to computing.

Much programming involves a mixture of these two things: defining procedures that evaluate functions explicitly; and defining functions implicitly in terms of each other (and sometimes in terms of themselves). The balance between these two kinds of definition is different in different programs and in different programming languages. Procedure definitions are generally more likely to involve local variables than implicit definitions, which is where the idea that a subroutine without locals should be called a function seems to come from.

As soon as a subroutine contains local variables, it has its own scope, its own place to store and retrieve intermediate results. The code in the body of the subroutine can take the form of commands to a machine to manipulate the contents of this data area, as part of the process of computing the value of the function. It's natural to call a function that's defined this way a procedure.

An implicit function definition that doesn't straightforwardly correspond to code is sometimes called a program specification. No known machine can execute those directly.

Objects

In languages like C, the local variables of a function exist on a stack. When the function is called, an area of memory (called a stack frame) is allocated on this stack for the local variables, and the variables are constructed using the initializer expressions in the function definition. When the function returns, the stack frame is destroyed. If the function is ever called again, it's with a new stack frame, in which all the locals are constructed again.

One can think of the stack frame as an object in the sense that it's an area of storage containing named variables. The lifetime of stack frame objects cannot be controlled independently by a C programmer-- it's something that happens automatically as part of the mechanism of function calls.

If a C programmer does need objects that persist across function calls, he usually defines structures, and constructs his own structure instances in heap memory. Objects of this kind have evolved into the objects of object-oriented programming.

The fact that in many languages (Python and C++, for example), constructors have function-call syntax reflects this. The difference between a function call and a constructor call is that a function call constructs an automatically allocated and de-allocated stack frame object, and a constructor call constructs an object that persists for as long as it is needed.

Don't allow yourself to be confused by the fact that in C and C++, objects of user-defined types can be constructed on the stack.

In Co, functions and objects are indistinguishable. This is best illustrated by example:

define f(x) { var y = x; return x + y; } print(f(2), f(3), "\n");

This program prints out 4 5, not 4 6 as you might expect. The definition statement defines an object f. The object is constructed by the first call to it, which initializes its variable y with the value 2. This value remains 2 the next time the function is called. The variable y behaves rather like a static variable in C, but with an important difference:

print((clone f)(2), (clone f)(3), "\n");

prints out 4 6. In this example, each call is actually a call to a newly constructed clone of the object f. Each clone has its own storage area, and its own copy of the variable y. This is unlike static objects in C, of which there is only one instance per process.

The object in the last example looks rather like a C function and can be used more or less like one. But consider this object:

define Rectangle(w, h) { var width = w; var height = h; define computeArea() { return width * height; } define setWidth(w) { width = w; } define setHeight(h) { height = h; } } var rect = clone Rectangle; var rect2 = clone Rectangle; rect(10, 20); rect2(100, 200); print("Width is", rect.width, "\n"); print("Area is", rect.computeArea(), "\n"); rect.setWidth(20); print("Width is", rect.width, "\n"); print("Area is", rect.computeArea(), "\n"); print("Area of rect2 is", rect2.computeArea(), "\n");

In this example the Rectangle object is being used like a class, or, more strictly speaking, as a prototype.

There are a couple more language features of Co this example introduces which should be explained:

inner objects or methods
The body of an object may contain definition statements. Any variable referred to in the body of an inner object that has not been declared there is searched for in the parent scope. This is how the variables width and height are visible inside the body of the computeArea object.
member expressions
Variables are available in unrelated scopes using member expressions in which the object name is followed by a dot followed by the variable name. An alternative syntax for member expressions is rect["width"] (just like in JavaScript), which has the advantage that the variable name is a string expression. You can use a member expression in most places, including on the left-hand-side of an assignment.

Every object in Co has an associated scope which is basically a dictionary of named variables. Clone an object, and you clone two things: its scope, and the code in it (which you might call the body of the object). Any of the variables inside an object may have the object type, with their own scopes. All this is cloned recursively.

If you assign something to a member that isn't in the object's own scope, then the object is extended with the new member. Ancestor scopes are never updated as a result of assignments to member expressions.

Using this feature, any Co object can be used as a dictionary of name-value pairs.

define table() {} define indices() {} f = open("/usr/dict/words"); i = 0; while (i < 10000) { word = strip(read(f)); # Put each word into the dictionary, with its linenumber as a value table[word] = i; # Also store the word in an array. push(indices, word); print(word, "\n"); i += 1; } # Now match up the array to the dictionary to check everything worked properly. # Note: we do require that all the words are unique i = 0; while (indices[i] != null) { value = table[indices[i]]; assert(value == i); print(table[indices[i]], "\n"); i += 1; } close(f);

Many objects contain only a handful of members, but objects used as dictionaries like this may contain many thousands. The Co interpreter builds an efficient index to make lookup and insertion into such large objects a relatively efficient operation.

The parameters of an object become variables in its scope when it is called (or constructed-- they're the same thing in Co). This means we could have written our Rectangle example more succinctly like this:

define Rectangle(width, height) { define computeArea() { return width * height; } define setWidth(w) { width = w; } define setHeight(h) { height = h; } }

In other words, the actual parameters passed in width and height persist inside the object along with everything else.

In many traditional object-oriented languages one often finds oneself writing constructors that simply copy their parameters into this or self. This workaround reflects the fact that such languages are stack-oriented with objects bolted on-- in effect, when you write self.width = width you are manually copying data from the stack object to the heap object.

Objects can be cloned at any time in their lifecycle. In this example, we cloned the Rectangle object before we had constructed it. In fact, we never constructed it, we only constructed the clones.

Definition statements are like declarations with initializers, in the sense that they are executed once. When a definition statement is executed, it creates a new object variable referred to by name in the scope in which it is executed. In this way, the definition of computeArea is no different from the initialization of a variable computeArea of object type, in which the initializer expression would correspond to something like a Lisp lambda expression or a JavaScript object literal.

If you clone before constructing, you get clones of the declaration and definition statements ready to be executed. If you clone afterwards, you get clones of the variables and their names. As we will see in the next section, you can also clone an object half-way through construction, and this presents no problems.

Coroutines

Objects in Co can be used to implement subroutines (or procedures), as we have seen; but in fact they belong to a more general class called coroutines (or generators).

A generator is just like a procedure, except the next time you call it, it carries on where it left off instead of starting again from the beginning.

This example of a generator prints out the series of numbers 0 to 3:

define f(x) { var i = 0; while (i != x) { yield i; i += 1; } yield null; } var x = f(4); while (x != null) { print(x, "\n"); x = f(4); }

yield is just like return, except for one important difference: each time f is called, execution continues from after the last yield statement, with the state of all local variables the same as they were.

In this example, we have a yield statement last thing in the object's body, and we yield the null value to signify termination. This is a common technique in Co. When a generator has yielded for the last time in Co, it starts again from the beginning (although initializations are not executed again, of course).

A slightly quirky feature of Co is that the yield value of a function is a variable in its scope just like any other, and the yield statement does two things: it sets the value of that variable, and it relinquishes control. Control is relinquished at the end of the body anyway, so the yield value, if not set at that point, retains the value it was last set to.

You can access the yield value by name if you like:

define f(x) { y = x; yield x + y; } print(f(2), f(3), f["yield"], "\n");

Note that you have to use the [] syntax for the member expression (as shown in this example), rather than the dot syntax as that would confuse the parser (since the unquoted word yield is a reserved word).

This example prints out the sequence 0 1 2 3 3 0 1:

define f(x) { i = 0; while (i != x) { yield i; i += 1; } } var x = f(4); var j = 0; while (j != 7) { print(x, " "); x = f(4); j += 1; } print("\n");

What's happening here is that after the while loop in the body of f has finished, there is an implicit yield with the previous yield value at the body's closing brace. Note too that the declaration of i has been changed to an assignment (with implied automatic declaration). This way, when the generator starts again, the value of i becomes 0 again. Had we left that line as var i = 0;, the generator would have continued to yield the value 3 indefinitely for all future calls (since in this case the loop's entry condition would not have become true again).

Note that return is really just the same as yield, except for the one fact that when you call an object again after it returned, it starts again from the beginning, instead of resuming from the statement following yield. The return value is still put into the object's scope under the name yield just as if you had used yield.

You can use either yield or return on their own, without an expression. In this case the null value is yielded.

As mentioned briefly earlier, you can clone an object at any point; so you can call a generator a few times, and then clone it, and then start calling the clone. The clone will continue where the original object left off. You can call the original object again later, and it will carry on from where it was the last time it was called. This is somewhat similar behaviour to the fork system call in UNIX/POSIX C programming.

In fact, coroutines will be familiar to C programmers in the form of threads and processes, although these are a special kind of coroutine called asynchronous coroutines.

One of the motivations behind the kind of generator-oriented programming that Co provides is to make writing programs for multiple CPU systems more natural. For example, in conventional languages it's common to write a function that assembles a list of results of some kind. The list is then passed to another function which does the next operation on each element. The two functions could easily be designed to run in parallel since they effectively form a pipeline.

The current implementation of the Co interpreter is single-threaded, but it would be straightforward for objects to execute in parallel routinely, provided they were synchronized at yield points. This would make turning a serial program into a parallel program a simple switch in the interpreter, without requiring enormous redesign and refactoring of the program.

More tricks with coroutines

We've seen that the parameters you call an object with just become variables in that object's scope. If you call the same object again with new arguments, the values of those same variables are overwritten with the new ones.

If you call an object with fewer arguments than it has formal parameters, the null value is passed in all remaining parameters.

It's possible to capture the values of the parameters to the first call to an object, using the following simple trick:

define primes(max_) { var max = max_; var candidate = 3; while (candidate < max) { limit = int(sqrt(candidate)); ok = 1; i = 2; while (i <= limit and ok) { if (candidate % i == 0) ok = 0; i += 1; } if (ok) yield candidate; candidate += 1; } yield null; } var prime = primes(100); while (prime != null) { print(prime, "\n"); prime = primes(); }

All the subsequent calls to primes(); in this example pass no parameters, which means in fact that they set the max_ parameter in the called object's scope to the null value. But the body of primes isn't using max_, it's using max, which is initialized to the value that max_ had the first time the object was called (recall how var declarations work).

How to use Co objects

Many widely used languages provide some support for functions and objects, and it's common to simulate generators with objects-- you give the object a method called something like next() or step(), and call that repeatedly.

Of course generators aren't the only thing you need objects for: in stack-based languages you need to construct objects of some description any time you require state that persists across returning from function calls; and this is the style of programming we've all become used to. It works well in fact, because, due to the naturally recursive nature of many programs, the procedure instances required by many programs fit naturally into a stack pattern.

Some languages support generators too-- Python, for example, although interestingly there, although the generator you define looks quite a lot like a Co generator, calling it results in the construction of a generator object with a next() method. This is a clever way to introduce generators into what is basically a fairly normal call-stack and class based language.

The unusual thing about Co is that you don't necessarily make the decision about whether a definition is of a generator, function or constructor at the time you define it, but at the time you use it. JavaScript is somewhat similar in this regard, in that functions and constructors are the same. The difference is that in the body of a JavaScript function/constructor, you always have access to a persistent object (accessible from this), and a stack frame (which is just there in the normal way). Co has no stack, and always constructs a persistent object, which may of course be a clone. Co requires no this keyword.

Is this indistinguishability a good thing? Possibly not. I suspect that with most objects, how they have to be used is effectively established at definition time, in which case the language may as well reinforce this by providing separate keywords like class, function etc., as opposed to Co's general purpose define... But only time and the experience of writing a few Co programs will tell.

Built in functions

Co provides a fairly minimal set of built in functions intended to make it possible to write real-world programs to see what it's like. Adding new builtins is very easy, so one could use Co as a glue language etc.

print(...)
Interprets its arguments as strings (whatever type they are) and prints them out with a space between them. If there are more than one argument, and the first argument starts with a % character, then it is interpreted as a format specifier that applies to all subsequent arguments of numeric types (integer and real) and strings. The format specifier is of the form width, followed by a dot, followed by precision. Either width or precision may be omitted. For strings, precision is ignored.
sqrt(n)
The square root of n.
sin(x)
The sine of x (which is taken to be in radians).
cos(x)
The cosine of x (which is taken to be in radians).
tan(x)
The tangent of x (which is taken to be in radians).
asin(x)
The arc sine of x in radians.
acos(x)
The arc cosine of x in radians.
atan(x)
The arc tangent of x in radians.
open(filename, [mode])
Open a file, mode is the same as you'd pass to fopen. Returns a file handle. If mode is not supplied, r is assumed.
popen(shell command, [mode])
Open a pipe to or from a shell process. Returns a handle.
read([handle])
Read a line from file or pipe (handle came from open or popen). If handle is omitted, reads from stdin.
load(handle, [length])
Read length bytes from file into a string. If length is omitted, read the whole file.
flush(handle)
Flush the file referred to by handle.
close(handle)
Close something you opened with open or popen. Open file handles are also automatically closed by the garbage-collector if there are no references to them. Unlike in languages like C, closing a file or pipe more than once is harmless.
write(handle, ...)
Same as print, only write to the file or pipe specified by handle. First argument after handle may be a format specifier (see print).
rand()
Return a random integer
real(x)
Convert x to a real.
int(x)
Convert x to an integer.
string(x)
Convert x to a string.
substring(s, start, count)
Return a substring of s of count characters from position start (0 at the beginning).
split(s, [splitchars])
Return an object containing in its intrinsic array (more about that later) the substrings of s that result by splitting it at any of the characters in splitchars (which is a string). If no splitchars are given, split splits at whitespace. If null is passed in splitchars, the string is split at every character (rather like using / */ in perl).
Co objects are passed null in any parameters you leave out. Builtin functions don't work like that though-- split does one thing if you pass null, and another thing if you only pass one parameter.
strip(s, [stripchars])
Remove leading and trailing whitespace from s. If you provide a string in the stripchars parameter, removes any leading and trailing characters that are in that string instead of whitespace.
system(shell command)
Execute the shell command in a subshell.
push([object], expression)
Evaluate expression and push the result onto object's scope's intrinsic array. If object is not given, push onto the current scope's array.
pop([object])
Remove and return the last element of object's intrinsic array. See push for what it means if you don't provide an object. Returns null if array is empty.
shift([object])
Same as pop, but with the first element rather than the last.
top([object])
Returns the last element, like pop but doesn't remove it.
head([object])
head is to shift as top is to pop.
get([object], index)
Return the index'th element of object's array.
set([object], index, expression)
Evaluate expression and set the index'th element of object's array to the result.
length([object])
Return the length of object's array, or the string length of a string.
input([prompt ...])
Invite the user to type in a string interactively, which is returned.
assert(expression, [...])
If expression is false, the interpreter aborts with a message that an assertion fails. Any other parameters are printed out to stderr along with the failure message. If expression is true, nothing happens.
compile(pattern, [flags])
Compile a regular expression object for subsequent use with match.
match(pattern, string, [flags])
If pattern (which is either a string defining a regular expression, or a regular expression object returned from a previous call to compile) matches string, return an object whose intrinsic array contains the groups of the match. See below.
upper(s, [...]);
Convert all the arguments in place to upper case. All arguments must be strings. Returns null.
lower(s, [...]);
Convert all the arguments in place to lower case. All arguments must be strings. Returns null.
env(string)
Return the value of the environment variable string (just like POSIX's getenv).
keys([object])
Returns an object whose intrinsic array contains all the member names of object's scope (or the current scope if object is not passed).

In addition to these builtin functions, there are four builtin objects, stdin, stdout, stderr and argv.

Regular Expressions

Co regular expressions use the PCRE library. Regular expression syntax is pretty much like Perl or Python (as opposed to POSIX). match and compile both have an optional flags parameter, which is a string of single-character flags. The following flags are supported:

a
Patter is anchored, which means it only matches at the start of the string.
i
Case-insensitive.
s
Dot-all.
x
Extended-- you can have whitespace in your patterns.
m
Multiline.
u
Ungreedy. Makes all quantifiers ungreedy by default.

For more information, see man(3) pcreapi and the PCRE documentation.

Formatting

If the first argument to print, or the second argument to write, starts with the % character, then it is interpreted as a format specifier that applies to all subsequent arguments of numeric and string types, although note that a precision, if specified, is ignored for arguments of the string type.

So, to print out a series of numbers with 3 decimal places of precision, you can use:

x = 1.1 / 4.5; y = x + 1.0; z = y / 2.0; print("%.3", "x is", x, "y is", y, "z is", z, "\n");

Outputs:

x is 0.244 y is 1.24 z is 0.622

You can specify a width as well as, or instead of, a precision, by using, for example, %12.6 or %12 respectively.

Note that the \ character can be used to escape %, so use \% if you really need to print the % character as the first argument to print. \n and \t can also be used to print newlines and tabs (we've already used this in many of the examples).

Arrays

Scopes in Co contain dictionaries of name/value pairs as we have seen. Every scope, including the global scope, also provides an array, accessible using the built in functions get, set, length, push, pop, shift, and head.

An object's intrinsic array can also be read and assigned to using [] expressions.

define List() {} list = clone List; push(list, 10); push(list, "hello"); list[1] += " world"; print(list[0], list[1], "\n");

The program arguments (argv) are passed in the global array. So here is a sample implementation of echo in Co:

shift(); # the first arg is the program name itself var arg = shift(); while (arg != null) { print(arg); print(" "); arg = shift(); } print("\n");

For convenience, the program args are also available as array members of the predefined global object argv.

If you try to read or write an array past its bounds, the interpreter aborts with an error message.

One thing we haven't mentioned yet is the parent keyword. Every scope contains a reference to the object containing the object whose scope it is, named with the special name parent. This is necessary if, for example, you want to write intrinsic array mutators:

define Container() { define addElement(x) { # Push into the parent's array, (not into addElement's own array) push(parent, x); } } var container = clone Container; container(); container.addElement("hello"); container.addElement("world"); i = 0; while (i < length(container)) { print(container[i], "\n"); i += 1; }
In some ways, parent is the closest thing in Co to the this keyword provided by many other languages.

Other source files

Co provides an include statement that works just like C's #include directive. That is to say, it's as though you'd pasted in the other file at the point you include it.

The syntax is pretty straightforward:

include "Matrix.co"; include "../Solver.co";

You can provide any path, absolute or relative. If the file is not found at first, the interpreter tries prefixing the path with a list of paths defined by the enviroment variable COPATH, which is either a single directory, or a : separated list.

Why the bulldozer, and not a more intelligent import statement? To keep things simple. Co provides general-purpose objects that can be nested, and this should be adequate for maintaing modularity without the need for another system of namespaces etc. on top of it.

Code inside objects is only executed when the objects are called. This is another property that should make it possible to get away with nothing more fancy than include-- the Co programmer should have sufficient control over any of the potential side-effects of including.

It's probably good advice to avoid putting any code in the global scope other than definition statements in Co files intended to be used as re-usable modules.

Using Co

What we've explained so far is about it as far as the actual language is concerned. But this small but general set of features (objects, cloning and scope searching) should be quite powerful.

Inheritance

Here's an example of how to do something closely resembling the inheritance of class-based languages:
define Shape() { var area = null; define printArea() { print("area is", string(area) + "\n"); } define Rectangle(width, height) { define computeArea() { area = width * height; } } define Circle(radius) { define computeArea() { area = 3.141 * radius * radius; } } } # # Construct the base prototype once and for all. Note however that we haven't # constructed either of the specializations Circle or Rectangle yet. # Shape(); # # An empty object, we're just going to use it as an array prototype # define List() {} # # shapes is our list of generic shapes # shapes = clone List; # # clone the base class, and take a reference to the Circle object within the # new clone. # base = clone Shape; circle = base.Circle; # # circle now refers to a circle, and can be "constructed" with a radius, and so # on. # circle(5.0); push(shapes, circle); # # And another circle for good measure # base = clone Shape; circle2 = base.Circle; circle2(10.0); push(shapes, circle2); # # clone the base class again, this time take a reference to the Rectangle # object ("subclass prototype") within the new clone. # base = clone Shape; rectangle = base.Rectangle; rectangle(10.0, 20.0); push(shapes, rectangle); # # Loop through our list of generic shapes # i = 0; while (i < length(shapes)) { shape = get(shapes, i); shape.computeArea(); # "virtual" method shape.printArea(); # method from base class i += 1; }

The way to think about this is to ask, when we clone the object Shape, what exactly do we get? This depends on when we clone it.

When we execute the code in an object (by calling it), any declaration or definition statements in there will add named values to the object's scope. Those will all get cloned when the object is cloned. The code in the object, including where to continue from, is also cloned.

In this example, we deliberately construct Shape before cloning it to make our Rectangle and two Circles, but we don't call either of its methods Rectangle or Circle until after cloning.

So, when we clone the Shape prototype we get an object containing a variable area with the null value, and a block of code containing three definition statements.

Then, in the case of rectangle, we execute only one of those definitions, to add to that clone the Rectangle object. After executing the definition, the rectangle object contains variables width and height, and the definition of computeArea, and also has access to the variables area and printArea since they are in its parent scope.

For circle and circle2, we simply instantiate the alternative specialization in just the same way, and use that.

It's interesting to compare this with the way languages like C++ implement inheritance, which they basically do with composition. In other words, in C++, Rectangle objects would contain Shape objects. Any method not found in the Rectangle object would get looked for in the Shape object inside it. But in Co, the Rectangle is inside the Shape, and looks upwards, using exactly the same mechanism that's used to find variables in wider scopes.

Note that the interpreter copies code just by incrementing reference-counts, since all code in Co is read-only, there's no need to copy it. This keeps cloning fairly efficient.

Recursion

The word recursion in the context of programming brings to mind two things:

  1. Functions that are defined in terms of themselves.
  2. Procedures that call themselves.

These are two different angles on the same thing really (see thoughts above about functions and procedures).

Attention is often drawn to tail recursion, which is a special case of recursion. You can spot a tail-recursive procedure because the recursive call is last thing in the body of the procedure.

It is well-known that tail recursion is equivalent to iteration, and much is made of languages like Scheme that can implement tail recursion as efficiently as iteration. So too does the Logo interpreter on these pages. Some C compilers are able to achieve this optimization for leaf functions I believe as well.

The reason C compilers generally have a problem implementing tail recursion efficiently is that every time you make a function call in C, you construct a new stack object, even if you don't really need one.

Illuminatingly, in Co, an object that calls itself just is a loop-- remember, objects are generators, which means you carry on from where you left off the last time. When you get to the end you go round again. If you need full recursion, then you don't call yourself, but a clone of yourself. This is just like pushing a new stack frame in C.

# # Simple tail recursion, loops up to 4. # define f(n) { if (n >= 4) yield null; else { print(n, "\n"); f(n + 1); # call to self is last-thing, no need for a clone } } # # Full recursion, we generate a tree. # define g(n) { if (n >= 4) yield null; else { # Print some indentation to indicate depth i = 0; while (i < n) { print(" "); i += 1; } print(n, "\n"); (clone g)(n + 1); (clone g)(n + 1); } } f(0); print("\n"); g(0);

Grammar

Overview

Freeform style where all whitespace is treated the same. Semicolons between statements, and braces around blocks. Any expression may be used as a statement (just like C), but note that assignment is a statement, not an expression (so no embedded assignments). None of the the pre- and post- increment and decrement operators, but you get += etc.

Lexical Tokens

#.*$ /* eat comments-- return nothing */ include return INCLUDE; var return VAR; yield return YIELD; define return DEFINE; if return IF; else return ELSE; while return WHILE; clone return CLONE; null return NULL_TOKEN; and return AND; or return OR; == return EQUALS; != return NOT_EQUALS; \<= return LTE; \>= return GTE; \+= return PLUS_EQUALS; -= return MINUS_EQUALS; \/= return DIV_EQUALS; \*= return MUL_EQUALS; [a-zA-Z_]+[a-zA-Z_0-9]* yylval.string = strdup(yytext); return IDENTIFIER; [0-9]+ yylval.integer = atoi(yytext); return INTEGER; [0-9]+\.[0-9]+ yylval.real = atof(yytext); return REAL; \"[^\"]*\" yylval.string = strdup(yytext); return STRING; \'[^\']*\' yylval.string = strdup(yytext); return STRING; \n current_line++; [ \t]+ /* eat whitespace-- return nothing */ . return yylval.integer = yytext[0];

Parser

0 $accept: module $end 1 module: program 2 program: statement 3 | program statement 4 block: '{' program '}' 5 | statement 6 | '{' '}' 7 arglist: expression 8 | arglist ',' expression 9 args: '(' ')' 10 | '(' arglist ')' 11 definition: DEFINE IDENTIFIER args block 12 statement: declaration 13 | definition 14 | evaluation 15 | yield 16 | ret 17 | brk 18 | assignment 19 | simple_condition 20 | condition 21 | loop 22 | include 23 yield: YIELD ';' 24 | YIELD expression ';' 25 ret: RETURN ';' 26 | RETURN expression ';' 27 brk: BREAK ';' 28 declaration_item: expression '=' expression 29 | expression 30 declaration_list: declaration_item 31 | declaration_list ',' declaration_item 32 declaration: VAR declaration_list ';' 33 assignment: expression '=' expression ';' 34 | expression PLUS_EQUALS expression ';' 35 | expression MINUS_EQUALS expression ';' 36 | expression DIV_EQUALS expression ';' 37 | expression MUL_EQUALS expression ';' 38 function: expression args 39 expression: value 40 | function 41 | member 42 | clone 43 | expression AND expression 44 | expression OR expression 45 | expression '+' expression 46 | expression '-' expression 47 | expression '*' expression 48 | expression '/' expression 49 | expression '%' expression 50 | expression POWER expression 51 | '-' expression 52 | '!' expression 53 | expression EQUALS expression 54 | expression NOT_EQUALS expression 55 | expression '<' expression 56 | expression LTE expression 57 | expression '>' expression 58 | expression GTE expression 59 | '(' expression ')' 60 evaluation: expression ';' 61 value: REAL 62 | INTEGER 63 | STRING 64 | IDENTIFIER 65 | NULL_TOKEN 66 member: expression '[' expression ']' 67 | expression '.' IDENTIFIER 68 simple_condition: IF '(' expression ')' block 69 condition: IF '(' expression ')' block ELSE block 70 loop: WHILE '(' expression ')' block 71 clone: CLONE expression 72 include: INCLUDE STRING ';'