The conser programming language is a language in which the only data type is sets of pairs (of sets). It is possible to represent many things with sets of pairs--lists, numbers, not to mention sets. conser is a declarative, lazily evaluated, esoteric programming language. conser programs tend to require immense amounts of memory and processing time to perform very simple calculations.
Creating a pair is called consing, after the lisp function, cons. Since a program that uses any data (other than the empty set) needs to cons that data, it's a conser. Also, conser is sort of a combination of Conway and surreal, which is significant, since pairs of sets (of pairs) can represent surreal numbers.
Surreal numbers were invented by John H Conway, and named by Donald E Knuth, who wrote a novelette about two characters making discoveries about surreal numbers. A good introduction to surreal numbers on the web is at http://www.tondering.dk/claus/surreal.html.
A conser program consists of a set of function definitions, which may
be mutually recursive. The execution begins with the evaluation of
the main
function, which must be defined to take zero or
one argument. If main
takes one argument, the argument
contains the program's input. The result of evaluating
main
will be the program's output. Evaluation is done
lazily.
The canonical example:
An example with input:
"hello: prints Hello world!" main = "01001000" ((,(((,((( "01100101" ((,(,(((,((, "01101100" ((,(,((,(,(( "01101100" ((,(,((,(,(( "01101111" ((,(,((,(,(,(, "00100000" (((,((((( "01110111" ((,(,(,((,(,(, "01101111" ((,(,((,(,(,(, "01110010" ((,(,(,(((,( "01101100" ((,(,((,(,(( "01100100" ((,(,(((,(( "00100001" (((,(((((, "00001101" (((((,(,((, "00001010" (((((,((,( (,) ,)),)),),),),) ),))),),),),) ),),),),)),),) ,),)),),))),) ,),))),))),) ,)),),)))),) )))),))),) ))),)))),) ,),),),),)),),) )))),))),) ,),))),))),) ,),))),))),) ),)),),))),) ,),),)),),)),).
Implementation of basic definitions of surreal numbers:
"cat: copies stdin to stdout." main a = a.
"Evaluates to () for valid surreal numbers. Doesn't evalute to () otherwise." number? a = (gt>a<a number?<a number?>a). "Evaluates to () if a > b, (,) otherwise." gt a b = ((,)!gt b<a gt>b a). "Addition." + a b = (+<a b + a<b,+>a b + a>b). "Inverse under addition." - a = (->a,-<a). "Multiplication." * a b = (+ + *<a b * a<b - *<a<b + + *>a b * a>b - *>a>b, + + *<a b * a>b - *<a>b + + *>a b * a<b - *>a<b).
Whitespace separates tokens. It is otherwise ignored.
Comments begin and end with double quotes. Comments cannot contain double quotes. Comments are treated as whitespace.
The characters =.<>(),&!
are special tokens.
Unbroken strings of characters other than whitespace, double quotes, or any of the special token characters form identifier tokens.
program ::= definition { definition } definition ::= function-name parameters "=" function-body "." function-name ::= identifier parameters ::= { identifier } function-body ::= expression expression ::= "<" expression | ">" expression | "(" { expression } ")" | "(" { expression } "," { expression } ")" | "(" { expression } "&" { expression } ")" | "(" { expression } "!" { expression } ")" | identifier { expression }
<
expr>
expr(
expr-list)
(
expr-list1,
expr-list2)
(
expr-list1&
expr-list2)
(
expr-list1!
expr-list2)
<((a,b) (c,d)) => (a c) >((a,b) (c,d)) => (b d) (a b b) => (a b) (a b & b) => b (a b ! b) => (a!b)
Function parameters in expressions have the usual meaning.
identity a = a.
A function name followed as many expressions as number of parameters the function invokes that function. Since all the arguments are sets, the meaning of invoking a function is subject to these rules:
()
, the result is ()
,
which is just a special case of the second rule.
f () => () g a () => () g () a => ()
f (a b) => (f a f b) g (a b) c => (g a c g b c) g a (b c) => (g a b g a c) g (a b) (c d) => (g a c g a d g b c g b d)
()
, the function evaluates
to ()
without having to complete the calculations for the
rest of the arguments. Additionally, <
,
>
, (&)
, and (!)
can
return results without completely evaluating their arguments.
Example:
"List of prime numbers." primes = primes' (((,),),) n* (((,),),). primes' a nonprimes = ?:(notin a nonprimes,((a,primes' (a,) (nonprimes n* a)),primes' (a,) nonprimes)). n* a = n*' a + a a. n*' a b = (b,n*' a + a b). + a b = ?:(<a,(b,+<a(b,))). "[->+<]" ?: a = (?(<a,<>a)2<a>>a). ? a = (>a!2<a>a). 2 a b = b. gt a b = ?:(<a,(b,gt<a<b)). notin a l = ?:((a!<l),(a,?(gt a<l,in a>l))).
The input, the contents of stdin, is the argument to
the main
function, and the result is the output,
which will be written to stdout.
The input and output are nested pair structures suggested by Ben Rudiak-Gould. If the car of the pair is empty, then the first bit is 1, and the rest of the bits are in the cdr, otherwise, if the cdr if empty, the first bit is 0, and the rest of the bits are in the car. Since they are lazily evaluated, interactive programs are possible. Any implementation that buffers its output should flush its buffer before each time it reads input, but is not required to, since, for example, if the native output is in bytes, then buffered partial bytes need not be flushed.
If the result of main
is an invalid structure, i.e. a
pair with both the car and cdr nonempty, or with a car or cdr with
more than one element, then what happens is undefined. The current
implementation chooses whichever pair it happens upon first as
defining the output. An alternative could have the output splitting
into multiple output streams, one for each pair, though, due to lazy
evaluation, there could be extraneous splitting if unresolved pairs
later resolve to be identical.
(,)
, (&)
,
(!)
often return incorrect results.-O cond
option).
There are a number of things that it doesn't do quite correctly:
(1 & (1 2 3 ... & 1 3 5 ...))
.
Fixing function calls to incrementally return results for
potentially infinite sets would be too expensive in terms
of memory and stack usage to be worth it. Fixing
intersection to incrementally return results for potentially
infinite sets is feasible, though.Infix syntax makes some functions look syntactically nicer. However, they make parsing more complicated. They also provide opportunities for obfuscation. Infix functions are purely syntactic, and are semantically identical to regular functions.
To define an infix function, enclose its name with parentheses in its definition with its left arguments to its left, and its right arguments to its right. If there are no left arguments, it is the same as regular function. An infix function invocation is the same as a regular function invocation except that its left arguments are precede the function name.
"Surreal addition using infix notation." a (+) b = (<a + b a + <b, >a + b a + >b).
Combining infix functions with other functions can cause confusion as
to which takes precedence. The rule will be left associativity. If
in doubt, avoid infix functions entirely. For example, if
+
takes one left argument and one right argument,
a b + + c
is equivalent to b + (a + c)
.
For Forth fans:
"Surreal addition using postfix notation." a b (+) = (<a b + a <b +, >a b + a >b +).
If the main
function is defined with more than one
argument, the first argument has the contents of stdin as
usual, and the additional arguments have contents of additional input
streams, determined in an implementation-dependent way. For example,
on Unix and Unix-like systems, the second argument could be read from
file descriptor 3, the third from file descriptor 4, etc.