Monk: "If I have nothing in my mind, what shall I do?"
Joshu: "Throw it out."
Monk: "But if there is nothing, how can I throw it out?"
Joshu: "Well, then carry it out."
(Zen koan)
(c) Software Lab. Alexander Burger
Because other Lisps are not the way I'd like them to be. They concentrate on efficient compilation, and lost the one-to-one relationship of language and virtual machine of an interpreted system, gave up power and flexibility, and impose unnecessary limitations on the freedom of the programmer. Other reasons are the case-insensitivity and complexity of current Lisp systems.
PicoLisp is for programmers who want to control their programming environment, at all levels, from the application domain down to the bare metal, who want to use a transparent and simple - yet universal - programming model, and who want to know exactly what is going on.
It does not pretend to be easy to learn. There are already plenty of languages that do so. It is not for people who don't care what's under the hood, who just want to get their application running. They are better served with some standard, "safe" black-box, which may be easier to learn, and which allegedly better protects them from their own mistakes.
PicoLisp is easy to understand and adapt. There is no compiler enforcing special rules, and the interpreter is simple and straightforward. There are only three data types: Numbers, symbols and lists ("LISP" means "List-, Integer- and Symbol Processing" after all ;-). The memory footprint is minimal, and the tarball size of the whole system is just a few hundred kilobytes.
Most other systems define the language, and leave it up to the implementation to follow the specifications. Therefore, language designers try to be as abstract and general as possible, leaving many questions and ambiguities to the users of the language.
PicoLisp does the opposite. Initially, only the single-cell data structure was defined, and then the structure of numbers, symbols and lists as they are composed of these cells. Everything else in the whole system follows from these axioms. This is documented in the chapter about The PicoLisp Machine in the reference manual.
There is only one symbolic data type, no distinction (confusion) between symbols, strings, variables, special variables and identifiers.
Most data-manipulation functions operate on the values of symbols as well as the CARs of cons pairs:
: (let (N 7 L (7 7 7)) (inc 'N) (inc (cdr L)) (cons N L)) -> (8 7 8 7)
There is only a single functional type, no "special forms". As there is no compiler, functions can be used instead of macros. No special "syntax" constructs are needed. This allows a completely orthogonal use of functions. For example, most other Lisps do not allow calls like
: (mapcar if '(T NIL T NIL) (1 2 3 4) (5 6 7 8)) -> (1 6 3 8)
PicoLisp has no such restrictions. It favors the principle of "Least Astonishment".
The OOP system is very powerful, because it is fully dynamic, yet extremely simple:
super
and extra
.
PicoLisp has many practical features not found in other Lisp dialects. Among them are:
'(1 2
3)
you can just write (1 2 3)
. This is possible because a
number never makes sense as a function name, and has to be checked at runtime
anyway.
quote
function returns all
unevaluated arguments, instead of just the first one. This is both faster
(quote
does not have to take the CAR of its argument list) and
smaller (a single cell instead of two). For example, 'A
expands to
(quote . A)
and '(A B C)
expands to (quote A B
C)
.
@
is automatically
maintained as a local variable, and set implicitly in certain flow- and
logic-functions. This makes it often unnecessary to allocate and assign local
variables.
var
locations (i.e. values of symbols and
CARs of cons pairs).
This
in combination with the access functions
=:
, :
and ::
.
link
, yoke
, chain
and made
functions in the make
environment.
(let ((A 1) (B 2) C 3) ..)
you write (let (A 1 B 2 C 3)
..)
, or just (let A 1 ..)
if there is only a single
variable.
#
) as a comment character is more appropriate
today, and allows a clean hash-bang (#!
) syntax for stand-alone
scripts.
select
can be used in database
queries.
Database objects ("external" symbols) are a primary data type in PicoLisp.
They look like normal symbols to the programmer, but are managed in the database
(fetched from, and stored to) automatically by the system. Symbol manipulation
functions like set
, put
or get
, the
garbage collector, and other parts of the interpreter know about them.
It is a stand-alone system (it does not depend on external programs like Apache or MySQL) and it provides a "live" user interface on the client side, with an application server session for each connected client. The GUI layout and behavior are described with S-expressions, generated dynamically at runtime, and interact directly with the database structures.
Internal exclusive and full use of UTF-8 encoding, and self-translating transient symbols (strings), make it easy to write country- and language-independent applications.
Despite the fact that PicoLisp is an interpreted-only system, the performance is quite good. Typical Lisp programs operating on list data structures are executed in (interpreted) PicoLisp at about the same speed as in (compiled) CMUCL, and about two or three times faster than in CLisp or Scheme48.
But in practice, speed was never a problem, even with the first versions of
PicoLisp in 1988 on a Mac II with a 12 MHz CPU. And certain things are cleaner
and easier to do in C
(or other low-level languages) anyway. It is
very easy to write C
functions in PicoLisp, either in the kernel,
as shared object libraries, or even inline in the Lisp code.
PicoLisp is very space-efficient. Other Lisp systems reserve heap space twice as much as needed, or use rather large internal structures to store cells and symbols. Each cell or minimal symbol in PicoLisp consists of only two pointers. No additional tags are stored, because they are implied in the pointer encodings. No gaps remain in the heap during allocation, as there are only objects of a single size. As a result, consing and garbage collection are very fast, and overall performance benefits from a better cache efficiency. Heap and stack grow automatically, and are limited only by hardware and operating system constraints.
It means to directly execute Lisp data as program code. No transformation to another representation of the code (e.g. compilation), and no structural modifications of these data, takes place.
Lisp data are the "real" things, like numbers, symbols and lists, which can
be directly handled by the system. They are not the textual
representation of these structures (which is outside the Lisp realm and taken
care of by the read
ing and print
ing functions).
The following example builds a function and immediately calls it with two arguments:
: ((list (list 'X 'Y) (list '* 'X 'Y)) 3 4) -> 12
Note that no time is wasted to build up a lexical environment. Variable bindings take place dynamically during interpretation.
A PicoLisp function is able to inspect or modify itself while it is running (though this is rarely done in application programming). The following function modifies itself by incrementing the '0' in its body:
(de incMe () (do 8 (printsp 0) (inc (cdadr (cdadr incMe))) ) ) : (incMe) 0 1 2 3 4 5 6 7 -> 8 : (incMe) 8 9 10 11 12 13 14 15 -> 16
Only an interpreted Lisp can fully support such "Equivalence of Code and Data". If executable pieces of data are used frequently, like in PicoLisp's dynamically generated GUI, a fast interpreter is preferable over any compiler.
No. That would contradict the idea of PicoLisp's simple virtual machine structure. A compiler transforms it to another (physical) machine, with the result that many assumptions about the machine's behavior won't hold any more. Besides that, PicoLisp primitive functions evaluate their arguments independently and are not suited for being called from compiled code. Finally, the gain in execution speed would probably not be worth the effort. Typical PicoLisp applications often use single-pass code which is loaded, executed and thrown away; a process that would be considerably slowed down by compilation.
Yes and No. Though we wrote and tested PicoLisp originally only on Linux, it now also runs on many other POSIX systems. The first versions were even fully portable between DOS, SCO-Unix and Macintosh systems. But today we have Linux. Linux itself is very portable, and you can get access to a Linux system almost everywhere. So why bother?
The GUI is completely platform independent (Browser), and in the age of the Internet an application server does not really need to be portable.
Not really, but it evolved a great deal into that direction.
Historically it was the other way round: We had a plain X11 GUI for our applications, and needed something platform independent. The solution was obvious: Browsers are installed virtually everywhere. So we developed a protocol which persuades a browser to function as a GUI front-end to our applications. This is much simpler than to develop a full-blown web server.
Because it isn't there. The reason is that it is redundant; it is equivalent
to the quote
function in any aspect, because there's no distinction
between code and data in PicoLisp, and quote
returns the whole
(unevaluated) argument list. If you insist on it, you can define your own
lambda
:
: (def 'lambda quote) -> lambda : ((lambda (X Y) (+ X Y)) 3 4) -> 7 : (mapcar (lambda (X) (+ 1 X)) (1 2 3 4 5)) -> (2 3 4 5 6)
Dynamic binding is very powerful, because there is only one single, dynamically changing environment active all the time. This makes it possible (e.g. for program snippets, interspersed with application data and/or passed over the network) to access the whole application context, freely, yet in a dynamically controlled manner. And (shallow) dynamic binding is the fastest method for a Lisp interpreter.
Lexical binding is more limited by definition, because each environment is
deliberately restricted to the visible (textual) static scope within its
establishing form. Therefore, most Lisps with lexical binding introduce "special
variables" to support dynamic binding as well, and constructs like
labels
to extend the scope of variables beyond a single function.
In PicoLisp, function definitions are normal symbol values. They can be dynamically rebound like other variables. As a useful real-world example, take this little gem:
(de recur recurse (run (cdr recurse)) )
It implements anonymous recursion, by defining recur
statically
and recurse
dynamically. Usually it is very cumbersome to think up
a name for a function (like the following one) which is used only in a single
place. But with recur
and recurse
you can simply
write:
: (mapcar '((N) (recur (N) (if (=0 N) 1 (* N (recurse (- N 1))) ) ) ) (1 2 3 4 5 6 7 8) ) -> (1 2 6 24 120 720 5040 40320)
Needless to say, the call to recurse
does not have to reside in
the same function as the corresponding recur
. Can you implement
anonymous recursion so elegantly with lexical binding?
You mean the funarg problem, or problems that arise when a variable might be bound to itself? For that reason we have a convention in PicoLisp to use transient symbols (instead of internal symbols) or private internal symbols
This is a form of lexical scoping - though we still have dynamic
binding - of symbols, similar to the static
keyword in
C
.
In fact, these problems are a real threat, and may lead to mysterious bugs (other Lisps have similar problems, e.g. with symbol capture in macros). They can be avoided, however, when the above conventions are observed. As an example, consider a function which doubles the value in a variable:
(de double (Var) (set Var (* 2 (val Var))) )
This works fine, as long as we call it as (double 'X)
, but will
break if we call it as (double 'Var)
. Therefore, the correct
implementation of double
should be:
(de double ("Var") (set "Var" (* 2 (val "Var"))) )
If double
is defined that way in a separate source file, then
the symbol Var
is locked into a private lexical context and
cannot conflict with other symbols.
Admittedly, there are two disadvantages with this solution:
This is not true. Closures are a matter of scope, not of binding.
For a closure it is necessary to build and maintain a separate environment. In a system with lexical bindings, this has to be done at each function call, and for compiled code it is the most efficient strategy anyway, because it is done once by the compiler, and can then be accessed as stack frames at runtime.
For an interpreter, however, this is quite an overhead. So it should not be done automatically at each and every function invocation, but only if needed.
You have several options in PicoLisp. For simple cases, you can take
advantage of the static scope of transient
or private symbols. For the general case, PicoLisp has built-in functions like
bind
or job
, which dynamically manage statically scoped
environments.
Environments are first-class objects in PicoLisp, more flexible than hard-coded closures, because they can be created and manipulated independently from the code.
As an example, consider a currying function:
(de curry Args (list (car Args) (list 'list (lit (cadr Args)) (list 'cons ''job (list 'cons (list 'lit (list 'env (lit (car Args)))) (lit (cddr Args)) ) ) ) ) )
When called, it returns a function-building function which may be applied to some argument:
: ((curry (X) (N) (* X N)) 3) -> ((N) (job '((X . 3)) (* X N)))
or used as:
: (((curry (X) (N) (* X N)) 3) 4) -> 12
In other cases, you are free to choose a shorter and faster solution. If (as in the example above) the curried argument is known to be immutable:
(de curry Args (list (cadr Args) (list 'fill (lit (cons (car Args) (cddr Args))) (lit (cadr Args)) ) ) )
Then the function built above will just be:
: ((curry (X) (N) (* X N)) 3) -> ((X) (* X 3))
In that case, the "environment build-up" is reduced by a simple (lexical) constant substitution with zero runtime overhead.
Note that the actual curry
function is simpler and more pragmatic. It combines both strategies (to use
job
, or to substitute), deciding at runtime what kind of function
to build.
Yes, there is a macro mechanism in PicoLisp, to build and immediately execute a list of expressions. But it is seldom used. Macros are a kludge. Most things where you need macros in other Lisps are directly expressible as functions in PicoLisp, which (as opposed to macros) can be applied, passed around, and debugged.
For example, Common Lisp's DO*
macro, written as a function:
(de do* "Args" (bind (mapcar car (car "Args")) (for "A" (car "Args") (set (car "A") (eval (cadr "A"))) ) (until (eval (caadr "Args")) (run (cddr "Args")) (for "A" (car "Args") (and (cddr "A") (set (car "A") (run @))) ) ) (run (cdadr "Args")) ) )
This is not possible. Threads share memory and other resources (as opposed to processes, which are better isolated from each other). Each thread has its own stack for private data, but PicoLisp uses dynamic binding, where the stack holds the saved values instead of the current values of symbols. As a result, each running thread would overwrite the symbol bindings of other threads.
Instead, PicoLisp uses separate processes - and interprocess communication - for parallel execution, or coroutines as a kind of cooperative threads running a controlled way and doing all necessary housekeeping.
Another advantage of separate processes over threads: They can be distributed across multiple machines, and therefore scale better.
Because PicoLisp has something better: Transient symbols. They look and behave like strings in any respect, but are nevertheless true symbols, with a value and a property list.
This leads to interesting opportunities. The value, for example, can point to other data that represent the string's translation. This is used extensively for localization. When a program calls
(prinl "Good morning!")
then changing the value of the symbol "Good morning!"
to its
translation will change the program's output at runtime.
Transient symbols are also quite memory-conservative. As they are stored in normal heap cells, no additional overhead for memory management is induced. The cell holds the symbol's value in its CDR, and the tail in its CAR. If the string is not longer than 7 bytes, it fits completely into the tail, and a single cell suffices. Up to 15 bytes take up two cells, 23 bytes three etc., so that long strings are not very efficient (needing twice the memory on the average), but this disadvantage is made up by simplicity and uniformity. And lots of extremely long strings are not the common case, as they are split up anyway during processing, and stored as plain byte sequences in external files and databases.
Because transient symbols are temporarily interned (while load
ing the current source file), they are
shared within the same source and occupy that space only once, even if they
occur multiple times within the same file.
PicoLisp has no array or vector data type. Instead, lists must be used for any type of sequentially arranged data.
We believe that arrays are usually overrated. Textbook wisdom tells that they have a constant access time O(1) when the index is known. Many other operations like splits or insertions are expensive. Access with a known (numeric) index is not typical for Lisp, and even then the advantage of an array is significant only if it is relatively long. Holding lots of data in long arrays, however, smells quite like a program design error, and we suspect that often more structured representations like trees or interconnected objects would be better.
In practice, most arrays are rather short, or the program can be designed in such a way that long arrays (or at least an indexed access) are avoided.
Using lists, on the other hand, has advantages. We have so many concerted functions that uniformly operate on lists. There is no separate data type that has to be handled by the interpreter, garbage collector, I/O, database and so on. Lists can be made circular. And lists don't cause memory fragmentation.
Still, if there is really a need to access large amounts of data with a
numeric index, enum
can be used. It
emulates a multidimensional - possibly sparse - array. It takes roughly 1.5 the
space a linear list would require, and is very fast.
PicoLisp does not support real floating point numbers. You can do all kinds
of floating point calculations by calling existing library functions via
native
, inline-C code, and/or by
loading the "@lib/math.l" library.
But PicoLisp has something even (arguably) better: Scaled fixpoint numbers, with unlimited precision.
The reasons for this design decision are manifold. Floating point numbers smack of imperfection, they don't give "exact" results, have limited precision and range, and require an extra data type. It is hard to understand what really goes on (How many digits of precision do we have today? Are perhaps 10-byte floats used for intermediate results? How does rounding behave?).
For fixpoint support, the system must handle just integer arithmetic, I/O and string conversions. The rest is under programmer's control and responsibility (the essence of PicoLisp).
Carefully scaled fixpoint calculations can do anything floating point can do.
That's not a good idea. The next time that function gets executed within the dynamic context the program may crash. Therefore we have a convention to use an upper case first letter for locally bound symbols:
(de findCar (Car List) (when (member Car (cdr List)) (list Car (car List)) ) );-)
At least it should be interesting. It would be a machine executing list (tree) structures instead of linear instruction sequences. "Instruction prefetch" would look down the CAR- and CDR-chains, and perhaps need only a single cache for both data and instructions.
Primitive functions like set
, val
, if
and while
, which are written in C
or assembly language
now, would be implemented in microcode. Plus a few I/O functions for hardware
access. EVAL
itself would be a microcode subroutine.
Only a single heap and a single stack is needed. They grow towards each other, and cause garbage collection if they get too close. Heap compaction is trivial due to the single cell size.
There would be no assembly-language. The lowest level (above the hardware and microcode levels) are s-expressions: The machine language is Lisp.
It is easy to produce a segfault in PicoLisp. Just set a symbol to a value which is not a function, and call it:
: (setq foo 1) -> 1 : (foo) Segmentation faultThere is another example in the Evaluation section of the reference manual.
PicoLisp is a pragmatic language. It doesn't check at runtime for all possible error conditions which won't occur during normal usage. Such errors are usually detected quickly at the first test run, and checking for them after that would just produce runtime overhead.
Catching the segmentation violation and bus fault signals is also not a good idea, because the Lisp heap is most probably be damaged afterwards, possibly creating further havoc if execution continues.
It is recommended to inspect the code periodically with lint
. It will detect many potential errors.
And, most of these errors are avoided by following the PicoLisp naming conventions.
The best place is the PicoLisp Mailing List (see also The Mail Archive), or the IRC #picolisp channel on FreeNode.net.