|OpenSelf project is ...
A new implementation of the Self language.
Self is a prototype based language (class-less) originally developed at Sun Microsystems
The sun's Self 4.0 is targeted specifically to Sparc machines
and it's very difficult to port to other architectures.
A port of the original Self 4.0 system
is started by Gordon E. Cichon OpenSelf is a rewrite from scratch
of the Self VM witch aims to be smaller and to integrate
itself better with the OS and the other utilities. More information on the original Self 4.0 ...
More information about OpenSelf
What is Self?
Self is a prototype based language (also called object-centered or classless). It uses a more radical approach to the OO paradigm. Self can be viewed as an evolution of Smalltalk. In Smalltalk everything is an object and Self extends this by merging the data and the behavior together and providing an uniform way to accessing and manipulating the objects: messages. Self uses messages and only messages to access the state ('instance variables') of an object
Every Self object has a number of 'slots' each which has a name, a content and a type. The slots can be basically of type 'data' and 'method'.
Data slots contains a reference (pointer) to an object. Data slots can be read-only or read-write. For read-write data slots it will be automatically defined another slot with the same name but with a ':' appended. This slot contains the assignment primitive, and this way you can change values in the data slots: by sending a message (note that the assignment primitive slot is only fictious: no real data is wasted for such a slot in the implementation).
Method slots contains (points to) an object with code. An Object with code is a normal object plus some executable code. When the code is executed the slots of this object will be it's 'local variables' and some of these his argument slots.
Parent slots are basically data slots (yes, they could be also
assignable, read-write, allowing dynamic inheritance) but provide different
semantics during message lookup. When someone sends a message to a receiver
the VM will scan it's slots. If it find a slot matching the selector (name
of the message) it will run the corresponding action depending on the type
of the slot: for data slots simply it's value is returned. Method referenced
by method slots is executed in the context of the receiver.
If the slot is not found the search is extended through it's parents. When a method is executed it will be in the context of the receiver, regardless in which parent the slot was found. This kind of message sends are said polymorphic as opposed to C++ method calls when no virtual functions are used.
OpenSelf is still heavy under development (as this site) and is not functional for normal, user-end, use. This site, for now, is intended to developers who want to make a better tool for better programming, faster, with less errors...
If you want to have more informations please see the sunlabs's documentation.
I decided to start from scratch for many reasons. First, I liked to build a small demo if the language and I began to implement it in Smalltalk. I started to really like the language and after reading some of the original papers I had an implementation design idea and I began a quick & dirty implementation of it. So, here it is, it's quick and it's very dirty, and for now it doesn't even support all the features of the Self language (like reflection).
Another reason to start from scratch is that I wanted try to do a lightweight VM stepping away from the concept of image files (or 'world', as named in the Self jargon), providing a interprocess shared code for base objects (read-only, copy-on-writing). This way there could be many lightweight self processes (yes, separate unix processes) protecting themselves from errors, mostly errors on external subroutine calls (C, or other languages), which the self's built-in exception handling system and debugger cannot handle. Interprocess communication will still be easy and fast using shared memory.
The VM is written in C++;
The code is still very dependent not only on the processor but also
on the compiler used.
This is because the method dispatch code uses inline assembler and bypasses compiled code epilogue when performing a call to a method. This way the coping of arguments on the stack is avoided between the dispatch routine and the method being called but every compiled method (machine code) generated by the VM has to include no preamble and the same epilogue (or equivalent) of the VM function which did the dispatch (send()).
Changing compiler or compiler version may change this epilogue. A simple way of fixing this is to look at the asm code generated by the C++ compiler.
it can perform 4000000 sends/s
it performs the following method 2000000 times in a second on my machine which is effectively 4M send/s with a parent lookup. All this without any inlining.
(| parent* = (|lobby = (). mymethod = (|| mydata)|). mydata = 'aString' | mymethod)confronted to the squeak message dispatch speed
400Mhz P-II, 1Mb L2, Linux: 31250000 bytecodes/sec; 2245539 sends/secas found in a recent squeak mail list. Note that this machine is a "bit" faster than mine (
NOTE: these are not benchmarks! There are many other factors which influence performance and I hope that they won't result in mutual excluding the current VM's speedup.
Blocks, Vectors, Mirrors, Maps (Very soon)
Garbage collector, and overall memory management
Write a more flexible assembler
Rewrite of the parser (maybe in self itself)
Image ('world') file dumper
Cleanup the VM sources (especially italian comments :-) )
Adaptive optimizations, customization
Porting to other architectures
Shared Object repository, interprocess comunication (shmem), distributed computing (sockets)
This is a quick overview of the current OpenSelf implementation. This can help you read the source code.
Lookup and send
There is a C++ object for every object. it contains header and slots. Slot names are referenced only by the address to unique string. There are currently 6 types:
smallInts (tagged pointer), data, code (method), string, vector, block,
There is a function called "send(String *sel,Object *self,Object *arg1,...)" which performs a dynamic send. It gets parameters like every other c function, on the stack, then after performing the lookup it knows the destination slot, the one referenced by "selector" in the inheritance tree. Now according to type of this slot different actions can be taken: if it's data slot it first check if the selector is a assignment selector (characterized by the ":" appended to it, it's in fact a single keyworded message). During the lookup if the selector was of "assignment" type it will match also on assignable slots with the same base name. In the case of an assignment operation it will be performed as well a read operation.
In the case a method slot was found the method's code is retrieved. If customization is enabled a customized version of the method specially targeted to a receiver is used. This version inlines all the method sends to self as well as some parent sends if a parent slot is set to read-only. Customizations have already a hook in the code bu currently no inlined code is generated by the compiler. After the code is retrieved a simple inline assembler instruction jumps to it.
when the method is called it stack frame looks like
|32||first arg, if any|
|EBP||prev. stack frame|
The method need to include the epilogue for self() (reset the stack frame)
The parser is very ugly and not yet 100% finished. It's not linked in the basic system because it's too ugly and full of memory leaks. The VM starts the parser as a separate process and gives him to the stdin the sources and waits back the bytecodes and objects definition back to stdout in a flat format.
A primitive is an hard coded function that is bound with a system wide special selector. This selector must begin with "_". There is no dynamic dispatch on primitives and permits to low-level manipulate objects or setup objects that have not yet the functionality to be self.
Example primitives can be _IntAdd:,_IntMul, _Clone, _Credits, ...