Simple Computers Passing Messages
In 1969, Alan Kay and others at Xerox PARC proposed a mechanism for decomposing a programming problem into smaller chunks that could be solved independently. These small programs would then run on simulated small computers, which communicated via an abstract message-passing interface.
The simple computers were termed "objects" in the system, and the combination of dynamic typing and message passing was dubbed "Object Oriented Programming" by Alan Key, as embodied by the Smalltalk language.
Smalltalk itself was based loosely on Simula, as part of a bet that it was possible to create a language embodying the message-passing idea from Simula in a single page of code.
Just as in Haskell, in which everything is a function, in Smalltalk everything is an object. At the simplest level, things like integers are objects. You can send messages to integers for simple things such as addition, but you also get more complex ones used for flow control.
Since blocks of code are also objects, they can be passed as arguments with messages, so conditional execution is performed by sending a message to (for example) an integer, containing another integer and a block of code, which is executed if the two integers have the same value.
Smalltalk used a method of refinement known as inheritance. Objects were defined by sending messages to a special kind of factory object, called a class. Classes (themselves instances of metaclasses) created instances of objects according to a recipe.
When you wanted to create a new kind of object, you would copy an existing class, send messages to it by adding new methods and instance variables to the objects it would create, and then use this template to create new objects.
This seems a long way away from the original machine code, so we’ll take a moment to see how it is implemented for a language very similar to Smalltalk: Objective-C.
Objective-C adds Smalltalk-like extensions to C, which is about as low-level as a high-level language can be—little more than a cross-platform assembler.
C provides two kinds of compound data types: arrays and structures. Arrays are strings of variables of a single type (in C, a text string is an array of characters that are 8-bit integers). Structures are heterogeneous collections with a fixed layout.
When C is compiled, elements in an array are found by multiplying the size of an individual element with the index, and elements in a structure are located by adding a fixed offset to the address of the start of the structure.
These simple compound types are used to implement Objective-C. The first implementations of the language comprised two components: a preprocessor that would emit C code and a runtime library that handled the dynamic features.
In more modern implementations the preprocessor is omitted, and the code is generated directly without a C intermediate form.
Classes in Objective-C are represented by a simple structure that contains a list of instance variable to offset mappings and a list of selector to function mappings. A selector is simply an abstract representation of the method name.
In Smalltalk, a message passing statement would look something like this:
anArray insertObject:anObject atIndex:anIndex.
In Objective-C, the syntax is very similar:
[anArray insertObject:anObject atIndex:anIndex];
In both of these cases, the selector would be uniquely identified by the string insertObject:atIndex:. The runtime library will typically maintain a mapping from these strings to integers or some other abstract mapping that allows fast searching.
When the code is run, this mapping will be looked up and used to find the implementation.
I said each class is a structure, and the same is true of each object. Objects are much simpler structures. The first element is a pointer to the definition of the class and the rest are the instance variables used by the object.
Given an object, it is possible to get the class by inspecting the first element in the structure and then look up the methods.
How are methods implemented? As plain C functions with a specific signature. The first argument of a method is a pointer to the object on which the method is being called, the second is the selector, and the remainder are the arguments passed to the method.
For the earlier example, the function would look something like this:
void method(id self, SEL cmd, id anObject, int anIndexP ...
The id and SEL types represent Objective-C objects and selectors, respectively.
Objective-C makes some compromises for ease of implementation. Unlike Smalltalk, it also supports "primitive types" that are not objects. In Objective-C, a 16-bit integer is an intrinsic type that does not respond to messages and needs to be manipulated directly with C integer expressions.
In Smalltalk, the lowest two bits of a pointer are usually used to indicate the type. If they have a specific value, the value is treated as an integer (after right-shifting by two) and has some special handling.
This adds a small amount of overhead since it’s necessary to check the type of a value before sending it a message.
It also means that integers can be compared for equivalence in exactly the same way as objects in the virtual machine (by pointer comparison), without needing a special case.