Threading Model

I have to admit that I am not great on threading. Two highly publicized systems I wrote a while ago generated native code. At least one RPM system I have recently used in a product did not even allow compiling thus had no threading model as such — after a dictionary search the interpreter just called the corresponding C function.

The Embeddable Forth Command Interpreter uses a kind of token threading that is similar to an indirect threading model on native Forth systems.

Execution Token Format

An execution token (XT) is what identifies a Forth work on the stack. If is the data type that is left on the stack by the Forth words and [‘] and what is consumed by EXECUTE and CATCH. The Embeddable Forth Command Interpreter uses two kinds of execution tokens: Low level tokens (the ones implemented in C — in traditionalt Forth parlance these are called primitives) and the indirect token that points to a different location inside the dictionary space (it is implemented as a cell index, not an address to allow relocatable dictionaries).

Tokens implemented in C were not chosen as a minimal set, rather everything that needed to be implemented during system bring up was left in C. Also some words that have a large amount of dependence on various facilities of the underlying system are implemented in C.

Execution token format in the Embeddable Forth Command Interpreter. There are two kinds of executions tokens: The first kind has the highest bit set to 1. The next 15 bits identify a token whose semantics is implemented in C. The low 16 bits form a parameter that is used by some token (e.g. branches, string literals, etc.).
The second type of XT is the indirect token. Its highest bit is zero and the rest of the bits form an index (cell displacement inside the dictionary area). See the text for more explanation.

The Forth Virtual Machine

The virtual machine used by the Embeddable Forth Command Interpreter is similar to a traditional Forth virtual machine. It has two stacks — for Forth’s data stack and Forth’s return stack as one would expect. Both stacks grow downwards. The location of the stacks is in the runtime context to allow separate stack space per thread. So the registers of the virtual machine would be sp for the data stack pointer, rp for the return stack pointer ip for the instruction pointer (index), w (a usual name in indirect threaded systems) is used for instruction fetch and we keep referring to the current execution token, let’s call it xt. The internal interpreter’s main loop is shown below, this is in effect the Forth engine that executes threaded code.

while (1)
{
    if (istoken(xt))
    {
         switch(extract_token(xt)
        {              
             // A gigantic switch statement implementing
             // the semantics of each token.
             case TOKEN_DROP: // This is how a token is implemented.
                 sp++;
             break;
        }
    } else {
        w = extract_index(xt);
        xt = dictionary[w];
        continue;
    }
 
    w = ip++;
    xt = dictionary[w];
}

The Usual Elements of a Threaded Interpreter

After the operation of the inner interpreter has been decided a lot of pieces easily fall into place.

Nest

Nest into a definition, used by words created by a colon definition:

 
*(--rp) = ip;    // Save the current ip on the return stack.
ip = w + 1;      // The new threaded code starts one cell after w.

Undo what Nest did:

ip = *(rp++);    // Pull the saved ip from the return stack.

DoVar

The execution semantics of variables.

*(--sp) = (cell)(&dictionary[w + 1]);  // The address of the cell after w.

DoConst

The execution semantics of Forth constants.

*(--sp) = dictionary[w + 1];           // The contents of the cell after w.

DoCreat

The execution semantics of words defined by CREATE DOES>.
In this model the DOES> part is a valid execution token (which might nest).

 // The address of the 2nd cell after w.
*(--sp) = (cell)(&dictionary[w + 2]);
 
// Fetch the XT stored in the CREATEd word
// that corresponds to the DOES> part of the definition.
xt = dictionary[w + 1];
 
// We are going back to the top of the while(1) loop.
continue;       

Using the Parameter Bits

In order to keep the implementation simple execution tokens are cell sized, which currently means always 32 bits. Certain tokens do in fact need extra data which is traditionally stored in the next cell in the dictionary. Because of the large token size I have decided to make to use of some of the bits inside them instead of using a second cell for it.

Branch and 0Branch

Branch tokens that are used for implementing forth control structures such as IF ELSE THEN BEGIN WHILE AGAIN UNTIL REPEAT CASE OF ENDOF ENDCASE use the parameter bits as an adjustment to ip.

    ip += sign_extend(extract_parameter(xt));

Short Literals

Short literals are supported that take the parameter bits and either sign extend or zero extend them and push them on the data stack, e.g.:

    *(--sp) = sign_extend(extract_parameter(xt));

Add Immediate

For implementing words such as 1+, 1- and CELL+.

    *sp += sign_extend(extract_parameter(xt));

DO LOOP

Words that implement the Forth loop construct (DO ?DO LOOP +LOOP) also use the parameter in a similar fashion as branches do.

DoUser

The token behind per execution thread variables (in Forth parlance USER variables) also keeps the index of the user variable in the parameter bits of the token. i.e. It pushes the start of the user area + parameter on the stack.

The Forth Runtime Context

The runtime context is a data structure that contains all sorts of variables that need to be separate if two copies of the Embeddable Forth Interpreter are run in the same address space (i.e. in threads). In theory the entire dictionary could be cloned and have separate copies of everything but that is the heavyweight way of doing it. The lightweight multi-threading support is to just have separate stacks for each instance that is running, separate search order, terminal buffer, exception handler, etc. Different threads also cannot write to anything in the dictionary space, thus normal variables cannot be used thus USER variable support is provided to allow per thread variables if the application chooses to run multiple threads that share the dictionary.

The function pointers that tell Forth how to interact with the terminal are also part of the runtime context (for example one thread might talk to a serial console while the other might talk to a TCP socket — the function pointers that implement what ACCEPT and TYPE and their kin does take care of dealing with that.