The Forth Programming Language



Forth is a stack-based language. Most of the standard words (commands) directly manipulate data on the stack.

Stack manipulation
At the heart of the Forth environment is the stack. The stack is used for value storage, parameter passing, return values, message passing, etc. At the core of the Forth language are words that perform basic operations on the stack, such as adding values to the stack, removing values, duplicating values, swapping, rotating, "leapfrogging", and clearing the stack. Programming using these basic stack-manipulation words is much like programming in assembly code, with one major difference: instead of performing operations on values stored in registers, operations are performed on values on the stack. In this way, Forth can simulate a stack-based architecture.

The stack-based design also facilitates a postfix notation for all operations. This allows easy parameter passing and nesting functions. For example, a typical C expression might look like:
sqrt(abs(difference(areaOfSquare(25), areaOfCircle(MAX(12, x)))));
Not necessarily a useful expression, but in Forth it looks like:
12 x @ MAX areaOfCircle 25 areaOfSquare difference abs sqrt
Still complex, perhaps, but the Forth program is simply a sequence of commands, each of which performs some operation on the data on the stack. In this way, it's more readable, easier to generate, and without the confusion of so many parenthesis.

Words, dictionaries
In Forth, all commands are called words. The collection of defined words is called the dictionary. A legal word is any collection of printable characters. Is.Stack.Empty? and $,,#Ard@--+}~! are both valid words, although not defined in the default dictionary. A user may add words to the dictionary using the : operator. Proper usage is
: word.name word.definition ;
An example definition:
: Say.Hello ." Hello world!" ;
This is a simple Forth program. Now a user can enter Say.Hello and the hello message will be printed.

A few notes on words: Forth is very sensitive about white space. All intended spaces must be explicitly indicated. For example, consider the two following code fragments,
10 2 *
10 2*
Both of the expressions will return the result of 10 * 2, but in much different ways. The first will push 10, push 2, multiply, and the second will push 10 multiply-by-2. 2* is a word (not a number followed by an operation, as in the first example) that executes a machine language procedure that multiplies the top of the stack by two faster than 2 * would. There are several such words, including 2*, 2/, 1+, 1-. The point is that if spaces are left out, the Forth interpreter will try to execute the unintentionally joined words as a single word. If you are lucky, this new word will not be in the dictionary and will generate an error. If the word is in the dictionary, there will be an unexpected result which might be very difficult to debug.

Conditionals
The IF...ELSE...THEN clause examines the top of the stack, executes the code following the IF word if the top is true, executes the code following the ELSE if false (ELSE is optional), and terminates clause at THEN.
: verbose.equals?
	= IF ." Yes, they are equal"
	  ELSE ." No, they are not equal"
	  THEN
;
This will print the yes message if the two numbers on the stack are equal, no if not.

= and other such operators ( >, <, and, or, not...) will put a boolean value on the stack. A boolean is simply a normal 32-bit number, 0 representing FALSE, anything else representing TRUE.



Variables
Although clever use of the stack can often eliminate the need for variables, variables can sometimes simplify a program. Not only do they offer a named, non-stack storage location, they can provide persistent storage of information. A variable is declared by:
VARIABLE varName
varName now contains a memory address where information can be stored. Entering "varName" pushes this memory location on the stack. To assign a value to the location,
varName !
To get the value in varName,
varName @
Because '!' and '@' operate on any 32-bit number, the programmer is granted a dangerous freedom, the ability to assign values to any given memory location. This is a major drawback to the language. Since every memory location is open game, the programmer must be very cautious with '!' operations. Pointer arithmetic is possible, but can be messy and flawful. The C-like statement
p* = 10
can be written in Forth as
10 p @ ! 
where p is the address of a memory location that contains the address of a memory location.

Passing variables between words is trivial. Since all variables declared with the VARIABLE word are global, all words can have direct access to variables. For passing variables by value, the value of the variable is pushed onto the stack. For passing variables by reference, the value of the variable is pushed onto the stack and the return value is stored back into the variable. The address in the variable could be passed, but this would add unnecessary complexity (except when dealing with strings and other data structures).

For clarity, here are some Forth varible operations and their C equivalents.
          VARIABLE MyVar     =    int MyVar
          MyVar              =    &MyVar
          MyVar @            =    MyVar
          10 MyVar !         =    MyVar = 10
Local variables
Variables can also be created in word definitions. These variables can only be accessed from inside the word definition. Local variables are used as such:
: switch { a b | t -- }
	a -> t
	b -> a
	t -> b
	a b
;
There are several things to notice here. First, in the variable declaration, if a variable name follows '|', the variable is initialized as 0. Otherwise, the variable is assigned the value of the top of the stack. Second, there is no dereferencing necessary for placing local variable values on the stack. Third, the '->' operator is used to store a value in a variable.

Looping
Because Forth has such direct, low-level stack access, recursion has little overhead and can be implemented in a wide variety of styles. See the example page for a recursive bubble sort. The word RECURSE indicates where a recursive call is to be made.

Other looping: BEGIN...UNTIL - executes the code ... until a true value is detected by UNTIL.

x y DO...LOOP - similar to a for loop -- for LoopIndex := x downto y do...

Some Forth examples
Forth links:
Forth Interest Group: Excellent source for Forth interpreters, tutorials, and other useful links


stoneda@cs.earlham.edu