Writing an Interpreter in Object Pascal: Part 1

How to Purchase

Paperback Copy (with the option to get 50% off the pdf version)  $16.95

Get the eBook (pdf)  $15.95

For more information contact: hsauro@gmail.com

This is part 1 of a series that will show you how to write an interactive interpreter in Object Pascal. Part 1 of the series will cover introductory material including a description of the language we’ll create, a full lexical analyzer for the language, how to use DUnitX for unit testing, and an introduction to the essential concepts in syntax analysis, recursive descent, grammar, and EBNF.  Along the way, we’ll create a simple REPL, give a detailed discussion of how to parse expressions and build a simple interactive calculator to illustrate the theory. The book provides fully working code and explains in plain English how the code works and why certain decisions were made, including alternative designs. The book makes liberal use of code throughout the book chapters.   Everything is done without the help of third-party tools such as Yacc, ANTLR or Flex. All you need is a standard installation of Free Pascal or Embarcaderos’s Delphi (including the free community edition).

The text is geared to hobbyists and midlevel developers who need an accessible introduction to lexical analysis and parsing.  It’s also for students starting out in compiler and interpreter design and need something more digestible.

All source code is open source under Apache 2.0 and available from Github.

Available in paperback form in the first week of January 2019 from Amazon.
Price $16.95 (paper), $15.95 (eBook) 170 pages
ISBN: 978-1-7325486-0-2

A brief description of the Rhodus language Version One

Contents:

  1. Introduction
    a) Why Object Pascal
    b) What is an interpreter
    c) Parts of an interpreter
  2. The Rhodus Language
  3. Lexical Analysis
    1. Initial API
    2. Input streams
    3. Retrieving tokens
    4. First run
    5. Adding more tokens
  4. Testing
    1. Introduction to testing
    2. Using DUnitX
  5. An Interactive Console
  6. Introduction to Syntax Analysis
    1. Grammar
    2. Production rules
    3. LL(k)
    4. Recursive descent
    5. Factoring, the dangling else
    6. Left recursion
    7. Ambiguous Grammars
    8. A simple calculator
    9. Syntax trees
    10. Adding exponentiation and the unary minus
  7. Testing the Calculator
  8. Adding Assignments and Variables
    1. Using a queue for token lookahead
    2. Updating the syntax analyzer
    3. Updating the lexical analyzer
  9. Building a Recognizer
  10. Appendix EBNF

The language for the interpreter is a mix of ideas from Pascal, Basic and Python. This is a toy language in the sense it is used to illustrate how one might go about building an interpreter. This doesn’t mean it couldn’t be used in a serious application but given the availability of existing mature scripting languages, it would seem less likely. It’s primary purpose is therefore educational. The easiest way to describe the language is via some examples. A couple of initial points worth noting:

  1. Newlines are not part of the syntax
  2. Semicolons separate statements

// Data Types
a = 3;
b = 3.1415; b = -1.234E-12;
c = True; c = False;
d = "String";
myList = {1,2,3,{5,6},{"xyz", True, {False}}};

/* Operators:
   +, -, *, /, ^, (), mod, div, and or, not, xor, ==, >, >=, <, <=, !=
*/
println ("Hello World")
// Conditionals
if a > 6 then
   b = 9
else
   if b < 3 then
      x = 7
   else
      x = 8
   end
end
// Loops
a = 10
while a > 5 do
     a = a - 1
end;

a = 10
repeat
   a = a - 1
until a < 5;

for i = 1 to 10 do
    println (i)
    if i == 7 then
       break;
end;

for i = 0 to 10 do
    a = 5;
    for j = 0 to 20 do
        b = 7;
        for k = 0 to 30 do
            c = 9
        end    
    end 
end
// User defined functions
function add (a, b)
   return a + b
end;

x = 6;
function sub (a)
  global x
  return a - b;
end;

println ("Add two numbers: ", add (4,4))

// Functions can be passed to other functions
function callfunc (func)
    return func (5,6)
end;

println ("Add two numbers: ", callfunc (add))
// Lists
function bubbleSort (a, length)
   for i = 0 to length - 1 do
       for j = i+1 to length - 1 do
           if (a[i] > a[j])
               temp = a[i]; a[i] = a[j]; a[j] = temp;
           end
       end
   end
end;

// Sort the following array
alist = {5,2,6,7,3,2,5,1,4,9};
lengthOfList = 9;
bubbleSort (alist, lengthOfList);
println (alist);
// Larger example
function hamming (limit)
   h[0] = 1;
   x2 = 2; x3 = 3; x5 = 5;
   i  = 0; j  = 0; k = 0;
   for n = 1 to limit do
        h[n] = min(x2, min(x3, x5));
        if x2 == h[n] then i = i + 1; x2 = 2*h[i] end;
        if x3 == h[n] then j = j + 1; x3 = 3*h[j] end;
        if x5 == h[n] then k = k + 1; x5 = 5*h[k] end;
    end;
    return h[limit -1];       
end;

// Currently no library support for lists until Part 2 so
// we have to declare a list of a specific size like this
h = {0,0,0,0,0,0,0,0,0,0,0,0,0};

for i = 1 to 20 do
     println (hamming (i));
end

What’s coming up next?

Given the size of the project it’s possible there may be two additional parts, Part 2 and 3. The topics for these parts will include:

  1. Symbol Table: (1)
  2. Building the AST
  3. Symbol Table: (2)
  4. Error Handling
  5. The Virtual Machine: (1)
  6. Code Generation: (1)
  7. Version Two of the Rhodus Language
  8. Adding Module and Array Support
  9. Math and IO Libraries
  10. Adding Exception Handling
  11. List and Array Libraries
  12. The Virtual Machine: (2)
  13. Code Generation: (2)
  14. Semantic Analysis
  15. The Virtual Machine: (3)
  16. Code Generation: (3)
  17. A Better REPL