Scheme in Python

The Scheme in Python series has been completed! Since there are still a few readers of this blog and I haven’t yet setup domain forwarding, I’ll post the links page here.


The Scheme in Python project is a port of the Scheme in Scheme project. The goal is to implement a small subset of Scheme as an interpreter written in Python.

There are a number of goals for this project. First, implementing Scheme in Scheme allowed us to “cheat” a bit by having access to the Scheme reader and data structures. Using Python as the implementation language will force us to code the reader by hand and create new data structures where there isn’t a one-to-one mapping from Scheme to Python.

There are also two auxiliary goals to this project. Using Python should make this more accessible to programmers who are interested in language development, but are unfamiliar with Scheme. Also I’m using this project as a way to familiarize myself with branching and merging in git, so each post will correspond to a branch in the repository.

All the code for this project will be hosted on GitHub. The code is licensed under a BSD license if you are interested in forking it for any reason.

This series will focus on building a very simple interpreter for the purpose of learning the steps involved in building one. For this reason there will be no/very little error checking or optimization. This port will be slightly more complicated than Scheme in Scheme so if you are interested in an even simpler interpreter look here.

Part 0 | Reading from stdin

Part 1 | Numbers

Part 2 | Extending the read layer

Part 3 | Pairs and linked lists

Part 4 | Self-evaluating values

Part 5 | Assignment and define

Part 6 | scheme-syntax macro

Part 7 | Refactor and load

Part 8 | Primitive procedures and apply

Part 9 | Environments

Part 10 | lambda

* Most implementations of Scheme in Python use regular expressions in the reader. I chose to write a parser by hand so I could explain some of the details of parsing. As this is an educational exercise I think this is appropriate.

Other resources for writing a Scheme in Python

lis.py and lispy.py
Simple Schemes written by Peter Norvig.

Psyche
pyscheme
I haven’t got the chance to look at Psyche or pyscheme, but you may be interested in them as well.

Other resources for writing a Scheme in Scheme or other languages

Structure and Interpretation of Computer Programs
Chapter 4 onward covers designing and implementing a number of interpreters and was the inspiration for this interpreter.

An Incremental Approach to Compiler Construction (PDF)
Great paper on building a Scheme compiler from the ground up. Each step is simple and results in a fully working compiler.

Scheme from Scratch
The blog series that inspired and guided the development of the original Lispy. Even if you don’t know C (I didn’t at the time) you will still be able to follow along and construct your own Scheme. Peter’s coding style is easy and pleasant to read and he mentions tips for going in different directions for your own implementations.

A Self-Hosting Evaluator using HOAS (PDF)
An interesting implementation of Scheme using a Higher-Order Abstract Syntax representation. This paper, An Incremental Approach to Compiler Construction and SICP were the primary motivating forces behind my interest in PL design and implementation. The author, Eli Barzilay, has many other interesting papers at his site.

Chai – Math ∩ Programming
A series detailing the development of Chai (what appears to be a Scheme-like language). It is well written and currently in development. I’ll post more information when it’s available.

Scheme in Scheme
Another series that is just beginning about writing a bytecode interpreter. It appears to be put on hold as of April 2011.

Lisp in Scheme
An implementation of McCarthy’s original Lisp in Scheme.

Offline

Lisp in Small Pieces
Great book. Contains code for 11 interpreters and 2 compilers. Source code from the book available here.

Advertisements

What In The Hell Are Race Conditions and Locks

In the context of parallel or concurrent programs, you will often hear the term race condition. This describes a problem when two pieces of code try to modify the same variable (or other location in memory).

Race conditions can be notoriously difficult to find, reproduce and debug.

To start with we have a variable a that is set to 7. I’ll show you what can happen, step-by-step, when two pieces of code try to use and modify that variable.

Both a = a + 35 and a = a – 10 will execute “at the same time”. I have “at the same time” in scare quotes because we don’t really know what “at the same time” means. Computers process instructions sequentially, this means that one operation has to come before the other. Let’s see how that can play out…

From looking at the code in the beginning, you may have guessed that the value of a after these operations would be 42, and that was indeed the intent. However, because of the race condition the value of a ended up being -3. Not at all what we wanted!

Probably the most common way to handle race conditions is to use locks. Here’s how the above process would look using locks…

Locks prevent race conditions by making sure that only one piece of code can access a memory location at any time. Locks can become very complicated to manage and are themselves the source of many bugs. Consider what would happen if some piece of code did not unlock the memory it was modifying! Or worse, did not lock it in the first place, in which case we fall back to the race condition.

Locks are not the only way to handle race conditions (though they are the most common). Another way to handle race conditions is to use immutable variables. This means that no code is allowed to change the value of a. Immutable variables will be the subject of another WITH article, so I won’t explain them here.

GOTO Table of Contents

Learn to Program with Python

Learn the basics of programming with Python. This tutorial will eventually become part of the What In The Hell series.

There are two schools of thought when it comes to writing code. The first is to sit down and throughly plan out your code and what it should do. The second is to sit down and start writing code, no matter how ugly and bug prone it may be, then hammer at the bugs until you have a beautifully exquisite piece of software. It is the latter approach that I recommend that you use, because it not only teaches you what works, but more importantly what doesn’t so that you can avoid that in the future.

This is a relatively short intro to the programming language Python. Python is a great beginner language as it hides many of the more advanced aspects of programming from the programmer. This doesn’t mean though, that Python is not a ‘real’ programming language, rather it takes care of things like memory management and object typing behind the scenes, making your code less buggy and error prone.

There are some terms you should know before setting out to learn any programming language. To those power-users some of these terms might be redundant.

Code: Code is a command, or commands, that are executed by the interpreter. Code can be saved in a file and run or typed into the interpreter.

Script: A script is simply a text file that contains code to be executed by an interpreter. You create a python script by putting .py at the end of the filename.

Program: A program can be a script or a collection of scripts that interact with each other.

Interpreter: An interpreter is a program that reads code and produces the output specified by the code. The Python interpreter will accept single lines of code, scripts or programs.

 

0) Using Python

Why does this tutorial start at zero? Computers usually start counting from zero. The simplest way to look at this is that zero is the first number in our number system, though there are some other reasons that may become clear to you later.

There are a number of ways to use Python, we will concentrate on the easiest method first (later on we will discuss some of the more advanced methods). First you must download the python executable from the Python website (www.python.org at the time of this writing). Once you have the program downloaded go ahead and set it up by double clicking on the icon and following along with the prompts (you will notice that it is no different than any windows program installation). Click ‘finished’ when you are done and you are now ready to start using Python. The program that we will be doing all our work in is called ‘Idle’ and it is Python’s default interactive interpreter (there are others, which we will get to toward the end of this tutorial). To start idle go to the Start menu, then All Programs, then Python. Click on Idle and the the Idle interpreter will pop up.

Python 2.4.3 (#1, Jun 13 2006, 14:23:21) 
[GCC 4.1.1 20060612 (Red Hat 4.1.1-2)] on linux2
Type "copyright", "credits" or "license()" for more information.

 ****************************************************************
 Personal firewall software may warn about the connection IDLE
 makes to its subprocess using this computer's internal loopback
 interface. This connection is not visible on any external
 interface and no data is sent to or received from the Internet.
 ****************************************************************

IDLE 1.1.3 
>>>

You should see something like the above when you open Idle. For now the only significant thing here is the >>>. This is the command prompt where you will enter all your commands. Just to make sure it works type 2+2 and then hit return. Did it print 4 on the line below 2+2? As you can see Python can do math too, but we’ll save that for a little bit later. Now you can move on to writing your first program.

 

1) Hello World

It is customary to start each introductory text with a hello world program. This is typically the simplest program you can write in any programming language. It is a very simple program which only does one thing, it prints “hello world” to the screen on the interpreter.

>>> print 'Hello World!'
'Hello World!'

Great, you have just created your very first program. Don’t feel special yet? Don’t worry there’s much more to come. Let’s break this simple program down to see what Python does with it.

print – print is the command used to display results on the screen. It doesn’t print anything to your printer so don’t go putting paper in.

‘Hello World!’ – This is a string. A string is any combination of letters, numbers and special symbols encased in either single quotes ‘ ‘, double quotes ” “, or triple quotes “”” “”” or ”’ ”’.

Let’s try something else:

>>> print "hello", 'world'
'hello world'

The comma separates objects to be printed and automatically inserts a space between them. You can separate letters, numbers and symbols with a comma. Note also that we can use different types of quoting in the same sentence and Python doesn’t care.

>>> print "I can't", 'run today'
I can't run today

This illustrates why we would want to use different quotes around different strings. If we used single quotes around ‘I can’t’ you see that the computer could get confused as to where to end the string (after n or t?).

>>> print 'Go see "The Rock"'
Go see "The Rock"

One of the best reasons to use single quotes is to preserve quotation within a string.

>>> print 'hello'+'world'
helloworld

Using the + sign between words performs an operation called concatenation, which is basically a big word for joining together strings without a space. Below you can see how to use concatenation and enter spaces between the words.

>>> print 'hello '+'world'
hello world

This prints a space between the words because there is a space after hello and before the ending quotation.

Print also can print numbers and the results of mathematical operators.

>>> print 5
5

>>> print 2435656
2435656

>>> print 5+5
10

Numbers with quotes around them are strings so the ‘+’ operator will concatenate instead of adding the two numbers.

>>> print '5'+'5'
'55'

 

2) Basic Math

Python as well as every other programming language is simply an overpowered calculator. Math in Python is very simple and the rules that you learned in math class will apply directly here. Here are a few math problems and their results.

>>> 5+5
10
>>> 5-4
1
>>> 5*5
25
>>> 5.0/2.0
2.5
>>> 5/2
2

The last problem is the only one here that seems strange. All whole numbers are automatically considered to be integers. A decimal is a special type of number called a floating point number, or float for short. You can declare a float by putting a .0 on the end of a whole number, so that 5 becomes 5.0.

To get the remainder of an expression you simply use the ‘%’ operator. The remainder of 5/2 is 1.

>>> 5 % 2
1

Python can also return a float.

>>> 5/2.0
2.5
>>> 5.0/2
2.5

Python will convert the number to float if one number in the expression is a float.

>>> 2**3
8
>>> 3**3
27

The ‘**’ operator represents raising the number to the x power, so that 2**x raises 2 to the x power.

Introducing Parenthesis

>>> (5*5)+3
28

The parenthesis in Python are equivalent to the parenthesis in math. Anything inside a parenthesis gets done before the things outside the parenthesis. So the expression above reads (25)+3.

>>> ((2*5)*5)+10
510

Parenthesis can be nested also. In this case the innermost parenthesis get evaluated first. So ((2*5)*5)+10 == (10*5)+10 == 500+10 == 510.

You might wonder why I used two equals signs (==). Python uses one equal sign to assign a variable. If you want to describe an equal to relationship you must use two equal signs ==.

Python has other comparison operators that you probably remember from grade school math class.

== equal to
!= not equal to
>= greater than or equal to
<= less than or equal to
> greater than
< less than

There are a few common ways to write Python code but I suggest that you stick with the method described in this tutorial. Other ways are as follows:

>>> 5+5
10
>>> 5 + 5
10
>>> ( 5+5 ) + 3
13
>>> (5 + 5) + 3
13
>>> (5+5)+ 3
13

Whitespace in Python is very important, however whitespace within expressions can be used however you want. It is best though, to pick a style and stick with it so that other developers don’t get confused when they read your code.

 

3) Variables

Variables come in very handy when programming. A variable is an identifier that holds a value. The best way to explain this is to show you.

>>> a = 10
>>> print a
10

The above example demonstrates how to create a variable (a = 10) and how to access that variable (print a). From now on when we enter the letter ‘a’ the interpreter will substitute the letter for the number 10. Let’s try some more.

>>> b = 5
>>> print b
5
>>> print a+b
15

As you can see we can perform mathematical operations on the variables.

>>> c = 5+15
>>> print c
20
>>> d = a+b
>>> print d
15

We can also assign the value of a mathematical expression to a variable.

>>> print d
15
>>> d = d+1
>>> print d
16

We can also change the variable by referring to itself.

>>> print d
16
>>> d = d/2
>>> print d
8

We can also assign multiple variables at once by using tuples (tuples are a type of list that we will discuss a little later)

>>> (i, j, k) = (1, 2, 3)
>>> print i
1
>>> print j
2
>>> print k
3

Tuples are always enclosed in parenthesis and each value is separated by a comma. The first variable (i) is assigned to the first value (1), the second variable (j) is assigned to the second value (2) and so on. You can have as many variables/values as you like, just make sure that you have equal numbers of each. Below you can see what happens if you don’t.

>>> (i, j, k) = (1, 2)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
ValueError: need more than 2 values to unpack

>>> (i, j, k) = (1, 2, 3, 4)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
ValueError: too many values to unpack 

In both instances above the Python interpreter spits out a nasty error message. We’ll take this opportunity to talk a little about error messages. Let’s break down the error message into its three lines.

Traceback (most recent call last):

Line 1 of the error message simply tells you that the last instruction that the interpreter tried to do failed, and that the following messages are a result of this instruction failing.

File "<stdin>", line 1, in <module>

This line tells us exactly what instruction failed.

First it tells us what file the error was in. Since we aren’t using a file (we are typing directly into the interpreter) it tells us that the error was in “<stdin>” which is a cryptic way of saying standard input, or input from the keyboard.

Then it tells us what line the program failed on. Here we only have one line so it tells us the error occurred on line 1.

The final hint, in <module>, tells us what function or class the instruction was in. Since we haven’t covered functions or classes yet, just ignore this part.

ValueError: too many values to unpack

Finally this line tells us what type of error occurred. This time it tells us that we gave it too many values (before it told us that we gave it too few values). This line will give you a good hint as to what you need to change to make the instruction work, here we need to take away one value.

For more information on errors, check out What In The Hell Are Errors.

You can also store your values in a variable like the example below.

>>> w = (4, 5, 6)
>>> (x, y, z) = w
>>> print x
4
>>> print y
5
>>> print z
6

Remember here that a variable is simply a placeholder, so when the Python interpreter sees the command (x, y, z) = w it changes it to (x, y, z) = (4, 5, 6).

Variables aren’t just for numbers, you can store any data type in a variable. You can even store a variable in a variable.

>>> m = 'Hello'
>>> o = 'World'
>>> print m, o
Hello World

>>> p = m
>>> q = o
>>> print p, q
Hello World

One of the main reasons to store a variable in a variable is if you want to change the initial variable.

>>> s = 50
>>> t = s
>>> s = s/2
>>> print s
25
>>> print t
50

Notice that t keeps the initial value of s even after we change s by dividing it. This feature will come in very handy when you begin to learn about loops.

 

4) Functions

A function is a set of instructions.

def HelloWorld():
 print 'Hello World!'

Above is probably the simplest function that you can create. Let’s break it down.

def HelloWorld():

This is where we define the function. We do this by giving the def command followed by a name (like a variable name). We follow the name with a set of parenthesis and after the parenthesis we put a colon. The colon tells the interpreter that we are done defining the function.

 print 'Hello World'

You should remember this from chapter 1. Under the definition you can list any number of commands. These commands should be indented one level from the definition and there should only be one command per line.

When you want to use a function you simply type the name followed by parenthesis.

>>> HelloWorld()
Hello World!

Here’s some more functions and their output when you use them.

>>> def add():
 print 2 + 2

>>> add()
4

>>> def multiply():
 print 2 * 3

>>> multiply()
6

>>> def mu():
 print 2 * 3

>>> mu()
6

You can think of a function as a variable that stores multiple commands. Below we’ll see some more advanced functions that use some of the things we learned in previous chapters.

>>> def commands():
 print 'Hello. I will show you the multiples of 2'
 print 2**0
 print 2**1
 print 2**2
 print 2**3
 print 'There are ' , (2**4)*2 , 'possible combinations in a binary byte'

>>> commands()
Hello. I will show you the multiples of 2
0
2
4
8
There are 16 possible combinations in a binary byte

As you can see above each line in the function commands() is executed one by one until it reaches the end. So far we have only used the print statement in our functions. Let’s see what happens when we use a variable to store data.

>>> def twoPlusTwo():
 add2 = 2+2

>>> twoPlusTwo()

Notice that the function twoPlusTwo() did not print anything when we called it? You might have noticed that add2 = 2+2 will not print anything either when we type it on the command line. Let’s try to use the variable add2 by printing it.

>>> print add2
Traceback (most recent call last):
 File "", line 1, in ?
NameError: name 'add2' is not defined

The interpreter spits out a error message telling us that add2 has not been defined. How can that be? Any variable that you define in a function stays in that function, and is not accessible from outside of the function. The technical term for this is a local variable. Why does Python do this? There are many reasons which you will come to find are very useful, but the main reason is so you don’t have to keep track of all the variables that you used in your program, because redefining a global variable (or a variable accessible from anywhere, more on this below) will erase it’s current value and replace it with the new one. Yikes, that means that if we forget that we used a variable name we could crash the program, or make it spit out completely wrong data!

Functions do not return values by default, instead we must specify a return value if we want the function to produce something. In the above example of twoPlusTwo() when we called it it just gave us a new command line. The function is not broken (Python really did calculate 2+2 and assign it to add2) it merely has no return value. Let’s redefine twoPlusTwo() and give it a return value.

>>> def twoPlusTwo():
 add2 = 2+2
 return add2

>>> twoPlusTwo()
4

Notice that we added one more line, return add2. This line tells the interpreter to return (or give back when called) a value. Note that return and print are two completely different commands (even though they produce the same output here) print simply prints a message to the screen, while return gives us back a value that we can use. Let’s look at an example of this.

>>> def r():
 add2 = 2+2
 return add2

>>> r() + 2
6

Here we can see that we can use the function just like a variable. In this example r() + 2 the interpreter executes the function r() and get a return value of 4. Then it takes that value and adds 2 to it and returns the value of 6. Let’s see what happens if we try this using print.

>>> def p():
 add2 = 2+2
 print add2

>>> p() + 2
Traceback (most recent call last):
 File "", line 1, in ?
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

Python spits out another error message that tells us that we cannot add NoneType and int. This tells us that the function p() returned NoneType, which is a way of saying that p() returned nothing, and that we cannot add 2 to nothing (note here that NoneType is not equivalent to 0, we could add 0+2)

Earlier we talked about local variables. Let’s try a little experiment with the return statement.

>>> def ret():
 sub2 = 4 - 2
 return sub2

>>> ret()
2
>>> print sub2
Traceback (most recent call last):
 File "", line 1, in ?
NameError: name 'sub2' is not defined

Note here that the return statement did not give us a global variable of sub2. Instead it just returned the value of sub2.

 

4.1) Global and Local Variables

Close your Python interpreter and reopen it (this will get rid of all the variables and commands we entered already, and give us a clean slate to work with). Below is a session that shows us the difference between local and global variables.

>>> a = 10 ## a is a global variable because it is not inside a function
>>> b = 15 ## b is a global variable because it is not inside a function
>>> def add():
 c = 20 ## c is a local variable
 return a+b+c

Global variables are variables defined outside of any function (or class, but we haven’t learned about classes yet). Another way to think about it is that any variable that is declared and is not indented is a global variable (this will be a little more clear below)

We can use global variables anywhere, but we can only use local variables inside the function that declared them. When we are using more than a few lines to tell the interpreter what to do it is often easier to create a script, or a file containing commands for the interpreter. Every Python script must end in the ‘.py’ extension, you can do this simply by typing ‘.py’ at the end of the filename when you save it. Let’s rewrite the above code in a script.

First go to File then click New Window in the Python interpreter. This will give you a blank window. The first thing we’ll do is save the file (even thought it’s blank). Go to File –> Save. A window will pop up asking you where you would like to save the file and what you would like to name it. For now we’ll choose the default location and we’ll name the file ‘test.py’. Saving the file right away does a few things for us that make writing a script easier. First, and most important, it lets us save the document anytime we want by simply using the keyboard shortcut Ctrl – s. Second saving the file as a Python script enables a feature of the text editor called syntax highlighting that changes certain keywords to a different color, while this might seem frivolous or even confusing at first, when you get used to the colors it will help you identify your code.

OK now that we’ve got that out of the way let’s get down to writing our script.

#############
## test.py ##
#############

# This is a comment. Anything after the '#' until the next line will be ignored by the interpreter
# Use comments to explain your code so that it is easier for you to remember what does what

a = 10
b = 15

def add():
 c = 20
 return a+b+c

The code above is a script. Strictly speaking any comments are not necessary, but to help us remember or to help someone else understand the code they are useful. I use a comment box, which is simply text surrounded by comment symbols, to identify things like filename, date created, date revised, changes to the script, TODOs and special sections of code. You do not have to use the comment box, instead you can use just a single comment (#) to denote the same thing. I also often use two comments to signify a comment (##) instead of just one. Again this is just personal preference and you can choose to comment however you like.

Like I stated earlier, one of the easiest ways to identify a global variable is to look at the level of indentation. Global variables will have no spaces before their declaration. This means that a and b are both global variables, while c is not (because it has spaces before it).

This script right now will run, but it won’t output anything. This is because we haven’t yet used any of the variables/functions that we defined. Let’s add a little bit to the script so that it does something.

#############
## test.py ##
#############

a = 10
b = 15

def add():
 c = 20
 return a+b+c

add() ## This line uses the function add() just like we've done on the command line

One of the nice features of the IDLE text editor is the F5 key. To run your script you simply press the F5 key and a window will pop up that tells you that you must save the script before you can run it (if you haven’t saved it already). Hit OK and IDLE will save the script for you and then run it on the command line. You will see the interpreter pop up and then you will see it spit out the number 45 (10 + 15 + 20). Congratulations, you have written your first script.

Python executes a script one line at a time (like we did manually at the command line). The script above is executed as follows.

1) set a = 10
2) set b = 15
3) define the function add()
4) execute add() and return the value

If we were to change the script and put the definition and the command add() before we declared a and b we would get an error telling us that a and b haven’t been defined yet. The code below is an example of trying to define and use a function before declaring the variables that are used in the function.

#############
## test.py ##
#############

def add():
 c = 20
 return a+b+c

add()

a = 10
b = 15

One of the properties of functions is that they are not executed in order. Instead a function is only executed when it is called. The script above breaks when we try to run it. But is that because we defined the function before we defined the variables? Or because we tried to use the function before we defined the variables. Look at the code below for the answer.

#############
## test.py ##
#############

def add():
 c = 20
 return a+b+c

a = 10
b = 15

add()

This works! The only thing we’ve changed here is where we execute the function (after we declare the variables). This shows us that a function is not evaluated until it is used. So even though we used variables in the function that don’t yet exist (from a line-by-line execution viewpoint) we define the variables before we use the function, making the script work.

To learn more about functions check out What In The Hell Are Functions.

 

5) Conditionals

A conditional is an if/then statement. We use these all the time in everyday life. For instance we might say; if the temperature is over 80 degrees, then turn on the air conditioning. The code for this statement would look something like this:

if temperature over 80:
 turn on air conditioning

The above code is what we programmers call psudocode. Psudocode won’t actually run, its purpose is instead to give us an idea of how the code should look. It is also very easy to turn psudocode into real code.

if temp > 80:
 runAC()

This is the real code for the psudocode above. Notice the similarities? One of the new features here is the > sign. > is called an operator and is part of a group of operators.

== equal to
!= not equal to
>= greater than or equal to
<= less than or equal to
> greater than
< less than

Each of these operators returns a boolean value. A boolean value is either True or False so if we say a > b the boolean response would be True if a is greater than b, or False if a was less than or equal to b.

Operators are used extensively in conditional statements so it is best to fully understand them and their use. Operators return only a boolean value, you cannot for instance, get an integer or string from an operator.

You can execute as many commands as you would like in an if statement. The following code demonstrates this:

if number > 60:
 hours = number/60
 minutes = number % 60
 string = hours + ':' + minutes
 print string

This code takes a number of minutes and transforms it into hours:minutes, then prints the output.

To learn more about conditionals check out What In The Hell Are Conditionals.

 

Links

Pointless Programming – What In The Hell Series

Official Python Tutorial

How to Think Like a Computer Scientist

A Byte of Python

More books and tutorials for beginner programmers.