Scheme in Python

The Scheme in Python series has been completed! Since there are still a few readers of this blog and I haven’t yet setup domain forwarding, I’ll post the links page here.


The Scheme in Python project is a port of the Scheme in Scheme project. The goal is to implement a small subset of Scheme as an interpreter written in Python.

There are a number of goals for this project. First, implementing Scheme in Scheme allowed us to “cheat” a bit by having access to the Scheme reader and data structures. Using Python as the implementation language will force us to code the reader by hand and create new data structures where there isn’t a one-to-one mapping from Scheme to Python.

There are also two auxiliary goals to this project. Using Python should make this more accessible to programmers who are interested in language development, but are unfamiliar with Scheme. Also I’m using this project as a way to familiarize myself with branching and merging in git, so each post will correspond to a branch in the repository.

All the code for this project will be hosted on GitHub. The code is licensed under a BSD license if you are interested in forking it for any reason.

This series will focus on building a very simple interpreter for the purpose of learning the steps involved in building one. For this reason there will be no/very little error checking or optimization. This port will be slightly more complicated than Scheme in Scheme so if you are interested in an even simpler interpreter look here.

Part 0 | Reading from stdin

Part 1 | Numbers

Part 2 | Extending the read layer

Part 3 | Pairs and linked lists

Part 4 | Self-evaluating values

Part 5 | Assignment and define

Part 6 | scheme-syntax macro

Part 7 | Refactor and load

Part 8 | Primitive procedures and apply

Part 9 | Environments

Part 10 | lambda

* Most implementations of Scheme in Python use regular expressions in the reader. I chose to write a parser by hand so I could explain some of the details of parsing. As this is an educational exercise I think this is appropriate.

Other resources for writing a Scheme in Python

lis.py and lispy.py
Simple Schemes written by Peter Norvig.

Psyche
pyscheme
I haven’t got the chance to look at Psyche or pyscheme, but you may be interested in them as well.

Other resources for writing a Scheme in Scheme or other languages

Structure and Interpretation of Computer Programs
Chapter 4 onward covers designing and implementing a number of interpreters and was the inspiration for this interpreter.

An Incremental Approach to Compiler Construction (PDF)
Great paper on building a Scheme compiler from the ground up. Each step is simple and results in a fully working compiler.

Scheme from Scratch
The blog series that inspired and guided the development of the original Lispy. Even if you don’t know C (I didn’t at the time) you will still be able to follow along and construct your own Scheme. Peter’s coding style is easy and pleasant to read and he mentions tips for going in different directions for your own implementations.

A Self-Hosting Evaluator using HOAS (PDF)
An interesting implementation of Scheme using a Higher-Order Abstract Syntax representation. This paper, An Incremental Approach to Compiler Construction and SICP were the primary motivating forces behind my interest in PL design and implementation. The author, Eli Barzilay, has many other interesting papers at his site.

Chai – Math ∩ Programming
A series detailing the development of Chai (what appears to be a Scheme-like language). It is well written and currently in development. I’ll post more information when it’s available.

Scheme in Scheme
Another series that is just beginning about writing a bytecode interpreter. It appears to be put on hold as of April 2011.

Lisp in Scheme
An implementation of McCarthy’s original Lisp in Scheme.

Offline

Lisp in Small Pieces
Great book. Contains code for 11 interpreters and 2 compilers. Source code from the book available here.

Scheme Macros, syntax-rules, Explicit Renaming and Links to More Insanity

This is a response to a question on Yahoo Answers about macros in Scheme. Apparently this was too long for YA, so I decided to post it here. Eventually I’ll rewrite this and make it more coherent, for now please bear with me. Note that there may be some errors here so please point them out if you find them. I am not an expert macrologist

Here’s the original question

I’m new to scheme/racket macros and here is a simplified version of the problem I’m having:

> (define-syntax let-five
(syntax-rules ()
[(let-five expr …)
(let ((five 5)) expr …)]))
> (let-five (+ five 2))

This should return 7 but instead I get an error saying that the identifier “five” is unbound.
All I want let-five to do is to bind the identifier “five” to the number 5 so that I can use “five” within let-five. The reason I want to do this is more complicated than this example so don’t tell me that this macro is doesn’t do anything useful.
I am fairly sure that I know what is wrong, but my question is what would be the correct way to write this code so that it does what I intend?
Additional Details
Jack Trades, I don’t understand what x, r, and c are and what the % and @ are. It would be nice if you could explain that.

And here’s my response…

I don’t use Racket, but Chicken gives the same error (but probably a different error message).

The problem here is that syntax-rules is a hygienic macro transformer. Which basically means that the symbol five is being renamed by syntax-rules to avoid name clashes during expansion. You can see this clearly by looking at the error message (like I said they’re probably different from yours)…

    <syntax>          (##core#let ((five119 5)) (+ five 2))

As you can see syntax-rules renames five to five119 (or something completely different). So when you try to use five it is unbound.

In Chicken I would use explicit renaming macros which preserve hygiene but allow you to break it where necessary. I’m not sure if Racket has this option (check into an implementation of syntax-case, that will allow you to do something similar).

(define-syntax (let-five x r c)
  (let ((%let (r 'let)))
    `(,%let ((five 5)) ,@(cdr x))))

(let-five (+ five 2))
;===> 7

***** EDIT *****

As I already said I don’t think Racket has explicit renaming macros so this probably won’t work for you in that case. You need to look into an implementation of syntax-case for Racket. This will provide you with the ability to do what you want. To the best of my knowledge this is not possible with syntax-rules.

That being said here’s the explanation you asked for…

x, r and c are parameter names the same as if you defined a function (lambda (x r c) …).

The x parameter is the expression, which would be (let-five (+ five 2)) in your example.

The r parameter is a renaming procedure. This is the “explicit renaming” part of the explicit renaming macro. To maintain hygiene you must use this procedure to rename every symbol that you use (except for the ones you don’t want renamed). You do this by calling (r ‘let) for example.

The c parameter is a comparison function that can compare renamed symbols for equality.

You do not provide any of those parameters. Rather the implementation of ER macros does that for you. All you need to do is provide a function that accepts 3 parameters. Another equivalent way to write the ER macro above would be this…

(define-syntax let-five
  (lambda (expression rename compare)
    (let ((%let (r 'let)))
      `(,%let ((five 5)) ,@(cdr x))))

The % sign is just an ordinary character no different from the #\l #\e or #\t. The reason I use that is to remind the reader that “%let” is the renamed symbol “let”. You could use (renamed-let (r ‘let)) if you want instead of (%let (r ‘let)).

The @ is the syntax for unquote-splicing. If you look at the last line in the ER macro above you’ll notice three forms of symbols (` , @).

` the backquote is syntax for (quasiquote (expression)). quasiquote is similar to the regular quote, except that instead of simply returning the whole expression unevaluated, it allows you to evaluate some things using the (,) syntax.

, the comma is syntax for (unquote expression) this tells Scheme to evaluate the expression that immediately follows it. So if we had the list `(1 (+ 1 1) ,(+ 1 2)) when it is evaluated it would return the list (1 (+ 1 1) 3). The expression (+ 1 1) was not evaluated because it was never unquoted, however the expression (+ 1 2) was unquoted and so it was evaluated and returned 3.

@ is syntax for splicing which is a way to evaluate an expression that returns a list and then put that list into the quasiquoted list. In other words it weaves a list into the quasiquoted list.

Here is some more information on Scheme macro systems. I read each of these articles at least a dozen times and eventually it started to come together. Even now though, I’d say that I only really understand about 10-20% of it.

Scheme macros are difficult to understand, but coming to that understanding (even if it is limited) is incredibly rewarding. It will definitely make you a better Scheme programmer and will probably help you to realize why the vast majority of other languages are simply inadequate.

This is the easiest read and the highest level overview.
A Scheme Syntax Rules Primer

This is probably the most complete approach to macros from the relatively simple to the complex. Read this until you don’t understand what it’s talking about, then play around with some of the things you learned. Then read it again from the beginning until you don’t know what it’s talking about, then play around. Then read it again… Eventually you’ll make it at least half way through and by then you’ll know more about macros than you ever thought possible.
define-syntax Primer

Here’s an interesting (and long) post on Chicken’s macro system. This covers ER macros.
Macro Systems and Chicken (long)

And finally here’s an advanced discussion of macros that may just leave you without hair.
An Advanced Syntax-Rules Primer for the Mildly Insane

Introducing Evo: The Original Purpose of Lispy

When I first envisioned the Lispy project I had only one goal in mind, which I’ll get to soon. As I started writing code, first in Scheme, then C then again in Scheme, the Lispy project evolved into something else. It changed into a platform for me to learn more about programming and the implementation of language features. Writing Lispy, in all its variations, has been a very educational and rewarding experience, so much so that I consider the project an overwhelming success, even though I didn’t realize the original purpose.

So what was the original purpose of Lispy? I’ll give you the tag line that I wrote when Lispy was nothing but an idea… Lispy is a distributed self-optimizing program by example language.

What does that mean? When I first became interested in programming I had 2 goals; to create a program like MetaStock (I’ve since written pyTrade) and to create Artificial Intelligence. Modest goals, I know. In my AI research I came across genetic programming and immediately took a liking to it, probably because I was still new to writing code and the idea that I could write code that wrote code fascinated me.

The main problem with genetic programming is that it is often difficult to write fitness tests for your problems. Somewhere along the way I noticed that programmers were writing fitness tests all the time in the form of unit tests for their code. In addition, the fact that 99% of the time the CPU sits idle while we browse reddit, kept rattling around in my head. Why not take advantage of those wasted cycles by using a programming language that used those cycles to optimize the code you just wrote?

Then I had another idea. Why write code at all? Why not just write the unit tests and let the code evolve on its own? It took me about an hour to realize that this wasn’t going to work on anything but the smallest applications. However another few weeks and I realized that, maybe it could work, given enough computers were running Lispy.

Anyway I have about 25 half-finished papers on Evo which I might get around to finishing and posting here. I’ll give the highlights and post a link to the github account where you can find more info. Evo is meant to be integrated with Lispy (though I don’t know when I’ll get around to that).

Evo is an evolutionary search based function optimizer. It provides an interface for defining new modules and functions as well as an evolutionary programming based method for optimizing those functions. Evo does not use the standard generational approach to GP, rather it uses “gene pools”, that contain functions of indefinite life, from which new functions can be evolved and tested one at a time.

When defining an Evo function you can choose to optimize it for speed, space or code length. Each function can have multiple gene pools that are each optimized for a different purpose. Using gene pools instead of a single population generational approach helps to avoid stagnation within the population as is a common occurrence with the standard generational model.

Evo is very much unfinished, however it can successfully run on its own and evolve solutions to problems. Because of this I am going to put the code out there in case anyone wishes to contribute. I used the implementation of Evo as an excuse to firm up my understanding of closures, as a result the majority of the program is written in a slightly OOP fashion. There is also a Tk based GUI for browsing modules and adding new functions and fitness tests, though its use is completely optional.

Without further ado, here’s the link to the Evo repo.