Tail Call Optimization in Marco
One of the main goals of the Marco language is that the interpreter code should be very easy to understand. It should be possible for almost any programmer without experience developing programming languages to read the code and understand what’s going on at a high level.
Even though the current state of the code requires lots of refactoring (since I tend to experiment a lot with it), I’m proud to say that I’m still walking towards that goal.
I have recently added TCO to Marco, in a similar way to the previous trampoline post. Let me show you the two main consequences to code quality:
Interpreter Changes
Here is part of the code for the if
special form:
It should not be difficult to read:
condition
,thenClause
andelseClause
are positional arguments.condition
is always evaluated.- If the result of the condition is
true
, return a continuation for thethenClause
, otherwise return a continuation for theelseClause
.
Compare this to the Racket documentation for if
:
Evaluates test-expr. If it produces any value other than #f, then then-expr is evaluated, and its results are the result for the if form. Otherwise, else-expr is evaluated, and its results are the result for the if form. The then-expr and else-expr are in tail position with respect to the if form.
I like to see these concepts (and some more) directly mapped to the interpreter code.
Catch: You need to know that continuations are being used to implement tail calls. I could just make a class MarcoTailCall that inherits from MarcoContinuation, but I have doubts if that actually makes it clearer.
The New Collatz Implementation
This is the new Marco code for finding the max collatz sequence up to some number n
:
It doesn’t require any hacks or trampolines since TCO is now part of Marco. Much more readable than before.
Performance Comparison
These are the previous values:
100000: 58.31s user 0.48s system 102% cpu 57.191 total
500000: 336.71s user 1.01s system 104% cpu 5:24.38 total
This is now using TCO:
100000: 28.90s user 0.26s system 102% cpu 28.554 total
500000: 158.40s user 0.91s system 100% cpu 2:38.02 total
Its about twice as fast, slightly more as the number increases. Better performance with better code.
Comments