Jekyll2018-10-12T04:56:17+00:00https://juanibiapina.github.io/Juan IbiapinaJuan Ibiapina's personal website.Juan Ibiapinajuanibiapina@gmail.comInstalling shell scripts from Github with Basher2014-11-05T00:00:00+00:002014-11-05T00:00:00+00:00https://juanibiapina.github.io/articles/installing-scripts-from-github-with-basher<h3 id="the-problem">The problem</h3>
<p>On my <a href="http://juanibiapina.com/articles/2014-08-20-basher-a-package-manager-for-shell-scripts/">previous post about Basher</a> I mentioned that one of its goals was to be able to install scripts directly from github with minimal manual intervention. During the last month, I had to make some changes to get closer to that goal.</p>
<p>The <code class="highlighter-rouge">package.sh</code> file contains package information like binaries, completions, dependencies etc. The problem with having a package descriptor is that I need to fork every repo I want to install and add the file (maybe pull request a change for the maintainer to add the file). There is nothing wrong with this apprach, and it is in fact what <a href="https://github.com/bpkg/bpkg">bpkg</a> does. This is partly what allows them to have custom install scripts per package.</p>
<p>On the other hand, what I really wanted was to see a package on github, run a one liner and keep working.</p>
<p>Besides, sourcing the package.sh file was a silly security problem. In addition, I never felt really confortable with having two different formats for packages (basher and bpkg). The discussion evolved but not fast enough for me.</p>
<p>So I got rid of it.</p>
<h3 id="conventions-to-the-rescue">Conventions to the rescue</h3>
<p>Now, basher looks for a <code class="highlighter-rouge">bin</code> folder and links any files in there. If there is no <code class="highlighter-rouge">bin</code> folder, it links all executable files in the package root.</p>
<p>With this change, I’ve been very happy installing packages with one liners.</p>
<h3 id="working-packages">Working packages</h3>
<ol>
<li><strong>sstephenson/bats</strong> Bash Automated Testing System</li>
<li><strong>pote/gpm</strong> Go package manager</li>
<li><strong>pote/gvp</strong> Go versioning packager</li>
<li><strong>treyhunner/tmuxstart</strong> named tmux sessions</li>
<li><strong>zsh-users/antigen</strong> plugin manager for zsh</li>
<li><strong>bripkens/dock</strong> easy bootstrapper using docker</li>
</ol>
<h3 id="kind-of-working-packages">Kind of working packages</h3>
<ol>
<li><strong>holman/spark</strong> it also links an extra <code class="highlighter-rouge">test</code> executable</li>
<li><strong>hecticjeff/shoreman</strong> binary ends with ‘.sh’</li>
</ol>
<h3 id="future-work">Future work</h3>
<p>I want to automatically remove the ‘.sh’ from binaries. Also figure out how to install completions for packages.</p>Juan Ibiapinajuanibiapina@gmail.comThe problemWriting a Language2014-10-03T00:00:00+00:002014-10-03T00:00:00+00:00https://juanibiapina.github.io/articles/writing-a-language<p>In this tutorial I’m gonna show you how to write a very simple programming language called Ygor. The language itself is just a placeholder for what I really want to show, which is how to get started with language development in Racket.</p>
<h2 id="the-ygor-language-definition">The Ygor language definition</h2>
<p>Ygor has only one type: integers. There is only one function <code class="highlighter-rouge">sum</code>. The language won’t have any syntax. You will input the abstract syntax tree directly.</p>
<p>Here is what Ygor will look like:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">ygor</span>
<span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nf">const</span> <span class="mi">5</span><span class="p">)</span> <span class="p">(</span><span class="nf">const</span> <span class="mi">6</span><span class="p">))</span></code></pre></figure>
<h2 id="racket-introduction">Racket introduction</h2>
<p>To develop Ygor, we’re gonna use <a href="http://racket-lang.org/">racket</a>. Racket has a very interesting framework for developing custom languages that extends on the power of lisp macros.</p>
<p>Download racket and add the <code class="highlighter-rouge">bin</code> folder to your <code class="highlighter-rouge">PATH</code>. That should give you access <code class="highlighter-rouge">racket</code>, <code class="highlighter-rouge">drracket</code>, <code class="highlighter-rouge">raco</code> and other useful stuff.</p>
<p>Even though you can use any editor, I seriously recommend DrRacket, which comes with the racket distribution.</p>
<p>In DrRacket you’ll see an editor on top, and the REPL on the bottom. Let’s try a simple program:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">racket</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">square</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">*</span> <span class="nv">x</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="k">define</span> <span class="nv">value</span> <span class="mi">4</span><span class="p">)</span>
<span class="p">(</span><span class="nf">square</span> <span class="nv">value</span><span class="p">)</span></code></pre></figure>
<p>Racket is a lisp dialect based on scheme. There, end of introduction. The racket website has loads of very good documentation. Have fun.</p>
<h2 id="creating-the-initial-project">Creating the initial project</h2>
<p>We’ll start by creating a directory for the language:</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span><span class="nb">cd</span> ~/development
<span class="nv">$ </span>mkdir ygor
<span class="nv">$ </span>raco link ygor</code></pre></figure>
<p>The last command will link the <code class="highlighter-rouge">ygor</code> directory to the racket collections, making it perfect for development. To test it, let’s try requiring a sample file from that module. Create a file <code class="highlighter-rouge">ygor/hello.rkt</code> with the following content:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">racket</span>
<span class="s">"Polka will never die."</span></code></pre></figure>
<p>And test it like this (in DrRacket):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">require</span> <span class="nv">ygor/hello</span><span class="p">)</span></code></pre></figure>
<p>We’re ready to start.</p>
<h2 id="the-hard-way">The hard way</h2>
<p>This tutorial will be mostly backwards, comparing to other racket language tutorials you’ll find. Instead of beginning by defining the lexer and parser (or even grammar), we’ll start with how to tell racket to treat ygor as a language. I find this approach much more pratical. Later you can choose to focus on any of the steps with the appropiate depth. There is a great tutorial on <a href="http://www.hashcollision.org/brainfudge/">how to implement brainf*ck with racket</a>.</p>
<p>In DrRacket, try the following code:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">ygor</span></code></pre></figure>
<p>And hit run. You should get something like this:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Module Language: invalid module text
standard-module-name-resolver: collection not found
collection: "ygor/lang"
in collection directories:
/Users/juanibiapina/Library/Racket/5.3.6/collects
/Applications/Racket v5.3.6/collects
sub-collection: "lang"
in parent directories:
/Volumes/development/ygor
</code></pre></div></div>
<p>That means racket is looking for a lang collection inside ygor. Let’s make one:</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>mkdir lang</code></pre></figure>
<p>And running again:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Module Language: invalid module text
. . ../../Applications/Racket v5.3.6/collects/mred/private/snipfile.rkt:324:2: open-input-file: cannot open input file
path: /Volumes/development/ygor/lang/reader.rkt
system error: No such file or directory; errno=2
</code></pre></div></div>
<p>So racket is looking for a reader.rkt file. Go ahead and create one. We’ll use an <a href="http://docs.racket-lang.org/syntax/reader-helpers.html#%28mod-path._syntax%2Fmodule-reader%29">module reader</a>, so add the following lines to <code class="highlighter-rouge">reader.rkt</code>:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">s-exp</span> <span class="nv">syntax/module-reader</span>
<span class="nv">ygor</span></code></pre></figure>
<p>The second line there tells racket to look for a file <code class="highlighter-rouge">main.rkt</code> inside the ygor collection. This file contains a module that provides all the top level bindings that will build the language. Let’s provide some initial content in main.rkt:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">racket</span></code></pre></figure>
<p>Let’s go back to our example and try to run a fake ygor program:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">ygor</span>
<span class="p">(</span><span class="nf">hello?</span><span class="p">)</span></code></pre></figure>
<p>You should get something like this:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module: no #%module-begin binding in the module's language
</code></pre></div></div>
<p>Let’s just provide #%module-begin for now, we’ll get back to it later (in main.rkt):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="o">#</span><span class="nv">%module-begin</span><span class="p">)</span></code></pre></figure>
<p>Trying again:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hello?: unbound identifier;
also, no #%app syntax transformer is bound in: hello?
Interactions disabled: ygor does not support a REPL (no #%top-interaction)
</code></pre></div></div>
<p>It tells you <code class="highlighter-rouge">hello?</code> is not defined. Let’s ignore the other errors for now.</p>
<p>In order to define what <code class="highlighter-rouge">hello?</code> is, we need to provide this definition. Add these two lines (in main.rkt):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="nv">hello?</span><span class="p">)</span>
<span class="p">(</span><span class="k">define-syntax-rule</span> <span class="p">(</span><span class="nf">hello?</span><span class="p">)</span>
<span class="p">(</span><span class="nb">print</span> <span class="s">"hello to you too!"</span><span class="p">))</span></code></pre></figure>
<p>And try running again. It should print “hello to you too!” to standard output. This is your first working version of a language that says hello. No kidding.</p>
<h2 id="the-repl">The REPL</h2>
<p>The previous error message said something about ygor not supporting a REPL. A simple way to get it going is to just provide <code class="highlighter-rouge">#%top-interaction</code> straight from racket. Add this line to <code class="highlighter-rouge">main.rkt</code>:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="o">#</span><span class="nv">%top-interaction</span><span class="p">)</span></code></pre></figure>
<p>Now if you run an ygor program from DrRacket, you get a REPL. From now on you can test all the examples directly there.</p>
<h2 id="const">const</h2>
<p>The next step is to allow the user to write ygor programs in the form of an abstract syntax tree. That means there won’t be any program “text” to parse. The user diretly inputs the syntax tree that will be evaluated. So let’s write a simple program:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">ygor</span>
<span class="p">(</span><span class="nf">const</span> <span class="mi">5</span><span class="p">)</span></code></pre></figure>
<p>If you run this, you’ll get “const: unbound identifier;”. Let’s define <code class="highlighter-rouge">const</code>. We’ll create syntactic forms as structs in racket: (in main.rkt)</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="nv">const</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">const</span> <span class="p">(</span><span class="nf">v</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span></code></pre></figure>
<p>Constants will be represented as structs that hold a value <code class="highlighter-rouge">v</code>. We also export this struct with the provide clause. Running again:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const1: function application is not allowed;
no #%app syntax transformer is bound in: (const1 5)
</code></pre></div></div>
<p>For racket to understand function applications (in this case <code class="highlighter-rouge">const</code> is a function that takes one argument and returns a struct), the <code class="highlighter-rouge">#%app</code> function must be defined. Let’s bring it from racket (in main.rkt):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="o">#</span><span class="nv">%app</span><span class="p">)</span></code></pre></figure>
<p>Running again:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>?: literal data is not allowed;
no #%datum syntax transformer is bound in: 5
</code></pre></div></div>
<p>Same deal for literal data. Racket needs the <code class="highlighter-rouge">#%datum</code> function in order to understand literal data. Let’s provide it (in main.rkt):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="o">#</span><span class="nv">%datum</span><span class="p">)</span></code></pre></figure>
<p>And running again, you can see it returns itself.</p>
<h2 id="sum">sum</h2>
<p>Let’s add a <code class="highlighter-rouge">sum</code> function. First let’s sketch the syntax tree for a sum:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">ygor</span>
<span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nf">const</span> <span class="mi">42</span><span class="p">)</span> <span class="p">(</span><span class="nf">const</span> <span class="mi">1</span><span class="p">))</span></code></pre></figure>
<p>We’ll need to define what <code class="highlighter-rouge">sum</code> is (in main.rkt):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="nv">sum</span><span class="p">)</span>
<span class="p">(</span><span class="nf">struct</span> <span class="nv">sum</span> <span class="p">(</span><span class="nf">e1</span> <span class="nv">e2</span><span class="p">)</span> <span class="nt">#:transparent</span><span class="p">)</span></code></pre></figure>
<p><code class="highlighter-rouge">sum</code> is a struct that hold two other expressions. Run again (or try in the REPL):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nf">const</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="nf">const</span> <span class="mi">2</span><span class="p">))</span></code></pre></figure>
<p>Which returns itself, of course.</p>
<p>So at this point, we can type the AST of a Ygor program, and it will evaluate to itself. How can we make Ygor programs runnable?</p>
<h2 id="eval">eval</h2>
<p>Let’s define a function to evaluate an Ygor program. We’ll call it <code class="highlighter-rouge">ygor-eval</code> (in main.rkt):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="nv">ygor-eval</span><span class="p">)</span>
<span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="nf">ygor-eval</span> <span class="nv">e</span><span class="p">)</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">e</span>
<span class="p">[(</span><span class="nf">const</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nf">const</span> <span class="nv">x</span><span class="p">)]</span>
<span class="p">[(</span><span class="nf">sum</span> <span class="nv">e1</span> <span class="nv">e2</span><span class="p">)</span> <span class="p">(</span><span class="nf">const</span> <span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nf">const-v</span> <span class="p">(</span><span class="nf">ygor-eval</span> <span class="nv">e1</span><span class="p">))</span> <span class="p">(</span><span class="nf">const-v</span> <span class="p">(</span><span class="nf">ygor-eval</span> <span class="nv">e2</span><span class="p">))))]))</span></code></pre></figure>
<p>In order to evaluate an expression <code class="highlighter-rouge">e</code>, we match this expression against the two possible cases in Ygor:</p>
<ol>
<li><code class="highlighter-rouge">e</code> is a <code class="highlighter-rouge">const</code> with value <code class="highlighter-rouge">x</code>: returns itself</li>
<li><code class="highlighter-rouge">e</code> is a <code class="highlighter-rouge">sum</code> of two other expressions: return the sum (racket <code class="highlighter-rouge">+</code>) of the result of recursively evaluating both expressions (assuming they evaluate to <code class="highlighter-rouge">const</code>).</li>
</ol>
<p>Try running this code now:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="o">#</span><span class="nv">lang</span> <span class="nv">ygor</span>
<span class="p">(</span><span class="nf">ygor-eval</span> <span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nf">const</span> <span class="mi">42</span><span class="p">)</span> <span class="p">(</span><span class="nf">const</span> <span class="mi">1</span><span class="p">)))</span></code></pre></figure>
<p>And you should get <code class="highlighter-rouge">(const 43)</code>.</p>
<h2 id="hooking-up-eval">Hooking up eval</h2>
<p>We wouldn’t like to write every line in Ygor prefixed with <code class="highlighter-rouge">ygor-eval</code>. Let’s add a hook to automatically wrap every expression with <code class="highlighter-rouge">ygor-eval</code>. To do that, we’ll overwrite <code class="highlighter-rouge">#%module-begin</code>, which is a function that is automatically added by racket wrapping the body of a module, which is very convenient (in main.rkt):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">define-syntax</span> <span class="p">(</span><span class="nf">ygor-module-begin</span> <span class="nv">stx</span><span class="p">)</span>
<span class="p">(</span><span class="nf">datum->syntax</span>
<span class="nv">stx</span>
<span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="k">quote-syntax</span> <span class="o">#</span><span class="nv">%module-begin</span><span class="p">)</span>
<span class="p">(</span><span class="nb">map</span> <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nf">e</span><span class="p">)</span>
<span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="k">quote-syntax</span> <span class="nv">ygor-eval</span><span class="p">)</span>
<span class="nv">e</span><span class="p">))</span>
<span class="p">(</span><span class="nb">cdr</span> <span class="p">(</span><span class="nb">syntax-e</span> <span class="nv">stx</span><span class="p">))))</span>
<span class="nv">stx</span>
<span class="nv">stx</span><span class="p">))</span></code></pre></figure>
<p>Remember how before we just provided <code class="highlighter-rouge">#%module-begin</code> from racket? Let’s replace the provided <code class="highlighter-rouge">#%module-begin</code> with our own overwritten version, defined above (in main.rkt):</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">provide</span> <span class="p">(</span><span class="nf">rename-out</span> <span class="p">[</span><span class="nf">ygor-module-begin</span> <span class="o">#</span><span class="nv">%module-begin</span><span class="p">]))</span></code></pre></figure>
<p>The workings of <code class="highlighter-rouge">ygor-module-begin</code> are not very interesting to our purposes right now, but the idea is basically this: wrap every statement in the module body with <code class="highlighter-rouge">ygor-eval</code>. You can test now that any Ygor programs you run will automatically eval (unless you type it in the REPL, in which case it will still just print the AST, because we haven’t changed how the REPL works).</p>
<h2 id="becoming-useful">Becoming useful</h2>
<p>There are a few things I’ve done in this tutorial you wouldn’t have actually done when writing your own language. On the other hand, this setup is the simplest possible one I could find that easily integrates into MUPL, the language you write for the <a href="https://www.coursera.org/course/proglang">Programming Languages course on coursera</a>, which I seriously recommend every programmer to complete.</p>
<p>From this setup, you can replace the struct definitions I have given with the ones from the course and replace <code class="highlighter-rouge">ygor-eval</code> with <code class="highlighter-rouge">eval-exp</code>, from one of the course exercises.</p>
<p>The full code can be found <a href="https://github.com/juanibiapina/ygor">on github</a>. Have fun.</p>Juan Ibiapinajuanibiapina@gmail.comIn this tutorial I’m gonna show you how to write a very simple programming language called Ygor. The language itself is just a placeholder for what I really want to show, which is how to get started with language development in Racket.Postgres command line2014-09-13T00:00:00+00:002014-09-13T00:00:00+00:00https://juanibiapina.github.io/articles/postgres-command-line<p>I find the postgres command line tools slightly annoying to use, so I wrote <a href="https://github.com/juanibiapina/pg">pg</a>.</p>
<p>With it, I can run commands in a more natural way. For instance:</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>pg list</code></pre></figure>
<p>List databases (same as psql -l), or with –short, prints only their names.</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>pg drop db_name</code></pre></figure>
<p>Drops a database. The only difference between this and <code class="highlighter-rouge">dropdb</code> is that it always uses <code class="highlighter-rouge">--if-exists</code>.</p>
<p>Or more interesting things:</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>pg mv origin target</code></pre></figure>
<p>Rename a database after killing all connections to <code class="highlighter-rouge">origin</code>.</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>pg copy origin target</code></pre></figure>
<p>Creates a “copy” of a database after killing all connections to <code class="highlighter-rouge">origin</code>. I use this to create fast snapshots that I can go back to later, much like version control.</p>
<p>These are very simple wrappers around <code class="highlighter-rouge">createdb</code>, <code class="highlighter-rouge">psql</code> etc, but since I was starting to repeat myself a lot, I decided it was worth it.</p>
<p>You can find the code on <a href="https://github.com/juanibiapina/pg">github</a>, along with tests, yes.</p>Juan Ibiapinajuanibiapina@gmail.comI find the postgres command line tools slightly annoying to use, so I wrote pg.A Package Manager For Shell Scripts2014-08-20T00:00:00+00:002014-08-20T00:00:00+00:00https://juanibiapina.github.io/articles/basher-a-package-manager-for-shell-scripts<p>In this post I’ll introduce <a href="https://github.com/basherpm/basher">basher</a>, a package manager for shell scripts.</p>
<h2 id="motivation">Motivation</h2>
<p>I do some development using bash. Not a <a href="https://github.com/avleen/bashttpd">web server</a>, or a <a href="https://github.com/dominictarr/JSON.sh">json parser</a>, because that’s not the point of bash (although fun!), but there are many tasks that are actually faster to do in bash.</p>
<p>When you write a bash script, it is usually a single file. All you have to do is put it somewhere in your PATH and you’re done. You can just keep using it and forget about it. Unless you have multiple machines.</p>
<p>For me, the main problem with this approach is the lack of version control. I tend to get lost easily during development if I don’t have the checkpoints a VCS provides. I need to be able to go back and forth in time and experiment. With the script in a version controlled directory, I only need to link the bin somewhere in the PATH. I can also publish to github.</p>
<p>It is very difficult to write bash scripts and get it right the first time. You might forget a space before a <code class="highlighter-rouge">]</code>, or you might put an extra space after an <code class="highlighter-rouge">=</code>. Or you might forget some quotes. The solution I found to this problem is TDD, of course. There is a testing framework for bash called <a href="https://github.com/sstephenson/bats">bats</a>, and it makes testing bash scripts as easy as it gets.</p>
<p>On the other hand, now you have to install a third party bash script (bats itself is written in bash). It’s not just a single file you can copy, but the creators of bats are cool enough to provide a simple, reliable and well made install.sh script that copies the binaries to the right place. Still, you have to clone the project and find out how to install it. Remember you might need sudo.</p>
<p>Bats is also on brew, so that’s simpler if you’re on osx. There is probably something packaged for linux too, but which distros? Older install instructions for bats told you to clone the repo to <code class="highlighter-rouge">~/.bats</code> and add its <code class="highlighter-rouge">bin</code> directory to your path. That’s how I had to do for a while. In the end, that was the most reliable approach that I could reproduce among all my machines.</p>
<p>I want to publish a couple of bash scripts myself, but I don’t really want to maintain installation instructions for osx and a bunch of linux distros. I want to think of a bash script as a package for bash, just like ruby has gems.</p>
<p>I also want to go to github, find a script I like, install it with one line and use it. If one of my machines doesn’t have that script, all I need to do is run that one line and get it. No messing with my PATH either.</p>
<h2 id="basher">Basher</h2>
<p>So I wrote <a href="https://github.com/basherpm/basher">basher</a>. With it installed, you can do this:</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>basher install basherpm/bats</code></pre></figure>
<p>There, <code class="highlighter-rouge">bats</code> is in your PATH ready to be used. Whatever OS.</p>
<p>The install command looks for a repository on github, clones it to a known location and links the binaries to a place in your PATH.</p>
<p>I can also list installed packages easily:</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>basher list</code></pre></figure>
<p>And uninstall (with completion support):</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>basher uninstall basherpm/bats</code></pre></figure>
<p>Or check for outdated packages:</p>
<figure class="highlight"><pre><code class="language-sh" data-lang="sh"><span class="nv">$ </span>basher outdated</code></pre></figure>
<p>Check the <code class="highlighter-rouge">commands</code> command for a full list. Try also <code class="highlighter-rouge">basher help <command></code>.</p>
<h2 id="shell-support">Shell support</h2>
<p>Basher needs to be available in your PATH, of course, but it also needs to add one entry to the PATH, where all binaries will be linked. It does this by hooking into the shell, similarly to what <a href="https://github.com/sstephenson/rbenv">rbenv</a> does. The biggest difference is that <a href="https://github.com/sstephenson/rbenv">rbenv</a> modifies the PATH on the fly, according to what ruby version should be used. Basher is simpler, because it needs only one location added to the PATH.</p>
<p>If you try running <code class="highlighter-rouge">basher init -</code>, you’ll see what gets added to the shell. This code is generated on the fly, according to what shell you use. It supports bash, zsh (might support any POSIX compliant) and even fish (thanks to <a href="https://github.com/jvortmann">João Vortmann</a>).</p>
<h2 id="packages">Packages</h2>
<p>Basher packages are simply github repos that have a <code class="highlighter-rouge">package.sh</code> file. This file defines what binaries need to be linked when installing and any package dependencies.</p>
<p>There is also experimental support for package runtimes. It allows you to <code class="highlighter-rouge">require juanibiapina/gg</code> (given this package is installed) in order to make some functions available to your shell. This might change in the future, but I find it very promissing.</p>
<h2 id="why-not-bpkg-">Why not bpkg ?</h2>
<p>I was using basher for a while, but did not plan on releasing it, when <a href="http://www.bpkg.io/">bpkg</a> came out. At that point basher was very simple, and bpkg seemed to offer much more. I tried using it, but decided to go back and release basher instead.</p>
<p>Bpkg uses a <code class="highlighter-rouge">package.json</code> file to define packages. I assumed it was the same format as npm packages, so I got really confused when I checked the documentation for <code class="highlighter-rouge">package.json</code> and found out bpkg implements it incorrectly. It turns out, it is a different format, just with the same name. There is some unfinished discussion going on on <a href="https://github.com/bpkg/bpkg/issues/17">this issue</a>. Overall, I find the choice of name and format confusing.</p>
<p>Bpkg also doesn’t keep the package repos on your local machine. Instead, it clones the repo, installs the binaries then removes the repo. I wanted to have local copies of each repo.</p>
<p>Bpkg uses whatever install script is provided by the package. This allows a package to install itself however it wants, including man pages or whatever is needed. On the other hand, it makes it difficult for me to track what has been installed and where. I don’t like installs without the corresponding uninstall, so basher does not provide support for custom install scripts. It would be interesting to add mechanisms for installing man pages, completions etc. That way, any package with a properly defined package.sh would be automatically installable and uninstallable.</p>
<p>Instead, basher keeps everything under <code class="highlighter-rouge">~/.basher</code>. You can remove this directory and get rid of everything easily.</p>
<p>Bpkg has support for github releases. You can choose to install a specific version. It also has support for local or global installs, like npm. Basher lacks these features, but they can be added if the need comes. Pull requests are welcome.</p>
<p>Basher is <a href="https://travis-ci.org/basherpm/basher">thoroughly tested</a>. I failed to make a pull request to bpkg because I lacked confidence.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This has solved many of my shell scripting problems. I can have version controlled scripts automatically linked to the PATH without any changes, I can install scripts from github. It works the same way on my osx, ubuntu and debian (might work anywhere, just need to test). I have one central location where I install scripts. I can easily check for new versions of scripts without checking one by one. I can see the changelog for each script (since I have the full repo cloned locally).</p>
<p>It also helps me develop scripts. Check <code class="highlighter-rouge">basher new</code> or <code class="highlighter-rouge">basher new-command</code>.</p>
<p>There is still much that can be done. Pull requests and feedback are very welcome.</p>Juan Ibiapinajuanibiapina@gmail.comIn this post I’ll introduce basher, a package manager for shell scripts.Project Euler - Problem 12014-08-07T00:00:00+00:002014-08-07T00:00:00+00:00https://juanibiapina.github.io/articles/project-euler-problem-1<p>In this post I’ll show how to solve the first problem on Project Euler using Marco, while at the same time introducing some new features of the language.</p>
<p>Disclaimer: Although I’m about to provide one way to solve problem 1 (maybe problem 2 in the future), my intention is to demonstrate how to write a simple algorithm in Marco while at the same time trying to get you interested in programming challenges. If you think I’m about to spoil the first problem, try one of the following:</p>
<ol>
<li>Solve it in your favorite language before reading this.</li>
<li>Read this solution, but try to solve it using infinite streams.</li>
<li>Try one of the other 475 problems.</li>
</ol>
<h2 id="the-problem">The problem</h2>
<blockquote>
<p>If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.</p>
<p>Find the sum of all the multiples of 3 or 5 below 1000.</p>
</blockquote>
<p>I’ll try to break the code in terms of some features of Marco.</p>
<h2 id="modules">Modules</h2>
<p>Modules are the main blocks for code organization in Marco. Let’s require the necessary modules for our solution:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">:io</span> <span class="p">(</span><span class="k">require</span> <span class="s">"io"</span><span class="p">))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">:integer</span> <span class="p">(</span><span class="k">require</span> <span class="s">"integer"</span><span class="p">))</span></code></pre></figure>
<p>The <code class="highlighter-rouge">io</code> module has some input and output functions. The <code class="highlighter-rouge">integer</code> module has functions for parsing and generating integers.</p>
<p>Notice how <code class="highlighter-rouge">require</code> doesn’t actually do anything to the environment; It has no side effects. The result of calling it is an anonymous module that we store in a binding. Also notice how <code class="highlighter-rouge">require</code> is a regular function with no special properties; It can be used anywhere a function can be used.</p>
<h2 id="member-access">Member access</h2>
<p>In order to access members of modules, we use the dot notation:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">:n</span> <span class="p">(</span><span class="nf">integer</span><span class="o">.</span><span class="nv">parse</span> <span class="p">(</span><span class="nf">io</span><span class="o">.</span><span class="nv">read-line</span> <span class="nv">io</span><span class="o">.</span><span class="nv">stdin</span><span class="p">)))</span></code></pre></figure>
<p>This will bind <code class="highlighter-rouge">n</code> to the result of parsing an integer from a line of input.</p>
<h2 id="conditional-and-recursion">Conditional and Recursion</h2>
<p>Let’s define a function to sum all numbers in a list:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">:sum</span> <span class="p">(</span><span class="nf">function</span> <span class="p">[</span><span class="nf">:list</span><span class="p">]</span> <span class="p">{</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">nil?</span> <span class="nv">list</span><span class="p">)</span> <span class="p">{</span> <span class="mi">0</span> <span class="p">}</span>
<span class="p">{</span> <span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nf">head</span> <span class="nv">list</span><span class="p">)</span> <span class="p">(</span><span class="nf">recurse</span> <span class="p">(</span><span class="nf">tail</span> <span class="nv">list</span><span class="p">)))</span> <span class="p">})</span>
<span class="p">}))</span></code></pre></figure>
<p><code class="highlighter-rouge">sum</code> is a function that takes a list. In case the list is nil (the empty list), we say the sum is zero. Otherwise, add the head of the list to the sum of its tail.</p>
<p>Notice the <code class="highlighter-rouge">recurse</code> binding being used. Since all functions are anonymous in Marco, there is currently no way to make a function call itself recursively by name (because it doesn’t have one!). The recurse binding is one way to do it, although I dislike how non explicit that is (among other problems).</p>
<h2 id="blocks-and-lazy-evaluation">Blocks and Lazy Evaluation</h2>
<p>Let’s define a function to check whether a number should be included in the final sum:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">:include?</span> <span class="p">(</span><span class="nf">function</span> <span class="p">[</span><span class="nf">:n</span><span class="p">]</span> <span class="p">{</span>
<span class="p">(</span><span class="k">or</span> <span class="p">{</span> <span class="p">(</span><span class="nb">=</span> <span class="p">(</span><span class="nf">%</span> <span class="nv">n</span> <span class="mi">3</span><span class="p">)</span> <span class="mi">0</span><span class="p">)</span> <span class="p">}</span> <span class="p">{</span> <span class="p">(</span><span class="nb">=</span> <span class="p">(</span><span class="nf">%</span> <span class="nv">n</span> <span class="mi">5</span><span class="p">)</span> <span class="mi">0</span><span class="p">)</span> <span class="p">})</span>
<span class="p">}))</span></code></pre></figure>
<p>Notice the <code class="highlighter-rouge">or</code> function. It takes two arguments, both being blocks. It invokes the first one in a lexical scope; if it returns true, it short circuits and never really invokes the second.</p>
<p>Blocks are how you perform delayed evaluation in Marco. Any code that needs to be passed around or might not run at all must be in a block. You can be sure that any code that is not inside a block won’t have any unexpected magic in it.</p>
<h2 id="putting-it-all-together">Putting it all together</h2>
<p>This is the final solution including generating the final result using <code class="highlighter-rouge">filter</code> and printing it:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">:io</span> <span class="p">(</span><span class="k">require</span> <span class="s">"io"</span><span class="p">))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">:integer</span> <span class="p">(</span><span class="k">require</span> <span class="s">"integer"</span><span class="p">))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">:n</span> <span class="p">(</span><span class="nf">integer</span><span class="o">.</span><span class="nv">parse</span> <span class="p">(</span><span class="nf">io</span><span class="o">.</span><span class="nv">read-line</span> <span class="nv">io</span><span class="o">.</span><span class="nv">stdin</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">:sum</span> <span class="p">(</span><span class="nf">function</span> <span class="p">[</span><span class="nf">:list</span><span class="p">]</span> <span class="p">{</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">nil?</span> <span class="nv">list</span><span class="p">)</span> <span class="p">{</span> <span class="mi">0</span> <span class="p">}</span>
<span class="p">{</span> <span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nf">head</span> <span class="nv">list</span><span class="p">)</span> <span class="p">(</span><span class="nf">recurse</span> <span class="p">(</span><span class="nf">tail</span> <span class="nv">list</span><span class="p">)))</span> <span class="p">})</span>
<span class="p">}))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">:include?</span> <span class="p">(</span><span class="nf">function</span> <span class="p">[</span><span class="nf">:n</span><span class="p">]</span> <span class="p">{</span>
<span class="p">(</span><span class="k">or</span> <span class="p">{</span> <span class="p">(</span><span class="nb">=</span> <span class="p">(</span><span class="nf">%</span> <span class="nv">n</span> <span class="mi">3</span><span class="p">)</span> <span class="mi">0</span><span class="p">)</span> <span class="p">}</span> <span class="p">{</span> <span class="p">(</span><span class="nb">=</span> <span class="p">(</span><span class="nf">%</span> <span class="nv">n</span> <span class="mi">5</span><span class="p">)</span> <span class="mi">0</span><span class="p">)</span> <span class="p">})</span>
<span class="p">}))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">:result</span> <span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nf">filter</span> <span class="p">(</span><span class="nf">integer</span><span class="o">.</span><span class="nv">range</span> <span class="mi">1</span> <span class="nv">n</span><span class="p">)</span> <span class="nv">include?</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">print</span> <span class="nv">result</span><span class="p">)</span></code></pre></figure>
<p>Let me know of any thoughts on any of this. Feedback is always appreciated.</p>Juan Ibiapinajuanibiapina@gmail.comIn this post I’ll show how to solve the first problem on Project Euler using Marco, while at the same time introducing some new features of the language.Marco Revamp2014-05-19T00:00:00+00:002014-05-19T00:00:00+00:00https://juanibiapina.github.io/articles/marco-revamp<p>Ever since Marco’s fourth rewrite (when things finally started to get serious), it has been heavily inspired by Lisp. One of my initial goals was to have a macro system similar to Clojure’s.</p>
<p>Now that the language has evolved, I have started making decisions based on some principles, instead of just playing aroung with language development concepts.</p>
<p>This is the result so far:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">:collatz</span> <span class="p">(</span><span class="nf">function</span> <span class="p">[</span><span class="nf">:n</span><span class="p">]</span> <span class="p">{</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">{</span> <span class="p">(</span><span class="nb">cons</span> <span class="mi">1</span> <span class="nv">nil</span><span class="p">)</span> <span class="p">}</span>
<span class="p">{</span> <span class="p">(</span><span class="nb">cons</span> <span class="nv">n</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">even?</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">{</span> <span class="p">(</span><span class="nf">recurse</span> <span class="p">(</span><span class="nb">/</span> <span class="nv">n</span> <span class="mi">2</span><span class="p">))</span> <span class="p">}</span>
<span class="p">{</span> <span class="p">(</span><span class="nf">recurse</span> <span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nb">*</span> <span class="mi">3</span> <span class="nv">n</span><span class="p">)</span> <span class="mi">1</span><span class="p">))</span> <span class="p">}))</span> <span class="p">})</span> <span class="p">}))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">:max-n</span> <span class="mi">100</span><span class="p">)</span>
<span class="p">(</span><span class="nb">print</span> <span class="p">(</span><span class="nf">list-max</span> <span class="p">(</span><span class="nb">map</span> <span class="nv">length</span> <span class="p">(</span><span class="nb">map</span> <span class="nv">collatz</span> <span class="p">(</span><span class="nf">range</span> <span class="mi">1</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">max-n</span> <span class="mi">1</span><span class="p">))))))</span></code></pre></figure>
<p>But I’ll explain.</p>
<h1 id="semantic-ambiguities">Semantic Ambiguities</h1>
<p>I have changed most of the language syntactic and semantics to adopt a new principle: No semantic ambiguities.</p>
<p>Let me give an example:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="k">class</span> <span class="nc">Ball</span>
<span class="k">def</span> <span class="nf">roll</span><span class="p">(</span><span class="n">roll</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
<span class="nb">puts</span> <span class="s2">"Rolling at speed </span><span class="si">#{</span><span class="n">roll</span><span class="si">}</span><span class="s2">"</span>
<span class="n">roll</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="no">Ball</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">roll</span></code></pre></figure>
<p>Is <code class="highlighter-rouge">roll</code> a method or a variable?</p>
<p>This is a simple ruby program that rolls a ball at a certain speed and returns that speed. Of course the parameter name should be <code class="highlighter-rouge">speed</code>, not <code class="highlighter-rouge">roll</code>. Under the right circunstances, code like that might happen in production (I have seen it).</p>
<p>Just ignore the bad name for a second. Why is code like that even allowed in Ruby?</p>
<p>There are perfectly valid explanations to why that is allowed (which I’ll not explain here), but probably no strong enough reason why you should ever do this on purpose.</p>
<p>That’s what I call a “semantic ambiguity”. This one is specific to Ruby, but similar things happen in most programming languages. These are small things that add to the subconscious burden you have to go through when reading code. It is very small, but why do I dislike it?</p>
<p>Consider this situation: Suppose you are reading legacy code, following a complicated flow of calls in order to understand some logic that was written four years ago, in a completely different context. You already jumped through several files. As you encounter something like the previous ambiguity, what do you do?</p>
<ol>
<li>Stop what you’re doing.</li>
<li>Identify the origin of the <code class="highlighter-rouge">roll</code>. In this case you need to go to the method signature and figure out that there is a method and a variable with the same name.</li>
<li>Possible WTF moment when you google for this crazy stuff.</li>
<li>Return to the <code class="highlighter-rouge">roll</code> you were analyzing and identify it as a method or a variable based on the context. Remember to do this to any <code class="highlighter-rouge">roll</code>s you find along the way.</li>
<li>Return to following the original flow.</li>
</ol>
<p>Now imagine the code is not as easy as my example. The variable and method definitions could be in completely different methods, modules or files. You could maybe see the method first, miss the variable and later get really confused.</p>
<p>This requires a lot of branching and stacking which computers do very well, but people mostly don’t. And we shoudn’t need to.</p>
<h1 id="the-path-of-no-ambiguity">The Path of No Ambiguity</h1>
<p>I tried to remove any semantic ambiguities from Marco. In order to achieve that, I made everything in the language mean one thing, and one thing only.</p>
<p>Of course there were compromises I had to make. Marco is no longer what I would consider a Lisp dialect. It is no longer homoiconic and has no support for macros (and probably cannot have).</p>
<p>Let’s compare the <code class="highlighter-rouge">if</code> function:</p>
<p>Before:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="p">(</span><span class="nb">error</span><span class="p">))</span></code></pre></figure>
<p><code class="highlighter-rouge">if</code> used to be a macro that took three parameters: a condition, a then clause and an else clause. It would evaluate the condition in lexical scope and then evaluate either the then or else clause accordingly. Where is the semantic ambiguity?</p>
<p>Why isn’t <code class="highlighter-rouge">(error)</code> evaluated to a function call the causes an error? Because of the inherent semantics of the <code class="highlighter-rouge">if</code> macro. If <code class="highlighter-rouge">if</code> was a user defined function, this evaluation would be different. For instance:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">do-stuff</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="p">(</span><span class="nb">error</span><span class="p">))</span></code></pre></figure>
<p>This would evaluate all the arguments normally if <code class="highlighter-rouge">do-stuff</code> were a function. If this was Racket or Common List, <code class="highlighter-rouge">do-stuff</code> might even be a macro. Now you need to read that macro to understand what is evaluated and when.</p>
<p>After:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span> <span class="p">(</span><span class="nb">cons</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="p">}</span> <span class="p">{</span> <span class="p">(</span><span class="nb">error</span><span class="p">)</span> <span class="p">})</span></code></pre></figure>
<p>In the new version, <code class="highlighter-rouge">if</code> is a function. It takes three arguments: a boolean, a then block and an else block. If the boolean is true, it invokes the then block. Otherwise it invokes the else block.</p>
<p>This is evaluated exacly like any other function. Note how the expression <code class="highlighter-rouge">(= n 1)</code> is not actually passed to the <code class="highlighter-rouge">if</code> function. It is evaluated before the function call, like any other argument evaluation, and only <code class="highlighter-rouge">true</code> or <code class="highlighter-rouge">false</code> is passed in. Blocks evaluate to themselves, so they are passed in, ready to be invoked if needed.</p>
<p>Notice how you can spot the delayed evaluation by the syntactic construct (brackets). Notice how you can safely evaluate any part of this code in your head without looking at any other parts.</p>
<p>This is what I hope to acomplish. There are also many other cases that are normally ambiguious and I have removed, for instance: function definitions, variable bindings and recursion among others. I won’t go into further details about each, but I would love to hear any thoughts about this whole approach.</p>Juan Ibiapinajuanibiapina@gmail.comEver since Marco’s fourth rewrite (when things finally started to get serious), it has been heavily inspired by Lisp. One of my initial goals was to have a macro system similar to Clojure’s.Tail Call Optimization in Marco2014-03-14T00:00:00+00:002014-03-14T00:00:00+00:00https://juanibiapina.github.io/articles/tail-call-optimization-in-marco<p>One of the main goals of the Marco language is that the interpreter code should be very easy to understand. It should be possible for almost any programmer without experience developing programming languages to read the code and understand what’s going on at a high level.</p>
<p>Even though the current state of the code requires lots of refactoring (since I tend to experiment a lot with it), I’m proud to say that I’m still walking towards that goal.</p>
<p>I have recently added TCO to Marco, in a similar way to the previous <a href="http://juanibiapina.com/articles/2013-12-16-trampolining-in-marco/">trampoline post</a>. Let me show you the two main consequences to code quality:</p>
<h2 id="interpreter-changes">Interpreter Changes</h2>
<p>Here is part of the code for the <code class="highlighter-rouge">if</code> special form:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="nd">@Override</span>
<span class="kd">public</span> <span class="n">MarcoObject</span> <span class="nf">performInvoke</span><span class="o">(</span><span class="n">Environment</span> <span class="n">environment</span><span class="o">,</span> <span class="n">MarcoList</span> <span class="n">arguments</span><span class="o">)</span> <span class="o">{</span>
<span class="n">MarcoObject</span> <span class="n">condition</span> <span class="o">=</span> <span class="n">arguments</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">0</span><span class="o">);</span>
<span class="n">MarcoObject</span> <span class="n">thenClause</span> <span class="o">=</span> <span class="n">arguments</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">1</span><span class="o">);</span>
<span class="n">MarcoObject</span> <span class="n">elseClause</span> <span class="o">=</span> <span class="n">arguments</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">2</span><span class="o">);</span>
<span class="n">MarcoObject</span> <span class="n">v1</span> <span class="o">=</span> <span class="n">condition</span><span class="o">.</span><span class="na">eval</span><span class="o">(</span><span class="n">environment</span><span class="o">);</span>
<span class="k">if</span> <span class="o">(</span><span class="n">Cast</span><span class="o">.</span><span class="na">toBoolean</span><span class="o">(</span><span class="n">v1</span><span class="o">)</span> <span class="o">==</span> <span class="n">MarcoBoolean</span><span class="o">.</span><span class="na">TRUE</span><span class="o">)</span> <span class="o">{</span>
<span class="k">return</span> <span class="k">new</span> <span class="nf">MarcoContinuation</span><span class="o">(</span><span class="n">thenClause</span><span class="o">,</span> <span class="n">environment</span><span class="o">);</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
<span class="k">return</span> <span class="k">new</span> <span class="nf">MarcoContinuation</span><span class="o">(</span><span class="n">elseClause</span><span class="o">,</span> <span class="n">environment</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span></code></pre></figure>
<p>It should not be difficult to read:</p>
<ol>
<li><code class="highlighter-rouge">condition</code>, <code class="highlighter-rouge">thenClause</code> and <code class="highlighter-rouge">elseClause</code> are positional arguments.</li>
<li><code class="highlighter-rouge">condition</code> is always evaluated.</li>
<li>If the result of the condition is <code class="highlighter-rouge">true</code>, return a continuation for the <code class="highlighter-rouge">thenClause</code>, otherwise return a continuation for the <code class="highlighter-rouge">elseClause</code>.</li>
</ol>
<p>Compare this to the Racket documentation for <code class="highlighter-rouge">if</code>:</p>
<blockquote>
<p>Evaluates test-expr. If it produces any value other than #f, then then-expr is evaluated, and its results are the result for the if form. Otherwise, else-expr is evaluated, and its results are the result for the if form. The then-expr and else-expr are in tail position with respect to the if form.</p>
</blockquote>
<p>I like to see these concepts (and some more) directly mapped to the interpreter code.</p>
<p>Catch: You need to know that continuations are being used to implement tail calls. I could just make a class MarcoTailCall that inherits from MarcoContinuation, but I have doubts if that actually makes it clearer.</p>
<h2 id="the-new-collatz-implementation">The New Collatz Implementation</h2>
<p>This is the new Marco code for finding the max collatz sequence up to some number <code class="highlighter-rouge">n</code>:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">collatz-size</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">n</span> <span class="nv">size</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="mi">1</span> <span class="nv">size</span><span class="p">)</span>
<span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">even?</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nb">/</span> <span class="nv">n</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nb">*</span> <span class="mi">3</span> <span class="nv">n</span><span class="p">)</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="nb">+</span> <span class="nv">size</span> <span class="mi">1</span><span class="p">)))))</span>
<span class="p">(</span><span class="nf">helper</span> <span class="nv">n</span> <span class="mi">0</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">collatz-max</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">n</span> <span class="nv">current-max</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nb">max</span> <span class="mi">1</span> <span class="nv">current-max</span><span class="p">)</span>
<span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="nb">max</span> <span class="nv">current-max</span> <span class="p">(</span><span class="nf">collatz-size</span> <span class="nv">n</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">helper</span> <span class="nv">n</span> <span class="mi">0</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">print</span> <span class="p">(</span><span class="nf">collatz-max</span> <span class="mi">100000</span><span class="p">))</span></code></pre></figure>
<p>It doesn’t require any hacks or trampolines since TCO is now part of Marco. Much more readable than <a href="http://juanibiapina.com/articles/2013-12-16-trampolining-in-marco/">before</a>.</p>
<h1 id="performance-comparison">Performance Comparison</h1>
<p>These are the previous values:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>100000: 58.31s user 0.48s system 102% cpu 57.191 total
500000: 336.71s user 1.01s system 104% cpu 5:24.38 total
</code></pre></div></div>
<p>This is now using TCO:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>100000: 28.90s user 0.26s system 102% cpu 28.554 total
500000: 158.40s user 0.91s system 100% cpu 2:38.02 total
</code></pre></div></div>
<p>Its about twice as fast, slightly more as the number increases. Better performance with better code.</p>Juan Ibiapinajuanibiapina@gmail.comOne of the main goals of the Marco language is that the interpreter code should be very easy to understand. It should be possible for almost any programmer without experience developing programming languages to read the code and understand what’s going on at a high level.Constant Time List Length2014-01-19T00:00:00+00:002014-01-19T00:00:00+00:00https://juanibiapina.github.io/articles/constant-time-list-length<h2 id="the-current-length">The current length</h2>
<p>After I decided to implement Tail Call Optimization in Marco, I realized there were many other possible optimizations there were much simpler, and could possible make a huge difference.</p>
<p>A simple one is list length. Here is the previous non-optimized version:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">length</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">l</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">nil?</span> <span class="nv">l</span><span class="p">)</span>
<span class="mi">0</span>
<span class="p">(</span><span class="nb">+</span> <span class="mi">1</span> <span class="p">(</span><span class="nb">length</span> <span class="p">(</span><span class="nf">tail</span> <span class="nv">l</span><span class="p">))))))</span></code></pre></figure>
<p>The cool thing about it, is that it is implemented in Marco itself. That was very rewarding for me. On the other hand, it performs horribly. It has to go through the whole list every time. How can we solve this?</p>
<h2 id="dynamic-dispatch-and-cached-length">Dynamic dispatch and cached length</h2>
<p>List are immutable in Marco, so the size of a list will never change. That means we can cache its length. Combine that with dynamic dispatch, and we get this solution:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">class</span> <span class="nc">MarcoNil</span> <span class="kd">implements</span> <span class="n">MarcoList</span> <span class="o">{</span>
<span class="kd">public</span> <span class="kt">int</span> <span class="nf">length</span><span class="o">()</span> <span class="o">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="o">;</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">MarcoPair</span> <span class="kd">implements</span> <span class="n">MarcoList</span> <span class="o">{</span>
<span class="kd">private</span> <span class="n">MarcoObject</span> <span class="n">first</span><span class="o">;</span>
<span class="kd">private</span> <span class="n">MarcoObject</span> <span class="n">second</span><span class="o">;</span>
<span class="kd">private</span> <span class="kt">boolean</span> <span class="n">isList</span><span class="o">;</span>
<span class="kd">private</span> <span class="kt">int</span> <span class="n">length</span><span class="o">;</span>
<span class="kd">public</span> <span class="nf">MarcoPair</span><span class="o">(</span><span class="n">MarcoObject</span> <span class="n">first</span><span class="o">,</span> <span class="n">MarcoObject</span> <span class="n">second</span><span class="o">)</span> <span class="o">{</span>
<span class="k">this</span><span class="o">.</span><span class="na">first</span> <span class="o">=</span> <span class="n">first</span><span class="o">;</span>
<span class="k">this</span><span class="o">.</span><span class="na">second</span> <span class="o">=</span> <span class="n">second</span><span class="o">;</span>
<span class="k">this</span><span class="o">.</span><span class="na">isList</span> <span class="o">=</span> <span class="n">second</span><span class="o">.</span><span class="na">isList</span><span class="o">();</span>
<span class="k">if</span> <span class="o">(</span><span class="n">isList</span><span class="o">())</span> <span class="o">{</span>
<span class="k">this</span><span class="o">.</span><span class="na">length</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">+</span> <span class="n">Cast</span><span class="o">.</span><span class="na">toList</span><span class="o">(</span><span class="n">second</span><span class="o">).</span><span class="na">length</span><span class="o">();</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="nd">@Override</span>
<span class="kd">public</span> <span class="kt">int</span> <span class="nf">length</span><span class="o">()</span> <span class="o">{</span>
<span class="k">return</span> <span class="n">length</span><span class="o">;</span>
<span class="o">}</span>
<span class="o">}</span></code></pre></figure>
<p>Nil is a list with length zero. A pair is a list (if its second element is a list) whose size is always one plus the length of the tail list. The logic here is the same as before, but this is implemented in constant time (one addition when consing to a list).</p>
<p>How big is the difference?</p>
<h2 id="results">Results</h2>
<p>If we take the code <a href="http://juanibiapina.com/articles/2013-12-16-trampolining-in-marco/">from this previous post with trampolines</a> and replace the <code class="highlighter-rouge">my-length</code> function with our new optimized length (and also optimized closures), we get these new results:</p>
<p>Previous results with optimized closures:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>100 : 0.81s user 0.05s system 151% cpu 0.570 total
500 : 1.64s user 0.12s system 142% cpu 1.229 total
1000 : 2.95s user 0.20s system 117% cpu 2.691 total
5000 : 38.57s user 0.40s system 102% cpu 38.204 total
10000: 149.57s user 1.09s system 101% cpu 2:29.08 total
</code></pre></div></div>
<p>New results:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>100 0.70s user 0.05s system 148% cpu 0.505 total
500 1.29s user 0.07s system 165% cpu 0.823 total
1000 1.53s user 0.11s system 150% cpu 1.085 total
5000 3.66s user 0.18s system 119% cpu 3.225 total
10000 6.56s user 0.19s system 112% cpu 5.997 total
100000 65.42s user 0.56s system 102% cpu 1:04.30 total
</code></pre></div></div>
<p>That’s very remarkable. We got to a hundred thousand. Only nine hundred thousand to go.</p>Juan Ibiapinajuanibiapina@gmail.comThe current lengthOptimizing Function Closures2013-12-27T00:00:00+00:002013-12-27T00:00:00+00:00https://juanibiapina.github.io/articles/optimizing-function-closures<p>In this post I want to talk about optimizations for function closures.</p>
<h2 id="evaluating-function-closures">Evaluating Function Closures</h2>
<p>One of the biggest revelations for me when writing Marco was that functions are not actually first class citizens. Closures are.</p>
<p>When you define a function, you’re actually creating a object which has three pieces of data: An executable body, an environment, and a list of parameters.</p>
<p>This closure object is what can be passed around and eventually called. It will take care of evaluating the function body (which it holds) in the environment where the function was originally defined (which it also holds) extended with the parameters (which it also also holds) bound to the actual arguments.</p>
<p>This is the definition of evaluating a closure (hence, a function with lexical scope). The cool thing about this definition is that it maps exactly to the code.</p>
<h2 id="the-insight">The Insight</h2>
<p>Consider the evaluation of the following code:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">y</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">f</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">z</span> <span class="mi">42</span><span class="p">)</span></code></pre></figure>
<p><code class="highlighter-rouge">f</code> is bound to a closure object that holds the following data:</p>
<ul>
<li>The body: <code class="highlighter-rouge">'(+ x y)</code></li>
<li>The parameters: <code class="highlighter-rouge">'x</code></li>
<li>The environment in the moment that <code class="highlighter-rouge">f</code> is defined</li>
</ul>
<p>So what is currently on that environment?</p>
<p>Obviously, <code class="highlighter-rouge">+</code> and <code class="highlighter-rouge">y</code> have to be, because they are used in the body. With a bit of cleverness that I might eventually talk about, <code class="highlighter-rouge">f</code> also is (to allow recursion with anonymous functions). What about <code class="highlighter-rouge">z</code>?</p>
<p>I wrote in <a href="http://juanibiapina.com/articles/2013-11-29-functions-in-marco/">this other post</a> that <code class="highlighter-rouge">z</code> would be available. I have changed this now, and things declared after the declaration of the function are not available, by definition. With that I might have killed mutual recursion for now. I’ll figure that out later.</p>
<p>The interesting question here is: What about <code class="highlighter-rouge">-</code>? What about <code class="highlighter-rouge">*</code>, <code class="highlighter-rouge">def</code>, <code class="highlighter-rouge">set!</code> etc?</p>
<p>None of these bindings are used in the function body, so one could argue that they should not be there. This is an known optimization for closure environments and I have added this to Marco.</p>
<h2 id="the-trick">The Trick</h2>
<p>But how do you know what needs to be available in the closure environment? I won’t make this post long and just say that there are formal ways to do it.</p>
<p>What I do is: During the function definition, I copy to the closure environment all symbols referenced in the body that are available at that point. The symbols I cannot find, I just assume they will be available later.</p>
<p>What happens later?</p>
<p>Currently there are three situations where symbols can be referenced in a function body and not be available at the time of the function definition:</p>
<ol>
<li>
<p>Parameters: They are available only when the function is about to be called. So I just ignore them because I know they will be bound during the call.</p>
</li>
<li>
<p>Special forms: Symbols that are used inside special forms and won’t actually be used to lookup values. More on this later.</p>
</li>
<li>
<p>Actual errors. Symbols that are not defined anywhere.</p>
</li>
</ol>
<p>The way it is currently done, if you defined a function that uses a variable that is not defined, it will only cause an error when you call that function, even though the function definition itself is invalid, by definition. In a way I’m apologizing instead of asking for permission, but there is a reason for that.</p>
<h2 id="back-to-number-2">Back to Number 2</h2>
<p>Consider the following code:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">f</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">x</span><span class="p">)</span> <span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">a</span> <span class="nv">x</span><span class="p">)</span> <span class="nv">a</span><span class="p">)))</span></code></pre></figure>
<p><code class="highlighter-rouge">f</code> is a function that takes one argument and returns it. When evaluating this function definition, I have no way to know at that point that the first <code class="highlighter-rouge">a</code> is not actually a symbol lookup (it is actually defining <code class="highlighter-rouge">a</code> to be <code class="highlighter-rouge">x</code>). It has special meaning because of the evaluation semantics of <code class="highlighter-rouge">let</code>.</p>
<p>That is the whole reason for the delayed check.</p>
<p>In the current implementation, I will assume that <code class="highlighter-rouge">a</code> is not available in the environment for a reason. When later the function is called, <code class="highlighter-rouge">a</code> will never be really used for a lookup, so there won’t be an error. I believe this works for any special forms.</p>
<p>The “better” approach would be to peek into the body of the function, find the <code class="highlighter-rouge">let</code> and recognize it as a special form. Then each special form would have well defined semantics for free variables. Marco doesn’t have special forms yet, but I’m considering adding them.</p>
<h2 id="results">Results</h2>
<p>Running the same tests as the <a href="http://juanibiapina.com/articles/2013-12-16-trampolining-in-marco/">previous post</a>:</p>
<p>Previous:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>100 : 0.72s user 0.06s system 139% cpu 0.562 total
500 : 2.35s user 0.22s system 117% cpu 2.196 total
1000 : 5.77s user 0.25s system 107% cpu 5.618 total
5000 : 111.44s user 0.82s system 101% cpu 1:51.07 total
10000: too long to wait
</code></pre></div></div>
<p>Now with the optimization:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>100 : 0.81s user 0.05s system 151% cpu 0.570 total
500 : 1.64s user 0.12s system 142% cpu 1.229 total
1000 : 2.95s user 0.20s system 117% cpu 2.691 total
5000 : 38.57s user 0.40s system 102% cpu 38.204 total
10000: 149.57s user 1.09s system 101% cpu 2:29.08 total
</code></pre></div></div>
<p>And remember this is a very simple test with a small environment. Imagine the gains when you have a big language.</p>
<p>If you are ever designing your own language, I suggest you do this as soon as you can. Do not focus on this in the very beginning, but if you ever feel you can do it, get to it. I had to do major changes to the implementation and even some semantics in order to get this working properly.</p>Juan Ibiapinajuanibiapina@gmail.comIn this post I want to talk about optimizations for function closures.Trampolining in Marco2013-12-16T00:00:00+00:002013-12-16T00:00:00+00:00https://juanibiapina.github.io/articles/trampolining-in-marco<p>In this post I’ll show how to “better” solve the collatz challenge from the <a href="http://juanibiapina.com/articles/2013-12-13-the-collatz-conjecture/">previous post</a> by escaping the limitations of the Java stack.</p>
<p>This is in fact, not at all better, it’s just much more complicated and helped me learn and think about some concepts I had never worked with.</p>
<h2 id="accumulators">Accumulators</h2>
<p>The first learning we get from the previous problem is that we don’t actually need to generate and store the sequence of numbers. All we need is their sizes. We can then write a new function:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">collatz-size</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">n</span> <span class="nv">size</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">size</span>
<span class="p">(</span><span class="nf">collatz-size</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">even?</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nb">/</span> <span class="nv">n</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nb">*</span> <span class="mi">3</span> <span class="nv">n</span><span class="p">)</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="nb">+</span> <span class="mi">1</span> <span class="nv">size</span><span class="p">)))))</span></code></pre></figure>
<p>The function now takes an <code class="highlighter-rouge">accumulator</code> called <code class="highlighter-rouge">size</code>. The accumulator will have an initial value of 1 and will be incremented for each recursive call. That way, the final call only needs to return the accumulator value, and the list is never stored.</p>
<p>This still does not solve the stack problem, but allows us to use much less memory.</p>
<h2 id="continuation-passing-style">Continuation Passing Style</h2>
<p>The previous function still relies heavily on the stack. One way to avoid this is to use a technique called <a href="http://en.wikipedia.org/wiki/Continuation-passing_style">Continuation Passing Style</a>. I’ll make a few simplifications, but the concept is still valid.</p>
<p>A <a href="http://en.wikipedia.org/wiki/Continuation">continuation</a> is a representation of control state. In our case, a continuation will be just a function. This function will take no arguments, and its sole objective is to be called in order to continue the execution of the program.</p>
<p>In Continuation Passing Style (CPS), we’ll have functions return the next piece of code that should execute. That means: Instead of calling itself recursively, the function will return a continuation.</p>
<p>Let’s rewrite <code class="highlighter-rouge">collatz-size</code> using our simplified CPS:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">collatz-size</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">n</span> <span class="nv">size</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">size</span>
<span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">collatz-size</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">even?</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nb">/</span> <span class="nv">n</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nb">*</span> <span class="mi">3</span> <span class="nv">n</span><span class="p">)</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="nb">+</span> <span class="mi">1</span> <span class="nv">size</span><span class="p">))))))</span></code></pre></figure>
<p>This new version returns the <code class="highlighter-rouge">size</code> when it finishes the calculation (the first part of the <code class="highlighter-rouge">if</code>). But when it knows it has to recurse, it instead creates a <code class="highlighter-rouge">continuation</code> (a function that takes no arguments) and returns it. That means this function will return a function that returns a function and eventually might return the result. How do we run this?</p>
<h2 id="trampolines">Trampolines</h2>
<p>A <code class="highlighter-rouge">trampoline</code> is a function that we can use to get the result of the previous <code class="highlighter-rouge">collatz-size</code>. It will take a function, run it and check the results. It will keep doing this until the result is not a function:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">f</span><span class="p">)</span>
<span class="p">(</span><span class="k">do</span> <span class="p">(</span>
<span class="p">(</span><span class="nf">var</span> <span class="nv">result</span> <span class="nv">f</span><span class="p">)</span>
<span class="p">(</span><span class="nf">while</span> <span class="p">(</span><span class="nf">function?</span> <span class="nv">result</span><span class="p">)</span> <span class="p">(</span><span class="k">set!</span> <span class="nv">result</span> <span class="p">(</span><span class="nf">result</span><span class="p">)))</span>
<span class="nv">result</span>
<span class="p">))))</span></code></pre></figure>
<p>So we can invoke like this:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nb">print</span> <span class="p">(</span><span class="nf">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">collatz-size</span> <span class="mi">6</span> <span class="mi">1</span><span class="p">))))</span></code></pre></figure>
<p>Note this is imperative style, which I don’t fully support, but it translates the stack usage into a while loop. I find this incredibly creative.</p>
<h2 id="changing-everything">Changing Everything</h2>
<p>Given that we have a <code class="highlighter-rouge">trampoline</code> function available, we can rewrite all of our recursive functions in terms of it. Here is the complete solution:</p>
<figure class="highlight"><pre><code class="language-racket" data-lang="racket"><span class="p">(</span><span class="nf">def</span> <span class="nv">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">f</span><span class="p">)</span>
<span class="p">(</span><span class="k">do</span> <span class="p">(</span>
<span class="p">(</span><span class="nf">var</span> <span class="nv">result</span> <span class="nv">f</span><span class="p">)</span>
<span class="p">(</span><span class="nf">while</span> <span class="p">(</span><span class="nf">function?</span> <span class="nv">result</span><span class="p">)</span> <span class="p">(</span><span class="k">set!</span> <span class="nv">result</span> <span class="p">(</span><span class="nf">result</span><span class="p">)))</span>
<span class="nv">result</span>
<span class="p">))))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">collatz-size</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">n</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">n</span> <span class="nv">size</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">size</span>
<span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">even?</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="nb">/</span> <span class="nv">n</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nb">+</span> <span class="p">(</span><span class="nb">*</span> <span class="mi">3</span> <span class="nv">n</span><span class="p">)</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="nb">+</span> <span class="nv">size</span> <span class="mi">1</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="nv">n</span> <span class="mi">1</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">my-range</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">v1</span> <span class="nv">v2</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">v1</span> <span class="nv">v2</span> <span class="nv">l</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb"><</span> <span class="nv">v2</span> <span class="nv">v1</span><span class="p">)</span>
<span class="nv">l</span>
<span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="nv">v1</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">v2</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="nv">v2</span> <span class="nv">l</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="nv">v1</span> <span class="nv">v2</span> <span class="nv">nil</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">my-length</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">l</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">l</span> <span class="nv">size</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">nil?</span> <span class="nv">l</span><span class="p">)</span>
<span class="nv">size</span>
<span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">tail</span> <span class="nv">l</span><span class="p">)</span> <span class="p">(</span><span class="nb">+</span> <span class="nv">size</span> <span class="mi">1</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="nv">l</span> <span class="mi">0</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">my-list-max</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">xs</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">xs</span> <span class="nv">m</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">=</span> <span class="p">(</span><span class="nf">my-length</span> <span class="nv">xs</span><span class="p">)</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nb">max</span> <span class="p">(</span><span class="nf">head</span> <span class="nv">xs</span><span class="p">)</span> <span class="nv">m</span><span class="p">)</span>
<span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">tail</span> <span class="nv">xs</span><span class="p">)</span> <span class="p">(</span><span class="nb">max</span> <span class="p">(</span><span class="nf">head</span> <span class="nv">xs</span><span class="p">)</span> <span class="nv">m</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">tail</span> <span class="nv">xs</span><span class="p">)</span> <span class="p">(</span><span class="nf">head</span> <span class="nv">xs</span><span class="p">)))))))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">my-reverse</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">xs</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">xs</span> <span class="nv">acc</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">nil?</span> <span class="nv">xs</span><span class="p">)</span>
<span class="nv">acc</span>
<span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">tail</span> <span class="nv">xs</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="nf">head</span> <span class="nv">xs</span><span class="p">)</span> <span class="nv">acc</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="nv">xs</span> <span class="nv">nil</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">my-map</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nf">f</span> <span class="nv">l</span><span class="p">)</span>
<span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">function</span> <span class="p">(</span><span class="nb">list</span> <span class="nv">acc</span><span class="p">)</span>
<span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nf">nil?</span> <span class="nv">list</span><span class="p">)</span>
<span class="nv">acc</span>
<span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="p">(</span><span class="nf">tail</span> <span class="nv">list</span><span class="p">)</span> <span class="p">(</span><span class="nb">cons</span> <span class="p">(</span><span class="nf">f</span> <span class="p">(</span><span class="nf">head</span> <span class="nv">list</span><span class="p">))</span> <span class="nv">acc</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">my-reverse</span> <span class="p">(</span><span class="nf">trampoline</span> <span class="p">(</span><span class="nf">function</span> <span class="p">()</span> <span class="p">(</span><span class="nf">helper</span> <span class="nv">l</span> <span class="nv">nil</span><span class="p">)))))))</span>
<span class="p">(</span><span class="nf">def</span> <span class="nv">max-n</span> <span class="mi">5000</span><span class="p">)</span>
<span class="p">(</span><span class="nb">print</span> <span class="p">(</span><span class="nf">my-list-max</span> <span class="p">(</span><span class="nf">my-map</span> <span class="nv">collatz-size</span> <span class="p">(</span><span class="nf">my-range</span> <span class="mi">1</span> <span class="nv">max-n</span><span class="p">))))</span></code></pre></figure>
<p>Note how even map and reverse need to be rewritten in this style.</p>
<h2 id="results">Results</h2>
<p>This version works “well” up to 5000 thousand. I did not wait for it to finish for 10000, although it probably would eventually. Here are the timings:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>100 : 0.72s user 0.06s system 139% cpu 0.562 total
500 : 2.35s user 0.22s system 117% cpu 2.196 total
1000 : 5.77s user 0.25s system 107% cpu 5.618 total
5000 : 111.44s user 0.82s system 101% cpu 1:51.07 total
</code></pre></div></div>
<h2 id="future">Future</h2>
<p>Tail call optimization is a much better solution to this problem. In the future I would like to implement TCO in the Marco interpreter. I might even use some sort of internal trampolining, transparent to the language.</p>
<p>The interpreter also needs several internal optimizations, specially regarding memory use when defining function closures. That’s a lot a future work there.</p>Juan Ibiapinajuanibiapina@gmail.comIn this post I’ll show how to “better” solve the collatz challenge from the previous post by escaping the limitations of the Java stack.