Ir al contenido principal

This is my blog, more about me at marianoguerra.github.io

🦋 @marianoguerra.org 🐘 @marianoguerra@hachyderm.io 🐦 @warianoguerra

Make Your Self

In Search of Maxwell's equations of Object Oriented Software

Motivation

Yes, that was the big revelation to me when I was in graduate school—when I finally understood that the half page of code on the bottom of page 13 of the Lisp 1.5 manual was Lisp in itself. These were “Maxwell’s Equations of Software!” This is the whole world of programming in a few lines that I can put my hand over.

-- A Conversation with Alan Kay

Which are Maxwell's equations of Object Oriented software?

This is an attempt at answering that question based on the following:

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.

-- Dr. Alan Kay on the Meaning of “Object-Oriented Programming”

I think our confusion with objects is the problem that in our Western culture, we have a language that has very hard nouns and verbs in it. Our process words stink. It's much easier for us when we think of an object—and I have apologized profusely over the last twenty years for making up the term object-oriented, because as soon as it started to be misapplied, I realized that I should have used a much more process-oriented term for it.—The Japanese have an interesting word, which is called ma. Spelled in English, just ma. Ma is the stuff in-between what we call objects. It's the stuff we don't see, because we're focused on the nounness of things rather than the processness of things. Japanese has a more process-feel oriented way of looking at how things relate to each other. You can always tell that by looking at the size of [the] word it takes to express something that is important. Ma is very short. We have to use words like interstitial or worse to approximate what the Japanese are talking about.

-- Alan Kay at OOPSLA 1997: The Computer Revolution has not Happened Yet

Don't Bury the Lede

At the end of this post you will understand the design and code that allows you to write and understand why this:

{
  @MyType is ("my-type" as-type ()),
  @name is "Joe",
  MyType apply-to name,
  @(name say-hello ()) replies @("Well, hello " + it),
  (name say-hello ()) : ("Mike" say-hello ())
}

Evaluates to:

"Well, hello Joe" : "Hello, Mike!"

With a runtime that consists of a Frame class that allows to bind values to names and find the value for a given name plus this two methods:

class Frame {
    // ...
    eval(v) {
      return this.send(v, new Msg("eval", this));
    }
    send(s, m) {
      return this.find(getType(s))
        .find(m.verb)
        .call(null, s, m, this);
    }
    // ...
}

And the following data types as the smallest set to make it all work:

  • Nil: similar to nil, null or Unit, the only falsy value
  • Int: Javascript's BigInt
  • Name: a variable name
  • Pair: a pair of values, like cons
  • Msg: similar to a method, a name and an argument
  • Send: like a method call, a method sent to a value
  • Later: like Lisp's quote
  • Block: like Lisp's progn, a sequence of instructions that returns the result of the last one
  • Symbol: Javascript's Symbol

A runnable version of this post can be found here: fatt.test.js (fatt: frame all the things)

A full implementation of the language with extra support for Array and Map in 100 lines for the runtime and 73 lines for the parser can be found here fatt.js.

A CLI to evaluate expressions can be found here fatt.cli.js, here are some expressions you can evaluate and their results:

./fatt.cli.js '42' '10 + 2' '10 + 2 - 5 * 2' '1 : 2.5 : () : "hi"' '() ? @ "true" : "false"' '\ + 2' '@(1 + 2)' '[]' '[1, 1.4, ["hi", ()]]' '"my-type" as-type ()' '@{42, ()}' '#{1: "one", "two": 2}' '#{} . "foo"' '#{"foo": "bar"} . "foo"' '{@foo is 42, foo}' '{@(0 add+1 0) replies @(it + that + 1), 1 add+1 2}'
>  42
42

>  10 + 2
12

>  10 + 2 - 5 * 2
14

>  1 : 2.5 : () : "hi"
1 : 2.5 : () : "hi"

>  () ? @ "true" : "false"
"false"

>  \ + 2
\ + 2

>  @(1 + 2)
1 + 2

>  []
[]

>  [1, 1.4, ["hi", ()]]
[1, 1.4, ["hi", ()]]

>  "my-type" as-type ()
("my-type" as-type ())

>  @{42, ()}
{42, ()}

>  #{1: "one", "two": 2}
#{1: "one", "two": 2}

>  #{} . "foo"
()

>  #{"foo": "bar"} . "foo"
"bar"

>  {@foo is 42, foo}
42

>  {@(0 add+1 0) replies @(it + that + 1), 1 add+1 2}
4

Bindings

If we had a calculator language and wanted to upgrade it to make it closer to a programming language probably the first feature would be the ability to give names to values and look them up later.

In programming languages bindings are stored in Frames in a call stack, let's try a simple example in JavaScript:

{
  let a = 10;
  let b = 20;

  test("single scope bindings", () => {
    expect(a).toBe(10);
    expect(b).toBe(20);
  });
}

Frame 1

Let's start with a simple Frame class that holds bindings in a Map and has two operations:

  • bind(name, value): store a value associated to a name
  • find(name): return the value associated with a name or undefined if not found
{
  class Frame {
    constructor() {
      this.binds = new Map();
    }

    bind(name, value) {
      this.binds.set(name, value);
      return this;
    }

    find(name) {
      return this.binds.get(name);
    }
  }

  test("single scope bindings implementation", () => {
    const env = new Frame().bind("a", 10).bind("b", 20);
    expect(env.find("a")).toBe(10);
    expect(env.find("b")).toBe(20);
  });
}

But in most languages bindings are not all in a single global namespace, in languages like JavaScript binding lookup starts in the current scope and continues in outer scopes:

{
  let a = 10;
  let b = 20;

  {
    let b = 30;

    test("nested scopes", () => {
      expect(a).toBe(10);
      expect(b).toBe(30);
    });
  }
}

Frame 2

To replicate this our new implementation of Frame gains a new attribute: up.

find starts in the current scope and if the binding is not found it continues in the scope referenced by up until the binding is found or up is null.

The method down enters a new Frame with its up attribute set to the current instance.

{
  class Frame {
    constructor(up = null) {
      this.up = up;
      this.binds = new Map();
    }

    bind(name, value) {
      this.binds.set(name, value);
      return this;
    }

    find(name) {
      const v = this.binds.get(name);
      if (v === undefined && this.up !== null) {
        return this.up.find(name);
      } else {
        return v;
      }
    }

    down() {
      return new Frame(this);
    }
  }

  test("nested scopes implementation", () => {
    const env = new Frame()
                  .bind("a", 10).bind("b", 20)
                  .down().bind("b", 30);

    expect(env.find("a")).toBe(10);
    expect(env.find("b")).toBe(30);
  });
}

But binding lookup stops for a second reason other than up being null, let's see it with an example:

{
  function f1() {
    let a = 10;
    let b = 20;
    f2();
  }

  function f2() {
    let b = 30;
    return a + b;
  }

  test("call frames", () => {
    expect(f1).toThrow();
  });
}

Frame 3

a is not available in f2 even if it was called from f1 where it was defined, this is because binding lookup stops at call frames.

We can implement this by adding a marker attribute upLimit that makes the lookup stop:

{
  class Frame {
    constructor(up = null) {
      this.up = up;
      this.upLimit = false;
      this.binds = new Map();
    }

    bind(name, value) {
      this.binds.set(name, value);
      return this;
    }

    find(name) {
      const v = this.binds.get(name);
      if (v === undefined) {
        if (this.upLimit || this.up === null) {
          return v;
        } else {
          return this.up.find(name);
        }
      } else {
        return v;
      }
    }

    down() {
      return new Frame(this.left, this);
    }

    setUpLimit() {
      this.upLimit = true;
      return this;
    }
  }

And test it:

  test("call frames implementation", () => {
    const env = new Frame()
      .bind("a", 10)
      .bind("b", 20)
      .down()
      .setUpLimit()
      .bind("b", 30);

    expect(env.find("a")).toBe(undefined);
    expect(env.find("b")).toBe(30);
  });
}

Even when binding lookup stops at the first call frame boundary there are two simple examples showing that the lookup continues "somewhere else":

{
  let a = 10;

  function f() {
    return a;
  }

  test("top level and prelude bindings", () => {
    expect(f()).toBe(10);
    expect(parseInt("42", 10)).toBe(42);
  });
}

In the first case after stopping at the first call frame it "continues" the lookup with bindings available at the top level (module) scope.

It the second case it finds a value that is not bound in our program: parseInt.

This is one of the bindings that are available everywhere without the need to include them, in JavaScript you may call it the window object, in other languages it is described as a set of bindings that are automatically imported on every module, or prelude for short.

If the "look up" stops at the call frame then after reaching that point it has to go somewhere else. We could say that module and "prelude" bindings are bound "before" the bindings in the call stack. In many cultures the past is to the left, so let's continue there.

Let's add a left attribute to our Frame class and make it work in a similar way to up, start the lookup in the current Frame and continue up until upLimit or until up is null, then continue left until leftLimit or until left is null.

The right method is similar to the down method but it returns a new Frame instance that has the current frame as its left and up set to null.

We redefine down to return a new Frame instance where left is the same as the left of the current frame and up is the current frame itself.

Frame 4

{
  class Frame {
    constructor(left = null, up = null) {
      this.up = up;
      this.left = left;
      this.upLimit = false;
      this.leftLimit = false;
      this.binds = new Map();
    }

    bind(name, value) {
      this.binds.set(name, value);
      return this;
    }

    find(name) {
      const v = this.binds.get(name);
      if (v === undefined) {
        if (this.upLimit || this.up === null) {
          if (this.leftLimit || this.left === null) {
            return v;
          } else {
            return this.left.find(name);
          }
        } else {
          return this.up.find(name);
        }
      } else {
        return v;
      }
    }

    down() {
      return new Frame(this.left, this);
    }

    right() {
      return new Frame(this, null);
    }

    setUpLimit() {
      this.upLimit = true;
      return this;
    }

    setLeftLimit() {
      this.leftLimit = true;
      return this;
    }
  }

  {
    test("prelude implementation", () => {
      const env = new Frame()
        .bind("parseInt", parseInt)
        .right()
        .bind("a", 10)
        .right()
        .down()
        .setUpLimit();

      expect(env.find("parseInt")("42", 10)).toBe(42);
      expect(env.find("a")).toBe(10);
    });
  }
}

Prototype Chains

Another thing in object oriented languages that can be described as looking up bindings is "message dispatch", let's see some examples.

If we define an empty class A in JavaScript it "inherits by default" from the Object class:

{
  class A {}

  test("Object default inheritance", () => {
    expect(new A().toString()).toBe("[object Object]");
  });
}

We can emulate the lookup of toString with the Frame class as it is:

{
  test("Object default inheritance implementation", () => {
    const a = new Frame().bind("toString",
                 () => "[object Object]").right();

    expect(a.find("toString")()).toBe("[object Object]");
  });
}

We can declare a class B that defines its own toString method:

class B {
  toString() {
    return "B!";
  }
}

test("method", () => {
  expect(new B().toString()).toBe("B!");
});

We can again emulate it with the Frame class:

test("method implementation", () => {
  const b = new Frame()
    .bind("toString", () => "[object Object]")
    .right()
    .bind("toString", () => "B!");

  expect(b.find("toString")()).toBe("B!");
});

A more complicated prototype chain:

class C extends B {
  toString() {
    return "C!";
  }
}

test("method override", () => {
  expect(new C().toString()).toBe("C!");
});

test("method override implementation", () => {
  const c = new Frame()
    .bind("toString", () => "[object Object]")
    .right()
    .bind("toString", () => "B!")
    .down()
    .bind("toString", () => "C!");

  expect(c.find("toString")()).toBe("C!");
});

A class can have instance attributes, each instance binds it's own attributes but looks up methods in the prototype chain:

class D extends C {
  constructor(count) {
    super();
    this.count = count;
  }
}

test("instance attributes", () => {
  const d1 = new D(10);
  const d2 = new D(20);

  expect(d1.toString()).toBe("C!");
  expect(d1.count).toBe(10);
  expect(d2.toString()).toBe("C!");
  expect(d2.count).toBe(20);
});

We can emulate this by having the prototype chain "to the left" and the instance attributes in its own scope.

test("method override implementation", () => {
  const D = new Frame()
    .bind("toString", () => "[object Object]")
    .right()
    .bind("toString", () => "B!")
    .down()
    .bind("toString", () => "C!")
    .down();
  const d1 = D.down().bind("count", 10);
  const d2 = D.down().bind("count", 20);

  expect(d1.find("toString")()).toBe("C!");
  expect(d1.find("count")).toBe(10);
  expect(d2.find("toString")()).toBe("C!");
  expect(d2.find("count")).toBe(20);
});

We can do an analogy and say that in OOP Object is a kind of "class prelude", the class hierarchy is an analog to nested modules and the instance is the call stack :)

Growing a Language

But manipulating frames directly doesn't feel like a programming language, if we want to create a really simple language on top we should be able to at least bind and lookup names and do some operations on those values, like arithmetic operations on numbers.

This is the point where most articles pull the "small Lisp or Forth interpreter" trick, but the initial motivation for this exploration was to find a small object oriented language that could be grown and expressed from a small set of primitives.

We are going to start with numbers, specifically integers, since we are implementing our language on top of JavaScript let's use BigInts.

To express a variable we can define a Name class that holds the variable name in its value attribute:

{
  class Name {
    constructor(value) {
      this.value = value;
    }

    getType() {
      return "Name";
    }
  }

The OOP way to eval a Name would be to send it the eval message.

To do that we need a Msg class that can hold eval as the verb.

Following the vocabulary of message, name and verb, the message is sent to the subject and has an object, in case of eval the object is the current scope:

Some verbs (called transitive verbs) take direct objects; some also take indirect objects. A direct object names the person or thing directly affected by the action of an active sentence. An indirect object names the entity indirectly affected

-- Wikipedia: Traditional Grammar

  class Msg {
    constructor(verb, obj) {
      this.verb = verb;
      this.obj = obj;
    }

    getType() {
      return "Msg";
    }
  }

Let's redefine Frame with just two extra methods:

  • eval(value): sends the message eval to value and return the result
  • send(subject, message): sends the message to the subject
  class Frame {
    constructor(left = null, up = null) {
      this.up = up;
      this.left = left;
      this.upLimit = false;
      this.leftLimit = false;
      this.binds = new Map();
    }

    eval(v) {
      return this.send(v, new Msg("eval", this));
    }

    send(s, m) {
      return this
        .find(s.getType())
        .find(m.verb)
        .call(null, s, m, this);
    }

    bind(name, value) {
      this.binds.set(name, value);
      return this;
    }

    find(name) {
      const v = this.binds.get(name);
      if (v === undefined) {
        if (this.upLimit || this.up === null) {
          if (this.leftLimit || this.left === null) {
            return v;
          } else {
            return this.left.find(name);
          }
        } else {
          return this.up.find(name);
        }
      } else {
        return v;
      }
    }

    down() {
      return new Frame(this.left, this);
    }

    right() {
      return new Frame(this, null);
    }

    setUpLimit() {
      this.upLimit = true;
      return this;
    }

    setLeftLimit() {
      this.leftLimit = true;
      return this;
    }
  }

The implementation of send:

  • gets the subject type
  • looks up the type in the environment
    • the result should be a Frame instance with the "prototype" for that type
  • it does a lookup for the Msg verb in the prototype
  • calls the handler passing the subject, message and environment as arguments

We can try it by:

  • creating an instance of Name for the name "a"
  • creating a Frame that works as the prototype for Name, it holds a binding for eval that when called does a variable lookup in the environment
  • creating a Frame for the call stack, binding "a" to 42, nameEnv to the type of Name (returned by a.getType())
  • evaluating a in env and checking that it returns 42
  test("Name resolution with eval message", () => {
    const a = new Name("a");
    const nameEnv = new Frame()
                      .bind("eval", (s, _m, e) => e.find(s.value));
    const env = new Frame()
                      .bind("a", 42)
                      .bind(a.getType(), nameEnv);

    expect(env.eval(a)).toBe(42);
  });

We now have a language that supports bindings but we still can't express message sends in it.

Let's fix this by defining a Send class that has a subject and a message as attributes:

  class Send {
    constructor(subj, msg) {
      this.subj = subj;
      this.msg = msg;
    }

    getType() {
      return "Send";
    }
  }

Since we are going to be using BigInt as our language's Int type we will need to monkey patch BigInt's prototype with our getType method to be able to lookup handlers for Ints:

  BigInt.prototype.getType = () => "Int";

Note: In the next version we are going to use Symbols to avoid monkey patching.

We can now implement message sends in our language by defining eval message handlers for:

  • Name: does a lookup for the name in the environment
  • BigInt: returns itself
  • Msg: returns a new Msg instance where the verb is the same but obj is evaluated
  • Send:
    • evaluates subject and message
    • enters a call frame
    • binds it to the subject
      • I use it instead of this to differenciate from this and self in other OOP languages
    • binds msg to the message
    • binds that for the message's object and
    • sends the evaluated msg to the evaluated subject

To have some message to send that we can test we define a handler for the + message for Ints which does a lookup for it and adds it to the value bound to that.

There's an alternative implementation commented that directly uses s and m.obj that contain the same values.

Finally we test it by building an object that represents the expression 10 + a and check that it results in 42n since a was bounds to 32n in the environment.

  test("Msg Send eval", () => {
    const nameEnv = new Frame()
                      .bind("eval", (s, _m, e) => e.find(s.value));
    const intEnv = new Frame()
      .bind("eval", (s, _m, _e) => s)
      .bind("+", (_s, _m, e) => e.find("it") + e.find("that"));
    //.bind("+", (s, m, _e) => s + m.obj);
    const msgEnv = new Frame().bind(
      "eval",
      (s, _m, e) => new Msg(s.verb, e.eval(s.obj)),
    );
    const sendEnv = new Frame().bind("eval", (s, _m, e) => {
      const subj = e.eval(s.subj),
        msg = e.eval(s.msg);
      return e
        .down()
        .setUpLimit()
        .bind("it", subj)
        .bind("msg", msg)
        .bind("that", msg.obj)
        .send(subj, msg);
    });
    const env = new Frame()
      .bind("Name", nameEnv)
      .bind("Int", intEnv)
      .bind("Msg", msgEnv)
      .bind("Send", sendEnv)
      .right()
      .bind("a", 32n);

    // 10 + a
    const expr = new Send(10n, new Msg("+", new Name("a")));
    expect(env.eval(expr)).toBe(42n);
  });
}

A Language with Syntax

Let's write a parser for our language to make it easier to test, we are going to use ohmjs

import * as ohm from "./node_modules/ohm-js/index.mjs";

Since we are going to be Growing a Language let's create an utility function to define new languages:

function mkLang(g, s) {
  const grammar = ohm.grammar(g),
    semantics = grammar.createSemantics().addOperation("toAst", s),
    parse = (code) => {
      const matchResult = grammar.match(code);

      if (matchResult.failed()) {
        console.warn("parse failed", matchResult.message);
        return null;
      }

      return semantics(matchResult).toAst();
    },
    run = (code, e) => {
      const ast = parse(code);
      return ast ? e.eval(ast) : null;
    };

  return { run, parse };
}

Let's define our types again to use Symbols instead of monkey patching and to add a base class that allows any type to be used as a reply handler for a message by implementing the call method:

class Base {
  call(_, s, m, e) {
    return e.eval(this);
  }
}

Name, Msg and Send are almost the same as before:

class Name extends Base {
  constructor(value) {
    super();
    this.value = value;
  }
}

class Msg extends Base {
  constructor(verb, obj) {
    super();
    this.verb = verb;
    this.obj = obj;
  }
}

class Send extends Base {
  constructor(subj, msg) {
    super();
    this.subj = subj;
    this.msg = msg;
  }
}

But now instead of implementing getType as a method each type is going to have a unique Symbol used to lookup its "prototype" in the call stack when looking for a message handler.

We are going to create a Symbol called typeSym to get and set the type for each object and 3 utility functions to get, set and make a type, which creates the type sets it on a class and returns the type Symbol:

const typeSym = Symbol("TypeSym"),
  getType = (v) => v[typeSym],
  setType = (Cls, type) => ((Cls.prototype[typeSym] = type), type),
  mkType = (name, Cls) => setType(Cls, Symbol(name));

Let's define types for the classes we already have:

const TYPE_NAME = mkType("Name", Name),
  TYPE_MSG = mkType("Msg", Msg),
  TYPE_SEND = mkType("Send", Send),
  TYPE_INT = mkType("Int", BigInt);

Let's redefine Frame for the last time to use getType to get the type associated with a value:

class Frame {
  constructor(left = null, up = null) {
    this.up = up;
    this.left = left;
    this.upLimit = false;
    this.leftLimit = false;
    this.binds = new Map();
  }

  eval(v) {
    return this.send(v, new Msg("eval", this));
  }

  send(s, m) {
    return this
            .find(getType(s))
            .find(m.verb)
            .call(null, s, m, this);
  }

  bind(name, value) {
    this.binds.set(name, value);
    return this;
  }

  find(name) {
    const v = this.binds.get(name);
    if (v === undefined) {
      if (this.upLimit || this.up === null) {
        if (this.leftLimit || this.left === null) {
          return v;
        } else {
          return this.left.find(name);
        }
      } else {
        return this.up.find(name);
      }
    } else {
      return v;
    }
  }

  down() {
    return new Frame(this.left, this);
  }

  right() {
    return new Frame(this, null);
  }

  setUpLimit() {
    this.upLimit = true;
    return this;
  }

  setLeftLimit() {
    this.leftLimit = true;
    return this;
  }
}

We have everything in place to create the first version of our language:

const { run: run1 } = mkLang(
  `Lang {
    Main = Send
    name = (letter | "_") (letter | "_" | digit)*
    Msg = verb Value
    verb = verbStart verbPart*
    verbStart = "+" | "-" | "*" | "/" | "-" | "%" | "&" | "<" | ">" | "!" | "?" | "." | letter
    verbPart = verbStart | digit
    Send = Value Msg*
    Value = int | name
    int = digit+
  }`,
  {
    name(_1, _2) {
      return new Name(this.sourceString);
    },
    Msg: (verb, obj) => new Msg(verb.toAst(), obj.toAst()),
    verb(_1, _2) {
      return this.sourceString;
    },
    Send: (v, msgs) =>
      msgs.children.reduce(
        (acc, msg) => new Send(acc, msg.toAst()), v.toAst()
      ),
    int(_) {
      return BigInt(this.sourceString);
    },
  },
);

Let's create another utility function to make type prototype definitions more readable:

function mkProto(obj) {
  const frame = new Frame();

  for (const name in obj) {
    frame.bind(name, obj[name]);
  }

  return frame;
}

And yet another function that creates a basic environment with eval handlers for Name, BigInt, Msg and Send that will be reused from now on to test our languages:

function mkEnv1() {
  return new Frame()
    .bind(TYPE_NAME,
       mkProto({ eval: (s, _m, e) => e.find(s.value) }))
    .bind(
      TYPE_INT,
      mkProto({
        eval: (s, _m, _e) => s,
        "+": (_s, _m, e) => e.find("it") + e.find("that"),
      }),
    )
    .bind(
      TYPE_MSG,
      mkProto({
       eval: (s, _m, e) => new Msg(s.verb, e.eval(s.obj))
      }),
    )
    .bind(
      TYPE_SEND,
      mkProto({
        eval(s, _m, e) {
          const subj = e.eval(s.subj),
            msg = e.eval(s.msg);
          return e
            .down()
            .setUpLimit()
            .bind("it", subj)
            .bind("msg", msg)
            .bind("that", msg.obj)
            .send(subj, msg);
        },
      }),
    );
}

Let's test some basic expressions in our first language:

test("Msg Send eval with parser", () => {
  const env = mkEnv1().right().bind("a", 32n);

  expect(run1("10 + 4", env)).toBe(14n);
  expect(run1("10 + a", env)).toBe(42n);
  expect(run1("10 + a + 4", env)).toBe(46n);
});

Conditionals

After arithmetic operations the next feature that sets a language apart from an advanced calculator are conditional expressions, to support them we need some new types, one to express false when evaluating conditions, we can also use it to express the lack of a value, a useful type for this is usually called null, nil or Unit, in our language it will be called Nil and its syntax will be ():

Let's create the class and a singleton instance:

class Nil extends Base {}
const NIL = new Nil();

For conditionals we need a way to express two branches and pick one of them, for that and, as Lisp as taught us, many other reasons we are going to create the Pair type that has two fields, not car/cdr, not first/rest, not head/tail but a and b:

class Pair extends Base {
  constructor(a, b) {
    super();
    this.a = a;
    this.b = b;
  }
}

The final ingredient for conditionals is the Later type, which I will describe... later ;)

class Later extends Base {
  constructor(value) {
    super();
    this.value = value;
  }
}

Let's not forget to create the Symbols for the new types:

const TYPE_NIL = mkType("Nil", Nil),
  TYPE_PAIR = mkType("Pair", Pair),
  TYPE_LATER = mkType("Later", Later);

The second version of our language adds support for the new types:

const { run: run2 } = mkLang(
  `Lang {
    Main = Send
    nil = "(" ")"
    Pair = PairHead ":" Value
    PairHead = Scalar | Later | ParSend
    name = (letter | "_") (letter | "_" | digit)*
    Msg = verb Value
    verb = verbStart verbPart*
    verbStart = "+" | "-" | "*" | "/" | "-" | "%" | "&" | "<" | ">" | "!" | "?" | "." | letter
    verbPart = verbStart | digit
    Send = Value Msg*
    ParSend = "(" Send ")"
    Later = "@" Value
    Value = Pair | PairHead
    Scalar = int | nil | name
    int = digit+
  }`,
  {
    nil: (_o, _c) => NIL,
    Pair: (a, _, b) => new Pair(a.toAst(), b.toAst()),
    name(_1, _2) {
      return new Name(this.sourceString);
    },
    Msg: (verb, obj) => new Msg(verb.toAst(), obj.toAst()),
    verb(_1, _2) {
      return this.sourceString;
    },
    Send: (v, msgs) =>
      msgs.children.reduce(
        (acc, msg) => new Send(acc, msg.toAst()), v.toAst()
      ),
    ParSend: (_o, v, _c) => v.toAst(),
    Later: (_, v) => new Later(v.toAst()),
    int(_) {
      return BigInt(this.sourceString);
    },
  },
);

The simplest implementation for conditionals in a language with no side effects and free CPU time could be a message with the format condition ? when-true : when-false where:

  • ? is a message sent to a condition expression
  • when-true : when-false is the message object as a pair of expressions
  • a message reply handler on Nil that picks the second item of the Pair
  • an implementation for the remaining types that picks the Pair's first item
test("eager conditional", () => {
  const env = mkEnv1()
    .bind(
      TYPE_NIL,
      mkProto({
        eval: (s, _m, e) => s,
        "?": (s, m, e) => m.obj.b
      }),
    )
    .bind(
      TYPE_PAIR,
      mkProto({
        eval: (s, _m, e) => new Pair(e.eval(s.a), e.eval(s.b))
      }),
    )
    .bind(
      TYPE_INT,
      mkProto({
       eval: (s, _m, e) => s,
       "?": (s, m, e) => m.obj.a
      }),
    )
    .right();

  expect(run2("0 ? 1 : 2", env)).toBe(1n);
  expect(run2("() ? 1 : 2", env)).toBe(2n);
  expect(run2("() ? 1 : () ? 2 : 3", env)).toBe(3n);
  expect(() => run2("0 ? 1 : (1 * 2)", env)).toThrow();
});

We can see that the last test throws, this is because there's no reply handler for * defined for Ints. From this we can tell that this implementation evaluates both sides of the pair, something we probably don't want.

Let's fix this by implementing eval for Later which wraps any other value and returns the wrapped value unevaluated on eval:

test("lazy conditional", () => {
  const env = mkEnv1()
    .bind(TYPE_LATER,
       mkProto({ eval: (s, _m, e) => s.value }))
    .bind(
      TYPE_NIL,
      mkProto({
        eval: (s, _m, e) => s,
        "?": (s, m, e) => e.eval(m.obj.b)
      }),
    )
    .bind(
      TYPE_PAIR,
      mkProto({
        eval: (s, _m, e) => new Pair(e.eval(s.a), e.eval(s.b))
      }),
    )
    .bind(
      TYPE_INT,
      mkProto({
        eval: (s, _m, e) => s,
        "?": (s, m, e) => e.eval(m.obj.a)
      }),
    )
    .right();

  expect(run2("0 ? 1 : 2", env)).toBe(1n);
  expect(run2("() ? 1 : 2", env)).toBe(2n);
  expect(run2("() ? 1 : () ? 2 : 3", env)).toBe(3n);
  expect(run2("0 ? @ 1 : (1 * 2)", env)).toBe(1n);
});

With Later we can "delay" the evaluation of the pair until we know which branch we want to take.

Notice that the implementations of ? for Nil and Int now have to explicitly evaluate the branch they take.

Blocks

The next feature we probably want is the ability to define reply handlers in our language instead of "native" JavaScript functions.

To test this we need to be able to have more than one expression in our language.

We could do it with pairs but let's create a Block type which contains a sequence of expressions that on eval it evaluates each in turn and returns the result of the last one:

class Block extends Base {
  constructor(value) {
    super();
    this.value = value;
  }
}

Let's add the type for Block and to avoid repeating a lot of code for small changes let's also introduce Float and Str to our language by adding their type tags and adding them to the parser:

// NOTE: we wrap string to be able to attach a Symbol at runtime further down
class Str extends String {}

const TYPE_BLOCK = mkType("Block", Block),
  TYPE_FLOAT = mkType("Float", Number),
  TYPE_STR = mkType("Str", Str);

const { run: run3 } = mkLang(
  `Lang {
    Main = Send
    nil = "(" ")"
    Pair = PairHead ":" Value
    PairHead = Block | Scalar | Later | ParSend
    name = (letter | "_") (letter | "_" | digit)*
    Block = "{" Exprs "}"
    Exprs = Send ("," Send )*
    Msg = verb Value
    MsgQuote = "\\\\" Msg
    verb = verbStart verbPart*
    verbStart = "+" | "-" | "*" | "/" | "-" | "%" | "&" | "<" | ">" | "!" | "?" | "." | letter
    verbPart = verbStart | digit
    Send = Value Msg*
    ParSend = "(" Send ")"
    Later = "@" Value
    Value = Pair | PairHead
    Scalar = float | int | str | nil | name | MsgQuote
    float = digit+ "." digit+
    int = digit+
    str = "\\\"" (~"\\\"" any)* "\\\""
  }`,
  {
    nil: (_o, _c) => NIL,
    Pair: (a, _, b) => new Pair(a.toAst(), b.toAst()),
    name(_1, _2) {
      return new Name(this.sourceString);
    },
    Block: (_o, exprs, _c) => new Block(exprs.toAst()),
    Exprs: (first, _, rest) =>
      [first.toAst()]
        .concat(rest.children.map((v) => v.toAst())),
    Msg: (verb, obj) => new Msg(verb.toAst(), obj.toAst()),
    verb(_1, _2) {
      return this.sourceString;
    },
    MsgQuote: (_, msg) => msg.toAst(),
    Send: (v, msgs) =>
      msgs.children.reduce(
        (acc, msg) => new Send(acc, msg.toAst()), v.toAst()
      ),
    ParSend: (_o, v, _c) => v.toAst(),
    Later: (_, v) => new Later(v.toAst()),
    int(_) {
      return BigInt(this.sourceString);
    },
    float(_a, _d, _b) {
      return parseFloat(this.sourceString);
    },
    str: (_1, s, _3) => new Str(s.sourceString),
  },
);

Message Reply Definition

With block support let's implement "message reply definition".

Since we are going to be using this message handlers in subsequent tests let's define a function to create an environment that supports reply definitions:

function mkEnv2() {
  return mkEnv1()
    .bind(TYPE_LATER, mkProto({ eval: (s, _m, e) => s.value }))
    .bind(
      TYPE_BLOCK,
      mkProto({
        eval: (s, _m, e) => {
          let r = NIL;
          for (const item of s.value) {
            r = e.eval(item);
          }
          return r;
        },
      }),
    )
    .bind(
      TYPE_SEND,
      mkProto({
        eval(s, _m, e) {
          const subj = e.eval(s.subj),
            msg = e.eval(s.msg);
          return e
            .down()
            .setUpLimit()
            .bind("it", subj)
            .bind("msg", msg)
            .bind("that", msg.obj)
            .send(subj, msg);
        },
        replies(s, m, e) {
          const target = e.up.eval(s.subj),
            targetType = getType(target),
            msgVerb = s.msg.verb,
            impl = m.obj,
            proto = e.up.find(targetType);

          proto.bind(msgVerb, impl);
          return NIL;
        },
      }),
    );
}

And test it:

test("Msg Send reply definition", () => {
  const env = mkEnv2().right();

  const code = "{@(0 add+1 0) replies @(it + that + 1), 1 add+1 2}";
  expect(run3(code, env)).toBe(4n);
});

We support reply definitions by adding a handler for the reply message on the Send type, without Later there's no way to send a message to a Send but with it we can "quote" a Send and send a message to it, yes, we send a message to a message send.

replies implementation:

  • takes Send's subject
  • gets its type
  • finds the current prototype for it in the environment
  • binds a handler for Send's verb using replies' object

A little convoluted, let's try again, this is the shape of an expression to define a reply to a message: @SampleSend replies @ReplyImplementation.

SampleSend is a Send expression, which we get by using Later on a Send to delay its evaluation, it's an example of the kind of expression that we want to handle.

As a reminder Send has the shape Subject Verb Object.

We take SampleSend's subject to get the type associated with the new reply.

From SampleSend we also get the verb that we want to reply to.

Finally ReplyImplementation is used as the handler for the message, which you have to quote to delay its evaluation until the message is handled.

Walk and Map

We still don't have iteration, there are many ways to implement it but here's a fun set of "primitives" i've been playing with: walk and map.

  • map forwards the quoted message to the subject's items
  • walk also forwards a quoted message but it forwards the walk message itself, not the quoted one, this makes it recursive
  • scalar values implement handlers for both by sending the quoted message to themselves
test("walk and map", () => {
  function esend(s, m, e) {
    return e.eval(new Send(s, m));
  }

  function pair(a, b) {
    return new Pair(a, b);
  }

  const env = mkEnv1()
    .bind(
      TYPE_PAIR,
      mkProto({
        eval: (s, _m, e) =>
                 pair(e.eval(s.a), e.eval(s.b)),
        walk: (s, m, e) =>
                 pair(esend(s.a, m, e), esend(s.b, m, e)),
        map: (s, m, e) =>
                 pair(esend(s.a, m.obj, e), esend(s.b, m.obj, e)),
      }),
    )
    .bind(
      TYPE_INT,
      mkProto({
        eval: (s, _m, _e) => s,
        "+": (s, m, e) => e.find("it") + e.find("that"),
        walk: (s, m, e) => esend(s, m.obj, e),
        map: (s, m, e) => esend(s, m.obj, e),
      }),
    )
    .right();

  expect(run3("1 walk \\ + 2", env)).toBe(3n);
  expect(run3("1 map \\ + 2", env)).toBe(3n);

  const p1 = run3("1 : 2 map \\ + 2", env);
  expect(p1.a).toBe(3n);
  expect(p1.b).toBe(4n);

  const p2 = run3("1 : 2 : 3 walk \\ + 2", env);
  expect(p2.a).toBe(3n);
  expect(p2.b.a).toBe(4n);
  expect(p2.b.b).toBe(5n);
});

Custom Types

You may be asking "but what about user defined types?", well, glad you asked because I was planning on explaining that just about now.

We first need to bring the Symbol type to our language:

const TYPE_SYM = mkType("Symbol", Symbol);

Then we need a way to create new Symbols, instead of adding syntax for it we are going to add a reply to the as-type message for strings.

And an apply-to handler to the Symbol type to apply itself as the type to the message object.

And... that it?

test("custom type definition", () => {
  const env = mkEnv2()
    .bind(TYPE_NIL, mkProto({ eval: (s) => s }))
    .bind(
      TYPE_PAIR,
      mkProto({
        eval: (s, _m, e) => new Pair(e.eval(s.a), e.eval(s.b)),
      }),
    )
    .bind(
      TYPE_STR,
      mkProto({
        eval: (s) => s,
        "as-type"(s, _m, e) {
          const type = Symbol(s);
          e.left.bind(
            type,
            new Frame().bind("eval", (s) => s),
          );
          return type;
        },
        "+": (s, m) => new Str(s + ("" + m.obj)),
        "say-hello": (s) => new Str(`Hello, ${s}!`),
      }),
    )
    .bind(
      TYPE_NAME,
      mkProto({
        eval: (s, _m, e) => e.find(s.value),
        is(s, m, e) {
          e.up.bind(s.value, m.obj);
          return m.obj;
        },
      }),
    )
    .bind(
      TYPE_SYM,
      mkProto({
        eval: (s) => s,
        "apply-to"(s, m) {
          m.obj[typeSym] = s;
          return s;
        },
      }),
    )
    .right();

  const pair = run3(
    `{
    @MyType is ("my-type" as-type ()),
    @name is "Joe",
     MyType apply-to name,
     @(name say-hello ()) replies @("Well, hello " + it),
     (name say-hello ()) : ("Mike" say-hello ())
   }`,
    env,
  );
  // coerse to String from Str to test
  expect("" + pair.a).toBe("Well, hello Joe");
  expect("" + pair.b).toBe("Hello, Mike!");
});

Let's go line by line:

@MyType is ("my-type" as-type ())

Define a new type with label "my-type" and bind it to the name MyType.

Notice that until now we had no way to bind new values in the environment, we defined a handler for is in the Name type that binds the object in the environment for the current name. Since each message handler enters its own call frame we bind it in the parent frame.

@name is "Joe"

Bind the String "joe" to the name name.

MyType apply-to name

Apply the type bound in MyType to the value bound in name.

@(name say-hello ()) replies @("Well, hello " + it)

Define a reply for the say-hello message for the type in name (notice that replies evaluates the subject in the current environment before getting the type so the type is not Name but our new type.

(name say-hello ()) : ("Mike" say-hello ())

Return a pair with a being the result of say-hello in our type and b the same message but on a String.

Future Work

That's all for now, here's a list of things I want to improve, explore, expand on this topic:

  • how to support closures in Frame
  • somehow support reply handlers that don't enter a call frame for lightweight things like arithmetic operators
  • since the language semantics are "late bound" in the environment we can do multiple passes on the same program by having different environments with different semantics and having the result of evaluating a program in one environment be the input of the next "phase", why stop at "compile time" and "runtime"?
  • explore the "emergent behavior" of the current design:
    • since a type message handler is in the stack we can "shadow" type methods per module or even in the call stack
    • alternative where prototype lookup is dynamically scoped and not lexically scoped, either by default or with a different syntax
  • have Frame as a first class type, and base objects on it?
  • make bind override the value in the right scope if the binding already exists
  • make send dispatch not only on subject's type but also on the message object type too