Ir al contenido principal

Hi, I'm Mariano Guerra, below is my blog, if you want to learn more about me and what I do check a summary here: marianoguerra.github.io or find me on twitter @warianoguerra or Mastodon @marianoguerra@hachyderm.io

Elixir Flavoured Erlang: Who Transpiles the Transpiler?

When a compiler is implemented in the language it compiles it's called bootstrapping.

What's the name for a transpiler transpiled to the target language?

Note: The recomended reading setup is to open the Inception Button and click it when appropiate


I've been working on an off on Elixir Flavoured Erlang: an Erlang to Elixir Transpiler for a while, in recent months a nice netizen called eksperimental started reporting cases where the transpiler was generating semantically incorrect Elixir code, that means that the code compiled but didn't do the same thing as the Erlang version.

I started noticing a pattern in the reports and thinking: Is he/she trying to do what I wanted to do since the beginning of this project?

As the name suggests, Elixir Flavoured Erlang is an Erlang to Elixir transpiler... written in Erlang.

The joke that started this project was that I could write Elixir without ever actually writing Elixir and the final step would be to transpile the transpiler with itself and make it work.

Since I closed all open issues (except keeping comments) I decided to give it a go.

What I needed to do to try this is:

  • Create an escript mix project and configure it accordingly

  • Transpile the project from Erlang to Elixir into the lib folder

  • Build the project

  • Transpile something with both versions

  • Diff both outputs to check they are equal

The testing strategy during development was to transpile Erlang/OTP to Elixir (otp.ex).

Transpiling it in this case should exercise most of the code paths.

All the steps are automated in the make inception target

1st Attempt, almost...

The first attempt failed by giving me the usage as if I was passing the wrong command line options, but I wasn't.

The problem was caused by Elixir passing binary strings to the escript entry point while Erlang passes list strings, it was fixed by catching the Elixir case, converting the arguments to list strings and calling main again with them.

2nd attempt, is absence of evidence evidence of absence?

After fixing that I ran it again and the diff didn't generate any output.

I wasn't sure if it worked or not, I introduced a change in the output manually and ran the diff line again, this time it displayed the difference.

That meant the transpiled transpiler is identical to the original at least when transpiling Erlang/OTP.

Some of the recent changes

In the previous post about the project I listed the special cases and tricks I had to do to transpile OTP, here are the main changes introduced after that.

Quoting all reserved keywords when used as identifiers

Elixir reserved keywords

Just one example since all are the same except for the identifier:

'true'() -> ok.

Translates to

def unquote(:true)() do
  :ok
end

Fixed improper cons list conversion

cons([]) -> ok;
cons([1]) -> ok;
cons([1, 2]) -> ok;
cons([1 | 2]) -> ok;
cons([[1, 2], 3]) -> ok;
cons([[1, 2] | 3]) -> ok;
cons([[1 | 2] | 3]) -> ok;
cons([0, [1, 2]]) -> ok;
cons([0, [1 | 2]]) -> ok;
% equivalent to [1, 2, 3]
cons([0 | [1, 2]]) -> ok;
cons([[1, [2, [3]]]]) -> ok;
cons([[-1, 0] | [1, 2]]) -> ok.

Made single line to save vertical space

def cons([]) do # ...
def cons([1]) do # ...
def cons([1, 2]) do # ...
def cons([1 | 2]) do # ...
def cons([[1, 2], 3]) do # ...
def cons([[1, 2] | 3]) do # ...
def cons([[1 | 2] | 3]) do # ...
def cons([0, [1, 2]]) do # ...
def cons([0, [1 | 2]]) do # ...
def cons([0, 1, 2]) do # ...
def cons([[1, [2, [3]]]]) do # ...
def cons([[- 1, 0], 1, 2]) do # ...

Transpile map exact and assoc updates into map syntax, Map.put and Map.merge

In Erlang there are two ways to set a map key: exact and assoc.

Exact uses the := operator and will only work if the key already exists in the map:

1> M = #{a => 1}.
#{a => 1}

2> M#{a := 2}.
#{a => 2}

3> M#{b := 2}.
** exception error: {badkey,b}

Assoc uses the => operator and works if the key exists or if it doesn't:

1> M = #{a => 1}.
#{a => 1}

2> M#{a => 2}.
#{a => 2}

3> M#{b => 2}.
#{a => 1,b => 2}

In Elixir only exact has syntax using the => operator (: as syntactic sugar for atom keys):

iex(1)> m = %{a: 1}
%{a: 1}

iex(2)> m = %{:a => 1} # equivalent
%{a: 1}

iex(3)> %{m | a: 2}
%{a: 2}

iex(4)> %{m | b: 2}
** (KeyError) key :b not found in: %{a: 1}

iex(4)> %{m | :b => 2}
** (KeyError) key :b not found in: %{a: 1}

The easiest solution would be to transpile all cases to Map.merge/2

But the right solution would be to use the most ideomatic for each case:

  • If Erlang uses exact, transpile to Elixir map update syntax

  • If Erlang uses assoc

  • If Erlang uses both split it and use the right syntax for each

Let's see it with examples:

put_atom() ->
        M = #{},
        M0 = M#{},
        M1 = M#{a => 1},
        M2 = M1#{a := 1},
        M3 = M#{a => 1, b => 2},
        M4 = M3#{a := 1, b => 2},
        M5 = M1#{a := 1, b := 2},
        M6 = M1#{a := 1, b := 2, c => 3, d => 4},
        {M0, M1, M2, M3, M4, M5, M6}.

put_key() ->
        M = #{},
        M1 = M#{<<"a">> => 1},
        M2 = M1#{<<"a">> := 1},
        M3 = M#{<<"a">> => 1, <<"b">> => 2},
        M4 = M3#{<<"a">> := 1, <<"b">> => 2},
        M5 = M1#{<<"a">> := 1, <<"b">> := 2},
        M6 = M1#{<<"a">> := 1, <<"b">> := 2, <<"c">> => 3, <<"d">> => 4},
        {M1, M2, M3, M4, M5, M6}.

quoted_atom_key(M) ->
        M#{'a-b' := 1}.

Compiles to:

def put_atom() do
  m = %{}
  m0 = m
  m1 = Map.put(m, :a, 1)
  m2 = %{m1 | a: 1}
  m3 = Map.merge(m, %{a: 1, b: 2})
  m4 = Map.put(%{m3 | a: 1}, :b, 2)
  m5 = %{m1 | a: 1, b: 2}
  m6 = Map.merge(%{m1 | a: 1, b: 2}, %{c: 3, d: 4})
  {m0, m1, m2, m3, m4, m5, m6}
end

def put_key() do
  m = %{}
  m1 = Map.put(m, "a", 1)
  m2 = %{m1 | "a" => 1}
  m3 = Map.merge(m, %{"a" => 1, "b" => 2})
  m4 = Map.put(%{m3 | "a" => 1}, "b", 2)
  m5 = %{m1 | "a" => 1, "b" => 2}
  m6 = Map.merge(%{m1 | "a" => 1, "b" => 2}, %{"c" => 3, "d" => 4})
  {m1, m2, m3, m4, m5, m6}
end

def quoted_atom_key(m) do
  %{m | "a-b": 1}
end

What's next

To be sure it works in all cases I would like to make it possible to translate Erlang projects and run the project tests after transpiling, if you are interested in helping contact me on twitter @warianoguerra