Multi-Paxos with riak_ensemble Part 3

In the previous post I showed how to use riak_ensemble in a rebar3 project, now I will show how to create an HTTP API for the Key/Value store using Cowboy and jsone.

This post assumes that you have erlang and rebar3 installed, I'm using erlang 19.3 and rebar3 3.4.3.

The source code for this post is at https://github.com/marianoguerra/cadena check the commits for the steps.

Dependency Setup

To have an HTTP API we will need an HTTP server, in our case we will use Cowboy 2.0 RC 3, for that we need to:

  1. Add it as a dependency (we will load if from git since it's still a release candidate)
  2. Add it to our list of applications to start when our application starts
  3. Add it to the list of dependencies to include in our release
  4. Set up the HTTP listener and routes when our application starts

We setup just one route that is handled by the cadena_h_keys module, it's a plain HTTP handler, no fancy REST stuff for now, there we handle the request on the init/2 function itself, we pattern match against the method field on the request object and handle:

POST
set a key in a given ensemble to the value sent in the JSON request body
GET
get a key in a given ensemble, if not found null will be returned in the value field in the response
DELETE
delete a key in a given ensemble, returns null both if the key existed and if itdidn't

Any other method would get a 405 Method Not Allowed response.

The route has the format /keys/<ensemble>/<key>, for now we only allow the root ensemble to be set in the <ensemble> part of the path.

We also add the jsone library to encode/decode JSON and the lager library to log messages.

We add both to the list of dependencies to include in the release.

We will also need to have a way to override the HTTP port where each instance listens to so we can run a cluster on one computer and each node can listen for HTTP requests on a different port.

The dev and prod releases will listen on 8080 as specified in vars.config.

node1 will listen on port 8081 (override in vars_node1.config)

node2 will listen on port 8082 (override in vars_node2.config)

node3 will listen on port 8083 (override in vars_node3.config)

To avoid having to configure this in sys.config we will define a cuttlefish schema in config.schema that cuttlefish will use to generate a default config file and validation code for us.

We have to replace the variables from variable overrides in our config.schema file for each release before it's processed by cuttlefish itself, for that we use the template directive on an overlay section on the release config.

Build devrel:

make revrel

Check the configuration file generated for each node at:

_build/node1/rel/cadena/etc/cadena.conf
_build/node2/rel/cadena/etc/cadena.conf
_build/node3/rel/cadena/etc/cadena.conf

The first part is of interest to us, it looks like this for node1, the port number is different in node2 and node3:

## port to listen to for HTTP API
##
## Default: 8081
##
## Acceptable values:
##   - an integer
http.port = 8081

## number of acceptors to user for HTTP API
##
## Default: 100
##
## Acceptable values:
##   - an integer
http.acceptors = 100

## folder where ensemble data is stored
##
## Default: ./cadena_data
##
## Acceptable values:
##   - text
data.dir = ./cadena_data

Start 3 nodes in 3 different shells:

make node1-console
make node2-console
make node3-console

Start enseble and join nodes, I created a target called devrel-setup in the Makefile to make it easier:

make devrel-setup

Let's set key1 in ensemble root to 42 on node1 (port 8081):

curl -X POST http://localhost:8081/keys/root/key1 -d 42

Response:

{"data":{"epoch":2,"key":"key1","seq":10,"value":42},"ok":true}

Let's get key1 in ensemble root to 42 on node2 (port 8082):

curl -X GET http://localhost:8082/keys/root/key1

Response:

{"data":{"epoch":2,"key":"key1","seq":10,"value":42},"ok":true}

Same on node3:

curl -X GET http://localhost:8083/keys/root/key1

Response:

{"data":{"epoch":2,"key":"key1","seq":10,"value":42},"ok":true}

Overwrite on node1:

curl -X POST http://localhost:8081/keys/root/key1 -d '{"number": 42}'

Response:

{"data":{"epoch":2,"key":"key1","seq":400,"value":{"number":42}},"ok":true}

Get on node2:

curl -X GET http://localhost:8082/keys/root/key2
{"data":{"epoch":3,"key":"key2","seq":11,"value":null},"ok":true}

Let's set key2 in ensemble root to {"number": 42} on node1 (port 8081):

curl -X POST http://localhost:8081/keys/root/key2 -d '{"number": 42}'

Response:

{"data":{"epoch":3,"key":"key2","seq":67,"value":{"number":42}},"ok":true}

Get it on node2:

curl -X GET http://localhost:8082/keys/root/key2

Response:

{"data":{"epoch":3,"key":"key2","seq":67,"value":{"number":42}},"ok":true}

Delete key2 in ensemble root on node2:

curl -X DELETE http://localhost:8082/keys/root/key2

Response:

{"data":{"epoch":3,"key":"key2","seq":137,"value":null},"ok":true}

Check that it was removed by trying to get it again on node2:

curl -X GET http://localhost:8082/keys/root/key2

Response:

{"data":{"epoch":3,"key":"key2","seq":137,"value":null},"ok":true}

There you go, now you have a Consistent Key Value Store with an HTTP API.

Multi-Paxos with riak_ensemble Part 2

In the previous post I showed how to use riak_ensemble from the interactive shell, now I will show how to use rebar3 to use riak_ensemble from a real project.

This post assumes that you have erlang and rebar3 installed, I'm using erlang 19.3 and rebar3 3.4.3.

The source code for this post is at https://github.com/marianoguerra/cadena check the commits for the steps.

Create Project

rebar3 new app name=cadena
cd cadena

The project structure should look like this:

.
├── LICENSE
├── README.md
├── rebar.config
└── src
        ├── cadena_app.erl
        ├── cadena.app.src
        └── cadena_sup.erl

1 directory, 6 files

Configuring Dev Release

We do the following steps, check the links for comments on what's going on for each step:

  1. Add Dependencies
  2. Configure relx section
    1. Add overlay variables file vars.config
    2. Add sys.config
    3. Add vm.args

Build a release to test that everything is setup correctly:

$ rebar3 release

Run the release interactively with a console:

$ _build/default/rel/cadena/bin/cadena console

Output (edited and paths redacted for clarity):

Exec: erlexec
        -boot _build/default/rel/cadena/releases/0.1.0/cadena
        -boot_var ERTS_LIB_DIR erts-8.3/../lib
        -mode embedded
        -config    _build/default/rel/cadena/generated.conf/app.1.config
        -args_file _build/default/rel/cadena/generated.conf/vm.1.args
        -vm_args   _build/default/rel/cadena/generated.conf/vm.1.args
        -- console

Root: _build/default/rel/cadena
Erlang/OTP 19 [erts-8.3] [source] [64-bit] [smp:4:4] [async-threads:64]
                      [kernel-poll:true]

18:31:12.150 [info] Application lager started on node 'cadena@127.0.0.1'
18:31:12.151 [info] Application cadena started on node 'cadena@127.0.0.1'
Eshell V8.3  (abort with ^G)
(cadena@127.0.0.1)1>

Quit:

(cadena@127.0.0.1)1> q().
ok

Non interactive start:

$ _build/default/rel/cadena/bin/cadena start

No output is generated if it's started, we can check if it's running by pinging the application:

$ _build/default/rel/cadena/bin/cadena ping

We should get:

pong

If we want we can attach a console to the running system:

$ _build/default/rel/cadena/bin/cadena attach

Output:

Attaching to /tmp/erl_pipes/cadena@127.0.0.1/erlang.pipe.1 (^D to exit)

(cadena@127.0.0.1)1>

If we press Ctrl+d we can dettach the console without stopping the system:

(cadena@127.0.0.1)1> [Quit]

We can stop the system whenever we want issuing the stop command:

$ _build/default/rel/cadena/bin/cadena stop

Output:

ok

Note

Use Ctrl+d to exit, if we write q(). not only we dettach the console but we also stop the system!

Let's try it.

Non interactive start:

$ _build/default/rel/cadena/bin/cadena start

No output is generated if it's started, we can check if it's running by pinging the application:

$ _build/default/rel/cadena/bin/cadena ping

We should get:

pong

If we want we can attach a console to the running system:

$ _build/default/rel/cadena/bin/cadena attach

Output:

Attaching to /tmp/erl_pipes/cadena@127.0.0.1/erlang.pipe.1 (^D to exit)

(cadena@127.0.0.1)1>

Now let's quit with q():

(cadena@127.0.0.1)1> q().

Output:

ok

Now let's see if it's alive:

$ _build/default/rel/cadena/bin/cadena ping

Node 'cadena@127.0.0.1' not responding to pings.

Be careful with how you quit attached consoles in production systems :)

Configure Prod and Dev Cluster Releases

Building Prod Release

We start by adding a new section to rebar.config called profiles, and define 4 profiles that override the default release config with specific values, let's start by trying the prod profile, which we will use to create production releases of the project:

rebar3 as prod release

Output:

===> Verifying dependencies...
...
===> Compiling cadena
===> Running cuttlefish schema generator
===> Starting relx build process ...
===> Resolving OTP Applications from directories:
          _build/prod/lib
          erl-19.3/lib
===> Resolved cadena-0.1.0
===> Including Erts from erl-19.3
===> release successfully created!

Notice now that we have a new folder in the _build directory:

$ ls -1 _build

Output:

default
prod

The results of the commands run "as prod" are stored in the prod folder.

You will notice if you explore the prod/rel/cadena folder that there's a folder called erts-8.3 (the version may differ if you are using a different erlang version), that folder is there because of the include_erts option we overrided in the prod profile.

This means you can zip the _build/prod/rel/cadena folder, upload it to a server that doesn't have erlang installed in it and still run your release there.

This is a good way to be sure that the version running in production is the same you use in development or at build time in your build server.

Just be careful with deploying to an operating system too different to the one you used to create the release becase you may have problems with bindings like libc or openssl.

Running it is done as usual, only the path changes:

_build/prod/rel/cadena/bin/cadena console

_build/prod/rel/cadena/bin/cadena start
_build/prod/rel/cadena/bin/cadena ping
_build/prod/rel/cadena/bin/cadena attach
_build/prod/rel/cadena/bin/cadena stop

Building Dev Cluster Releases

To build a cluster we need at least 3 nodes, that's why the last 3 profiles are node1, node2 and node3, they need to have different node names, for that we use the overlay var files to override the name of each, that is achieved on config/vars_node1.config for node1, config/vars_node2.config for node2 and config/vars_node3.config for node3.

Now let's build them:

rebar3 as node1 release
rebar3 as node2 release
rebar3 as node3 release

The output for each should be similar to the one for the prod release.

Now on three different shells start each node:

./_build/node1/rel/cadena/bin/cadena console

Check the name of the node in the shell:

(node1@127.0.0.1)1>

Do the same for node2 and node3 on different shells:

./_build/node2/rel/cadena/bin/cadena console
./_build/node3/rel/cadena/bin/cadena console

You should get respectively:

(node2@127.0.0.1)1>

And:

(node3@127.0.0.1)1>

In case you don't remember, you can quit with q().

Joining the Cluster Together

Until here we built 3 releases of the same code with slight modifications to allow running a cluster on one computer, but 3 nodes running doesn't mean we have a cluster, for that we need to use what we learned in the Multi-Paxos with riak_ensemble Part 1 but now on code and not interactively.

For that we will create a cadena_console module that we will use to make calls from the outside and trigger actions on each node, the code is similar to the one presented in Multi-Paxos with riak_ensemble Part 1.

join([NodeStr]) ->
    % node name comes as a list string, we need it as an atom
    Node = list_to_atom(NodeStr),
    % check that the node exists and is alive
    case net_adm:ping(Node) of
        % if not, return an error
        pang ->
            {error, not_reachable};
        % if it replies, let's join him passing our node reference
        pong ->
            riak_ensemble_manager:join(Node, node())
    end.

create([]) ->
    % enable riak_ensemble_manager
    riak_ensemble_manager:enable(),
    % wait until it stabilizes
    wait_stable().

cluster_status() ->
    case riak_ensemble_manager:enabled() of
        false ->
            {error, not_enabled};
        true ->
            Nodes = lists:sort(riak_ensemble_manager:cluster()),
            io:format("Nodes in cluster: ~p~n",[Nodes]),
            LeaderNode = node(riak_ensemble_manager:get_leader_pid(root)),
            io:format("Leader: ~p~n",[LeaderNode])
    end.

We also need to add the riak_ensemble supervisor to our supervisor tree in cadena_sup:

init([]) ->
    % get the configuration from sys.config
    DataRoot = application:get_env(riak_ensemble, data_root, "./data"),
    % create a unique path for each node to avoid clashes if running more
    % than one node in the same computer
    NodeDataDir = filename:join(DataRoot, atom_to_list(node())),

    Ensemble = {riak_ensemble_sup,
                {riak_ensemble_sup, start_link,
                 [NodeDataDir]},
                permanent, 20000, supervisor, [riak_ensemble_sup]},

    {ok, { {one_for_all, 0, 1}, [Ensemble]} }.

Before building the dev cluster we need to add the crypto app to cadena.app.src since it's needed by riak_ensemble to create the cluster.

Now let's build the dev cluster, I created a Makefile to make it simpler:

make devrel

On three different shells run one command on each:

make node1-console
make node2-console
make node3-console

Let's make an rpc call to enable the riak_ensemble cluster on node1:

./_build/node1/rel/cadena/bin/cadena rpc cadena_console create

On node1 you should see something like:

[info] {root,'node1@127.0.0.1'}: Leading

Let's join node2 to node1:

./_build/node2/rel/cadena/bin/cadena rpc cadena_console join node1@127.0.0.1

On node1 you should see:

[info] join(Vsn): {1,152} :: 'node2@127.0.0.1' :: ['node1@127.0.0.1']

On node2:

[info] JOIN: success

Finally let's join node3:

./_build/node3/rel/cadena/bin/cadena rpc cadena_console join node1@127.0.0.1

Output on node1:

[info] join(Vsn): {1,453} :: 'node3@127.0.0.1' :: ['node1@127.0.0.1','node2@127.0.0.1']

On node3:

[info] JOIN: success

Let's check that the 3 nodes have the same view of the cluster, let's ask node1 what's the ensemble status:

./_build/node1/rel/cadena/bin/cadena rpc cadena_console ensemble_status
Nodes in cluster: ['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']
Leader: 'node1@127.0.0.1'

node2:

$ ./_build/node2/rel/cadena/bin/cadena rpc cadena_console ensemble_status
Nodes in cluster: ['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']
Leader: 'node1@127.0.0.1'

node3:

$ ./_build/node3/rel/cadena/bin/cadena rpc cadena_console ensemble_status
Nodes in cluster: ['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']
Leader: 'node1@127.0.0.1'

Everything looks right, stop the 3 nodes (q().) and start them again, you will see that after starting up node1 logs:

[info] {root,'node1@127.0.0.1'}: Leading

And if you call ensemble_status on any node you get the same outputs as before, this means they remember the cluster topology even after restarts.

Public/Private Key Encryption, Sign and Verification in Erlang

You want to encrypt/decrypt some content?

You want to generate a signature and let others verify it?

At least that's what I wanted to do, so here it is.

First generate keys if you don't have some available:

openssl genrsa -out private.pem 2048
openssl rsa -in private.pem -out public.pem -outform PEM -pubout

Load the raw keys:

{ok, RawSKey} = file:read_file("private.pem").
{ok, RawPKey} = file:read_file("public.pem").

[EncSKey] = public_key:pem_decode(RawSKey).
SKey = public_key:pem_entry_decode(EncSKey).

[EncPKey] = public_key:pem_decode(RawPKey).
PKey = public_key:pem_entry_decode(EncPKey).

Let's encrypt a message with the private key and decrypt with the public key:

Msg = <<"hello crypto world">>.
CMsg = public_key:encrypt_private(Msg, SKey).
Msg = public_key:decrypt_public(CMsg, PKey).

We can do it the other way, encrypt with the public key and decrypt with the private key:

CPMsg = public_key:encrypt_public(Msg, PKey).
Msg = public_key:decrypt_private(CPMsg, SKey).

Let's generate a signature for the message that others can verify with our public key:

Signature = public_key:sign(Msg, sha256, SKey).
public_key:verify(Msg, sha256, Signature, PKey).

% let's see if it works with another message
public_key:verify(<<"not the original message">>, sha256, Signature, PKey).

Papers (and other things) of the LargeSpanOfTime II

OK, the title is getting fuzzier and fuzzier, but I decided to condense some things I've been reading here.

Papers:

Bringing the Web up to Speed with WebAssembly:

I like compilers, and their implementations, so I've been following WebAssembly, this is a good place to look at.

Spanner, TrueTime & The CAP Theorem:

A blog post by google made the rounds lately with people saying that google was saying that they beat the CAP Theorem, so I went to the source. The conclusion is interesting:

Spanner reasonably claims to be an “effectively CA” system despite operating over a wide area, as it is
always consistent and achieves greater than 5 9s availability. As with Chubby, this combination is possible
in practice if you control the whole network, which is rare over the wide area. Even then, it requires
significant redundancy of network paths, architectural planning to manage correlated failures, and very
careful operations, especially for upgrades. Even then outages will occur, in which case Spanner chooses
consistency over availability.
Spanner uses two-phase commit to achieve serializability, but it uses TrueTime for external consistency,
consistent reads without locking, and consistent snapshots.

Bitcoin: A Peer-to-Peer Electronic Cash System:

Again, many people ranting and raving about bitcoin, blockchain and cryptocurrencies, what's better than go to the source, really readable paper.

CAP Twelve Years Later: How the “Rules” Have Changed:

I have a deja vu that I already read this paper, but just to be sure I read it again, interesting summary of the concepts and how they evolved over time.

LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data:

I wanted to read the LSM-tree paper and it seems I didn't look what I was clicking so instead I ended up reading the LSM-trie paper, which is really interesting and has an overview of the LSM-tree one, now I have to go and read that one too.

A prettier printer Philip Wadler:

In a previous post I mentioned that I read "The Design of a Pretty-printing Library" and I was expecting something else, well, this paper is a something else that I liked more.

Metaobject protocols: Why we want them and what else they can do:

Being an aspiring Smug Lisp Weenie I had to read this one, it's a nice paper and puts a name on some "patterns" that I've observed but couldn't describe clearly.

The Cube Data Model: A Conceptual Model and Algebra for On-Line Analytical Processing in Data Warehouses:

I've been thinking lately about the relation between Pivot Tables, Data Cubes and the things mentioned in the paper A Layered Grammar of Graphics so I started reading more about Data Cubes, I skimmed a couple papers that I forgot to register somewhere but this one was one I actually registered.

End-to-End Arguments in System Design:

Someone somewhere mentioned this paper so I went to look, it's a really good one, like the Metaobject protocol paper and other's I've read, this one is like a condensation of years of knowledge and experiences that are really interesting to read.

Books:

Object-Oriented Programming in the Beta Programming Language:

Interesting book about a really interesting (and different) object oriented programming language by the creators of Simula (aka the creators of object orientation), it explains an abstraction called "patterns" in which all other abstractions are expressed.

Project Oberon The Design of an Operating System and Compiler:

Another interesting book by Niklaus Wirth, creator of between others, Pascal, Modula and Oberon describing how to basically create computing from scratch.

I will note that I skimmed over the dense specification parts of those books since I wasn't trying to implement nor use them.

Reading:

Papers this looong week: 11 (count books as papers because why not)

Papers so far: 54

Papers in queue: don't know

Multi-Paxos with riak_ensemble Part 1

In this post I will do the initial steps to setup a project using riak_ensemble and use its core APIs, we will do it manually in the shell on purpose, later (I hope) I will post how to build it properly in code.

First we create a new project, I'm using erlang 19.3 and rebar3 3.4.3:

rebar3 new app name=cadena

Then add riak_ensemble dependency to rebar.config, it should look like this:

{erl_opts, [debug_info]}.
{deps, [{riak_ensemble_ng, "2.4.0"}]}.

Now on 3 different terminals start 3 erlang nodes:

rebar3 shell --name node1@127.0.0.1
rebar3 shell --name node2@127.0.0.1
rebar3 shell --name node3@127.0.0.1

Run the following in every node:

Timeout = 1000.
Ensemble = root.
K1 = <<"k1">>.

application:set_env(riak_ensemble, data_root, "data/" ++ atom_to_list(node())).
application:ensure_all_started(riak_ensemble).

We are setting a variable telling riak_ensemble where to store the data for each node, node1 will store it under data/node1@127.0.0.1 node2 on data/node2@127.0.0.1 and node3 on data/node3@127.0.0.1

After that we ensure all apps that riak_ensemble requires to run are started.

You should see something like this:

ok

18:05:50.548 [info] Application lager started on node 'node1@127.0.0.1'
18:05:50.558 [info] Application riak_ensemble started on node 'node1@127.0.0.1'
{ok,[syntax_tools,compiler,goldrush,lager,riak_ensemble]}

Now on node1 run:

riak_ensemble_manager:enable().

Output:

ok

We start the riak_ensemble_manager in one node only.

Then on node2 we join node1 and node3:

riak_ensemble_manager:join('node1@127.0.0.1' ,node()).
riak_ensemble_manager:join('node3@127.0.0.1' ,node()).

Output on node2:

18:06:39.285 [info] JOIN: success
ok
remote_not_enabled

This command also generates output on node1:

18:06:24.008 [info] {root,'node1@127.0.0.1'}: Leading
18:06:39.281 [info] join(Vsn): {1,64} :: 'node2@127.0.0.1' :: ['node1@127.0.0.1']

On node3 we join node1 and node2:

riak_ensemble_manager:join('node1@127.0.0.1' ,node()).
riak_ensemble_manager:join('node2@127.0.0.1' ,node()).

Output on node 3:

18:07:36.078 [info] JOIN: success
ok

Output on node 1:

18:07:36.069 [info] join(Vsn): {1,291} :: 'node3@127.0.0.1' :: ['node1@127.0.0.1','node2@127.0.0.1']
18:07:36.074 [info] join(Vsn): {1,292} :: 'node3@127.0.0.1' :: ['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']

Run this on all nodes:

riak_ensemble_manager:check_quorum(Ensemble, Timeout).
riak_ensemble_peer:stable_views(Ensemble, Timeout).
riak_ensemble_manager:cluster().

Output:

true
{ok,true}
['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']

Everything seems to be ok, we have a cluster!

Now we can write something, let's set key "k1" to value "v1" on all nodes using paxos for consensus.

On node1 run:

V1 = <<"v1">>.
riak_ensemble_client:kover(node(), Ensemble, K1, V1, Timeout).

Output:

{ok,{obj,1,729,<<"k1">>,<<"v1">>}}

We can check on node2 that the value is available:

riak_ensemble_client:kget(node(), Ensemble, K1, Timeout).

Output:

{ok,{obj,1,729,<<"k1">>,<<"v1">>}}

Now we can try a different way to update a value, let's say we want to set a new value but depending on the current value or only if the current value is set to something specific, for that we use kmodify, which receives a function and calls us with the current value and sets the key to the value we return.

On node3 run:

V2 = <<"v2">>.
DefaultVal = <<"v0">>.
ModifyTimeout = 5000.

riak_ensemble_peer:kmodify(node(), Ensemble, K1,
    fun({Epoch, Seq}, CurVal) ->
        io:format("CurVal: ~p ~p ~p to ~p~n", [Epoch, Seq, CurVal, V2]),
        V2
    end,
    DefaultVal, ModifyTimeout).

Output on node 3:

{ok,{obj,1,914,<<"k1">>,<<"v2">>}}

Output on node 1:

CurVal: 1 914 <<"v1">> to <<"v2">>

The call with a function as parameter was done on node3 but it ran on node1, that's the advantage of using the Erlang virtual machine to build distributed systems.

Now let's check if the value was set on all nodes by checking it on node2:

riak_ensemble_client:kget(node(), Ensemble, K1, Timeout).

Output:

{ok,{obj,1,914,<<"k1">>,<<"v2">>}}

Now let's quit on all nodes:

q().

Let's start the cluster again to see if riak_ensemble rememers things, in 3 different terminals run:

rebar3 shell --name node1@127.0.0.1
rebar3 shell --name node2@127.0.0.1
rebar3 shell --name node3@127.0.0.1

On every node:

Timeout = 1000.
Ensemble = root.
K1 = <<"k1">>.

application:set_env(riak_ensemble, data_root, "data/" ++ atom_to_list(node())).
application:ensure_all_started(riak_ensemble).

We set the data_root again and start riak_enseble and its dependencies, after that on node1 we should see:

18:11:55.286 [info] {root,'node1@127.0.0.1'}: Leading

Now let's check that the cluster was initialized correctly:

riak_ensemble_manager:check_quorum(Ensemble, Timeout).
riak_ensemble_peer:stable_views(Ensemble, Timeout).
riak_ensemble_manager:cluster().

Output:

true
{ok,true}
['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']

You can now check on any node you want if the key is still set:

riak_ensemble_client:kget(node(), Ensemble, K1, Timeout).

Output should be:

{ok,{obj,2,275,<<"k1">>,<<"v2">>}}

Check the generated files under the data folder:

$ tree data

data
├── node1@127.0.0.1
│   └── ensembles
│       ├── 1394851733385875569783788015140658786474476408261_kv
│       ├── ensemble_facts
│       └── ensemble_facts.backup
├── node2@127.0.0.1
│   └── ensembles
│       ├── ensemble_facts
│       └── ensemble_facts.backup
└── node3@127.0.0.1
    └── ensembles
            ├── ensemble_facts
            └── ensemble_facts.backup

6 directories, 7 files

To sum up, we created a project, added riak_ensemble as a dependency, started a 3 node cluster, joined all the nodes, wrote a key with a value, checked that it was available on all nodes, updated the value with a "compare and swap" operation, stopped the cluster, started it again and checked that the cluster was restarted as it was and the value was still there.

Papers of the LargeSpanOfTime I

Welp, some day the experiment had to end, I stopped reading 5 papers a week because some books arrived and I read those instead and also because I was busy at work.

But that doesn't mean I didn't read papers at all, so here's a list of the ones I did read.

Note

Since some of them I read them a while ago the reviews may not be really detailed

Cuneiform: A Functional Language for Large Scale Scientific Data Analysis

Seems useful in practice, was expecting something else from the title.

The Stratosphere platform for big data analytics

I remember reading a paper from what later became Apache Flink that I liked a lot, I was looking for that one and I found this one instead (stratosphere became flink), it was an interesting overview, would like to know how much of that is still in flink.

Orleans: Distributed Virtual Actors for Programmability and Scalability

Really good paper, I like how it's written and the idea and implementation.

HyParView: a membership protocol for reliable gossip-based broadcast

Epidemic Broadcast Trees

This too reviewed together because they are like bread and butter, I love both of them, highly recommended.

Large-Scale Peer-to-Peer Autonomic Monitoring

I won't lie to you, I don't remember much about this one, but given the authors it must be good :)

Stream Processing with a Spreadsheet

Object Spreadsheets: A New Computational Model for End-User Development of Data-Centric Web Applications

I was looking for ideas and inspiration when I read these two, I liked both, Object Spreadsheets being the most interesting aproach.

A Layered Grammar of Graphics

Great paper, on my top list, maybe because I love the topic :)

Virtual Time and Global States of Distributed Systems

A must read if interested in vector clocks, the non math parts are good, I don't enjoy reading theormes a lot (not their fault).

Papers this looong week: 10

Papers so far: 43

Papers in queue: don't want to count anymore

Improving Official Erlang Documentation

Many times I've heard people complaining about different aspects of the Official Erlang documentation, one thing that I find interesting is the fact that the Erlang documentation is really complete and detailed, so I decided to dedicate some time to other parts, to get familiar with it I decided to start with an "easy" one, it's presentation.

So I downloaded erlang/otp:

git clone https://github.com/erlang/otp.git

And did a build:

# to avoid having dates formated in your local format
export LC_ALL="en_US.utf-8"
cd otp
./otp_build setup
make docs

Then I installed the result in another folder to see the result:

mkdir ../erl-docs
make release_docs RELEASE_ROOT=../erl-docs

And served them to be able to navidate them:

cd ../erl-docs
python3 -m http.server

If you want to give it a try you need to install the following deps on debian based systems:

sudo apt install build-essential fop xsltproc autoconf libncurses5-dev

With the docs available I started looking around, the main files to modify are:

lib/erl_docgen/priv/css/otp_doc.css
The stylesheet for the docs
lib/erl_docgen/priv/xsl/db_html.xsl
An XSLT file to transform xml docs into html

The problem I found at first was that to see the results of my changes to db_html.xsl I had to do a clean and build from scratch, which involved recompiling erlang itself, taking a lot of time.

Later I found a way to only build the docs again by forcing a rebuild:

make -B docs

But this still involves building the pdf files which is the part that takes the most time, I haven't found a target that will only build the html files, if you know how or want to try to add it in the make file it would be great.

With this knowledge I started improving the docs, I will cover the main things I changed.

You can see all my chages in the improve-docs-style branch.

Small styling changes

  • Don't use full black and white
  • Set font to sans-serif
  • Use mono as code font
  • Improve link colors
  • Improve title and description markup on landing page
  • Update menu icons (the folder and document icons)
  • Improve panel and horizontal separator styles
  • Align left panel's links to the left

Improve code box color, border and spacing

/galleries/misc/otp-old-2.png

Old Code Examples

/galleries/misc/otp-new-2.png

New Code Examples

Improve warning and info boxes' color, border and spacing

/galleries/misc/otp-old-3.png

Old Warning Dialog

/galleries/misc/otp-new-3.png

New Warning Dialog

/galleries/misc/otp-old-4.png

Old Info Dialog

/galleries/misc/otp-new-4.png

New Info Dialog

Logo Improvements

  • Remove drop shadows from logo
  • Center Erlang logo on left panel
  • Erlang logo is a link to the docs' main page
  • Put section description after logo and before links in left panel
/galleries/misc/otp-old-1.png

Old Landing Page

/galleries/misc/otp-new-1.png

New Landing Page

Semantic Improvements

  • Use title tags for titles
  • Remove usage of <br/> and empty <p></p> to add vertical spacing
  • Use lists for link lists
  • Title case section titles instead of uppercase
  • Add semantic markup and classes to section titles and bodies
  • Add classes to all generated markup
    • The ones I couldn't figure out a semantic class I added a generic one to help people spot them in the xsl document by inspecting the generated files
  • Clicable titles for standard sections with anchors for better linking

Improve table styling

/galleries/misc/otp-old-5.png

Old Tables

/galleries/misc/otp-new-5.png

New Tables

Improve applications page

/galleries/misc/otp-old-7.png

Old Applications List

/galleries/misc/otp-new-7.png

New Applications List

Improve modules page

/galleries/misc/otp-old-8.png

Old Modules List

/galleries/misc/otp-new-8.png

New Modules List

Add "progressive enhanced" syntax highlighting

At the bottom of the page there's a javascript file loaded, if successful it will load the syntax highlighter module and css and then style all the code blocks in the page, if it fails to load, is blocked or no js is enabled then the code blocks will have a default styling provided by CSS.

The markup was not modified in any way to add this feature.

Make code tokens easier to differentiate from standard text

The previous style for inline code was a really light italic font, I changed it to monospace but it was hard to distinguish, so I got some inspiration from slack and surrounded the inline code words in a light box to make them stand out.

Indent Exports and Data Types' section bodies

/galleries/misc/otp-old-6.png

Old Data Types and Exports Sections

/galleries/misc/otp-new-6.png

New Data Types and Exports Sections

This is all for now, I have some other ideas for future improvements but they involve changes to the documentation so I will submit them separatedly.

If you have any feedback please let me know!

Software que no falla

Reproduzco acá un post que hice en facebook después de ver la siguiente transcripción:

/galleries/misc/software-no-falla.jpg

Avisenle al señor Tonelli que el mismo día que el decía eso la agencia espacial europea perdió contacto con una sonda que mando a marte, que estuvo desarrollando por los últimos 7 anios, el proyecto salio 870 millones de euros y tiene los niveles de control de calidad mas altos de cualquier industria.

Un día después de eso, durante mas de dos horas servicios como twitter, netflix, github, paypal estuvieron fuera de servicio porque alguien hackeo webcams y otros dispositivos "inteligentes" y los uso para realizar un ataque de denegación de servicio contra un servicio que traduce lo que escribís en la barra de direcciones de tu navegador a direcciones que las computadoras pueden entender.

El que dice que el software no va a fallar es un irresponsable y no puede tener ninguna responsabilidad legislando sobre siquiera una linea de código.

Luego comencé a agregar los siguientes comentarios:

1) Mas noticias del día, se encontró hoy en el sistema operativo que van a usar las maquinas de voto electrónico un error que permite a cualquier persona obtener control total sobre el sistema, se que no lo van a leer pero acá esta:

“Most serious” Linux privilege-escalation bug ever is under active exploit

2) Hoy se informo que una empresa que distribuye certificados SSL (lo que pone el candadito verde en la dirección de tu banco y hace que sea una conexión segura, que también se usa para la transmisión de los resultados de las maquinas de voto al servidor central) permitía a personas obtener certificados para dominios que no eran de las personas que los solicitaban.

Incident Report - OCR

3) Algunos "divertidos" de la historia: Stanislav Yevgráfovich Petrov (Станислав Евграфович Петров en ruso, nacido en 9 de septiembre de 1939) es un teniente coronel retirado del ejército soviético durante la Guerra Fría. Es recordado por haber identificado correctamente una alerta de ataque con misiles como una falsa alarma en 1983, por lo que evitó lo que podía haber escalado en una guerra nuclear entre la Unión Soviética y los Estados Unidos.

4) Uno de 1998: La Mars Climate Orbiter se destruyó debido a un error de navegación, consistente en que el equipo de control en la Tierra hacía uso del Sistema Anglosajón de Unidades para calcular los parámetros de inserción y envió los datos a la nave, que realizaba los cálculos con el sistema métrico decimal. Así, cada encendido de los motores habría modificado la velocidad de la sonda de una forma no prevista y tras meses de vuelo el error se había ido acumulando.

5) En 2003 50 millones de personas se quedaron sin electricidad en Estados Unidos y Canada por un error de software: https://en.wikipedia.org/wiki/Northeast_blackout_of_2003

6) La Therac-25 fue una máquina de radioterapia producida por AECL, sucesora de los modelos Therac-6 y Therac-20 (las unidades anteriores fueron producidas en asociación con CGR). El aparato estuvo comprometido en al menos seis accidentes entre 1985 y 1987, en los que varios pacientes recibieron sobredosis de radiación. Tres de los pacientes murieron como consecuencia directa. Estos accidentes pusieron en duda la fiabilidad del control por software de sistemas de seguridad crítica, convirtiéndose en caso de estudio en la informática médica y en la ingeniería de software.

7) En 1995 un cohete (Ariane 5) que costo 7 billones de dolares de desarrollo y llevaba una carga valuada en 500 millones de dolares exploto porque se uso un numero "muy chico" para mantener la velocidad horizontal, esto resulto en la explosión del cohete.

8) Knight Capital perdió 440 millones de dolares en 45 minutos y se fue a la quiebra por un error de software que vendio acciones a precio equivocado.

9) En 2004 el sistema de trafico aéreo de Los Ángeles dejo de funcionar porque usaban un contador "muy chico", lo divertido es que el sistema de respaldo dejo de funcionar a los minutos de ser encendido.

10) En 1979 una planta nuclear en estados unidos "sufrió una fusión parcial del núcleo del reactor" causa: "La válvula debía cerrarse al disminuir la presión, aunque por un fallo no lo hizo. Las señales que llegaban al operador no indicaron que la válvula seguía abierta, aunque debía haberlo mostrado."

https://es.wikipedia.org/wiki/Accidente_de_Three_Mile_Island

11) Otras veces las causas son políticas "...fallas en la comunicación... dieron lugar a una decisión de lanzar 51-L basada en información incompleta y algunas veces engañosa, un conflicto entre los datos de ingeniería y los juicios de gestión, y una estructura de dirección de la NASA que permitió problemas internos de seguridad de vuelo para eludir las claves de traslado del transbordador."

https://es.wikipedia.org/wiki/Siniestro_del_transbordador_espacial_Challenger

This Week in WebAssembly III

Spec

spec repository

The most important "change" is that a PR for the stack machine semantics was opened in PR #323, but still not merged.

This Week in WebAssembly II