Riak Core Tutorial Part 6: Handoff

The content of this chapter is in the 04-handoff branch.

https://gitlab.com/marianoguerra/tanodb/tree/04-handoff

How it Works

With quorum requests we are halfway in our way to tolerate failures in cluster nodes, our values are written to more than one vnode but if a node dies and another takes his work or if we add a new node and the vnodes must be rebalanced we need to handle handoff.

The reasons to start a handoff are:

  • A ring update event for a ring that all other nodes have already seen.
  • A secondary vnode is idle for a period of time and the primary, original owner of the partition is up again.

When this happens riak_core will inform the vnode that handoff is starting, calling handoff_starting, if it returns false it's cancelled, if it returns true it calls is_empty, that must return false to inform that the vnode has something to handoff (it's not empty) or true to inform that the vnode is empty, in our case we ask for the first element of the ets table and if it's the special value '$end_of_table' we know it's empty, if it returns true the handoff is considered finished, if false then a call is done to handle_handoff_command passing as first parameter an opaque structure that contains two fields we are insterested in, foldfun and acc0, they can be unpacked with a macro like this:

handle_handoff_command(?FOLD_REQ{foldfun=Fun, acc0=Acc0}, _Sender, State) ->

The FOLD_REQ macro is defined in the riak_core_vnode.hrl header file which we include.

This function must iterate through all the keys it stores and for each of them call foldfun with the key as first argument, the value as second argument and the latest acc0 value as third.

The result of the function call is the new Acc0 you must pass to the next call to foldfun, the last Acc0 must be returned by the handle_handoff_command.

For each call to Fun(Key, Entry, AccIn0) riak_core will send it to the new vnode, to do that it must encode the data before sending, it does this by calling encode_handoff_item(Key, Value), where you must encode the data before sending it.

When the value is received by the new vnode it must decode it and do something with it, this is done by the function handle_handoff_data, where we decode the received data and do the appropriate thing with it.

When we sent all the key/values handoff_finished will be called and then delete so we cleanup the data on the old vnode .

You can decide to handle other commands sent to the vnode while the handoff is running, you can choose to do one of the followings:

  • Handle it in the current vnode
  • Forward it to the vnode we are handing off
  • Drop it

What to do depends on the design of you app, all of them have tradeoffs.

The signature of all the responses is:

-callback handle_handoff_command(Request::term(), Sender::sender(), ModState::term()) ->
{reply, Reply::term(), NewModState::term()} |
{noreply, NewModState::term()} |
{async, Work::function(), From::sender(), NewModState::term()} |
{forward, NewModState::term()} |
{drop, NewModState::term()} |
{stop, Reason::term(), NewModState::term()}.

A diagram of the flow is as follows:

+-----------+      +----------+        +----------+
|           | true |          | false  |          |
| Starting  +------> is_empty +--------> fold_req |
|           |      |          |        |          |
+-----+-----+      +----+-----+        +----+-----+
      |                 |                   |
      | false           | true              | ok
      |                 |                   |
+-----v-----+           |              +----v-----+     +--------+
|           |           |              |          |     |        |
| Cancelled |           +--------------> finished +-----> delete |
|           |                          |          |     |        |
+-----------+                          +----------+     +--------+

Implementing it

We need to add logic to all the empty callbacks related to handoff:

handle_handoff_command(?FOLD_REQ{foldfun=FoldFun, acc0=Acc0}, _Sender,
                       State=#state{partition=Partition, kv_state=KvState}) ->
    lager:info("fold req ~p", [Partition]),
    KvFoldFun = fun ({Key, Val}, AccIn) ->
                        lager:info("fold fun ~p: ~p", [Key, Val]),
                        FoldFun(Key, Val, AccIn)
                end,
    {AccFinal, KvState1} = tanodb_kv_ets:foldl(KvFoldFun, Acc0, KvState),
    {reply, AccFinal, State#state{kv_state=KvState1}};

handle_handoff_command(Message, _Sender, State) ->
    lager:warning("handoff command ~p, ignoring", [Message]),
    {noreply, State}.

handoff_starting(TargetNode, State=#state{partition=Partition}) ->
    lager:info("handoff starting ~p: ~p", [Partition, TargetNode]),
    {true, State}.

handoff_cancelled(State=#state{partition=Partition}) ->
    lager:info("handoff cancelled ~p", [Partition]),
    {ok, State}.

handoff_finished(TargetNode, State=#state{partition=Partition}) ->
    lager:info("handoff finished ~p: ~p", [Partition, TargetNode]),
    {ok, State}.

handle_handoff_data(BinData, State=#state{kv_state=KvState}) ->
    TermData = binary_to_term(BinData),
    lager:info("handoff data received ~p", [TermData]),
    {{Bucket, Key}, Value} = TermData,
    {ok, KvState1} = tanodb_kv_ets:put(KvState, Bucket, Key, Value),
    {reply, ok, State#state{kv_state=KvState1}}.

encode_handoff_item(Key, Value) ->
    term_to_binary({Key, Value}).

is_empty(State=#state{kv_state=KvState, partition=Partition}) ->
    {IsEmpty, KvState1} = tanodb_kv_ets:is_empty(KvState),
    lager:info("is_empty ~p: ~p", [Partition, IsEmpty]),
    {IsEmpty, State#state{kv_state=KvState1}}.

delete(State=#state{kv_state=KvState, partition=Partition}) ->
    lager:info("delete ~p", [Partition]),
    {ok, KvState1} = tanodb_kv_ets:dispose(KvState),
    {ok, KvState2} = tanodb_kv_ets:delete(KvState1),
    {ok, State#state{kv_state=KvState2}}.

Trying it

To test it we will first start a devrel node, put some values and then join two other nodes and see on the console the handoff happening.

To make sure the nodes don't know about each other in case you played with clustering already we will start by removing the devrel builds:

rm -rf _build/dev*

And build the nodes again:

make devrel

Now we will start the first node and connect to its console:

make dev1-console

We generate a list of some numbers:

(tanodb1@127.0.0.1)1> Nums = lists:seq(1, 10).

[1,2,3,4,5,6,7,8,9,10]

And with it create some bucket names:

(tanodb1@127.0.0.1)2>
Buckets = lists:map(fun (N) ->
    list_to_binary("bucket-" ++ integer_to_list(N))
end, Nums).

[<<"bucket-1">>,<<"bucket-2">>,<<"bucket-3">>,
 <<"bucket-4">>,<<"bucket-5">>,<<"bucket-6">>,<<"bucket-7">>,
 <<"bucket-8">>,<<"bucket-9">>,<<"bucket-10">>]

And some key names:

(tanodb1@127.0.0.1)3>
Keys = lists:map(fun (N) ->
    list_to_binary("key-" ++ integer_to_list(N))
end, Nums).

[<<"key-1">>,<<"key-2">>,<<"key-3">>,<<"key-4">>,
 <<"key-5">>,<<"key-6">>,<<"key-7">>,<<"key-8">>,<<"key-9">>,
 <<"key-10">>]

We create a function to generate a value from a bucket and a key:

(tanodb1@127.0.0.1)4>
GenValue = fun (Bucket, Key) -> [{bucket, Bucket}, {key, Key}] end.

And then put some values to the buckets and keys we created:

(tanodb1@127.0.0.1)5>
lists:foreach(fun (Bucket) ->
    lists:foreach(fun (Key) ->
        Val = GenValue(Bucket, Key),
        tanodb:put(Bucket, Key, Val)
    end, Keys)
end, Buckets).

ok

Now that we have some data let's start the other two nodes:

make dev2-console

In yet another shell:

make dev3-console

This part should remind you of the first chapter:

make devrel-join
Success: staged join request for 'tanodb2@127.0.0.1' to 'tanodb1@127.0.0.1'
Success: staged join request for 'tanodb3@127.0.0.1' to 'tanodb1@127.0.0.1'
make devrel-cluster-plan
=============================== Staged Changes =========================
Action         Details(s)
------------------------------------------------------------------------
join           'tanodb2@127.0.0.1'
join           'tanodb3@127.0.0.1'
------------------------------------------------------------------------


NOTE: Applying these changes will result in 1 cluster transition

########################################################################
                         After cluster transition 1/1
########################################################################

================================= Membership ===========================
Status     Ring    Pending    Node
------------------------------------------------------------------------
valid     100.0%     34.4%    'tanodb1@127.0.0.1'
valid       0.0%     32.8%    'tanodb2@127.0.0.1'
valid       0.0%     32.8%    'tanodb3@127.0.0.1'
------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

WARNING: Not all replicas will be on distinct nodes

Transfers resulting from cluster changes: 42
  21 transfers from 'tanodb1@127.0.0.1' to 'tanodb3@127.0.0.1'
  21 transfers from 'tanodb1@127.0.0.1' to 'tanodb2@127.0.0.1'
make devrel-cluster-commit
Cluster changes committed

On the consoles from the nodes you should see some logs like the following, I will just paste some as example.

On the sending side:

00:17:24.240 [info] Starting ownership transfer of tanodb_vnode from
'tanodb1@127.0.0.1' 1118962191081472546749696200048404186924073353216 to
'tanodb2@127.0.0.1' 1118962191081472546749696200048404186924073353216

00:17:24.240 [info] fold req 1118962191081472546749696200048404186924073353216
00:17:24.240 [info] fold fun {<<"bucket-1">>,<<"key-1">>}:
    [{bucket,<<"bucket-1">>},{key,<<"key-1">>}]

...

00:17:24.241 [info] fold fun {<<"bucket-7">>,<<"key-8">>}:
    [{bucket,<<"bucket-7">>},{key,<<"key-8">>}]

00:17:24.281 [info] ownership transfer of tanodb_vnode from
'tanodb1@127.0.0.1' 1118962191081472546749696200048404186924073353216 to
'tanodb2@127.0.0.1' 1118962191081472546749696200048404186924073353216
    completed: sent 575.00 B bytes in 7 of 7 objects in 0.04 seconds
    (13.67 KB/second)

00:17:24.280 [info] handoff finished
    1141798154164767904846628775559596109106197299200:
    {1141798154164767904846628775559596109106197299200,
        'tanodb3@127.0.0.1'}

00:17:24.285 [info] delete
    1141798154164767904846628775559596109106197299200

On the receiving side:

00:13:59.641 [info] handoff starting
    1050454301831586472458898473514828420377701515264:
    {hinted,{1050454301831586472458898473514828420377701515264,
        'tanodb1@127.0.0.1'}}

00:13:59.641 [info] is_empty
    182687704666362864775460604089535377456991567872: true

00:14:34.259 [info] Receiving handoff data for partition
    tanodb_vnode:68507889249886074290797726533575766546371837952 from
    {"127.0.0.1",47440}

00:14:34.296 [info] handoff data received
    {{<<"bucket-8">>,<<"key-1">>},
        [{bucket,<<"bucket-8">>},{key,<<"key-1">>}]}

...

00:14:34.297 [info] handoff data received
    {{<<"bucket-3">>,<<"key-7">>},
        [{bucket,<<"bucket-3">>},{key,<<"key-7">>}]}

00:14:34.298 [info] Handoff receiver for partition
    68507889249886074290797726533575766546371837952 exited after
    processing 5 objects from {"127.0.0.1",47440}

Riak Core Tutorial Part 5: Quorum Requests

The content of this chapter is in the 03-quorum branch.

https://gitlab.com/marianoguerra/tanodb/tree/03-quorum

How it Works

Quorum requests allows sending a command to more than one vnode and wait until a number of responses are received before considering the request succesful.

To implement this we need a process that distributed the requests to the first N vnodes in the preference list and waits for at least W response to arrive before returning to the requester.

We use a gen_fsm to implement this process, which looks like this:

+------+    +---------+    +---------+    +---------+              +------+
|      |    |         |    |         |    |         |remaining = 0 |      |
| Init +--->| Prepare +--->| Execute +--->| Waiting +------------->| Stop |
|      |    |         |    |         |    |         |              |      |
+------+    +---------+    +---------+    +-------+-+              +------+
                                              ^   | |
                                              |   | |        +---------+
                                              +---+ +------->|         |
                                                             | Timeout |
                                      remaining > 0  timeout |         |
                                                             +---------+

Implementing it

To implement it we need to change the code in tanodb.erl instantiate a FSM to handle the request instead of sending the command directly to one vnode.

get(Bucket, Key, Opts) ->
    K = {Bucket, Key},
    Params = K,
    run_quorum(get, K, Params, Opts).

put(Bucket, Key, Value, Opts) ->
    K = {Bucket, Key},
    Params = {Bucket, Key, Value},
    run_quorum(put, K, Params, Opts).

delete(Bucket, Key, Opts) ->
    K = {Bucket, Key},
    Params = K,
    run_quorum(delete, K, Params, Opts).

We are going to generalize that logic in a function called run_quorum, where we can pass options for N, W and Timeout to play with different values:

run_quorum(Action, K, Params, Opts) ->
    N = maps:get(n, Opts, ?N),
    W = maps:get(w, Opts, ?W),
    Timeout = maps:get(timeout, Opts, ?TIMEOUT),
    ReqId = make_ref(),
    tanodb_write_fsm:run(Action, K, Params, N, W, self(), ReqId),
    wait_for_reqid(ReqId, Timeout).

wait_for_reqid(ReqId, Timeout) ->
    receive
        {ReqId, Val} -> Val
    after
        Timeout -> {error, timeout}
    end.

To wait for the right answer we need to generate a unique identifier for each request and send it with the request itself. The identifier will come back in the message sent by the FSM once the request finishes.

If too much time passed waiting for the response we consider it an error and return before receiving it.

wait_for_reqid(ReqId, Timeout) ->
        receive
                {ReqId, {error, Reason}} -> {error, Reason};
                {ReqId, Val} -> Val
        after
                Timeout -> {error, timeout}
        end.

There are two new files:

tanodb_write_fsm.erl
The FSM logic
tanodb_write_fsm_sup.erl
The supervisor for the FSMs

Finally we need to add tanodb_write_fsm_sup to our top level supervisor in tanodb_sup.

Trying it

To test it we are going to run some calls to the API and observe that now the response contains more than one response:

(tanodb@127.0.0.1)1> B1 = b1.
(tanodb@127.0.0.1)2> K1 = k1.
(tanodb@127.0.0.1)3> V1 = v1.

First let's try to get a key that doesn't exist:

(tanodb@127.0.0.1)4> tanodb:get(B1, K1).
{ok,[{[1073290264914881830555831049026020342559825461248,
           'tanodb@127.0.0.1'],
          {not_found,{b1,k1}}},

         {[1050454301831586472458898473514828420377701515264,
           'tanodb@127.0.0.1'],
          {not_found,{b1,k1}}},

         {[1096126227998177188652763624537212264741949407232,
           'tanodb@127.0.0.1'],
          {not_found,{b1,k1}}}]}

Let's do the same call but passing options, we want to run the command in 5 vnodes and wait for the response of the 5, the request should finish under a second:

(tanodb@127.0.0.1)5> tanodb:get(k1, v1, #{n => 5, w => 5, timeout => 1000}).
{ok,[{[456719261665907161938651510223838443642478919680,
           'tanodb@127.0.0.1'],
          {not_found,{k1,v1}}},

         {[433883298582611803841718934712646521460354973696,
           'tanodb@127.0.0.1'],
          {not_found,{k1,v1}}},

         {[411047335499316445744786359201454599278231027712,
           'tanodb@127.0.0.1'],
          {not_found,{k1,v1}}},

         {[388211372416021087647853783690262677096107081728,
           'tanodb@127.0.0.1'],
          {not_found,{k1,v1}}},

         {[365375409332725729550921208179070754913983135744,
           'tanodb@127.0.0.1'],
          {not_found,{k1,v1}}}]}

Let's try deleting a key that doesn't exist:

(tanodb@127.0.0.1)6> tanodb:delete(B1, K1).
{ok,[{[1050454301831586472458898473514828420377701515264,
           'tanodb@127.0.0.1'],
          ok},

         {[1073290264914881830555831049026020342559825461248,
           'tanodb@127.0.0.1'],
          ok},

         {[1096126227998177188652763624537212264741949407232,
           'tanodb@127.0.0.1'],
          ok}]}

Let's put a value:

(tanodb@127.0.0.1)7> tanodb:put(B1, K1, V1).
{ok,[{[1096126227998177188652763624537212264741949407232,
           'tanodb@127.0.0.1'],
          ok},

         {[1073290264914881830555831049026020342559825461248,
           'tanodb@127.0.0.1'],
          ok},

         {[1050454301831586472458898473514828420377701515264,
           'tanodb@127.0.0.1'],
          ok}]}

Now let's get the value:

(tanodb@127.0.0.1)8> tanodb:get(B1, K1).
{ok,[{[1096126227998177188652763624537212264741949407232,
           'tanodb@127.0.0.1'],
          {found,{{b1,k1},v1}}},

         {[1050454301831586472458898473514828420377701515264,
           'tanodb@127.0.0.1'],
          {found,{{b1,k1},v1}}},

         {[1073290264914881830555831049026020342559825461248,
           'tanodb@127.0.0.1'],
          {found,{{b1,k1},v1}}}]}

Let's delete it:

(tanodb@127.0.0.1)9> tanodb:delete(B1, K1).
{ok,[{[1073290264914881830555831049026020342559825461248,
           'tanodb@127.0.0.1'],
          ok},

         {[1096126227998177188652763624537212264741949407232,
           'tanodb@127.0.0.1'],
          ok},

         {[1050454301831586472458898473514828420377701515264,
           'tanodb@127.0.0.1'],
          ok}]}

And try to get it back:

(tanodb@127.0.0.1)10> tanodb:get(B1, K1).
{ok,[{[1073290264914881830555831049026020342559825461248,
           'tanodb@127.0.0.1'],
          {not_found,{b1,k1}}},

         {[1096126227998177188652763624537212264741949407232,
           'tanodb@127.0.0.1'],
          {not_found,{b1,k1}}},

         {[1050454301831586472458898473514828420377701515264,
           'tanodb@127.0.0.1'],
          {not_found,{b1,k1}}}]}

Riak Core Tutorial Part 4: First Commands

The content of this chapter is in the 02-commands branch.

https://gitlab.com/marianoguerra/tanodb/tree/02-commands

This is part of a series, see the previous one at Riak Core Tutorial Part 3: Ping Command

Implementing Get, Put and Delete

For our first commands we will copy the general structure of the ping command.

We will start by adding three new functions to the tanodb.erl file:

get(Bucket, Key) ->
    ReqId = make_ref(),
    send_to_one(Bucket, Key, {get, ReqId, {Bucket, Key}}).

put(Bucket, Key, Value) ->
    ReqId = make_ref(),
    send_to_one(Bucket, Key, {put, ReqId, {Bucket, Key, Value}}).

delete(Bucket, Key) ->
    ReqId = make_ref(),
    send_to_one(Bucket, Key, {delete, ReqId, {Bucket, Key}}).

And generalizing the code used by ping to send a command to one vnode:

send_to_one(Bucket, Key, Cmd) ->
    DocIdx = riak_core_util:chash_key({Bucket, Key}),
    PrefList = riak_core_apl:get_primary_apl(DocIdx, 1, tanodb),
    [{IndexNode, _Type}] = PrefList,
    riak_core_vnode_master:sync_spawn_command(IndexNode, Cmd, tanodb_vnode_master).

In tanodb_vnode.erl we will need to first create an instance of the key-value store per vnode at initialization and keep a reference to its state in the vnode state record:

-record(state, {partition, kv_state}).

init([Partition]) ->
    {ok, KvState} = tanodb_kv_ets:new(#{partition => Partition}),
    {ok, #state { partition=Partition, kv_state=KvState }}.

We then need to add three new clauses to the handle_command callback to handle our two new commands, which translate almost directly to calls in the kv module:

handle_command({put, ReqId, {Bucket, Key, Value}}, _Sender,
               State=#state{kv_state=KvState, partition=Partition}) ->
    Location = [Partition, node()],
    {Res, KvState1} = tanodb_kv_ets:put(KvState, Bucket, Key, Value),
    {reply, {ReqId, {Location, Res}}, State#state{kv_state=KvState1}};

handle_command({get, ReqId, {Bucket, Key}}, _Sender,
               State=#state{kv_state=KvState, partition=Partition}) ->
    Location = [Partition, node()],
    {Res, KvState1} = tanodb_kv_ets:get(KvState, Bucket, Key),
    {reply, {ReqId, {Location, Res}}, State#state{kv_state=KvState1}};

handle_command({delete, ReqId, {Bucket, Key}}, _Sender,
               State=#state{kv_state=KvState, partition=Partition}) ->
    Location = [Partition, node()],
    {Res, KvState1} = tanodb_kv_ets:delete(KvState, Bucket, Key),
    {reply, {ReqId, {Location, Res}}, State#state{kv_state=KvState1}};

Trying it

First let's try to get a key that doesn't exist:

(tanodb@127.0.0.1)1> B1 = b1.
(tanodb@127.0.0.1)2> K1 = k1.
(tanodb@127.0.0.1)3> V1 = v1.
(tanodb@127.0.0.1)4> tanodb:get(B1, K1).
{Ref, {[1050454301831586472458898473514828420377701515264,
        'tanodb@127.0.0.1'],
  {not_found,{b1,k1}}}}

The structure of the response is:

{UniqueRequestReference, {[PartitionId, NodeId], CommandResponse}}.

Let's try deleting a key that doesn't exist:

(tanodb@127.0.0.1)5> tanodb:delete(B1, K1).
{Ref, {[1050454301831586472458898473514828420377701515264,
        'tanodb@127.0.0.1'],
  ok}}

Let's put a value:

(tanodb@127.0.0.1)6> tanodb:put(B1, K1, V1).
{Ref, {[1050454301831586472458898473514828420377701515264,
        'tanodb@127.0.0.1'],
  ok}}

Now let's get the value:

(tanodb@127.0.0.1)7> tanodb:get(B1, K1).
{Ref, {[1050454301831586472458898473514828420377701515264,
        'tanodb@127.0.0.1'],
  {found,{{b1,k1},v1}}}}

Let's delete it:

(tanodb@127.0.0.1)8> tanodb:delete(B1, K1).
{Ref, {[1050454301831586472458898473514828420377701515264,
        'tanodb@127.0.0.1'],
  ok}}

And try to get it back:

(tanodb@127.0.0.1)9> tanodb:get(B1, K1).
{Ref, {[1050454301831586472458898473514828420377701515264,
        'tanodb@127.0.0.1'],
  {not_found,{b1,k1}}}}

Riak Core Tutorial Part 3: Ping Command

The content of this chapter is in the `01-template` branch.

https://gitlab.com/marianoguerra/tanodb/tree/01-template

This is part of a series, see the previous one at Riak Core Tutorial Part 2: Starting

How it Works

Let's see how ping works under the covers.

Its entry point and public API is the tanodb module, that means we have to look into tanodb.erl:

-module(tanodb).
-export([ping/0]).
-ignore_xref([ping/0]).

%% @doc Pings a random vnode to make sure communication is functional
ping() ->
    % argument to chash_key has to be a two item tuple, since it comes from riak
    % and the full key has a bucket, we use a contant in the bucket position
    % and a timestamp as key so we hit different vnodes on each call
    DocIdx = riak_core_util:chash_key({<<"ping">>, term_to_binary(os:timestamp())}),
    % ask for 1 vnode index to send this request to, change N to get more
    % vnodes, for example for replication
    N = 1,
    PrefList = riak_core_apl:get_primary_apl(DocIdx, N, tanodb),
    [{IndexNode, _Type}] = PrefList,
    riak_core_vnode_master:sync_spawn_command(IndexNode, ping, tanodb_vnode_master).
DocIdx = riak_core_util:chash_key({<<"ping">>, term_to_binary(os:timestamp())}),

The line above hashes a key to decide to which vnode the call should go, a riak_core app has a fixed number of vnodes that are distributed across all the instances of your app's physical nodes, vnodes move from instance to instance when the number of instances change to balance the load and provide fault tolerance and scalability.

The call above will allow us to ask for vnodes that can handle that hashed key, let's run it in the app console to see what it does:

(tanodb@127.0.0.1)1> DocIdx = riak_core_util:chash_key({<<"ping">>, term_to_binary(os:timestamp())}).

<<126,9,218,77,97,108,38,92,0,155,160,26,161,3,200,87,134,213,167,168>>

We seem to get a binary back, in the next line we ask for a list of vnodes that can handle that hashed key:

PrefList = riak_core_apl:get_primary_apl(DocIdx, N, tanodb),

Let's run it to see what it does:

(tanodb@127.0.0.1)2> PrefList = riak_core_apl:get_primary_apl(DocIdx, 1, tanodb).

[{{730750818665451459101842416358141509827966271488, 'tanodb@127.0.0.1'},
     primary}]

We get a list with one tuple that has 3 items, a long number, something that looks like a host and an atom, let's try changing the number 1:

(tanodb@127.0.0.1)3> PrefList2 = riak_core_apl:get_primary_apl(DocIdx, 2, tanodb).

[{{730750818665451459101842416358141509827966271488,
   'tanodb@127.0.0.1'}, primary},
 {{753586781748746817198774991869333432010090217472,
   'tanodb@127.0.0.1'}, primary}]

Now we get two tuples, the first one is the same, so what this does is to return the number of vnodes that can handle the request from the hashed key by priority.

The first number is the vnode id, it's what we get on the ping response.

Next line just unpacks the pref list to get the vnode id and ignore the other part:

[{IndexNode, _Type}] = PrefList,

Finally we ask riak_core to call the ping command on the IndexNode we got back:

riak_core_vnode_master:sync_spawn_command(IndexNode, ping, tanodb_vnode_master).

Let's try it on the console:

(tanodb@127.0.0.1)5> [{IndexNode, _Type}] = PrefList.

[{{730750818665451459101842416358141509827966271488,
   'tanodb@127.0.0.1'}, primary}]

(tanodb@127.0.0.1)6> riak_core_vnode_master:sync_spawn_command(IndexNode, ping, tanodb_vnode_master).

{pong,730750818665451459101842416358141509827966271488}

You can see we get IndexNode back in the pong response, now let's try passing the second IndexNode:

(tanodb@127.0.0.1)7> [{IndexNode1, _Type1}, {IndexNode2, _Type2}] = PrefList2.

[{{730750818665451459101842416358141509827966271488,
   'tanodb@127.0.0.1'}, primary},
 {{753586781748746817198774991869333432010090217472,
   'tanodb@127.0.0.1'}, primary}]


(tanodb@127.0.0.1)9> riak_core_vnode_master:sync_spawn_command(IndexNode2, ping, tanodb_vnode_master).

{pong,753586781748746817198774991869333432010090217472}

We get the IndexNode2 back, that means that the request was sent to the second vnode instead of the first one.

But where does the command go?

Let's see the content of tanodb_vnode.erl (just the useful parts):

-module(tanodb_vnode).
-behaviour(riak_core_vnode).

-export([start_vnode/1,
         init/1,
         terminate/2,
         handle_command/3,
         is_empty/1,
         delete/1,
         handle_handoff_command/3,
         handoff_starting/2,
         handoff_cancelled/1,
         handoff_finished/2,
         handle_handoff_data/2,
         encode_handoff_item/2,
         handle_overload_command/3,
         handle_overload_info/2,
         handle_coverage/4,
         handle_exit/3]).

-record(state, {partition}).

%% API
start_vnode(I) ->
    riak_core_vnode_master:get_vnode_pid(I, ?MODULE).

init([Partition]) ->
    {ok, #state { partition=Partition }}.

%% Sample command: respond to a ping
handle_command(ping, _Sender, State) ->
    {reply, {pong, State#state.partition}, State};
handle_command(Message, _Sender, State) ->
    lager:warning("unhandled_command ~p", [Message]),
    {noreply, State}.

Let's go by parts, first we declare our module:

-module(tanodb_vnode).

We specify that we want to implement the riak_core_vnode behavior:

-behaviour(riak_core_vnode).

Behaviors in Erlang are like interfaces, a set of functions that a module must implement to satisfy the behaviour specification, you can read more in the Erlang documentation.

In this case riak_core defines a behavior with a set of functions we must implement to be a valid riak_core vnode, you can get an idea of the kind of functionality we need by looking at the exported functions:

-export([start_vnode/1,
         init/1,
         terminate/2,
         handle_command/3,
         is_empty/1,
         delete/1,
         handle_handoff_command/3,
         handoff_starting/2,
         handoff_cancelled/1,
         handoff_finished/2,
         handle_handoff_data/2,
         encode_handoff_item/2,
         handle_overload_command/3,
         handle_overload_info/2,
         handle_coverage/4,
         handle_exit/3]).

For the moment most of them have a "dummy" implementation where they just do the minimal amount of work to satisfy the behavior and not more, it's our job to change the default implementation to fit our needs.

We will have a record called state to keep info between callbacks, this is typical Erlang way of managing state so I won't cover it here:

-record(state, {partition}).

We implement the api to start the vnode:

%% API
start_vnode(I) ->
    riak_core_vnode_master:get_vnode_pid(I, ?MODULE).

Note that on init we store the Partition value on state so we can use it later, this is what I referred above as vnode id, it's the big number you saw before:

init([Partition]) ->
    {ok, #state { partition=Partition }}.

Now for the interesting part, here we have our ping command implementation, we match for ping in the Message position (the first argument):

handle_command(ping, _Sender, State) ->

Return a response with the second item in the tuple being the actual response that the caller will get where we reply with the atom pong and the partition number of this vnode, the last item in the tuple is the new state we want to have for this vnode, since we didn't change anything we pass the current value:

{reply, {pong, State#state.partition}, State};

We implement a catch all that will just log the unknown command and give no reply back:

handle_command(Message, _Sender, State) ->
    lager:warning("unhandled_command ~p", [Message]),
    {noreply, State}.

This is the roundtrip of a ping call, our task to add more commands will be:

  • Add a function on tanodb.erl that hides the internal work done to distribute the work
  • Add a new match on handle_command to match the command we added on tanodb.erl and provide a reply

Creemos en la Web: Recursos online

Si venís siguiente todas las secciones de esta serie habrás notado un patrón:

  1. Esto parece bastante repetitivo
  2. Abra alguien que haya hecho algo para facilitar esto?
  3. Si!

En esta sección vamos a ver algunos recursos que nos van a hacer mas fácil empezar y adaptar los recursos disponibles a lo que necesitemos.

Adaptando boostrap a nuestros gustos

Whoostrap cuenta con una lista de temas para aplicar a bootstrap y cambiar su aspecto básico.

Acá hay un ejemplo de como usarlo:

Guardamos el texto del CSS en un archivo y lo incluimos en nuestro proyecto, aca hay un ejemplo: https://thimbleprojects.org/marianoguerra/512478/

La pagina tambien provee algunos themes predefinidos themes.guide

Otras paginas quen nos brinda themes gratuitos que podemos descargar y usar: hackerthemes y Now UI Kit

Copiando fragmentos de HTML

Muchas de las partes de una pagina son generales y se repiten, por ejemplo la barra de navegación superior, el pie de pagina, una lista de productos o características, como esas cosas son repetitivas pero no hay una forma simple de "abstraerlas" sin tener que aprender javascript, hay paginas que nos muestran distintos fragmentos de HTML para componentes comunes. En ingles le llaman cheatsheets, acá hay una de bootstrap que es muy útil:

Bootstrap Cheatsheet

Hace click en el componente que querés ver y te va a mostrar el HTML a la izquierda y como se ve a la derecha.