Papers of The Week I

This is an attempt to treat what I would call acolyer's syndrome which is the gilt felt by people that would like to read papers as often as Adrian Colyer but never do.

So I will blog the ones I read here to try to follow Jerry Seinfeld's Productivity Secret

I will blog weekly because I won't read a paper a day, but I will try to read around 4 or 5 papers a week if they are around 12~15 pages, if they are longer I will read less.

The initial topics are stream processing systems and distributed systems, I will follow the references that I find interesting to inform future papers.

I will also read papers that I find interesting as I go.

ok, without further ado, here are the ones I read this week.

Related to stream processing:

Clasics I wanted to read:

In queue: 33

LoRaWan Overview


  • Network Protocol Candidate Specification
  • Optimized for battery powered end-devices
    • Fixed
    • Mobile (as in, they move, not phones)
  • Network Topology typically is start-of-stars
  • Network Operators can't secretly listen on application data


All Communication is bidirectional, uplink traffic should dominate:

|   |

           LoRa or FSK   +---------+   IP   +---------+
    +---+    (radio)     |         |        |         |
    |   |  <---------->  |         <-------->         |
    +---+                |         |        |         |
                         +---------+        +---------+
                           Gateway         Network Server
   |   |

  • Communication spread out on different frequency channels and data rates
  • Data Rates between 0.3 kbps and 50 kbps
    • Max ~ 45 tweets/s (extended ASCII only ;)
    • Just the text w/o protocol overhead
    • Don't expect audio, video or any kind of streaming
  • Encryption of payload
    • AES 128 bit key length
    • One key for each FPort
  • MAC Commands
    • For Network Management
    • Invisible to Application Layer

Devices Classes

  • Class A: Baseline
    • Uplink transmission
    • Followed by two short downlink receive windows (RX1, RX2)
  • Class B: Beacon
    • Allow more receive slots at scheduled times
    • Synchronize by a beacon from the gateway
  • Class C: Continuous
    • Nearly continuos open receive windows
    • Only closed when transmitting
    • Lower latency, but more energy usage
  • All devices implement at least class A

Receive Windows

  • After uplink at configured periods
  • If msg received for current device on RX1, RX2 doesn't happen
    • Max one downlink per uplink on Class A
  • Can't transmit from last transmit until after RX2 window
+------------------+                  +-----------+              +------------+
|                  |                  |           |              |            |
| Transmit         |                  | RX1       |              | RX2        |
|                  |                  |           |              |            |
+------------------+                  +-----------+              +------------+

   Transmit Time         Receive
      on Air             Delay 1

                                      Delay 2

MAC Message Types

  • Join Request/Accept
    • For Over the air Activation
  • Unconfirmed Data Up/Down
    • No ACK required
  • Confirmed Data Up/Down
    • ACK required

MAC Messages

  • Can be standalone messages
    • Always encrypted
  • Or "Piggyback" on next message
    • No encryption
  • Unknown messages ignored

ACK Messages

  • Can be standalone messages
  • Or "Piggyback" on next message

End Device Activation

To participate on a LoRaWAN network

Over the Air Activation

  • Needs join procedure
  • Requires fields set on device
    • DevEUI
    • AppEUI
    • AppKey (AES 128, derived from root AppKey)
  • Network Key provided
    • Allows network roaming

Activation by Personalization

All info stored on device on setup

Information Stored after Activation

  • Device Address
    • Two parts: Network Id and Network Address
  • Application Identifier
    • Global ID, uniquely identifies owner
  • Network Session Key
    • Used for MIC generation
    • Used for MAC only message encryption/decryption
  • Application Session Key
    • Used to encrypt/decrypt payload and for MIC

Class B Devices

  • Devices mobile or fixed that require to open receive windows
    • At fixed time intervals (ping slots)
  • Class B implements Class A
  • All gateways must synchronously broadcast a beacon
  • Provides timing reference to devices
  • Devices start as Class A and can switch to B when detect a beacon
  • If no beacon is detected for 120 minutes, devices switches back to Class A

Class C Devices

  • Used for applications that have suficient power available
    • cannot implement Class B
  • Will listen with RX2 window parameters as often as possible
  • No message to tell the server that it is a class C node
    • App must know
  • Like Class B, can receive multicast downlink frames end to end - Part II: Frontend

This is the second and final part, the previous part is here: end to end - Part I: Backend, this part will be a little more complicated than necesary since I made a mistake in the first part and I carried it in the first implementation of the frontend, you can have a clean picture of the final result which doesn't include any cruft by reading the current code in the repository marianoguerra-atik/om-next-e2e.

Without further ado, here we go:

In the previous section I created one endpoint for queries and one for actions (or transactions), this was a confusion I had and is not needed, the om parser will call mutators or readers depending on what is passed, let's review the changes needed in the backend to make this a single endpoint:

If we run this changes and try the increment mutation like before but sending it to the query endpoint we will get an error:

$ echo '(ui/increment {:value 1})' | transito http post http://localhost:8080/query e2t -

Status: 500
Connection: keep-alive
Content-Type: application/transit+json
Content-Length: 33

{:error "Internal Error"}

To make it work we have to send it inside a vector:

$ echo '[(ui/increment {:value 1})]' | transito http post http://localhost:8080/query e2t -

Status: 200
Connection: keep-alive
Content-Type: application/transit+json
Content-Length: 6


Like in the frontend, we can send a list of places to re read after the transaction:

$ echo '[(ui/increment {:value 1}) :count]' | transito http post http://localhost:8080/query e2t -

Status: 200
Connection: keep-alive
Content-Type: application/transit+json
Content-Length: 18

{:count 2}

Now that we have all the changes in the backend let's review the frontend.

In this ui we just display hello world and is only to test that the figwheel and cljsbuild setup works.

You can try it running:

lein figwheel

And opening http://localhost:3449/index.html

Then we implement a counter component that only works in the frontend, if you read the documentation it shouldn't require much explanation.

Then we add cljs-http dependency that we will use to talk to the server from the frontend and we do some changes on the backend to serve static files from resources/public.

In the next commit we rename the increment mutation to ui/increment (ui isn't a good name for this, should have picked a better one).

We also require some modules and macros to use the cljs-http module and implement the :send function that is required by the reconciler if we want to talk to remotes, this is explained in the documentation in the Remote Synchronization Tutorial and the FAQ.

In this commit I did the increment transaction by hand because I couldn't get it to work since I was trying to pass ":remote true" to the mutator but not the query ast, you will see that in the next commit.

Then when Increment is clicked I make a transaction to increment it both locally and send it to the backend, I make the transaction on click which is handled at defmethod mutate 'ui/increment, notice the ":remote true" and ":api ast", :api is an identifier for a remote that I specified when creating the reconciler.

Now you can start the server with:

lein run

And open http://localhost:8080/index.html.

click increment, open it in another browser and click increment in one and then in the other one, see how they reflect the actual value after a short time where they increment it by one locally.

You can see a short screencast of this demo here: end to end - Part I: Backend

Here I will build an example of end to end app with frontend communicating with backend both using clojure.

The repository is here: gh:marianoguerra-atik/om-next-e2e, each commit is one step here, some commits are simple changes that I don't cover here.

Click on the links to go to the diff of that specific part.


Start by creating a new clojure project with leiningen:

lein new om-next-e2e

Basic Logging and HTTP Server

Jump to this commit with:

git checkout 32842e95abc4960b32488a51110fe7d7e385be88

To test run:

lein run

You should see:

14:55:22.179 [main] INFO  om-next-e2e.core - Starting Server at
14:55:22.778 INFO  [org.projectodd.wunderboss.web.Web] (main) Registered
web context /

On another terminal using httpie (

$ http get localhost:8080/

HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 12
Date: Thu, 26 Nov 2015 13:55:24 GMT
Server: undertow

Hello world!

Basic Routing with Bidi

This handlers (action and query) just return 200 and the body with some extra content.

Jump to this commit with:

git checkout 03b95c397b1c7d21cafe7a9a21efebc7df5b6b41

Let's try it, first let's try the not found handler:

$ http get localhost:8080/lala
HTTP/1.1 404 Not Found
Content-Length: 9
Server: undertow

Not Found

Let's check that doing get on a route that handles only post returns 404 (for REST purists it should be 405, I know):

$ http get localhost:8080/action
HTTP/1.1 404 Not Found
Content-Length: 9
Server: undertow

Not Found

Let's send some content to action as json for now:

$ http post localhost:8080/action name=lala
HTTP/1.1 200 OK
Content-Length: 24
Server: undertow

action: {"name": "lala"}

And query:

$ http post localhost:8080/query name=lala
HTTP/1.1 200 OK
Content-Length: 30
Server: undertow

query action: {"name": "lala"}

Use Transit for Requests and Responses

Jump to this commit with:

git checkout 56d8d2e615e7f499c9dbeaa1d1479a0f39dc1950

From here on I will use a tool I created called transito written in python since writing and reading transit is not fun I created a tool to translate to and from json, transit and edn, here I use edn since it's more readable and is what we will use in our frontend, you can install it with:

sudo pip install transito

Send an action:

$ echo '(start {:id "id2"})' | transito http post http://localhost:8080/action e2t -

Status: 200
Content-Type: application/transit+json
Content-Length: 60
Server: undertow

{:action (start {:id "id2"})}

The response is translated from transit to edn, the actual response can be seen using something like curl:

curl -X POST http://localhost:8080/action -d '["~#list",["~$start",["^ ","~:id","id2"]]]'

["^ ","~:action",["~#list",["~$start",["^ ","~:id","id2"]]]]

You can get the body you want translated to transit like this:

echo '(start {:id "id2"})' | transito e2t -
["~#list",["~$start",["^ ","~:id","id2"]]]

Let's try the not found handler (notice we are sending to actiona instead of action):

$ echo '(start {:id "id2"})' | transito http post http://localhost:8080/actiona e2t -
Status: 404
Content-Type: application/transit+json
Content-Length: 28
Server: undertow

{:error "Not Found"}

Now let's test the query endpoint:

$ echo '(tasks {:id "id2"})' | transito http post http://localhost:8080/query e2t -
Status: 200
Content-Type: application/transit+json
Content-Length: 59
Server: undertow

{:query (tasks {:id "id2"})}

Supporting Actions and Queries

At this point we need to support the same mutations and reads as the frontend, to do this we need to add the dependency, I'm using om next alpha25 SNAPSHOT, here is the way to install the exact version I'm using:

git clone
cd om
git checkout 34b9a614764f47a022ddfaf2e469d298d7605d44
lein install


Jump to this commit with:

git checkout f9ac70c18c89ecbe336c736ef266c17ee1ef8eab

Now let's test it.

Increment by 20:

$ echo '(increment {:value 20})' | transito http post http://localhost:8080/action e2t -

Status: 200
Content-Type: application/transit+json
Content-Length: 44
Server: undertow

{:value {:keys [:count]}}

Get current count:

$ echo '[:count]' | transito http post http://localhost:8080/query e2t -

Status: 200
Content-Type: application/transit+json
Content-Length: 19
Server: undertow

{:count 20}

Increment by 1:

$ echo '(increment {:value 1})' | transito http post http://localhost:8080/action e2t -

Status: 200
Content-Type: application/transit+json
Content-Length: 44
Server: undertow

{:value {:keys [:count]}}

Get current count:

$ echo '[:count]' | transito http post http://localhost:8080/query e2t -

Status: 200
Content-Type: application/transit+json
Content-Length: 19
Server: undertow

{:count 21}

Try getting something else to try the :default handler:

$ echo '[:otherthing]' | transito http post http://localhost:8080/query e2t -

Status: 200
Content-Type: application/transit+json
Content-Length: 6
Server: undertow


Try an inexistent action to try the :default handler:

$ echo '(somethingelse {:value 1})' | transito http post http://localhost:8080/action e2t -

Status: 404
Content-Type: application/transit+json
Content-Length: 84
Server: undertow

{:params {:value 1}, :key somethingelse, :error "Not Found"} with devcards how to

Simple step by step guide to try with devcards.

This assumes you have leiningen installed if not, go to and follow the instructions there.

Let's start by creating the basic devcards environment using the devcards template:

lein new devcards omnom
cd omnom
lein figwheel

The output should look something like this:

Figwheel: Starting server at http://localhost:3449
Focusing on build ids: devcards
Compiling "resources/public/js/compiled/omnom_devcards.js" from ["src"]...
Successfully compiled "resources/public/js/compiled/omnom_devcards.js" in 15.476 seconds.
Started Figwheel autobuilder

Launching ClojureScript REPL for build: devcards
Figwheel Controls:


  Switch REPL build focus:
          :cljs/quit                      ;; allows you to switch REPL to another build
    Docs: (doc function-name-here)
    Exit: Control+C or :cljs/quit
 Results: Stored in vars *1, *2, *3, *e holds last exception object
Prompt will show when figwheel connects to your application
To quit, type: :cljs/quit

then after it does all it's thing open http://localhost:3449/cards.html

it should look something like this:


click the omnom.core link, you should see this:


now we have to install the latest development snapshot for om to try, in some folder outside your project run:

git clone
cd om
lein install

Now let's add the dependencies to our project, open project.clj and make the :dependencies section look like this:

:dependencies [[org.clojure/clojure "1.7.0"]
               [org.clojure/clojurescript "1.7.122"]
               [devcards "0.2.0-3"]
               [sablono "0.3.4"]
               [org.omcljs/om "0.9.0-SNAPSHOT"]
               [datascript "0.13.1"]]

Now restart fighwheel (press Ctrl + d) and run it again:

lein figwheel

reload the page.

open the file src/omnom/core.cljs and replace its content with this:

(ns omnom.core
   [cljs.test :refer-macros [is async]]
   [goog.dom :as gdom]
   [ :as om :refer-macros [defui]]
   [om.dom :as dom]
   [datascript.core :as d]
   [sablono.core :as sab :include-macros true])
   [devcards.core :as dc :refer [defcard deftest]]))


(defcard first-card
  (sab/html [:div
             [:h1 "This is your first devcard!"]]))

(defui Hello
  (render [this]
    (dom/p nil (-> this om/props :text))))

(def hello (om/factory Hello))

(defcard simple-component
  "Test that Om Next component work as regular React components."
  (hello {:text "Hello, world!"}))

(def p
    {:read   (fn [_ _ _] {:quote true})
     :mutate (fn [_ _ _] {:quote true})}))

(def r
    {:parser p
     :ui->ref (fn [c] (-> c om/props :id))}))

(defui Binder
  (componentDidMount [this]
    (let [indexes @(get-in (-> this om/props :reconciler) [:config :indexer])]
      (om/update-state! this assoc :indexes indexes)))
  (render [this]
    (binding [om/*reconciler* (-> this om/props :reconciler)]
      (apply dom/div nil
        (hello {:id 0 :text "Goodbye, world!"})
        (when-let [indexes (get-in (om/get-state this)
                             [:indexes :ref->components])]
          [(dom/p nil (pr-str indexes))])))))

(def binder (om/factory Binder))

(defcard basic-nested-component
  "Test that component nesting works"
  (binder {:reconciler r}))

(deftest test-indexer
  "Test indexer"
  (let [idxr (get-in r [:config :indexer])]
    (is (not (nil? idxr)) "Indexer is not nil in the reconciler")
    (is (not (nil? @idxr)) "Indexer is IDeref")))

(defn main []
  ;; conditionally start the app based on wether the #main-app-area
  ;; node is on the page
  (if-let [node (.getElementById js/document "main-app-area")]
    (js/React.render (sab/html [:div "This is working"]) node)))


;; remember to run lein figwheel and then browse to
;; http://localhost:3449/cards.html

it should display the om cards if not try reloading the page.

now just keep adding cards!

Quotes N+1

Some Quotes I had around and wanted to put somewhere.

Navigation implies state. Software that can be navigated is software in which the user can get lost. The more navigation, the more corners to get stuck in. The more manipulable state, the more ways to wander into a “bad mode.” State is the primary reason people fear computers—stateful things can be broken
Most of the time, a person sits down at her personal computer not to create, but to read, observe, study, explore, make cognitive connections, and ultimately come to an understanding. This person is not seeking to make her mark upon the world, but to rearrange her own neurons. The computer becomes a medium for asking questions, making comparisons, and drawing conclusions—that is, for learning.
Many types of context can be naturally expressed in some informative graphical domain, relieving the user from manipulating information-free general-purpose controls.
If the software properly infers as much as possible from history and the environment, it should be able to produce at least a reasonable starting point for the context model. Most of the user’s interaction will then consist of correcting (or confirming) the software’s predictions. This is generally less stressful than constructing the entire context from scratch.
Simplicity. “I conclude that there are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is make it so complicated that there are no obvious deficiencies.” C.A.R. Hoare, The Emperor’s Old Clothes Turing Award lecture (1980), p81. From a practical (and historical) standpoint, we can assume that no complex specification will be implemented exactly. This, in itself, is not a problem. However, multiple, decentralized implementations of a complex specification will be incorrect in different ways. A platform consisting of the union of all possible implementations is thus arbitrarily unreliable—the designer can have no assurance of what a recipient actually receives. For a platform to be reliable, it must either have a single implementation, or be so utterly simple that it can be implemented uniformly. If we assume a practical need for open, freely implementable standards, the only option is simplicity.*

All of our days are numbered, we cannot afford to be idle To act on a bad idea is better than to not act at all. Because the worth of an idea never becomes apparent until you do it.

—nick cave, 20000 days on earth

Forward syslog messages to flume with rsyslog

As usual, brain dump, just instructions, not much content.

download flume from here:

I'm using this one:

unpack and put it somewhere.

create a file with the following content, I will name it flume-syslog.conf and place it in ~/tmp/, you should too if you are lazy and don't want to change the commands:

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# I'll be using TCP based Syslog source
a1.sources.r1.type = syslogtcp
# the port that Flume Syslog source will listen on
a1.sources.r1.port = 7077
# the hostname that Flume Syslog source will be running on = localhost

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1 = c1

Install rsyslog if you don't have it and start it, I'm using fedora 22, change for your distro:

sudo dnf install rsyslog
sudo service rsyslog start


For Fedora Users

I had to disable selinux since it was blocking some ports, YMMV

Configure rsyslog with your rule, you can do it directly on /etc/rsyslog.conf or better, check that the following line is uncommented:

$IncludeConfig /etc/rsyslog.d/*.conf

And put your config under /etc/rsyslog.d/50-default.conf (create it if it doesn't exist)

We are going to forward only messages with a given tag, since we are interested on a subset of the logs, in this case we only want log lines with the tag "test", add this to the rsyslog config file:

:syslogtag, isequal, "test:" @@

Save and restart rsyslog:

sudo service rsyslog start

Start flume with your configuration:

./bin/flume-ng agent --conf conf --conf-file ~/tmp/flume-syslog.conf --name a1 -Dflume.root.logger=INFO,console  -Dorg.apache.flume.lifecycle.LifecycleSuperviso=INFO,console


You should run the flume-ng command from the flume folder otherwise a log4j warning will appear and you won't see the output of the sink

Now generate a log line with our tag:

logger -t test 'Testing Flume with Syslog!

you should see a line like this:

2015-08-27 18:06:25,096 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(] Event: { headers:{host=ganesha, Severity=5, Facility=1, priority=13, timestamp=1440695180000} body: 74 65 73 74 3A 20 54 65 73 74 69 6E 67 20 46 6C test: Testing Fl }

If you don't see the line check /var/log/messages to see if your message is there:

sudo vim /var/log/messages

Bonus track! sending apache logs to syslog and from there to flume.

for this install apache 2, on fedora:

sudo dnf install httpd
sudo service httpd start
sudo bash -c "echo 'welcome!' > /var/www/html/index.html"

curl localhost

The output should be:


Now configure apache to forward logs to syslog, open /etc/httpd/conf.d/welcome.conf and add at the bottom:

CustomLog "|/usr/bin/logger -t test" combined

Restar apache:

sudo service httpd restart

Now open the page or use curl to get a page:


You should see a new log on flume.

Where to go from here?

  • Put flume on another machine, change the ip address to that address
  • change the tag (test) on rsyslog and on welcome.conf to something else
  • Buy me a beer

Enabling CORS in Solr in a Cloudera environment

This is a continuation of this post: Enable CORS in Apache Solr but this time for an instance that is running in cloudera.

No idea how it was installed since it was already there, but doing some investigation and avoiding reading the docs at all costs I arrived at this solution.

The idea of this post is to make you avoid reading the docs too!

First I will give names to some things that may be different for you:


Now do:

cd $CDH/jars/

cd $CDH/lib/bigtop-tomcat/lib/
ln -s $CDH/jars/jetty-servlets-9.1.5.v20140505.jar
ln -s $CDH/jars/jetty-util-9.1.5.v20140505.jar

chown $CDH_USER.$CDH_GROUP jetty-servlets-9.1.5.v20140505.jar
chown -h $CDH_USER.$CDH_GROUP jetty-servlets-9.1.5.v20140505.jar

chown $CDH_USER.$CDH_GROUP jetty-util-9.1.5.v20140505.jar
chown -h $CDH_USER.$CDH_GROUP jetty-util-9.1.5.v20140505.jar

Then create $CDH/lib/bigtop-tomcat/bin/ with your favorite text editor and put in it the following:


Open $CDH/etc/solr/tomcat-conf.dist/WEB-INF/web.xml with your text editor and follow the instructions at Enable CORS in Apache Solr

The way to know if it worked is to open the Solr admin panel, if it loads it works, if it doesn't look at the logs, mine are at /var/log/solr/. To be sure that the classpath was set correctly from look in the solr admin page in the "Java Properties" section for the java.class.path variable, it should have the class path you set in setenv.hs plus some extra stuff (mainly bootstrap.jar).

If the admin page doesn't load (tomcat 404) look at the logs, some class loading error may be happening, comment the config you added in web.xml and restart.

The version I'm using of the jetty jars is because newer versions are compiled for java 1.8 and I have 1.7, use older/newer depending on your java version.

Enable CORS in Apache Solr

Quick post since there's no easy googlable (?) resource to do this.

open the file server/solr-webapp/webapp/WEB-INF/web.xml and add the following XML before the existing filter section:

         <param-value>origin, content-type, cache-control, accept, options, authorization, x-requested-with</param-value>


taken from this Stack Overflow response

Making a GIF out of a folder of PNGs (plus resizing)

I need to update the gif from the Event Fabric landing page and I forgot how I did it last time.

So this time I will write it here as a reminder.

First take the screenshots, I do it the good ol' way by using the browser fullscreen and hitting the screeshot key at almost regular intervals.

That leaves me with a set of screenshots I want to resize, so I run:

mogrify -path small -resize 800x450 *.png


This requires imagemagick to be installed

This will resize all the *.png files in the current folder to 800x450 and write the results into a folder called small.

Now we go to the small folder and generate the gif:

cd small
convert -delay 100 -loop 0 *.png animation.gif

This will greate a gif that transitions every second from the png images and save it in the animation.gif file.