Ricardo Forth: a Forth implemented in C, JS, WebAssembly and compiled from C to asm.js and WebAssembly

It comes a time in the life of everyone when you implement a Forth.

The time has come for me.

Presenting Ricardo Forth:

A Forth dialect implemented in C, Javascript, WebAssembly and compiled from C to asm.js and WebAssembly.

This project is based on the 1992 IOCCC entry buzzard.2 (design notes: buzzard.2.design), prettified and then compiled to:

Also reimplemented by translating the C code into Javascript and WebAssebly.

Go check it out if you are curious about asmjs, WebAssembly, Forth or Emscripten/Binaryen.

Papers of the Week VII

Because nothing lasts forever and after a week half traveling and a busy one I managed to read 4 papers this week.

The first one was interesting but comes from an area I will describe as "let's bend relational databases to fit Event Stream Processing", which is not bad per se but has things like joins and being able to remember past events that make its scalability (at least in terms of memory) quite hard, also it never discuses distribution, which is ok for the field but not what I'm looking for.

The interesting part about this one is the part where it introduces Visibly Pushdown Languages something that looks really interesting but I couldn't find an introduction for mere mortals, the descriptions are really dense an mathematical, which is ok but hard to learn for outsiders like me.

Another interesting point is the fact that it uses the XML Schema to optimize the generated VPA (Visibly Pushdown Automata) and that the implementation not only applies to XML but to any nested semistructured data.

The review of the next one will seem conflicting with my previous reviews, but this one had too much enfasis on the low level implementation details, not novel things and optimizations, just a lot of details, like the guys found the implementation really cool and wanted to share it with the world. Not a bad thing per se, but in this batch I was looking for abstractions, optimizations and distribution characteristics of stream processing, better if focused on distributed systems, and this one talked mainly about the DSL they build that compiles to C. It also sorts the streams, does multiple passes over the data, does lookahead in the stream and does a kind of "micro batches" which isn't what I was looking for.

The last one, I found the approach interesting, they seemed to try to push the purity of the approach (everything is a regular expression) which may have end up with a nice model (a thing I like) but by reading the code it doesn't seem to be really clear, at least for a OO/functional background, and I think less for non programmers. Maybe the syntax doesn't help and some other syntax would make things clearer, I don't know.

Other than that the approach is interesting and it made me think on some ways to define a stream processing language using mainly pattern matching.

Papers this week: 4

Papers so far: 33

Papers in queue: 76

How to build Riak TS (Time Series Database) from Source

To build riak ts we need some basic build tools installed, like compilers and tools.

On ubuntu/debian an derivatives:

sudo apt-get update
sudo apt-get install build-essential autoconf git libncurses5-dev libssl-dev libpam0g-dev

On RHEL, Centos, Oracle Linux and derivatives:

sudo yum update -y
sudo yum groupinstall "Development Tools" -y
sudo yum install openssl-devel ncurses-devel git autoconf pam-devel -y

A quick description of each so you can map to your OS:

  • build-essential: a group of tools to build stuff (duh!)
  • autoconf: needed to build basho's erlang OTP version
  • git: to fetch repos
  • libcurses and libssl: to have curses and ssl support on erlang
  • libpam0g-dev: required to compile a riak module (canola)
    • not sure about the RHEL equivalent, try pam-devel

Now clone the riak repo:

git clone https://github.com/basho/riak.git
cd riak

Checkout the Riak TS tag:

git checkout riak_ts-1.3.0

Download and install kerl to build the correct erlang OTP version:

mkdir -p ~/bin
wget https://raw.githubusercontent.com/kerl/kerl/master/kerl -O ~/bin/kerl
chmod u+x ~/bin/kerl
export PATH=$PATH:$HOME/bin

Build OTP_R16B02_basho10 erlang version (notice that this won't interfere with your local erlang installation, see kerl readme for details):

kerl build git git://github.com/basho/otp.git OTP_R16B02_basho10 R16B02-basho10
mkdir -p ~/soft/erlang-releases/R16B02-basho10
kerl install R16B02-basho10 ~/soft/erlang-releases/R16B02-basho10
. ~/soft/erlang-releases/R16B02-basho10/activate
export PATH=$HOME/soft/erlang-releases/R16B02-basho10/bin:$PATH

Now build Riak TS:

make locked-deps
make rel

And run it:

cd rel/riak
./bin/riak console

Papers of the Week VI

Better late than never (even when I read all the papers last week) here is the sixth installment of Papers of the Week.

Starting next week I will try to write the reviews after I read the papers and not almost one week after when my memories are fuzzy :)

The fist one describes an implementation of out of order processing using punctuation, interesting in that it "applies" the concept of punctuation to building a streaming system and analyzes the result.

This one describes an implementation of a storage engine using LSM Trees and a compression technique.

You can read an overview of the next paper and find the link to it at acolyer's paper of the day: Holistic Configuration Management at Facebook, I copy the first paragraph here:

This paper gives a comprehensive description of the use cases, design,
implementation, and usage statistics of a suite of tools that manage
Facebook’s configuration end-to-end, including the frontend products,
backend systems, and mobile apps.

It's a good overview of tools and techniques used to scale and standardize configuration management and how to avoid problems introduced by sloppy configuration management.

The next one is my favorite of the week, it defines a baseline by implementing solutions from other papers that introduce some parallelization strategy by implementing them in a simple single threaded way and benchmarking it against other solutions, then defined a "metric" that describes how many cores are required to match the single thread implementation, as many sites would tell you "the result will amaze you".

The last one for this week surprisingly brought me to the CRDT/Lasp/@cmeik land, when the title didn't seemed to imply that, the crazy fact is that I saw a talk about this paper at RICON 2015 and I didn't remembered the title :)

Some parts where hard for me since it's the first paper I read about CRDTs so I don't have the vocabulary and basic theory in place but it made me think on some interesting applications on the IoT and monitoring spaces.

Papers this week: 5

Papers so far: 29

Papers in queue: 82