This is my blog, more about me at marianoguerra.github.io

Welp, some day the experiment had to end, I stopped reading 5 papers a week because some books arrived and I read those instead and also because I was busy at work.

But that doesn't mean I didn't read papers at all, so here's a list of the ones I did read.

Cuneiform: A Functional Language for Large Scale Scientific Data Analysis

Seems useful in practice, was expecting something else from the title.

The Stratosphere platform for big data analytics

I remember reading a paper from what later became Apache Flink that I liked a lot, I was looking for that one and I found this one instead (stratosphere became flink), it was an interesting overview, would like to know how much of that is still in flink.

Orleans: Distributed Virtual Actors for Programmability and Scalability

Really good paper, I like how it's written and the idea and implementation.

HyParView: a membership protocol for reliable gossip-based broadcast

Epidemic Broadcast Trees

This too reviewed together because they are like bread and butter, I love both of them, highly recommended.

Large-Scale Peer-to-Peer Autonomic Monitoring

I won't lie to you, I don't remember much about this one, but given the authors it must be good :)

Stream Processing with a Spreadsheet

Object Spreadsheets: A New Computational Model for End-User Development of Data-Centric Web Applications

I was looking for ideas and inspiration when I read these two, I liked both, Object Spreadsheets being the most interesting aproach.

A Layered Grammar of Graphics

Great paper, on my top list, maybe because I love the topic :)

Virtual Time and Global States of Distributed Systems

A must read if interested in vector clocks, the non math parts are good, I don't enjoy reading theormes a lot (not their fault).

Papers this looong week: 10

Papers so far: 43

Papers in queue: don't want to count anymore

Improving Official Erlang Documentation

2016-10-23 21:55

Many times I've heard people complaining about different aspects of the Official Erlang documentation, one thing that I find interesting is the fact that the Erlang documentation is really complete and detailed, so I decided to dedicate some time to other parts, to get familiar with it I decided to start with an "easy" one, it's presentation.

So I downloaded erlang/otp:

git clone https://github.com/erlang/otp.git

And did a build:

# to avoid having dates formated in your local format
export LC_ALL="en_US.utf-8"
cd otp
./otp_build setup
make docs

Then I installed the result in another folder to see the result:

mkdir ../erl-docs
make release_docs RELEASE_ROOT=../erl-docs

And served them to be able to navidate them:

cd ../erl-docs
python3 -m http.server

If you want to give it a try you need to install the following deps on debian based systems:

sudo apt install build-essential fop xsltproc autoconf libncurses5-dev

With the docs available I started looking around, the main files to modify are:

lib/erl_docgen/priv/css/otp_doc.css: The stylesheet for the docs
lib/erl_docgen/priv/xsl/db_html.xsl: An XSLT file to transform xml docs into html

The problem I found at first was that to see the results of my changes to db_html.xsl I had to do a clean and build from scratch, which involved recompiling erlang itself, taking a lot of time.

Later I found a way to only build the docs again by forcing a rebuild:

make -B docs

But this still involves building the pdf files which is the part that takes the most time, I haven't found a target that will only build the html files, if you know how or want to try to add it in the make file it would be great.

With this knowledge I started improving the docs, I will cover the main things I changed.

You can see all my chages in the improve-docs-style branch.

Small styling changes

Don't use full black and white
Set font to sans-serif
Use mono as code font
Improve link colors
Improve title and description markup on landing page
Update menu icons (the folder and document icons)
Improve panel and horizontal separator styles
Align left panel's links to the left

Improve code box color, border and spacing

/galleries/misc/otp-old-2.png — Old Code Examples

/galleries/misc/otp-new-2.png — New Code Examples

Improve warning and info boxes' color, border and spacing

/galleries/misc/otp-old-3.png — Old Warning Dialog

/galleries/misc/otp-new-3.png — New Warning Dialog

/galleries/misc/otp-old-4.png — Old Info Dialog

/galleries/misc/otp-new-4.png — New Info Dialog

Logo Improvements

Remove drop shadows from logo
Center Erlang logo on left panel
Erlang logo is a link to the docs' main page
Put section description after logo and before links in left panel

Old Landing Page

New Landing Page

Semantic Improvements

Use title tags for titles
Remove usage of <br/> and empty <p></p> to add vertical spacing
Use lists for link lists
Title case section titles instead of uppercase
Add semantic markup and classes to section titles and bodies
Add classes to all generated markup
- The ones I couldn't figure out a semantic class I added a generic one to help people spot them in the xsl document by inspecting the generated files
Clicable titles for standard sections with anchors for better linking

Improve table styling

/galleries/misc/otp-old-5.png — Old Tables

/galleries/misc/otp-new-5.png — New Tables

Improve applications page

/galleries/misc/otp-old-7.png — Old Applications List

/galleries/misc/otp-new-7.png — New Applications List

Improve modules page

/galleries/misc/otp-old-8.png — Old Modules List

/galleries/misc/otp-new-8.png — New Modules List

Add "progressive enhanced" syntax highlighting

At the bottom of the page there's a javascript file loaded, if successful it will load the syntax highlighter module and css and then style all the code blocks in the page, if it fails to load, is blocked or no js is enabled then the code blocks will have a default styling provided by CSS.

The markup was not modified in any way to add this feature.

Make code tokens easier to differentiate from standard text

The previous style for inline code was a really light italic font, I changed it to monospace but it was hard to distinguish, so I got some inspiration from slack and surrounded the inline code words in a light box to make them stand out.

Indent Exports and Data Types' section bodies

/galleries/misc/otp-old-6.png — Old Data Types and Exports Sections

/galleries/misc/otp-new-6.png — New Data Types and Exports Sections

This is all for now, I have some other ideas for future improvements but they involve changes to the documentation so I will submit them separatedly.

If you have any feedback please let me know!

Software que no falla

2016-10-21 22:29

Reproduzco acá un post que hice en facebook después de ver la siguiente transcripción:

Avisenle al señor Tonelli que el mismo día que el decía eso la agencia espacial europea perdió contacto con una sonda que mando a marte, que estuvo desarrollando por los últimos 7 anios, el proyecto salio 870 millones de euros y tiene los niveles de control de calidad mas altos de cualquier industria.

Un día después de eso, durante mas de dos horas servicios como twitter, netflix, github, paypal estuvieron fuera de servicio porque alguien hackeo webcams y otros dispositivos "inteligentes" y los uso para realizar un ataque de denegación de servicio contra un servicio que traduce lo que escribís en la barra de direcciones de tu navegador a direcciones que las computadoras pueden entender.

El que dice que el software no va a fallar es un irresponsable y no puede tener ninguna responsabilidad legislando sobre siquiera una linea de código.

Luego comencé a agregar los siguientes comentarios:

1) Mas noticias del día, se encontró hoy en el sistema operativo que van a usar las maquinas de voto electrónico un error que permite a cualquier persona obtener control total sobre el sistema, se que no lo van a leer pero acá esta:

“Most serious” Linux privilege-escalation bug ever is under active exploit

2) Hoy se informo que una empresa que distribuye certificados SSL (lo que pone el candadito verde en la dirección de tu banco y hace que sea una conexión segura, que también se usa para la transmisión de los resultados de las maquinas de voto al servidor central) permitía a personas obtener certificados para dominios que no eran de las personas que los solicitaban.

Incident Report - OCR

3) Algunos "divertidos" de la historia: Stanislav Yevgráfovich Petrov (Станислав Евграфович Петров en ruso, nacido en 9 de septiembre de 1939) es un teniente coronel retirado del ejército soviético durante la Guerra Fría. Es recordado por haber identificado correctamente una alerta de ataque con misiles como una falsa alarma en 1983, por lo que evitó lo que podía haber escalado en una guerra nuclear entre la Unión Soviética y los Estados Unidos.

4) Uno de 1998: La Mars Climate Orbiter se destruyó debido a un error de navegación, consistente en que el equipo de control en la Tierra hacía uso del Sistema Anglosajón de Unidades para calcular los parámetros de inserción y envió los datos a la nave, que realizaba los cálculos con el sistema métrico decimal. Así, cada encendido de los motores habría modificado la velocidad de la sonda de una forma no prevista y tras meses de vuelo el error se había ido acumulando.

5) En 2003 50 millones de personas se quedaron sin electricidad en Estados Unidos y Canada por un error de software: https://en.wikipedia.org/wiki/Northeast_blackout_of_2003

6) La Therac-25 fue una máquina de radioterapia producida por AECL, sucesora de los modelos Therac-6 y Therac-20 (las unidades anteriores fueron producidas en asociación con CGR). El aparato estuvo comprometido en al menos seis accidentes entre 1985 y 1987, en los que varios pacientes recibieron sobredosis de radiación. Tres de los pacientes murieron como consecuencia directa. Estos accidentes pusieron en duda la fiabilidad del control por software de sistemas de seguridad crítica, convirtiéndose en caso de estudio en la informática médica y en la ingeniería de software.

7) En 1995 un cohete (Ariane 5) que costo 7 billones de dolares de desarrollo y llevaba una carga valuada en 500 millones de dolares exploto porque se uso un numero "muy chico" para mantener la velocidad horizontal, esto resulto en la explosión del cohete.

8) Knight Capital perdió 440 millones de dolares en 45 minutos y se fue a la quiebra por un error de software que vendio acciones a precio equivocado.

9) En 2004 el sistema de trafico aéreo de Los Ángeles dejo de funcionar porque usaban un contador "muy chico", lo divertido es que el sistema de respaldo dejo de funcionar a los minutos de ser encendido.

10) En 1979 una planta nuclear en estados unidos "sufrió una fusión parcial del núcleo del reactor" causa: "La válvula debía cerrarse al disminuir la presión, aunque por un fallo no lo hizo. Las señales que llegaban al operador no indicaron que la válvula seguía abierta, aunque debía haberlo mostrado."

https://es.wikipedia.org/wiki/Accidente_de_Three_Mile_Island

11) Otras veces las causas son políticas "...fallas en la comunicación... dieron lugar a una decisión de lanzar 51-L basada en información incompleta y algunas veces engañosa, un conflicto entre los datos de ingeniería y los juicios de gestión, y una estructura de dirección de la NASA que permitió problemas internos de seguridad de vuelo para eludir las claves de traslado del transbordador."

https://es.wikipedia.org/wiki/Siniestro_del_transbordador_espacial_Challenger

This Week in WebAssembly III

2016-08-30 15:55

This Week in WebAssembly III

The most important "change" is that a PR for the stack machine semantics was opened in PR #323, but still not merged.

[2016-08-23 19:22:20 +0200] Andreas Rossberg: Tweak S-expr grammar
[2016-08-24 14:12:15 +0200] Andreas Rossberg: Eliminate second loop label
[2016-08-24 15:15:10 +0200] Andreas Rossberg: Simplify memop
[2016-08-24 16:54:46 +0200] Andreas Rossberg: Rename expressions to instructions
[2016-08-24 17:10:52 +0200] Andreas Rossberg: Added todos
[2016-08-30 11:57:55 +0200] rossberg-chromium: Symmetrise import/export syntax

Website

webassembly.github.io repository

[2016-08-22 16:18:15 -0700] Rebecca Bettencourt: Add note about which system the instructions are for, instructions for cmake on OS X, and remove leading whitespace on pre tags.

Resources

How to start using WebAssembly today

This Week in WebAssembly II

2016-08-22 17:38

Second update on #webassembly

Binaryen

binaryen repository

[2016-08-18 16:30:07 -0700] Seth Samuel: Update README.md with full hello_world.asm.js source

Design

design repository

[2016-08-17 11:49:56 +0200] titzer: Limit varint sizes in Binary Encoding. (#764)
[2016-08-17 09:19:56 -0700] gahaas: Clarify TypeError with a link to the JS spec. (#767)
[2016-08-19 02:21:20 -0700] gahaas: Evaluation order of parameters of exports. (#770)

Spec

spec repository

No Changes

Website

webassembly.github.io repository

[2016-08-18 11:43:34 -0700] Seth Thompson: add pre/code styling

Resources

This Week in WebAssembly I

2016-08-17 13:16

(Hopefuly) weekly update on WebAssembly and WebAssembly related projects

Binaryen

binaryen repository

[2016-08-11 21:02:33 -0700] Heejin Ahn: Implement asm.js style exception handling for Wasm (#664)
[2016-08-12 11:57:12 -0700] Alon Zakai: support expressions in segment offsets
[2016-08-12 14:40:26 -0700] Alon Zakai: support function table initial and max sizes, and new printing format
[2016-08-15 14:29:57 -0700] Alon Zakai: offset support in table

Design

design repository

[2016-08-09 19:20:36 -0500] Luke Wagner: Change WebAssembly.compile to throw on bad args
[2016-08-11 16:46:10 +0200] rossberg-chromium: Describe operand order of call_indirect (#758)
[2016-08-16 01:02:27 -0700] gahaas: Define realm for ToWebAssemblyValue (#754)
[2016-08-16 15:37:16 +0200] titzer: Limit varint sizes in Binary Encoding.
[2016-08-09 12:15:38 -0700] Dan Gohman: Replace branch arities with block and if signatures.

Spec

spec repository

[2016-08-12 13:11:38 +0200] rossberg-chromium: Implement globals (#313)
[2016-08-12 14:40:09 +0200] rossberg-chromium: Implement tables & multiple memories (#316)
[2016-08-16 18:15:51 +0200] rossberg-chromium: Implement non-trapping grow_memory semantics (#308)

Website

webassembly.github.io repository

No changes

Resources

Talk by @callahad at MidWest: What the Heck is WebAssembly, and do I Have to Learn C Now?
Talk by @warianoguerra at StuttgartJS: WebAssembly @ StuttgartJS
Ricardo Forth: A Forth dialect implemented in C, Javascript, WebAssembly and compiled from C to asm.js and WebAssembly.

Ricardo Forth: a Forth implemented in C, JS, WebAssembly and compiled from C to asm.js and WebAssembly

2016-08-11 08:31

It comes a time in the life of everyone when you implement a Forth.

The time has come for me.

Presenting Ricardo Forth:

A Forth dialect implemented in C, Javascript, WebAssembly and compiled from C to asm.js and WebAssembly.

This project is based on the 1992 IOCCC entry buzzard.2 (design notes: buzzard.2.design), prettified and then compiled to:

asmjs using emscripten
WebAssembly using Binaryen

Also reimplemented by translating the C code into Javascript and WebAssebly.

Go check it out if you are curious about asmjs, WebAssembly, Forth or Emscripten/Binaryen.

Papers of the Week VII

2016-06-14 09:24

Because nothing lasts forever and after a week half traveling and a busy one I managed to read 4 papers this week.

The first one was interesting but comes from an area I will describe as "let's bend relational databases to fit Event Stream Processing", which is not bad per se but has things like joins and being able to remember past events that make its scalability (at least in terms of memory) quite hard, also it never discuses distribution, which is ok for the field but not what I'm looking for.

Cayuga: A General Purpose Event Monitoring System

The interesting part about this one is the part where it introduces Visibly Pushdown Languages something that looks really interesting but I couldn't find an introduction for mere mortals, the descriptions are really dense an mathematical, which is ok but hard to learn for outsiders like me.

Another interesting point is the fact that it uses the XML Schema to optimize the generated VPA (Visibly Pushdown Automata) and that the implementation not only applies to XML but to any nested semistructured data.

High-Performance Complex Event Processing over XML Streams

The review of the next one will seem conflicting with my previous reviews, but this one had too much enfasis on the low level implementation details, not novel things and optimizations, just a lot of details, like the guys found the implementation really cool and wanted to share it with the world. Not a bad thing per se, but in this batch I was looking for abstractions, optimizations and distribution characteristics of stream processing, better if focused on distributed systems, and this one talked mainly about the DSL they build that compiles to C. It also sorts the streams, does multiple passes over the data, does lookahead in the stream and does a kind of "micro batches" which isn't what I was looking for.

Hancock: A Language for Extracting Signatures from Data Streams

The last one, I found the approach interesting, they seemed to try to push the purity of the approach (everything is a regular expression) which may have end up with a nice model (a thing I like) but by reading the code it doesn't seem to be really clear, at least for a OO/functional background, and I think less for non programmers. Maybe the syntax doesn't help and some other syntax would make things clearer, I don't know.

Other than that the approach is interesting and it made me think on some ways to define a stream processing language using mainly pattern matching.

EventScript: An Event-Processing Language Based on Regular Expressions with Actions

Papers this week: 4

Papers so far: 33

Papers in queue: 76

How to build Riak TS (Time Series Database) from Source

2016-06-03 15:31

To build riak ts we need some basic build tools installed, like compilers and tools.

On ubuntu/debian an derivatives:

sudo apt-get update
sudo apt-get install build-essential autoconf git libncurses5-dev libssl-dev libpam0g-dev

On RHEL, Centos, Oracle Linux and derivatives:

sudo yum update -y
sudo yum groupinstall "Development Tools" -y
sudo yum install openssl-devel ncurses-devel git autoconf pam-devel -y

A quick description of each so you can map to your OS:

build-essential: a group of tools to build stuff (duh!)
autoconf: needed to build basho's erlang OTP version
git: to fetch repos
libcurses and libssl: to have curses and ssl support on erlang
libpam0g-dev: required to compile a riak module (canola)
- not sure about the RHEL equivalent, try pam-devel

Now clone the riak repo:

git clone https://github.com/basho/riak.git
cd riak

Checkout the Riak TS tag:

git checkout riak_ts-1.3.0

Download and install kerl to build the correct erlang OTP version:

mkdir -p ~/bin
wget https://raw.githubusercontent.com/kerl/kerl/master/kerl -O ~/bin/kerl
chmod u+x ~/bin/kerl
export PATH=$PATH:$HOME/bin

Build OTP_R16B02_basho10 erlang version (notice that this won't interfere with your local erlang installation, see kerl readme for details):

kerl build git git://github.com/basho/otp.git OTP_R16B02_basho10 R16B02-basho10
mkdir -p ~/soft/erlang-releases/R16B02-basho10
kerl install R16B02-basho10 ~/soft/erlang-releases/R16B02-basho10
. ~/soft/erlang-releases/R16B02-basho10/activate
export PATH=$HOME/soft/erlang-releases/R16B02-basho10/bin:$PATH

Now build Riak TS:

make locked-deps
make rel

And run it:

cd rel/riak
./bin/riak console

Papers of the Week VI

2016-06-01 08:45

Better late than never (even when I read all the papers last week) here is the sixth installment of Papers of the Week.

Starting next week I will try to write the reviews after I read the papers and not almost one week after when my memories are fuzzy :)

The fist one describes an implementation of out of order processing using punctuation, interesting in that it "applies" the concept of punctuation to building a streaming system and analyzes the result.

Out-of-Order Processing: A New Architecture for High- Performance Stream Systems

This one describes an implementation of a storage engine using LSM Trees and a compression technique.

Rose: Compressed, log-structured replication

You can read an overview of the next paper and find the link to it at acolyer's paper of the day: Holistic Configuration Management at Facebook, I copy the first paragraph here:

This paper gives a comprehensive description of the use cases, design,
implementation, and usage statistics of a suite of tools that manage
Facebook’s conﬁguration end-to-end, including the frontend products,
backend systems, and mobile apps.

It's a good overview of tools and techniques used to scale and standardize configuration management and how to avoid problems introduced by sloppy configuration management.

Holistic Configuration Management at Facebook

The next one is my favorite of the week, it defines a baseline by implementing solutions from other papers that introduce some parallelization strategy by implementing them in a simple single threaded way and benchmarking it against other solutions, then defined a "metric" that describes how many cores are required to match the single thread implementation, as many sites would tell you "the result will amaze you".

Scalability! But at what COST?

The last one for this week surprisingly brought me to the CRDT/Lasp/@cmeik land, when the title didn't seemed to imply that, the crazy fact is that I saw a talk about this paper at RICON 2015 and I didn't remembered the title :)

Some parts where hard for me since it's the first paper I read about CRDTs so I don't have the vocabulary and basic theory in place but it made me think on some interesting applications on the IoT and monitoring spaces.

Declarative, Sliding Window Aggregations for Computations at the Edge

Papers this week: 5

Papers so far: 29

Papers in queue: 82

Mariano Guerra's Log

Papers of the LargeSpanOfTime I

Improving Official Erlang Documentation

Small styling changes

Improve code box color, border and spacing

Improve warning and info boxes' color, border and spacing

Logo Improvements

Semantic Improvements

Improve table styling

Improve applications page

Improve modules page

Add "progressive enhanced" syntax highlighting

Make code tokens easier to differentiate from standard text

Indent Exports and Data Types' section bodies

Software que no falla

This Week in WebAssembly III

This Week in WebAssembly III

Binaryen

Design

Spec

Website

Resources

This Week in WebAssembly II

Binaryen

Design

Spec

Website

Resources

This Week in WebAssembly I

Binaryen

Design

Spec

Website

Resources

Ricardo Forth: a Forth implemented in C, JS, WebAssembly and compiled from C to asm.js and WebAssembly

Papers of the Week VII

How to build Riak TS (Time Series Database) from Source

Papers of the Week VI