Welp, some day the experiment had to end, I stopped reading 5 papers a week because
some books arrived and I read those instead and also because I was busy at work.
But that doesn't mean I didn't read papers at all, so here's a list of the ones
I did read.
I remember reading a paper from what later became Apache Flink that I liked a
lot, I was looking for that one and I found this one instead (stratosphere
became flink), it was an interesting overview, would like to know how much of
that is still in flink.
Many times I've heard people complaining about different aspects of the
Official Erlang documentation, one thing that I find interesting is the fact
that the Erlang documentation is really complete and detailed, so I decided to
dedicate some time to other parts, to get familiar with it I decided to start
with an "easy" one, it's presentation.
So I downloaded erlang/otp:
git clone https://github.com/erlang/otp.git
And did a build:
# to avoid having dates formated in your local format
export LC_ALL="en_US.utf-8"
cd otp
./otp_build setup
make docs
Then I installed the result in another folder to see the result:
mkdir ../erl-docs
make release_docs RELEASE_ROOT=../erl-docs
And served them to be able to navidate them:
cd ../erl-docs
python3 -m http.server
If you want to give it a try you need to install the following deps on debian
based systems:
The problem I found at first was that to see the results of my changes to
db_html.xsl I had to do a clean and build from scratch, which involved
recompiling erlang itself, taking a lot of time.
Later I found a way to only build the docs again by forcing a rebuild:
make -B docs
But this still involves building the pdf files which is the part that takes the
most time, I haven't found a target that will only build the html files, if you
know how or want to try to add it in the make file it would be great.
With this knowledge I started improving the docs, I will cover the main things
I changed.
Improve title and description markup on landing page
Update menu icons (the folder and document icons)
Improve panel and horizontal separator styles
Align left panel's links to the left
Improve code box color, border and spacing
Improve warning and info boxes' color, border and spacing
Logo Improvements
Remove drop shadows from logo
Center Erlang logo on left panel
Erlang logo is a link to the docs' main page
Put section description after logo and before links in left panel
Semantic Improvements
Use title tags for titles
Remove usage of <br/> and empty <p></p> to add vertical spacing
Use lists for link lists
Title case section titles instead of uppercase
Add semantic markup and classes to section titles and bodies
Add classes to all generated markup
The ones I couldn't figure out a semantic class I added a generic one to
help people spot them in the xsl document by inspecting the generated files
Clicable titles for standard sections with anchors for better linking
Improve table styling
Improve applications page
Improve modules page
Add "progressive enhanced" syntax highlighting
At the bottom of the page there's a javascript file loaded, if successful it
will load the syntax highlighter module and css and then style all the code
blocks in the page, if it fails to load, is blocked or no js is enabled then
the code blocks will have a default styling provided by CSS.
The markup was not modified in any way to add this feature.
Make code tokens easier to differentiate from standard text
The previous style for inline code was a really light italic font, I changed it
to monospace but it was hard to distinguish, so I got some inspiration from
slack and surrounded the inline code words in a light box to make them stand
out.
Indent Exports and Data Types' section bodies
This is all for now, I have some other ideas for future improvements but they
involve changes to the documentation so I will submit them separatedly.
Reproduzco acá un post que hice en facebook después de ver la siguiente transcripción:
Avisenle al señor Tonelli que el mismo día que el decía eso la agencia espacial
europea perdió contacto con una sonda que mando a marte, que estuvo
desarrollando por los últimos 7 anios, el proyecto salio 870 millones de euros
y tiene los niveles de control de calidad mas altos de cualquier industria.
Un día después de eso, durante mas de dos horas servicios como twitter,
netflix, github, paypal estuvieron fuera de servicio porque alguien hackeo
webcams y otros dispositivos "inteligentes" y los uso para realizar un ataque
de denegación de servicio contra un servicio que traduce lo que escribís en la
barra de direcciones de tu navegador a direcciones que las computadoras pueden
entender.
El que dice que el software no va a fallar es un irresponsable y no puede tener
ninguna responsabilidad legislando sobre siquiera una linea de código.
Luego comencé a agregar los siguientes comentarios:
1) Mas noticias del día, se encontró hoy en el sistema operativo que van a usar
las maquinas de voto electrónico un error que permite a cualquier persona
obtener control total sobre el sistema, se que no lo van a leer pero acá esta:
2) Hoy se informo que una empresa que distribuye certificados SSL (lo que pone
el candadito verde en la dirección de tu banco y hace que sea una conexión
segura, que también se usa para la transmisión de los resultados de las
maquinas de voto al servidor central) permitía a personas obtener certificados
para dominios que no eran de las personas que los solicitaban.
3) Algunos "divertidos" de la historia: Stanislav Yevgráfovich Petrov
(Станислав Евграфович Петров en ruso, nacido en 9 de septiembre de 1939) es un
teniente coronel retirado del ejército soviético durante la Guerra Fría. Es
recordado por haber identificado correctamente una alerta de ataque con misiles
como una falsa alarma en 1983, por lo que evitó lo que podía haber escalado en
una guerra nuclear entre la Unión Soviética y los Estados Unidos.
4) Uno de 1998: La Mars Climate Orbiter se destruyó debido a un error de
navegación, consistente en que el equipo de control en la Tierra hacía uso del
Sistema Anglosajón de Unidades para calcular los parámetros de inserción y
envió los datos a la nave, que realizaba los cálculos con el sistema métrico
decimal. Así, cada encendido de los motores habría modificado la velocidad de
la sonda de una forma no prevista y tras meses de vuelo el error se había ido
acumulando.
6) La Therac-25 fue una máquina de radioterapia producida por AECL, sucesora de
los modelos Therac-6 y Therac-20 (las unidades anteriores fueron producidas en
asociación con CGR). El aparato estuvo comprometido en al menos seis accidentes
entre 1985 y 1987, en los que varios pacientes recibieron sobredosis de
radiación. Tres de los pacientes murieron como consecuencia directa. Estos
accidentes pusieron en duda la fiabilidad del control por software de sistemas
de seguridad crítica, convirtiéndose en caso de estudio en la informática
médica y en la ingeniería de software.
7) En 1995 un cohete (Ariane 5) que costo 7 billones de dolares de desarrollo y
llevaba una carga valuada en 500 millones de dolares exploto porque se uso un
numero "muy chico" para mantener la velocidad horizontal, esto resulto en la
explosión del cohete.
8) Knight Capital perdió 440 millones de dolares en 45 minutos y se fue a la
quiebra por un error de software que vendio acciones a precio equivocado.
9) En 2004 el sistema de trafico aéreo de Los Ángeles dejo de funcionar porque
usaban un contador "muy chico", lo divertido es que el sistema de respaldo dejo
de funcionar a los minutos de ser encendido.
10) En 1979 una planta nuclear en estados unidos "sufrió una fusión parcial del
núcleo del reactor" causa: "La válvula debía cerrarse al disminuir la presión,
aunque por un fallo no lo hizo. Las señales que llegaban al operador no
indicaron que la válvula seguía abierta, aunque debía haberlo mostrado."
11) Otras veces las causas son políticas "...fallas en la comunicación...
dieron lugar a una decisión de lanzar 51-L basada en información incompleta y
algunas veces engañosa, un conflicto entre los datos de ingeniería y los
juicios de gestión, y una estructura de dirección de la NASA que permitió
problemas internos de seguridad de vuelo para eludir las claves de traslado del
transbordador."
Because nothing lasts forever and after a week half traveling and a busy one
I managed to read 4 papers this week.
The first one was interesting but comes from an area I will describe as "let's
bend relational databases to fit Event Stream Processing", which is not bad per
se but has things like joins and being able to remember past events that make
its scalability (at least in terms of memory) quite hard, also it never discuses
distribution, which is ok for the field but not what I'm looking for.
The interesting part about this one is the part where it introduces
Visibly Pushdown Languages
something that looks really interesting but I couldn't find an introduction
for mere mortals, the descriptions are really dense an mathematical, which is
ok but hard to learn for outsiders like me.
Another interesting point is the fact that it uses the XML Schema to optimize
the generated VPA (Visibly Pushdown Automata) and that the implementation not
only applies to XML but to any nested semistructured data.
The review of the next one will seem conflicting with my previous reviews, but
this one had too much enfasis on the low level implementation details, not
novel things and optimizations, just a lot of details, like the guys found the
implementation really cool and wanted to share it with the world. Not a bad
thing per se, but in this batch I was looking for abstractions, optimizations
and distribution characteristics of stream processing, better if focused on
distributed systems, and this one talked mainly about the DSL they build that
compiles to C. It also sorts the streams, does multiple passes over the data,
does lookahead in the stream and does a kind of "micro batches" which isn't
what I was looking for.
The last one, I found the approach interesting, they seemed to try to push the
purity of the approach (everything is a regular expression) which may have end
up with a nice model (a thing I like) but by reading the code it doesn't seem
to be really clear, at least for a OO/functional background, and I think less
for non programmers. Maybe the syntax doesn't help and some other syntax would
make things clearer, I don't know.
Other than that the approach is interesting and it made me think on some ways
to define a stream processing language using mainly pattern matching.
Better late than never (even when I read all the papers last week) here
is the sixth installment of Papers of the Week.
Starting next week I will try to write the reviews after I read the papers and
not almost one week after when my memories are fuzzy :)
The fist one describes an implementation of out of order processing using
punctuation, interesting in that it "applies" the concept of punctuation to
building a streaming system and analyzes the result.
This paper gives a comprehensive description of the use cases, design,
implementation, and usage statistics of a suite of tools that manage
Facebook’s configuration end-to-end, including the frontend products,
backend systems, and mobile apps.
It's a good overview of tools and techniques used to scale and standardize
configuration management and how to avoid problems introduced by sloppy
configuration management.
The next one is my favorite of the week, it defines a baseline by implementing
solutions from other papers that introduce some parallelization strategy by
implementing them in a simple single threaded way and benchmarking it against
other solutions, then defined a "metric" that describes how many cores are
required to match the single thread implementation, as many sites would tell
you "the result will amaze you".
The last one for this week surprisingly brought me to the CRDT/Lasp/@cmeik land, when the title didn't seemed to imply that, the crazy fact is that
I saw a talk about this paper at RICON 2015 and I didn't remembered the title :)
Some parts where hard for me since it's the first paper I read about CRDTs so I
don't have the vocabulary and basic theory in place but it made me think on
some interesting applications on the IoT and monitoring spaces.