State of the BEAM 2017: Survey Results

Intro

You can't improve what you don't measure, and since I think there are areas in the BEAM community (Erlang, Elixir, LFE, Efene, Alpaca, Clojerl et al.) to improve we need to have a better picture of it.

That's why some months ago I decided to create this survey, I told to some people and started researching other "State of the X Community" yearly surveys, I wrote some draft questions and published to some people for feedback, after a couple of rounds I made a Form and ran a test survey for more feedback, after a couple dozen answers I cleared the results and announced it publicly with a weakly reminder on multiple channels.

Result Analysis

We got 423 Responses up to this point.

I present the results of the State of the BEAM Survey 2017 here in two ways:

  • Bar charts sorted by most answers to less
    • On questions with many answers I make a cut at some point
  • Raw data tables sorted by most answers to less
    • Here I did some consolidation of answers to avoid making them too large

I was thinking on doing a deep analysis on the answers but later I realized that if I did an analysis many people would read mine and avoid analyzing it themselves in detail.

Instead I decided to open an analysis thread in some forum and later maybe summarize the most interesting comments.

To ease the discussion I will do some light observations where I see it makes sense and make some questions to open the discussion.

Before diving into the result I want to make explicit two things that may make the results less representative than they should:

1. The "Elixir Effect"

I think the Elixir community is bigger or at least more active than the rest of the BEAM community, because of that and the fact that Elixir already has its own survey, I decided not to promote this survey there, to avoid the number of Elixir specific answers to skew the results and make this survey just be yet another Elixir survey with some BEAMers also replying.

With this clarification, and looking at the answers, I can identify some answers that are from Elixir-only developers, you can see that when some Elixir specific tools appear in the answers (Mix, ExUnit, Distillery, deploy to Heroku etc.), just keep that in mind when analyzing the results.

2. The "Survivorship Bias Effect"

From the wikipedia article on Survivorship bias

Survivorship bias or survival bias is the logical error of concentrating on the people or things that made it past some selection process and overlooking those that did not, typically because of their lack of visibility. This can lead to false conclusions in several different ways. It is a form of selection bias.

Survivorship bias can lead to overly optimistic beliefs because failures are ignored, such as when companies that no longer exist are excluded from analyses of financial performance.

/galleries/state-of-beam-2017/Survivorship-bias.png

The damaged portions of returning planes show locations where they can take a hit and still return home safely; those hit in other places do not survive.

This survey is done on people that wanted to learn Erlang, learned it, and are still active enough on the community to see the survey announcement.

This means that the answers are from the ones that "survived", which makes it really hard to get good feedback on the bad parts of the language, tooling and community since the most affected by it aren't going to stay around to fill this survey.

How to reach those? I don't know, propose solutions on the discussion.

I forgot to ask if I could make public the name of the companies so I won't, but I can say that I got 202 responses and most of them are not duplicates.

Things to improve for next year

  • Ask users if they want their answers available to be distributed in raw form for others to analyze
  • Ask users if I can share publicly the name of the company where they use Erlang
  • Decide what to do about Elixir-only replies, maybe make a question about it
  • Make specific questions regarding better tooling
  • I forgot Russia and Central America options, maybe next time do Latin America?

Let's see the results!

Which languages of the BEAM do you use?

/galleries/state-of-beam-2017/lang.png

Clearly Erlang is the most used language, ignoring the Elixir Effect, I'm kind of disappointed by the lack of users trying alternative languages. More so given the fact that many of the complaints or requests in other questions are already solved by other languages in the ecosystem, for example "better macros" or lisp inspired features being solved by LFE, static/stronger typing or better static analysis being solved by Alpaca, Elixir's pipe operator and a more mainstream syntax being solved by Efene.

My advice to the community, try the other languages, blog/tweet about it and share feedback with their creators, there's a language for each taste!

Erlang 326 54.42%
Elixir 231 38.56%
LFE 14 2.34%
Luerl 12 2.00%
Alpaca 9 1.50%
Clojerl 4 0.67%
Erlog 1 0.17%
Efene 1 0.17%
PHP 1 0.17%

How would you characterize your use of BEAM Languages today?

/galleries/state-of-beam-2017/use.png

Many people using it for serious stuff, the Open Source answer is really low here but is contradicted by another answer below.

I think I should add another option for something like "experiments", "try new ideas".

I use it at work 327 48.66%
I use it for serious "hobby" projects 245 36.46%
I'm just tinkering 62 9.23%
I use it for my studies 35 5.21%
Learning 1 0.15%
katas 1 0.15%
Open Source Software 1 0.15%

In which domains are you applying it?

/galleries/state-of-beam-2017/domains.png
Distributed Systems 225 15.20%
Web development 214 14.46%
Building and delivering commercial services 172 11.62%
Open source projects 149 10.07%
Network programming 136 9.19%
Enterprise apps 92 6.22%
Databases 80 5.41%
IoT / home automation / physical computing 75 5.07%
System administration / dev ops 60 4.05%
Big Data 51 3.45%
Mobile app development (non-web) 46 3.11%
Research 33 2.23%
AI / NLP / machine learning 28 1.89%
Games 28 1.89%
Math / data analysis 23 1.55%
Scientific computing / simulations / data visualization 21 1.42%
Desktop apps 14 0.95%
Graphics / Art 4 0.27%
Music 3 0.20%
Industrial Automation 2 0.14%
log system 1 0.07%
videostreaming 1 0.07%
soft real time analytics 1 0.07%
Security Event Processing 1 0.07%
Media encoding and distribution 1 0.07%
Ad delivery 1 0.07%
Telecom Apps 1 0.07%
telecom and chat 1 0.07%
video 1 0.07%
Developer Tooling 1 0.07%
Telecommunications 1 0.07%
embedded systems 1 0.07%
Advertising/RTB 1 0.07%
Prototyping network apps 1 0.07%
Real time systems 1 0.07%
Real-Time Bidding 1 0.07%
Instant messaging / VoIP / Communications 1 0.07%
ad traffic management 1 0.07%
REST/GraphQL API 1 0.07%
Test systems 1 0.07%
Learning 1 0.07%
telecommunications 1 0.07%
VoIP 1 0.07%
Code static analysis 1 0.07%

What industry or industries do you develop for?

/galleries/state-of-beam-2017/industries.png
Enterprise software 117 15.04%
Communications / Networking 103 13.24%
Consumer software 85 10.93%
IT / Cloud Provider 83 10.67%
Financial services / FinTech 69 8.87%
Telecom 67 8.61%
Media / Advertising 46 5.91%
Retail / ecommerce 41 5.27%
Academic 29 3.73%
Healthcare 28 3.60%
Education 26 3.34%
Government / Military 22 2.83%
Scientific 16 2.06%
Legal Tech 6 0.77%
Energy 5 0.64%
Gaming 2 0.26%
HR 2 0.26%
Security 2 0.26%
Logistics 2 0.26%
sports/fitness 1 0.13%
Retired 1 0.13%
Sport 1 0.13%
Business Intelligence 1 0.13%
Telematics / Car industry 1 0.13%
Manufacturing / Automotive 1 0.13%
Cultural/Museum 1 0.13%
Utilities 1 0.13%
Open source 1 0.13%
Travel 1 0.13%
Sport analysis 1 0.13%
Fitness 1 0.13%
Online Games 1 0.13%
Automotive 1 0.13%
Marketing 1 0.13%
Real estate 1 0.13%
Consumer electronics 1 0.13%
Non profit 1 0.13%
Client driven 1 0.13%
Industrial IoT 1 0.13%
Electric utility 1 0.13%
SaaS 1 0.13%
Automobile 1 0.13%
energy sector 1 0.13%
utilities 1 0.13%
Recruitment 1 0.13%
Energetics 1 0.13%

How long have you been using Erlang?

/galleries/state-of-beam-2017/howlong.png

The entrants (1 year or less) being less than 2 and 3 years may be discouraging or maybe as a sign that this survey didn't reach as many newcomers as it should.

> 6 Years 116 27.62%
2 Years 76 18.10%
3 Years 58 13.81%
1 Year 52 12.38%
Less than a year 45 10.71%
5 Years 36 8.57%
4 Years 34 8.10%
I've stopped using it 3 0.71%

What's your age

/galleries/state-of-beam-2017/age.png

Similar to the previous one, the survey shows that we are not interesting to young programmers (or this survey is not interesting to them :)

30-40 179 42.42%
20-30 112 26.54%
40-50 93 22.04%
> 50 31 7.35%
< 20 7 1.66%

What's your gender

/galleries/state-of-beam-2017/gender.png

One I was expecting, but bad nonetheless.

Male 401 95.02%
Prefer not to say 15 3.55%
Female 5 1.18%
attack helicopter 1 0.24%

Where are you located?

/galleries/state-of-beam-2017/location.png
North America 127 30.09%
Western Europe 117 27.73%
Eastern Europe 42 9.95%
Northern Europe 39 9.24%
South America 30 7.11%
Asia 25 5.92%
Oceania 11 2.61%
Russia 7 1.66%
India 6 1.42%
China 6 1.42%
South Saharan Afica 3 0.71%
Middle East 2 0.47%
Europe 1 0.24%
Iran 1 0.24%
Central America 1 0.24%
Australia 1 0.24%
Thailand 1 0.24%
East Africa 1 0.24%
Central Europe 1 0.24%

What is your level of experience with functional programming?

/galleries/state-of-beam-2017/fpexp.png

7 answers got the joke or are really awesome programmers :)

/galleries/state-of-beam-2017/profunctor.jpg
Intermediate 202 48.44%
Advanced 148 35.49%
Beginner 57 13.67%
Profunctor Optics Level 7 1.68%
None 3 0.72%

Prior to using Erlang, which were your primary development languages?

/galleries/state-of-beam-2017/prevlang.png
C or C++ 163 14.75%
Python 145 13.12%
Javascript 144 13.03%
Ruby 138 12.49%
Java 135 12.22%
PHP 72 6.52%
C# 56 5.07%
Perl 46 4.16%
Go 26 2.35%
Haskell 25 2.26%
Swift or Objective-C 24 2.17%
Common Lisp 20 1.81%
Scala 20 1.81%
Scheme or Racket 14 1.27%
Visual Basic 11 1.00%
Clojure 8 0.72%
R 8 0.72%
Rust 7 0.63%
None 6 0.54%
OCaml 3 0.27%
F# 3 0.27%
Kotlin 2 0.18%
Standard ML 2 0.18%
Fortran 2 0.18%
Pascal 1 0.09%
Ocaml 1 0.09%
KDB 1 0.09%
so "primary" here for me is "what was most used at work" 1 0.09%
TypeScript 1 0.09%
Microsoft Access 1 0.09%
Groovy 1 0.09%
but I am a self-proclaimed polyglot 1 0.09%
Shell 1 0.09%
Tcl/Tk 1 0.09%
Limbo 1 0.09%
Smalltalk 1 0.09%
clojure 1 0.09%
ActionScript 1 0.09%
Actionscript 1 0.09%
Prolog 1 0.09%
Racket 1 0.09%
Bash 1 0.09%
ML 1 0.09%
TCL 1 0.09%
Elixir 1 0.09%
C ANSI POSIX 1 0.09%
D 1 0.09%
ocaml 1 0.09%
Assembly 1 0.09%

Which client-side language are you using with Erlang?

/galleries/state-of-beam-2017/clientlang.png
Javascript 257 44.93%
None 90 15.73%
Elm 69 12.06%
Java 36 6.29%
Swift/Objective-C 36 6.29%
Clojurescript 13 2.27%
ReasonML/Ocaml 10 1.75%
Kotlin 8 1.40%
Typescript 7 1.22%
Scala 7 1.22%
Purescript 6 1.05%
C++ 4 0.70%
TypeScript 3 0.52%
Go 2 0.35%
typescript 2 0.35%
Python 2 0.35%
Erlang 2 0.35%
Flow + Javascript 1 0.17%
HTML-CSS 1 0.17%
Haskell 1 0.17%
What do you mean by "client-side language"? 1 0.17%
other 1 0.17%
Action Script 3 1 0.17%
Coffeescript 1 0.17%
d3.js 1 0.17%
lua 1 0.17%
Python/PyQt 1 0.17%
Dart 1 0.17%
Golang 1 0.17%
Ruby 1 0.17%
M$ C# 1 0.17%
Python (interface to legacy system - not web based) 1 0.17%
clojure 1 0.17%
C# 1 0.17%
Tcl/Tk 1 0.17%

In your Erlang projects, do you interoperate with other languages? if so, which ones?

/galleries/state-of-beam-2017/interop.png
C or C++ 156 24.19%
None 92 14.26%
Python 87 13.49%
Javascript 72 11.16%
Java 51 7.91%
Ruby 37 5.74%
Rust 27 4.19%
Go 27 4.19%
Swift or Objective-C 14 2.17%
C# 12 1.86%
Scala 11 1.71%
PHP 9 1.40%
Perl 8 1.24%
R 8 1.24%
Haskell 6 0.93%
Common Lisp 4 0.62%
Clojure 3 0.47%
OCaml 3 0.47%
Elixir 2 0.31%
Scheme or Racket 2 0.31%
Bash 2 0.31%
Kotlin 1 0.16%
KDB 1 0.16%
I use Erlang from Elixir 1 0.16%
lua 1 0.16%
SQL 1 0.16%
java 1 0.16%
Ocaml 1 0.16%
go 1 0.16%
Not directly via NIFs/ports but via HTTP/rabbit with ruby 1 0.16%
Tcl/Tk 1 0.16%
Lua 1 0.16%
python 1 0.16%

Which is your primary development environment?

/galleries/state-of-beam-2017/editor.png

I thought Emacs would win here by the fact that the Erlang creators use Emacs.

Vim 116 27.49%
Emacs 114 27.01%
IntelliJ 47 11.14%
Visual Studio Code 47 11.14%
Sublime Text 39 9.24%
Atom 32 7.58%
Eclipse 6 1.42%
spacemacs 2 0.47%
nano 2 0.47%
linux with vim as my text editor 1 0.24%
Kate 1 0.24%
textmate 1 0.24%
TextPad 1 0.24%
Simple text editor 1 0.24%
Notepad++ 1 0.24%
Also nvi. 1 0.24%
mcedit 1 0.24%
PSPad 1 0.24%
geany with erlang syntax support 1 0.24%
Kakoune 1 0.24%
Neovim 1 0.24%
Acme 1 0.24%
Spacemacs 1 0.24%
Atom and Emacs both very equally 1 0.24%
ed 1 0.24%
elvis 1 0.24%

Where do you go for Erlang news and discussions?

/galleries/state-of-beam-2017/news.png
Twitter 204 20.20%
Mailing List 188 18.61%
Slack 116 11.49%
Reddit 111 10.99%
Stack Overflow 103 10.20%
IRC 62 6.14%
Erlang Central 60 5.94%
Newsletters 50 4.95%
Podcasts 33 3.27%
Planet Erlang 31 3.07%
ElixirForum 31 3.07%
Elixir Community 1 0.10%
lobste.rs 1 0.10%
EUC 1 0.10%
Reddit and ElixirForum 1 0.10%
This week in Erlang 1 0.10%
Gitter 1 0.10%
Awesome Elixir 1 0.10%
OTP's github for PRs and commit log 1 0.10%
elixirstatus.com 1 0.10%
Google Plus 1 0.10%
youtube 1 0.10%
Search on github 1 0.10%
Erlang solutions 1 0.10%
not interesting 1 0.10%
Medium 1 0.10%
None 1 0.10%
Watch talks 1 0.10%
Conference Videos 1 0.10%
Elixir Sips 1 0.10%
https://medium.com/@gootik 1 0.10%
https://lobste.rs/t/erlang 1 0.10%

Which versions of the Erlang VM do you currently use for development?

/galleries/state-of-beam-2017/versiondev.png

We are really up to date, we are near a point where we can assume maps on our libraries :)

20 305 46.71%
19 213 32.62%
18 84 12.86%
17 30 4.59%
16 16 2.45%
<= 15 5 0.77%

Which versions of the Erlang VM do you currently use in production?

/galleries/state-of-beam-2017/versiondeploy.png

Of course production is a little more conservative

19 215 37.65%
20 183 32.05%
18 94 16.46%
17 43 7.53%
16 26 4.55%
<= 15 10 1.75%

Which build tool do you use?

/galleries/state-of-beam-2017/buildtool.png

Nice to see Rebar3 picking up momentum, Mix being mainly the Elixir Effect, next year I should add an option for "Mix for erlang or mixed projects".

Rebar3 220 32.59%
Mix 177 26.22%
Makefile 111 16.44%
Rebar 71 10.52%
erlang.mk 46 6.81%
Custom build scripts 44 6.52%
Distillery 1 0.15%
maven 1 0.15%
redo 1 0.15%
mix 1 0.15%
synrc/mad 1 0.15%
MBU 1 0.15%

How do you test your code?

/galleries/state-of-beam-2017/howtest.png

Surprised by EUnit being on top, why do people prefer it over Common Test?

EUnit 216 34.67%
Common Test 158 25.36%
ExUnit 74 11.88%
PropEr 69 11.08%
I don't write tests 45 7.22%
QuickCheck 33 5.30%
Custom 4 0.64%
Triq 4 0.64%
ESpec 3 0.48%
CutEr 3 0.48%
StreamData 2 0.32%
Lux 2 0.32%
py.test 2 0.32%
Functional tests 1 0.16%
Don't have time to write tests 1 0.16%
katana-test 1 0.16%
riak_test 1 0.16%
Dialyzer 1 0.16%
Integration tests 1 0.16%
Elixir tests. 1 0.16%
Concuerror 1 0.16%

How do you deploy your application?

/galleries/state-of-beam-2017/howdeploy.png

Lot of custom deploy scripts and docker/kubernetes here, maybe we should have better deploy support in our tools?

Custom deploy scripts 186 32.75%
Docker 128 22.54%
Kubernetes 50 8.80%
Ansible 44 7.75%
I don't deploy in other servers 40 7.04%
Chef 39 6.87%
Puppet 11 1.94%
SaltStack 11 1.94%
Heroku 11 1.94%
Edeliver 8 1.41%
deb 7 1.23%
Distillery 7 1.23%
Zones 5 0.88%
AWS CodeDeploy 3 0.53%
Rancher 1 0.18%
VM image 1 0.18%
boot from flash 1 0.18%
https://github.com/labzero/bootleg2 1 0.18%
Not my job 1 0.18%
copy paste 1 0.18%
mad 1 0.18%
CD 1 0.18%
Exrm 1 0.18%
rpm 1 0.18%
Nomad 1 0.18%
AWS ECS 1 0.18%
FreeBSD Jails 1 0.18%
lxc 1 0.18%
WIX 1 0.18%
os packages 1 0.18%
nanobox 1 0.18%
cloudfoundry 1 0.18%

What is your organization's size?

/galleries/state-of-beam-2017/orgsize.png

Can we say that Erlang works on organizations of any size?

11-50 109 26.20%
2-10 93 22.36%
Just me 75 18.03%
500+ 65 15.62%
101-500 45 10.82%
51-100 29 6.97%

Which operating system(s) you use for development?

/galleries/state-of-beam-2017/osdev.png

Almost same amount of Windows and FreeBSD, is it because Windows support is bad? or is this a reflection of the usual developer OS choice in any programming language?

Linux 307 47.01%
MacOS 253 38.74%
Windows 38 5.82%
FreeBSD 34 5.21%
Illumos 8 1.23%
OpenBSD 7 1.07%
Solaris 3 0.46%
NetBSD 1 0.15%
GRiSP 1 0.15%
ChromeOS 1 0.15%

Which operating system(s) you use for deployment?

/galleries/state-of-beam-2017/osdeploy.png
Linux 378 75.15%
FreeBSD 43 8.55%
MacOS 25 4.97%
Windows 22 4.37%
I don't deploy in other servers 11 2.19%
Solaris 9 1.79%
Illumos 8 1.59%
OpenBSD 3 0.60%
RTEMS 1 0.20%
GRiSP 1 0.20%
OSv 1 0.20%
NetBSD 1 0.20%

Where do you deploy your applications?

/galleries/state-of-beam-2017/wheredeploy.png

I don't think this question provides useful information, maybe I should add options?

Public Cloud 188 27.85%
Use on local machine(s) 162 24.00%
Traditional Infrastructure 157 23.26%
Private Cloud (or hybrid) 156 23.11%
Distillery 1 0.15%
VMs on ESXi 1 0.15%
embedded systems 1 0.15%
Rented physical server 1 0.15%
Vagrant vms 1 0.15%
Embedded appliances 1 0.15%
Heroku 1 0.15%
VPS 1 0.15%
hyper.sh 1 0.15%
Containers (Docker) 1 0.15%
Edeliver 1 0.15%
GitHub 1 0.15%

Which events have you attended in the last year?

/galleries/state-of-beam-2017/events.png

Local Meetups at the top is a nice one, we can work to promote more of these.

Local Meetup 124 46.97%
Erlang Factory 39 14.77%
Erlang User Conference 38 14.39%
Erlang Factory Light 20 7.58%
ElixirConf 12 4.55%
ElixirConfEU 9 3.41%
None 6 2.27%
Lonestar Elixir 3 1.14%
Code Mesh 2 0.76%
Lambda Days 1 0.38%
ElixirConfEU + ElixirLDN 1 0.38%
ElixirConf USA 2017 1 0.38%
Elixir London 1 0.38%
ElixirConf 2017 1 0.38%
Elixir meetup in Leeds UK 1 0.38%
ElixirLive Conference in Warsaw 1 0.38%
J on the Beach 1 0.38%
Peer gatherings in region 1 0.38%
Empex 1 0.38%
Elixir Camp 1 0.38%

Do you use HiPE?

No 301 74.32%
Yes 104 25.68%

Do you use dialyzer?

Yes 280 66.99%
No 138 33.01%

How important have each of these aspects of Erlang been to you and your projects?

Remember that charts and tables are sorted by most to less answers to compare correctly in the following set of questions.

Community

/galleries/state-of-beam-2017/ocommunity.png
Very Important 157 37.74%
Fairly Important 112 26.92%
Important 79 18.99%
Slightly Important 52 12.50%
No Opinion 8 1.92%
Not Important at All 8 1.92%

Concurrency facilities

/galleries/state-of-beam-2017/oconcurrency.png
Very Important 306 73.73%
Fairly Important 58 13.98%
Important 36 8.67%
No Opinion 7 1.69%
Slightly Important 7 1.69%
Not Important at All 1 0.24%

Ease of development

/galleries/state-of-beam-2017/oeasedev.png
Very Important 205 49.52%
Fairly Important 98 23.67%
Important 72 17.39%
Slightly Important 27 6.52%
No Opinion 10 2.42%
Not Important at All 2 0.48%

Functional Programming

/galleries/state-of-beam-2017/ofp.png
Very Important 207 49.88%
Fairly Important 105 25.30%
Important 53 12.77%
Slightly Important 33 7.95%
No Opinion 9 2.17%
Not Important at All 8 1.93%

Immutability

/galleries/state-of-beam-2017/oimmutability.png
Very Important 222 53.62%
Fairly Important 90 21.74%
Important 60 14.49%
Slightly Important 30 7.25%
No Opinion 8 1.93%
Not Important at All 4 0.97%

Runtime performance

/galleries/state-of-beam-2017/operf.png
Very Important 148 35.75%
Fairly Important 122 29.47%
Important 95 22.95%
Slightly Important 36 8.70%
Not Important at All 7 1.69%
No Opinion 6 1.45%

The REPL

/galleries/state-of-beam-2017/orepl.png
Very Important 145 35.02%
Fairly Important 106 25.60%
Important 74 17.87%
Slightly Important 61 14.73%
No Opinion 19 4.59%
Not Important at All 9 2.17%

Tracing

/galleries/state-of-beam-2017/otracing.png
Slightly Important 96 23.02%
Very Important 95 22.78%
Fairly Important 90 21.58%
Important 82 19.66%
Not Important at All 29 6.95%
No Opinion 25 6.00%

What has been most frustrating or has prevented you from using Erlang more than you do now?

App deployment

/galleries/state-of-beam-2017/fappdev.png
Not Frustrating at All 120 30.08%
Slightly Frustrating 93 23.31%
Frustrating 54 13.53%
Fairly Frustrating 41 10.28%
Quite the contrary: I love this feature 38 9.52%
No Opinion 29 7.27%
Very Frustrating 24 6.02%

Error messages

/galleries/state-of-beam-2017/ferrormsgs.png
Slightly Frustrating 119 29.82%
Not Frustrating at All 89 22.31%
Frustrating 48 12.03%
Fairly Frustrating 43 10.78%
Quite the contrary: I love this feature 39 9.77%
Very Frustrating 38 9.52%
No Opinion 23 5.76%

Finding libraries

/galleries/state-of-beam-2017/flibs.png
Slightly Frustrating 137 34.08%
Not Frustrating at All 121 30.10%
Frustrating 64 15.92%
Fairly Frustrating 31 7.71%
Quite the contrary: I love this feature 21 5.22%
Very Frustrating 15 3.73%
No Opinion 13 3.23%

Hard to Learn it

/galleries/state-of-beam-2017/fhardlearn.png
Not Frustrating at All 204 51.13%
Slightly Frustrating 77 19.30%
Quite the contrary: I love this feature 48 12.03%
Frustrating 29 7.27%
No Opinion 21 5.26%
Fairly Frustrating 14 3.51%
Very Frustrating 6 1.50%

Hiring and staffing

/galleries/state-of-beam-2017/fhiring.png
No Opinion 124 31.47%
Slightly Frustrating 78 19.80%
Not Frustrating at All 71 18.02%
Fairly Frustrating 43 10.91%
Frustrating 41 10.41%
Very Frustrating 25 6.35%
Quite the contrary: I love this feature 12 3.05%

Installation process

/galleries/state-of-beam-2017/finstallation.png
Not Frustrating at All 218 55.05%
Slightly Frustrating 67 16.92%
Quite the contrary: I love this feature 54 13.64%
Frustrating 23 5.81%
Fairly Frustrating 16 4.04%
No Opinion 14 3.54%
Very Frustrating 4 1.01%

Long term viability

/galleries/state-of-beam-2017/fviability.png
Not Frustrating at All 194 48.87%
Quite the contrary: I love this feature 74 18.64%
Slightly Frustrating 46 11.59%
No Opinion 41 10.33%
Frustrating 28 7.05%
Fairly Frustrating 9 2.27%
Very Frustrating 5 1.26%

Need more docs/tutorials

/galleries/state-of-beam-2017/fdocs.png
Not Frustrating at All 127 32.32%
Slightly Frustrating 124 31.55%
Frustrating 44 11.20%
Fairly Frustrating 37 9.41%
Quite the contrary: I love this feature 22 5.60%
No Opinion 22 5.60%
Very Frustrating 17 4.33%

Need more text editor support/IDEs

/galleries/state-of-beam-2017/fides.png
Not Frustrating at All 168 42.00%
Slightly Frustrating 93 23.25%
Frustrating 39 9.75%
Quite the contrary: I love this feature 32 8.00%
Fairly Frustrating 28 7.00%
Very Frustrating 22 5.50%
No Opinion 18 4.50%

Need more tools

/galleries/state-of-beam-2017/ftools.png
Slightly Frustrating 128 32.16%
Not Frustrating at All 99 24.87%
Frustrating 58 14.57%
Fairly Frustrating 40 10.05%
Very Frustrating 34 8.54%
No Opinion 26 6.53%
Quite the contrary: I love this feature 13 3.27%

No static typing

/galleries/state-of-beam-2017/ftyping.png
Not Frustrating at All 113 28.18%
Slightly Frustrating 105 26.18%
Quite the contrary: I love this feature 63 15.71%
Frustrating 40 9.98%
Fairly Frustrating 34 8.48%
Very Frustrating 25 6.23%
No Opinion 21 5.24%

Release schedule

/galleries/state-of-beam-2017/freleasesched.png
Not Frustrating at All 258 64.99%
Quite the contrary: I love this feature 57 14.36%
No Opinion 43 10.83%
Slightly Frustrating 26 6.55%
Frustrating 9 2.27%
Very Frustrating 2 0.50%
Fairly Frustrating 2 0.50%

Runtime performance

/galleries/state-of-beam-2017/fperformance.png
Not Frustrating at All 185 46.25%
Slightly Frustrating 72 18.00%
Quite the contrary: I love this feature 57 14.25%
Frustrating 32 8.00%
No Opinion 25 6.25%
Fairly Frustrating 17 4.25%
Very Frustrating 12 3.00%

Unpleasant community

/galleries/state-of-beam-2017/fcommunity.png
Not Frustrating at All 224 56.14%
Quite the contrary: I love this feature 79 19.80%
No Opinion 45 11.28%
Slightly Frustrating 26 6.52%
Frustrating 14 3.51%
Very Frustrating 7 1.75%
Fairly Frustrating 4 1.00%

Version incompatibility

/galleries/state-of-beam-2017/fversioncompat.png
Not Frustrating at All 212 53.13%
Slightly Frustrating 84 21.05%
No Opinion 40 10.03%
Quite the contrary: I love this feature 29 7.27%
Frustrating 19 4.76%
Fairly Frustrating 11 2.76%
Very Frustrating 4 1.00%

Any feature you would like to see added to the language?

This was an open ended question, I'm summarizing similar answers here in groups

Static Typing 20
Performance 7
Pipe operator 7
JIT 6
Currying 6
Better GUI lib 5
Better macros 4
Docs in shell 4
Better language interop 4
JSON in stdlib 3
Compile to single binary 3
Namespaces 3
Rebind variables 2
Numeric performance 2
Elixir protocols 2
Language server protocol 2
Non full mesh disterl 2
Consensus implementations in stdlib 2
More than 2 version of same module 2
Atom GC 2

Other answers with one vote:

  • Backward compatibility
  • BEAM on browsers
  • Better binary syntax
  • Better container support
  • Better datetime support
  • Better documentation
  • Better errors
  • Better global registry
  • Better if expression
  • Better map support in mnesia
  • Better ML integration
  • Better module system
  • Better proc_lib
  • Better profiler
  • Better site
  • Better string module
  • Better unicode support
  • Bigger standard library
  • Bring back parameterized modules
  • Cleanup standard library
  • Code change watcher and loader
  • Consistent error return
  • CRDTs
  • Curses version of observer
  • Database drivers/support
  • Early return statement
  • Encrypted inter node communication
  • Erlang leveldb/rocksdb (better DETS)
  • First class records (not as tuples)
  • Function composition
  • IPv6
  • Laziness
  • LLVM based Hipe
  • Map comprehensions
  • Monads
  • More behaviors
  • More Lispy
  • More robust on_load
  • Multi-poll sets
  • Native compilation
  • New logo
  • Numerical/GPU support
  • Orleans
  • Package manager
  • Rational numbers
  • Remove stuff
  • Short circuit folding a list
  • Single file distribution
  • String performance
  • Top-like tool
  • Type checking as you type
  • Type inference
  • WebRTC
  • With expression like Elixir

Any advise on how we can make Erlang more welcoming and easy to use?

This was an open ended question, I'm summarizing similar answers here in groups

Better guides 20
Better documentation 18
Better error messages 13
Better tooling 9
Central/better landing page 4
Better REPL 3
Translated documentation 2
Better libraries 2
Friendlier community 2
Better learning curve 2
Learn from Elixir community 3
IDE support 3
Marketing 2
Searchable docs 2
Throw away legacy 2
Simpler release process 2

Other answers with one vote:

  • A killer app
  • Better tracing tools
  • Better Windows experience
  • Embrace BEAM languages
  • Erlang forum
  • Introductory workshops
  • More conferences
  • More welcoming mailing list
  • Nicer syntax
  • Performance
  • Templates

Would you like to share any frustrating experience?

This was an open ended question, I'm summarizing similar answers here in groups

Bad tooling 6
Lack of libraries, immaturity, not maintained 5
Lack of guides 5
Unwelcome community 4
Lack of strong/static typing 4
Syntax 4
No examples on documentation/in general 3
Bad documentation 3
Mnesia 3
Bad debugging experience 3
Ignoring other languages/communities/feedback 3
Niche language 2
Bad shell 2
No jobs 2
Learning curve 2
Difficulty contributing to the BEAM and OTP 2
Confusing errors 2
Performance 2
Package management 2

Other answers with one vote:

  • Atoms not garbage collected
  • Elitism
  • Flat namespace
  • Hard to hire
  • Lack of features in standard library
  • Lack of language features
  • Not a clear overview of the ecosystem
  • People complaining about syntax
  • String vs binary
  • Upgrades break code

Papers (and other things) of the LargeSpanOfTime III

I got to this papers by following the references from another paper, they were interesting but not as much as I was expecting:

This was on my queue:

A good one, I found it quite hard to find good overviews about datalog before this one.

Finished reading the book (or collection of 19 papers), really good chapters/papers in it: Your Wish is My Command: Giving Users the Power to Instruct their Software

Reading:

Papers this not so long span of time: 5 (count books as 1)

Papers+Books so far: 59

Multi-Paxos with riak_ensemble Part 3

In the previous post I showed how to use riak_ensemble in a rebar3 project, now I will show how to create an HTTP API for the Key/Value store using Cowboy and jsone.

This post assumes that you have erlang and rebar3 installed, I'm using erlang 19.3 and rebar3 3.4.3.

The source code for this post is at https://github.com/marianoguerra/cadena check the commits for the steps.

Dependency Setup

To have an HTTP API we will need an HTTP server, in our case we will use Cowboy 2.0 RC 3, for that we need to:

  1. Add it as a dependency (we will load if from git since it's still a release candidate)
  2. Add it to our list of applications to start when our application starts
  3. Add it to the list of dependencies to include in our release
  4. Set up the HTTP listener and routes when our application starts

We setup just one route that is handled by the cadena_h_keys module, it's a plain HTTP handler, no fancy REST stuff for now, there we handle the request on the init/2 function itself, we pattern match against the method field on the request object and handle:

POST
set a key in a given ensemble to the value sent in the JSON request body
GET
get a key in a given ensemble, if not found null will be returned in the value field in the response
DELETE
delete a key in a given ensemble, returns null both if the key existed and if itdidn't

Any other method would get a 405 Method Not Allowed response.

The route has the format /keys/<ensemble>/<key>, for now we only allow the root ensemble to be set in the <ensemble> part of the path.

We also add the jsone library to encode/decode JSON and the lager library to log messages.

We add both to the list of dependencies to include in the release.

We will also need to have a way to override the HTTP port where each instance listens to so we can run a cluster on one computer and each node can listen for HTTP requests on a different port.

The dev and prod releases will listen on 8080 as specified in vars.config.

node1 will listen on port 8081 (override in vars_node1.config)

node2 will listen on port 8082 (override in vars_node2.config)

node3 will listen on port 8083 (override in vars_node3.config)

To avoid having to configure this in sys.config we will define a cuttlefish schema in config.schema that cuttlefish will use to generate a default config file and validation code for us.

We have to replace the variables from variable overrides in our config.schema file for each release before it's processed by cuttlefish itself, for that we use the template directive on an overlay section on the release config.

Build devrel:

make revrel

Check the configuration file generated for each node at:

_build/node1/rel/cadena/etc/cadena.conf
_build/node2/rel/cadena/etc/cadena.conf
_build/node3/rel/cadena/etc/cadena.conf

The first part is of interest to us, it looks like this for node1, the port number is different in node2 and node3:

## port to listen to for HTTP API
##
## Default: 8081
##
## Acceptable values:
##   - an integer
http.port = 8081

## number of acceptors to user for HTTP API
##
## Default: 100
##
## Acceptable values:
##   - an integer
http.acceptors = 100

## folder where ensemble data is stored
##
## Default: ./cadena_data
##
## Acceptable values:
##   - text
data.dir = ./cadena_data

Start 3 nodes in 3 different shells:

make node1-console
make node2-console
make node3-console

Start enseble and join nodes, I created a target called devrel-setup in the Makefile to make it easier:

make devrel-setup

Let's set key1 in ensemble root to 42 on node1 (port 8081):

curl -X POST http://localhost:8081/keys/root/key1 -d 42

Response:

{"data":{"epoch":2,"key":"key1","seq":10,"value":42},"ok":true}

Let's get key1 in ensemble root to 42 on node2 (port 8082):

curl -X GET http://localhost:8082/keys/root/key1

Response:

{"data":{"epoch":2,"key":"key1","seq":10,"value":42},"ok":true}

Same on node3:

curl -X GET http://localhost:8083/keys/root/key1

Response:

{"data":{"epoch":2,"key":"key1","seq":10,"value":42},"ok":true}

Overwrite on node1:

curl -X POST http://localhost:8081/keys/root/key1 -d '{"number": 42}'

Response:

{"data":{"epoch":2,"key":"key1","seq":400,"value":{"number":42}},"ok":true}

Get on node2:

curl -X GET http://localhost:8082/keys/root/key2
{"data":{"epoch":3,"key":"key2","seq":11,"value":null},"ok":true}

Let's set key2 in ensemble root to {"number": 42} on node1 (port 8081):

curl -X POST http://localhost:8081/keys/root/key2 -d '{"number": 42}'

Response:

{"data":{"epoch":3,"key":"key2","seq":67,"value":{"number":42}},"ok":true}

Get it on node2:

curl -X GET http://localhost:8082/keys/root/key2

Response:

{"data":{"epoch":3,"key":"key2","seq":67,"value":{"number":42}},"ok":true}

Delete key2 in ensemble root on node2:

curl -X DELETE http://localhost:8082/keys/root/key2

Response:

{"data":{"epoch":3,"key":"key2","seq":137,"value":null},"ok":true}

Check that it was removed by trying to get it again on node2:

curl -X GET http://localhost:8082/keys/root/key2

Response:

{"data":{"epoch":3,"key":"key2","seq":137,"value":null},"ok":true}

There you go, now you have a Consistent Key Value Store with an HTTP API.

Multi-Paxos with riak_ensemble Part 2

In the previous post I showed how to use riak_ensemble from the interactive shell, now I will show how to use rebar3 to use riak_ensemble from a real project.

This post assumes that you have erlang and rebar3 installed, I'm using erlang 19.3 and rebar3 3.4.3.

The source code for this post is at https://github.com/marianoguerra/cadena check the commits for the steps.

Create Project

rebar3 new app name=cadena
cd cadena

The project structure should look like this:

.
├── LICENSE
├── README.md
├── rebar.config
└── src
        ├── cadena_app.erl
        ├── cadena.app.src
        └── cadena_sup.erl

1 directory, 6 files

Configuring Dev Release

We do the following steps, check the links for comments on what's going on for each step:

  1. Add Dependencies
  2. Configure relx section
    1. Add overlay variables file vars.config
    2. Add sys.config
    3. Add vm.args

Build a release to test that everything is setup correctly:

$ rebar3 release

Run the release interactively with a console:

$ _build/default/rel/cadena/bin/cadena console

Output (edited and paths redacted for clarity):

Exec: erlexec
        -boot _build/default/rel/cadena/releases/0.1.0/cadena
        -boot_var ERTS_LIB_DIR erts-8.3/../lib
        -mode embedded
        -config    _build/default/rel/cadena/generated.conf/app.1.config
        -args_file _build/default/rel/cadena/generated.conf/vm.1.args
        -vm_args   _build/default/rel/cadena/generated.conf/vm.1.args
        -- console

Root: _build/default/rel/cadena
Erlang/OTP 19 [erts-8.3] [source] [64-bit] [smp:4:4] [async-threads:64]
                      [kernel-poll:true]

18:31:12.150 [info] Application lager started on node 'cadena@127.0.0.1'
18:31:12.151 [info] Application cadena started on node 'cadena@127.0.0.1'
Eshell V8.3  (abort with ^G)
(cadena@127.0.0.1)1>

Quit:

(cadena@127.0.0.1)1> q().
ok

Non interactive start:

$ _build/default/rel/cadena/bin/cadena start

No output is generated if it's started, we can check if it's running by pinging the application:

$ _build/default/rel/cadena/bin/cadena ping

We should get:

pong

If we want we can attach a console to the running system:

$ _build/default/rel/cadena/bin/cadena attach

Output:

Attaching to /tmp/erl_pipes/cadena@127.0.0.1/erlang.pipe.1 (^D to exit)

(cadena@127.0.0.1)1>

If we press Ctrl+d we can dettach the console without stopping the system:

(cadena@127.0.0.1)1> [Quit]

We can stop the system whenever we want issuing the stop command:

$ _build/default/rel/cadena/bin/cadena stop

Output:

ok

Note

Use Ctrl+d to exit, if we write q(). not only we dettach the console but we also stop the system!

Let's try it.

Non interactive start:

$ _build/default/rel/cadena/bin/cadena start

No output is generated if it's started, we can check if it's running by pinging the application:

$ _build/default/rel/cadena/bin/cadena ping

We should get:

pong

If we want we can attach a console to the running system:

$ _build/default/rel/cadena/bin/cadena attach

Output:

Attaching to /tmp/erl_pipes/cadena@127.0.0.1/erlang.pipe.1 (^D to exit)

(cadena@127.0.0.1)1>

Now let's quit with q():

(cadena@127.0.0.1)1> q().

Output:

ok

Now let's see if it's alive:

$ _build/default/rel/cadena/bin/cadena ping

Node 'cadena@127.0.0.1' not responding to pings.

Be careful with how you quit attached consoles in production systems :)

Configure Prod and Dev Cluster Releases

Building Prod Release

We start by adding a new section to rebar.config called profiles, and define 4 profiles that override the default release config with specific values, let's start by trying the prod profile, which we will use to create production releases of the project:

rebar3 as prod release

Output:

===> Verifying dependencies...
...
===> Compiling cadena
===> Running cuttlefish schema generator
===> Starting relx build process ...
===> Resolving OTP Applications from directories:
          _build/prod/lib
          erl-19.3/lib
===> Resolved cadena-0.1.0
===> Including Erts from erl-19.3
===> release successfully created!

Notice now that we have a new folder in the _build directory:

$ ls -1 _build

Output:

default
prod

The results of the commands run "as prod" are stored in the prod folder.

You will notice if you explore the prod/rel/cadena folder that there's a folder called erts-8.3 (the version may differ if you are using a different erlang version), that folder is there because of the include_erts option we overrided in the prod profile.

This means you can zip the _build/prod/rel/cadena folder, upload it to a server that doesn't have erlang installed in it and still run your release there.

This is a good way to be sure that the version running in production is the same you use in development or at build time in your build server.

Just be careful with deploying to an operating system too different to the one you used to create the release becase you may have problems with bindings like libc or openssl.

Running it is done as usual, only the path changes:

_build/prod/rel/cadena/bin/cadena console

_build/prod/rel/cadena/bin/cadena start
_build/prod/rel/cadena/bin/cadena ping
_build/prod/rel/cadena/bin/cadena attach
_build/prod/rel/cadena/bin/cadena stop

Building Dev Cluster Releases

To build a cluster we need at least 3 nodes, that's why the last 3 profiles are node1, node2 and node3, they need to have different node names, for that we use the overlay var files to override the name of each, that is achieved on config/vars_node1.config for node1, config/vars_node2.config for node2 and config/vars_node3.config for node3.

Now let's build them:

rebar3 as node1 release
rebar3 as node2 release
rebar3 as node3 release

The output for each should be similar to the one for the prod release.

Now on three different shells start each node:

./_build/node1/rel/cadena/bin/cadena console

Check the name of the node in the shell:

(node1@127.0.0.1)1>

Do the same for node2 and node3 on different shells:

./_build/node2/rel/cadena/bin/cadena console
./_build/node3/rel/cadena/bin/cadena console

You should get respectively:

(node2@127.0.0.1)1>

And:

(node3@127.0.0.1)1>

In case you don't remember, you can quit with q().

Joining the Cluster Together

Until here we built 3 releases of the same code with slight modifications to allow running a cluster on one computer, but 3 nodes running doesn't mean we have a cluster, for that we need to use what we learned in the Multi-Paxos with riak_ensemble Part 1 but now on code and not interactively.

For that we will create a cadena_console module that we will use to make calls from the outside and trigger actions on each node, the code is similar to the one presented in Multi-Paxos with riak_ensemble Part 1.

join([NodeStr]) ->
    % node name comes as a list string, we need it as an atom
    Node = list_to_atom(NodeStr),
    % check that the node exists and is alive
    case net_adm:ping(Node) of
        % if not, return an error
        pang ->
            {error, not_reachable};
        % if it replies, let's join him passing our node reference
        pong ->
            riak_ensemble_manager:join(Node, node())
    end.

create([]) ->
    % enable riak_ensemble_manager
    riak_ensemble_manager:enable(),
    % wait until it stabilizes
    wait_stable().

cluster_status() ->
    case riak_ensemble_manager:enabled() of
        false ->
            {error, not_enabled};
        true ->
            Nodes = lists:sort(riak_ensemble_manager:cluster()),
            io:format("Nodes in cluster: ~p~n",[Nodes]),
            LeaderNode = node(riak_ensemble_manager:get_leader_pid(root)),
            io:format("Leader: ~p~n",[LeaderNode])
    end.

We also need to add the riak_ensemble supervisor to our supervisor tree in cadena_sup:

init([]) ->
    % get the configuration from sys.config
    DataRoot = application:get_env(riak_ensemble, data_root, "./data"),
    % create a unique path for each node to avoid clashes if running more
    % than one node in the same computer
    NodeDataDir = filename:join(DataRoot, atom_to_list(node())),

    Ensemble = {riak_ensemble_sup,
                {riak_ensemble_sup, start_link,
                 [NodeDataDir]},
                permanent, 20000, supervisor, [riak_ensemble_sup]},

    {ok, { {one_for_all, 0, 1}, [Ensemble]} }.

Before building the dev cluster we need to add the crypto app to cadena.app.src since it's needed by riak_ensemble to create the cluster.

Now let's build the dev cluster, I created a Makefile to make it simpler:

make devrel

On three different shells run one command on each:

make node1-console
make node2-console
make node3-console

Let's make an rpc call to enable the riak_ensemble cluster on node1:

./_build/node1/rel/cadena/bin/cadena rpc cadena_console create

On node1 you should see something like:

[info] {root,'node1@127.0.0.1'}: Leading

Let's join node2 to node1:

./_build/node2/rel/cadena/bin/cadena rpc cadena_console join node1@127.0.0.1

On node1 you should see:

[info] join(Vsn): {1,152} :: 'node2@127.0.0.1' :: ['node1@127.0.0.1']

On node2:

[info] JOIN: success

Finally let's join node3:

./_build/node3/rel/cadena/bin/cadena rpc cadena_console join node1@127.0.0.1

Output on node1:

[info] join(Vsn): {1,453} :: 'node3@127.0.0.1' :: ['node1@127.0.0.1','node2@127.0.0.1']

On node3:

[info] JOIN: success

Let's check that the 3 nodes have the same view of the cluster, let's ask node1 what's the ensemble status:

./_build/node1/rel/cadena/bin/cadena rpc cadena_console ensemble_status
Nodes in cluster: ['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']
Leader: 'node1@127.0.0.1'

node2:

$ ./_build/node2/rel/cadena/bin/cadena rpc cadena_console ensemble_status
Nodes in cluster: ['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']
Leader: 'node1@127.0.0.1'

node3:

$ ./_build/node3/rel/cadena/bin/cadena rpc cadena_console ensemble_status
Nodes in cluster: ['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']
Leader: 'node1@127.0.0.1'

Everything looks right, stop the 3 nodes (q().) and start them again, you will see that after starting up node1 logs:

[info] {root,'node1@127.0.0.1'}: Leading

And if you call ensemble_status on any node you get the same outputs as before, this means they remember the cluster topology even after restarts.

Public/Private Key Encryption, Sign and Verification in Erlang

You want to encrypt/decrypt some content?

You want to generate a signature and let others verify it?

At least that's what I wanted to do, so here it is.

First generate keys if you don't have some available:

openssl genrsa -out private.pem 2048
openssl rsa -in private.pem -out public.pem -outform PEM -pubout

Load the raw keys:

{ok, RawSKey} = file:read_file("private.pem").
{ok, RawPKey} = file:read_file("public.pem").

[EncSKey] = public_key:pem_decode(RawSKey).
SKey = public_key:pem_entry_decode(EncSKey).

[EncPKey] = public_key:pem_decode(RawPKey).
PKey = public_key:pem_entry_decode(EncPKey).

Let's encrypt a message with the private key and decrypt with the public key:

Msg = <<"hello crypto world">>.
CMsg = public_key:encrypt_private(Msg, SKey).
Msg = public_key:decrypt_public(CMsg, PKey).

We can do it the other way, encrypt with the public key and decrypt with the private key:

CPMsg = public_key:encrypt_public(Msg, PKey).
Msg = public_key:decrypt_private(CPMsg, SKey).

Let's generate a signature for the message that others can verify with our public key:

Signature = public_key:sign(Msg, sha256, SKey).
public_key:verify(Msg, sha256, Signature, PKey).

% let's see if it works with another message
public_key:verify(<<"not the original message">>, sha256, Signature, PKey).

Papers (and other things) of the LargeSpanOfTime II

OK, the title is getting fuzzier and fuzzier, but I decided to condense some things I've been reading here.

Papers:

Bringing the Web up to Speed with WebAssembly:

I like compilers, and their implementations, so I've been following WebAssembly, this is a good place to look at.

Spanner, TrueTime & The CAP Theorem:

A blog post by google made the rounds lately with people saying that google was saying that they beat the CAP Theorem, so I went to the source. The conclusion is interesting:

Spanner reasonably claims to be an “effectively CA” system despite operating over a wide area, as it is
always consistent and achieves greater than 5 9s availability. As with Chubby, this combination is possible
in practice if you control the whole network, which is rare over the wide area. Even then, it requires
significant redundancy of network paths, architectural planning to manage correlated failures, and very
careful operations, especially for upgrades. Even then outages will occur, in which case Spanner chooses
consistency over availability.
Spanner uses two-phase commit to achieve serializability, but it uses TrueTime for external consistency,
consistent reads without locking, and consistent snapshots.

Bitcoin: A Peer-to-Peer Electronic Cash System:

Again, many people ranting and raving about bitcoin, blockchain and cryptocurrencies, what's better than go to the source, really readable paper.

CAP Twelve Years Later: How the “Rules” Have Changed:

I have a deja vu that I already read this paper, but just to be sure I read it again, interesting summary of the concepts and how they evolved over time.

LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data:

I wanted to read the LSM-tree paper and it seems I didn't look what I was clicking so instead I ended up reading the LSM-trie paper, which is really interesting and has an overview of the LSM-tree one, now I have to go and read that one too.

A prettier printer Philip Wadler:

In a previous post I mentioned that I read "The Design of a Pretty-printing Library" and I was expecting something else, well, this paper is a something else that I liked more.

Metaobject protocols: Why we want them and what else they can do:

Being an aspiring Smug Lisp Weenie I had to read this one, it's a nice paper and puts a name on some "patterns" that I've observed but couldn't describe clearly.

The Cube Data Model: A Conceptual Model and Algebra for On-Line Analytical Processing in Data Warehouses:

I've been thinking lately about the relation between Pivot Tables, Data Cubes and the things mentioned in the paper A Layered Grammar of Graphics so I started reading more about Data Cubes, I skimmed a couple papers that I forgot to register somewhere but this one was one I actually registered.

End-to-End Arguments in System Design:

Someone somewhere mentioned this paper so I went to look, it's a really good one, like the Metaobject protocol paper and other's I've read, this one is like a condensation of years of knowledge and experiences that are really interesting to read.

Books:

Object-Oriented Programming in the Beta Programming Language:

Interesting book about a really interesting (and different) object oriented programming language by the creators of Simula (aka the creators of object orientation), it explains an abstraction called "patterns" in which all other abstractions are expressed.

Project Oberon The Design of an Operating System and Compiler:

Another interesting book by Niklaus Wirth, creator of between others, Pascal, Modula and Oberon describing how to basically create computing from scratch.

I will note that I skimmed over the dense specification parts of those books since I wasn't trying to implement nor use them.

Reading:

Papers this looong week: 11 (count books as papers because why not)

Papers so far: 54

Papers in queue: don't know

Multi-Paxos with riak_ensemble Part 1

In this post I will do the initial steps to setup a project using riak_ensemble and use its core APIs, we will do it manually in the shell on purpose, later (I hope) I will post how to build it properly in code.

First we create a new project, I'm using erlang 19.3 and rebar3 3.4.3:

rebar3 new app name=cadena

Then add riak_ensemble dependency to rebar.config, it should look like this:

{erl_opts, [debug_info]}.
{deps, [{riak_ensemble_ng, "2.4.0"}]}.

Now on 3 different terminals start 3 erlang nodes:

rebar3 shell --name node1@127.0.0.1
rebar3 shell --name node2@127.0.0.1
rebar3 shell --name node3@127.0.0.1

Run the following in every node:

Timeout = 1000.
Ensemble = root.
K1 = <<"k1">>.

application:set_env(riak_ensemble, data_root, "data/" ++ atom_to_list(node())).
application:ensure_all_started(riak_ensemble).

We are setting a variable telling riak_ensemble where to store the data for each node, node1 will store it under data/node1@127.0.0.1 node2 on data/node2@127.0.0.1 and node3 on data/node3@127.0.0.1

After that we ensure all apps that riak_ensemble requires to run are started.

You should see something like this:

ok

18:05:50.548 [info] Application lager started on node 'node1@127.0.0.1'
18:05:50.558 [info] Application riak_ensemble started on node 'node1@127.0.0.1'
{ok,[syntax_tools,compiler,goldrush,lager,riak_ensemble]}

Now on node1 run:

riak_ensemble_manager:enable().

Output:

ok

We start the riak_ensemble_manager in one node only.

Then on node2 we join node1 and node3:

riak_ensemble_manager:join('node1@127.0.0.1' ,node()).
riak_ensemble_manager:join('node3@127.0.0.1' ,node()).

Output on node2:

18:06:39.285 [info] JOIN: success
ok
remote_not_enabled

This command also generates output on node1:

18:06:24.008 [info] {root,'node1@127.0.0.1'}: Leading
18:06:39.281 [info] join(Vsn): {1,64} :: 'node2@127.0.0.1' :: ['node1@127.0.0.1']

On node3 we join node1 and node2:

riak_ensemble_manager:join('node1@127.0.0.1' ,node()).
riak_ensemble_manager:join('node2@127.0.0.1' ,node()).

Output on node 3:

18:07:36.078 [info] JOIN: success
ok

Output on node 1:

18:07:36.069 [info] join(Vsn): {1,291} :: 'node3@127.0.0.1' :: ['node1@127.0.0.1','node2@127.0.0.1']
18:07:36.074 [info] join(Vsn): {1,292} :: 'node3@127.0.0.1' :: ['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']

Run this on all nodes:

riak_ensemble_manager:check_quorum(Ensemble, Timeout).
riak_ensemble_peer:stable_views(Ensemble, Timeout).
riak_ensemble_manager:cluster().

Output:

true
{ok,true}
['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']

Everything seems to be ok, we have a cluster!

Now we can write something, let's set key "k1" to value "v1" on all nodes using paxos for consensus.

On node1 run:

V1 = <<"v1">>.
riak_ensemble_client:kover(node(), Ensemble, K1, V1, Timeout).

Output:

{ok,{obj,1,729,<<"k1">>,<<"v1">>}}

We can check on node2 that the value is available:

riak_ensemble_client:kget(node(), Ensemble, K1, Timeout).

Output:

{ok,{obj,1,729,<<"k1">>,<<"v1">>}}

Now we can try a different way to update a value, let's say we want to set a new value but depending on the current value or only if the current value is set to something specific, for that we use kmodify, which receives a function and calls us with the current value and sets the key to the value we return.

On node3 run:

V2 = <<"v2">>.
DefaultVal = <<"v0">>.
ModifyTimeout = 5000.

riak_ensemble_peer:kmodify(node(), Ensemble, K1,
    fun({Epoch, Seq}, CurVal) ->
        io:format("CurVal: ~p ~p ~p to ~p~n", [Epoch, Seq, CurVal, V2]),
        V2
    end,
    DefaultVal, ModifyTimeout).

Output on node 3:

{ok,{obj,1,914,<<"k1">>,<<"v2">>}}

Output on node 1:

CurVal: 1 914 <<"v1">> to <<"v2">>

The call with a function as parameter was done on node3 but it ran on node1, that's the advantage of using the Erlang virtual machine to build distributed systems.

Now let's check if the value was set on all nodes by checking it on node2:

riak_ensemble_client:kget(node(), Ensemble, K1, Timeout).

Output:

{ok,{obj,1,914,<<"k1">>,<<"v2">>}}

Now let's quit on all nodes:

q().

Let's start the cluster again to see if riak_ensemble rememers things, in 3 different terminals run:

rebar3 shell --name node1@127.0.0.1
rebar3 shell --name node2@127.0.0.1
rebar3 shell --name node3@127.0.0.1

On every node:

Timeout = 1000.
Ensemble = root.
K1 = <<"k1">>.

application:set_env(riak_ensemble, data_root, "data/" ++ atom_to_list(node())).
application:ensure_all_started(riak_ensemble).

We set the data_root again and start riak_enseble and its dependencies, after that on node1 we should see:

18:11:55.286 [info] {root,'node1@127.0.0.1'}: Leading

Now let's check that the cluster was initialized correctly:

riak_ensemble_manager:check_quorum(Ensemble, Timeout).
riak_ensemble_peer:stable_views(Ensemble, Timeout).
riak_ensemble_manager:cluster().

Output:

true
{ok,true}
['node1@127.0.0.1','node2@127.0.0.1','node3@127.0.0.1']

You can now check on any node you want if the key is still set:

riak_ensemble_client:kget(node(), Ensemble, K1, Timeout).

Output should be:

{ok,{obj,2,275,<<"k1">>,<<"v2">>}}

Check the generated files under the data folder:

$ tree data

data
├── node1@127.0.0.1
│   └── ensembles
│       ├── 1394851733385875569783788015140658786474476408261_kv
│       ├── ensemble_facts
│       └── ensemble_facts.backup
├── node2@127.0.0.1
│   └── ensembles
│       ├── ensemble_facts
│       └── ensemble_facts.backup
└── node3@127.0.0.1
    └── ensembles
            ├── ensemble_facts
            └── ensemble_facts.backup

6 directories, 7 files

To sum up, we created a project, added riak_ensemble as a dependency, started a 3 node cluster, joined all the nodes, wrote a key with a value, checked that it was available on all nodes, updated the value with a "compare and swap" operation, stopped the cluster, started it again and checked that the cluster was restarted as it was and the value was still there.

Papers of the LargeSpanOfTime I

Welp, some day the experiment had to end, I stopped reading 5 papers a week because some books arrived and I read those instead and also because I was busy at work.

But that doesn't mean I didn't read papers at all, so here's a list of the ones I did read.

Note

Since some of them I read them a while ago the reviews may not be really detailed

Cuneiform: A Functional Language for Large Scale Scientific Data Analysis

Seems useful in practice, was expecting something else from the title.

The Stratosphere platform for big data analytics

I remember reading a paper from what later became Apache Flink that I liked a lot, I was looking for that one and I found this one instead (stratosphere became flink), it was an interesting overview, would like to know how much of that is still in flink.

Orleans: Distributed Virtual Actors for Programmability and Scalability

Really good paper, I like how it's written and the idea and implementation.

HyParView: a membership protocol for reliable gossip-based broadcast

Epidemic Broadcast Trees

This too reviewed together because they are like bread and butter, I love both of them, highly recommended.

Large-Scale Peer-to-Peer Autonomic Monitoring

I won't lie to you, I don't remember much about this one, but given the authors it must be good :)

Stream Processing with a Spreadsheet

Object Spreadsheets: A New Computational Model for End-User Development of Data-Centric Web Applications

I was looking for ideas and inspiration when I read these two, I liked both, Object Spreadsheets being the most interesting aproach.

A Layered Grammar of Graphics

Great paper, on my top list, maybe because I love the topic :)

Virtual Time and Global States of Distributed Systems

A must read if interested in vector clocks, the non math parts are good, I don't enjoy reading theormes a lot (not their fault).

Papers this looong week: 10

Papers so far: 43

Papers in queue: don't want to count anymore

Improving Official Erlang Documentation

Many times I've heard people complaining about different aspects of the Official Erlang documentation, one thing that I find interesting is the fact that the Erlang documentation is really complete and detailed, so I decided to dedicate some time to other parts, to get familiar with it I decided to start with an "easy" one, it's presentation.

So I downloaded erlang/otp:

git clone https://github.com/erlang/otp.git

And did a build:

# to avoid having dates formated in your local format
export LC_ALL="en_US.utf-8"
cd otp
./otp_build setup
make docs

Then I installed the result in another folder to see the result:

mkdir ../erl-docs
make release_docs RELEASE_ROOT=../erl-docs

And served them to be able to navidate them:

cd ../erl-docs
python3 -m http.server

If you want to give it a try you need to install the following deps on debian based systems:

sudo apt install build-essential fop xsltproc autoconf libncurses5-dev

With the docs available I started looking around, the main files to modify are:

lib/erl_docgen/priv/css/otp_doc.css
The stylesheet for the docs
lib/erl_docgen/priv/xsl/db_html.xsl
An XSLT file to transform xml docs into html

The problem I found at first was that to see the results of my changes to db_html.xsl I had to do a clean and build from scratch, which involved recompiling erlang itself, taking a lot of time.

Later I found a way to only build the docs again by forcing a rebuild:

make -B docs

But this still involves building the pdf files which is the part that takes the most time, I haven't found a target that will only build the html files, if you know how or want to try to add it in the make file it would be great.

With this knowledge I started improving the docs, I will cover the main things I changed.

You can see all my chages in the improve-docs-style branch.

Small styling changes

  • Don't use full black and white
  • Set font to sans-serif
  • Use mono as code font
  • Improve link colors
  • Improve title and description markup on landing page
  • Update menu icons (the folder and document icons)
  • Improve panel and horizontal separator styles
  • Align left panel's links to the left

Improve code box color, border and spacing

/galleries/misc/otp-old-2.png

Old Code Examples

/galleries/misc/otp-new-2.png

New Code Examples

Improve warning and info boxes' color, border and spacing

/galleries/misc/otp-old-3.png

Old Warning Dialog

/galleries/misc/otp-new-3.png

New Warning Dialog

/galleries/misc/otp-old-4.png

Old Info Dialog

/galleries/misc/otp-new-4.png

New Info Dialog

Logo Improvements

  • Remove drop shadows from logo
  • Center Erlang logo on left panel
  • Erlang logo is a link to the docs' main page
  • Put section description after logo and before links in left panel
/galleries/misc/otp-old-1.png

Old Landing Page

/galleries/misc/otp-new-1.png

New Landing Page

Semantic Improvements

  • Use title tags for titles
  • Remove usage of <br/> and empty <p></p> to add vertical spacing
  • Use lists for link lists
  • Title case section titles instead of uppercase
  • Add semantic markup and classes to section titles and bodies
  • Add classes to all generated markup
    • The ones I couldn't figure out a semantic class I added a generic one to help people spot them in the xsl document by inspecting the generated files
  • Clicable titles for standard sections with anchors for better linking

Improve table styling

/galleries/misc/otp-old-5.png

Old Tables

/galleries/misc/otp-new-5.png

New Tables

Improve applications page

/galleries/misc/otp-old-7.png

Old Applications List

/galleries/misc/otp-new-7.png

New Applications List

Improve modules page

/galleries/misc/otp-old-8.png

Old Modules List

/galleries/misc/otp-new-8.png

New Modules List

Add "progressive enhanced" syntax highlighting

At the bottom of the page there's a javascript file loaded, if successful it will load the syntax highlighter module and css and then style all the code blocks in the page, if it fails to load, is blocked or no js is enabled then the code blocks will have a default styling provided by CSS.

The markup was not modified in any way to add this feature.

Make code tokens easier to differentiate from standard text

The previous style for inline code was a really light italic font, I changed it to monospace but it was hard to distinguish, so I got some inspiration from slack and surrounded the inline code words in a light box to make them stand out.

Indent Exports and Data Types' section bodies

/galleries/misc/otp-old-6.png

Old Data Types and Exports Sections

/galleries/misc/otp-new-6.png

New Data Types and Exports Sections

This is all for now, I have some other ideas for future improvements but they involve changes to the documentation so I will submit them separatedly.

If you have any feedback please let me know!

Software que no falla

Reproduzco acá un post que hice en facebook después de ver la siguiente transcripción:

/galleries/misc/software-no-falla.jpg

Avisenle al señor Tonelli que el mismo día que el decía eso la agencia espacial europea perdió contacto con una sonda que mando a marte, que estuvo desarrollando por los últimos 7 anios, el proyecto salio 870 millones de euros y tiene los niveles de control de calidad mas altos de cualquier industria.

Un día después de eso, durante mas de dos horas servicios como twitter, netflix, github, paypal estuvieron fuera de servicio porque alguien hackeo webcams y otros dispositivos "inteligentes" y los uso para realizar un ataque de denegación de servicio contra un servicio que traduce lo que escribís en la barra de direcciones de tu navegador a direcciones que las computadoras pueden entender.

El que dice que el software no va a fallar es un irresponsable y no puede tener ninguna responsabilidad legislando sobre siquiera una linea de código.

Luego comencé a agregar los siguientes comentarios:

1) Mas noticias del día, se encontró hoy en el sistema operativo que van a usar las maquinas de voto electrónico un error que permite a cualquier persona obtener control total sobre el sistema, se que no lo van a leer pero acá esta:

“Most serious” Linux privilege-escalation bug ever is under active exploit

2) Hoy se informo que una empresa que distribuye certificados SSL (lo que pone el candadito verde en la dirección de tu banco y hace que sea una conexión segura, que también se usa para la transmisión de los resultados de las maquinas de voto al servidor central) permitía a personas obtener certificados para dominios que no eran de las personas que los solicitaban.

Incident Report - OCR

3) Algunos "divertidos" de la historia: Stanislav Yevgráfovich Petrov (Станислав Евграфович Петров en ruso, nacido en 9 de septiembre de 1939) es un teniente coronel retirado del ejército soviético durante la Guerra Fría. Es recordado por haber identificado correctamente una alerta de ataque con misiles como una falsa alarma en 1983, por lo que evitó lo que podía haber escalado en una guerra nuclear entre la Unión Soviética y los Estados Unidos.

4) Uno de 1998: La Mars Climate Orbiter se destruyó debido a un error de navegación, consistente en que el equipo de control en la Tierra hacía uso del Sistema Anglosajón de Unidades para calcular los parámetros de inserción y envió los datos a la nave, que realizaba los cálculos con el sistema métrico decimal. Así, cada encendido de los motores habría modificado la velocidad de la sonda de una forma no prevista y tras meses de vuelo el error se había ido acumulando.

5) En 2003 50 millones de personas se quedaron sin electricidad en Estados Unidos y Canada por un error de software: https://en.wikipedia.org/wiki/Northeast_blackout_of_2003

6) La Therac-25 fue una máquina de radioterapia producida por AECL, sucesora de los modelos Therac-6 y Therac-20 (las unidades anteriores fueron producidas en asociación con CGR). El aparato estuvo comprometido en al menos seis accidentes entre 1985 y 1987, en los que varios pacientes recibieron sobredosis de radiación. Tres de los pacientes murieron como consecuencia directa. Estos accidentes pusieron en duda la fiabilidad del control por software de sistemas de seguridad crítica, convirtiéndose en caso de estudio en la informática médica y en la ingeniería de software.

7) En 1995 un cohete (Ariane 5) que costo 7 billones de dolares de desarrollo y llevaba una carga valuada en 500 millones de dolares exploto porque se uso un numero "muy chico" para mantener la velocidad horizontal, esto resulto en la explosión del cohete.

8) Knight Capital perdió 440 millones de dolares en 45 minutos y se fue a la quiebra por un error de software que vendio acciones a precio equivocado.

9) En 2004 el sistema de trafico aéreo de Los Ángeles dejo de funcionar porque usaban un contador "muy chico", lo divertido es que el sistema de respaldo dejo de funcionar a los minutos de ser encendido.

10) En 1979 una planta nuclear en estados unidos "sufrió una fusión parcial del núcleo del reactor" causa: "La válvula debía cerrarse al disminuir la presión, aunque por un fallo no lo hizo. Las señales que llegaban al operador no indicaron que la válvula seguía abierta, aunque debía haberlo mostrado."

https://es.wikipedia.org/wiki/Accidente_de_Three_Mile_Island

11) Otras veces las causas son políticas "...fallas en la comunicación... dieron lugar a una decisión de lanzar 51-L basada en información incompleta y algunas veces engañosa, un conflicto entre los datos de ingeniería y los juicios de gestión, y una estructura de dirección de la NASA que permitió problemas internos de seguridad de vuelo para eludir las claves de traslado del transbordador."

https://es.wikipedia.org/wiki/Siniestro_del_transbordador_espacial_Challenger