Don't strip the meaning away from your runtime

Motivation

How many times have you committed a console.log() "by accident"?

How many times have you seen a comment saying that a function is deprecated or unused but you can't tell if it has been executed recently?

Same for a branch in an if/switch statement. Is it even possible for that condition to ever be true?

How much time have you spent setting up breakpoints, logpoints and reading a mess of prints trying to find the cause of a bug to then throw all the effort and hard earned contextual knowledge away?

How much time have you spent scanning a stack trace to discard stack frames and functions that are not relevant to the problem, they just reflect code structure/practices/libraries/frameworks/transpilers/macros. Wouldn't you like a semantic stack trace? for us humans? With custom metadata attached to it by us?

Wouldn't it be nice to encode assumptions about some logic and get notified if it's broken in production? For example, "this algorithm is log(N), I assume this array length is always smaller than 1000 items".

I think I can come up with more examples like this, I will stop here.

Introduction

It is said that code is read more than it's written.

I would add that code is read mostly to rebuild the dynamic runtime behavior out of its static representation. Let a comic explain it for me:

A programmer slowly building an idea of a program being interrupted and lossing all their progress

We should ask

Homer Simpson asking: Can't someone else do it? — Someone or something...

Well, computers can rebuild the dynamic representation quickly and deterministically, and they don't get tired or demotivated while doing it.

The problem is that given the low level representation of our runtimes, even if we manage to get it back out, extracting the meaning out of it is a task similar in effort to manually rebuilding it in our heads from the code.

This, I think, comes from the fact that we still see code as something we write for computers to execute, everything else gets "optimized out" of the executable or lowered to a point where all the high level meaning is lost.

After this lossy process we create tools and conventions to get some of the high level ideas back from the code and whatever the runtime can provide us.

Some examples of this are:

Tracing (ebpf, dtrace, distributed tracing etc.)
Debuggers
Monitoring
Telemetry
Crash reporting
Assertions
Logging

Developing a programming language 20 years ago involved the reference implementation, maybe a standard library and hopefully documentation.

Now a programming language project includes the language, syntax highlighting, formatter, linter, language server, test runner, package manager, official documentation, documentation generation and more. The definition of a programming language expanded to incorporate surrounding activities.

I think we should expand the definition of code in a language to include some of the tools and conventions that currently surround it.

A symptom of this lack of language evolution to express new practices is the growth of dotfiles in any project, here are some you may see in a javascript/typescript project:

.dockerignore .editorconfig .eslintrc.js .npmrc example.env package.json pnpm-lock.yaml postcss.config.js prettierrc.js tailwind.config.ts tsconfig.json vite.config.mts .eslintrc.json next.config.js package-lock.json tsconfig.json .dockerignore .editorconfig .eslintignore .eslintrc.json .gitpod.Dockerfile .gitpod.yml .lintstagedrc.js .npmrc .prettierignore .prettierrc babel.config.js cypress.config.js docker-compose.yml jest.config.js jest.setup.js jest.transform.js knip.jsonc pnpm-workspace.yaml sample.env tsconfig-base.json tsconfig.json buildspec.yml codecov.yml .yarnrc .yarnrc.yml lerna.json lighthouserc.js

Daniel Ingalls said in Design Principles Behind Smalltalk:

An operating system is a collection of things that don't fit into a language. There shouldn't be one

I would say:

Dotfiles are a collection of things that currently don't fit into a language. They should eventually be incorporated into them.

But talk is cheap, here's a live demo of some of these ideas.

💡 The example is interactive, read after it for some things you may want to try.

💭 Some things you can do in the playground

Click to get a runtime trace
Edit the code and click run again
Click Restore to get the default source code back in the editor
Use the slider to see the trace across time
Slide to the marks in the slider to get to "interesting" points in the trace
Mouse over different places in the trace to highlight the relevant code in the editor
Mouse over @log() outputs to see the code that generated them in a tooltip
Check how many times a section was hit on the right side of each section
- "- hits: 1" means the section / function / branch was run once
Collapse / Expand sections by clicking on the headline

How is it possible?

Simply put, the compiler has access to the required information at compile time and keeps enough information at runtime to reconstruct all the playground features.

There are examples of this effect in practice when a language runtime consolidates tasks dispersed in upper layers of the stack. In The Soul of Erlang and Elixir Sasa Juric shared a table that compares the solution to similar problems in two system he was developing side by side with different programming systems:

Technical requirement	Server A	Server B
HTTP server	Nginx and Phusion Passenger	Erlang
Request processing	Ruby on Rails	Erlang
Long-running requests	Go	Erlang
Server-wide state	Redis	Erlang
Persistable data	Redis and MongoDB	Erlang
Background jobs	Cron, Bash scripts, and Ruby	Erlang
Service crash recovery	Upstart	Erlang

Rich Hickey in his talk The Language of the System talks about something similar. At 9:01 he provides different meanings for the word language:

'Tongue'- communication
- programmer → programmer
Programming language
- programmer → computer
System language
- program → program

Let me pull a Žižek on that idea and notice that the forth term is missing. What is the name for the missing combination: computer → programmer?

At 26:00 Rich Hickey asks "Where do the semantics of the system live?" I ask a similar question "Where does the human representation of the running system live?"

I think we need to incorporate the syntax, semantics and runtime representation of the problem/solution into our programming languages and the ability to get it back from a running program to a place where a programmer can inspect it.

What else is possible?

Watch mutations on a variable
Express properties, as in property based testing
Define Eiffel-style contracts like preconditions / postconditions / invariants
Infer types and value ranges for a variable from traces over time
Capture values to reproduce errors

Why not something like Eiffel then?

The idea here is that the runtime behavior of the syntax can be switched depending on the environment or objective. During development it could stop and open a debugger, during testing we could get extra information in our test output, we could also save it to a local circular buffer to add context to our IDE. In production it could be sent to a collection server, have an inspector, fire alerts and be queryable after the fact. In case of an error it could attach recent information to the crash report.

Does it scale to large codebases?

In the real world ™️ all behavior would be opt-in with annotations. To avoid having to remove some annotations before deploying I think a good approach would be to have profiles stored in different files with CSS-style selectors that tell which annotations are enabled for different modules and functions, paired with annotations for feature flagging it would allow to leave most annotations in and turn them on/off at deploy time or even runtime. An additional benefit of storing profiles in their own files allows to version, maintain and reuse debugging sessions over time the same way the rest of the code is managed.

An idea: Human ← Server Protocol

One of the big improvements in developer experience over the last few years was the introduction of the Language Server Protocol:

The LSP was created by Microsoft to define a common language for programming language analyzers to speak.

LSP creates the opportunity to reduce the m-times-n complexity problem of providing a high level of support for any programming language in any editor, IDE, or client endpoint to a simpler m-plus-n problem.

By defining and standardizing the meaning and serialization of human level annotations across languages we can build generic tools that can capture, process and feed back to our IDEs information captured during the execution of our programs.

Futurama professor saying something similar to the caption below — So that's what things would be like if we had this, a man can dream though

Beware the tooling tarpit in which everything is possible but nothing of interest is easy.