0026: break, preimp essay, focus + clojure, zed experiments, decorrelation and nested relations, bunny, sqlite mode, reading, links

Published 2022-07-26


Nobody could come to our covid-era wedding, so instead we have a wave of family and friends visiting for our first anniversary. Which means that between July 16 - Aug 7 I'm barely computering at all. Progress will resume Aug 8.


I wrote a draft of an essay explaining preimp: the program is the database is the interface.

It's a long way from publishable.

At minimum I want to allow direct editing of values in derived views, which means that each value needs to track it's upstream source, which means that I can't do it in clojure (because only collections can have metadata, not numbers, strings, nulls etc).

I looked around to see if I could at least find a language implementation that I could easily modify but most implementations rely on nan-boxing for numbers, which makes it very difficult to modify the implementation to add extra metadata to numbers. The best candidate so far is janet - there's an compile-time option to disable nan-boxing.

I also want to use the destructuring forms and type annotations from the argument lists in functions to generate better function UIs. Eg for (fn [{^String foo :foo ^Bool bar :bar}] ...) I would render a form with a textbox input labeled 'foo' and a checkbox input labeled 'bar'. I can get this information in clojure but it's not compatible with the incremental evaluation I use for preimp. And it doesn't seem like janet has any way of adding type tags or other metadata to destructuring forms.

Finally, I would love to support live eval which is difficult in any language implementation that wasn't explicitly designed for it.

So increasingly it looks like August might be spent hacking together a simple language where I can support all these features.

Clojurescript woes continue too. In the draft essay above I show how the exported code can be run in the repl. In reality it took me over an hour to get a working clojurescript repl due to issues like not being able to start a repl in a too-long working directory.


I ported focus from sdl to mach-glfw. This fixed a few bugs - the most annoying was that if I started focus from the terminal then sdl would report a spurious enter keydown event.

I've decided not to try to build on top of mach itself until the apis settle down, but I am mirroring their event types and I pushed my additions upstream.

I added clojure syntax highlighting and structural navigation. I had to write a clojure tokenizer from scratch since all the existing parsers are either java or js.

Completions, smart indent, paren matching etc are now language-aware too, and I added a generic language mode that works ok for most c-like languages.


I mentioned zed previously. Here is a talk introducing the ideas in more depth.

I spent some time reading the spec and trying out examples.

The data language seems ideal. The storage model and type system are very much aligned with the ideas I was pursuing with imp's gradual typing - all data is self-describing but efficient binary representations still exist and schemas can be enforced using first-class types. The only major quibble I had is that all values are nullable.

The query language has some really nice features. Errors are first-class value and are propagated by most functions. Various shorthands make the language usable as a search bar while still allowing complex queries. But when I tried to express the tagging logic used in preimp every approach bumped up against one problem or another. The most troublesome is that streams can't be named - which strikes me as a repeat of the mistakes of sql before the introduction of CTEs. But the language is still very young, so there is plenty of time to improve.

decorrelation and nested relations

The trouble with the way that materialize decorrelates subqueries is that it produces a query plan which is a dag rather than a tree, and that makes other downstream optimizations harder to express.

Semantically, subqueries produce nested relations. But existing query planners require every plan to produce a single flat relation. Encoding arbitrarily nested relations into a single flat relation is very difficult, especially if you also need your encoding to be stable wrt to incremental updates.

What materialize currently does is keep the outer query around, so that it can be joined against the result of the subquery to lazily reconstruct the nested relations. This is just a way to encode nested relations using multiple flat relations. But the query planner doesn't know about this encoding.

I think it might work better to have an IR where each plan is a tree and each operator produces a nested relation. The pipeline would then be:


I moved scattered-thoughts.net from netlify to bunny. The tooling for uploading static content is not nearly as nice but the site now loads much faster outside of the US/EU, and as a bonus I get raw (anonymized) server logs. I expect the total cost to be <$1 per month.

(Bunny vs netlify from a 3g connection in India)

sqlite mode

I complained in the shape of data that most sql databases don't print data in the same syntax that you must use to enter it.

Sqlite is an exception - with .mode quote it prints literals using the literal syntax.

sqlite> create table foo(x text, y float);
sqlite> insert into foo values ('foo''bar', 3.14);
sqlite> select * from foo;
sqlite> .mode quote
sqlite> select * from foo;


OpSets: Sequential Specifications for Replicated Datatypes. A generic method for designing crdts by deterministically extending the causal order to some arbitrary total order.

The Intelligence Trap. Covering recent research in rationality (why smart people make dumb decisions). But published in 2019 and still uncritically quoting research that weathered the reproducibility crisis. Eg has a whole chapter reporting experiments on grit and growth mindset interventions in schools, but fails to mention that wide-scale testing of these interventions has shown near-zero effect. Read Scout Mindset instead, in which Julia Galef actually reads the underlying papers and reports the weight of evidence accurately.

Redpanda are offering free copies of WebAssembly: The Definitive Guide. It's not a particularly deep book, but a reasonable starting point if you don't know anything about wasm yet.

Stubborn Attachments. An argument for longer-term thinking, but from such a theoretical point of view that I found it hard to sustain interest. But I liked this thought experiment on how to hold political views:

That means your political views, though they are the best ones out there, will have grave negative consequences with probability .98 (one minus two percent, the latter being the chance that you are right on the details of the means-end relationships). In this setting, how confident should you really be about the details of your political beliefs? How firm should your dogmatism be about means-ends relationships? Probably not very; better to adopt a tolerant demeanor and really mean it. As a general rule, we should not pat ourselves on the back and feel that we are on the correct side of an issue.

The Practicing Mind. About being mostly process-oriented in the moment, rather than mostly goal-oriented. Nothing particularly novel or compelling compared to other similar books. I much preferred The Rock Warriors Way, despite it's goofiness.

Changing Minds. An older book about how teaching programming in schools is going to revolutionize education. 21 years later this has yet to materialize, but I still constantly talk to people who venerate these ideas. I'd be interested in reading an honest attempt to explain why the peak of computer-based education so far looks like the Khan Academy rather than the 'new literacy' that has been thrown around for decades.

cell is a relational language that bears an uncanny resemblance to the ideas in the essence of software. Despite the radio silence in the last few years, it appears there is still active work on the new c++ backend.

Ink+Switch published notes from their Berlin conference.

libEpollFuzzer mocks the epoll syscalls on linux to fuzz network applications.

I've mentioned skiplang a while ago - an imperative language that supports incremental/reactive programming. I'd thought it was dormant, but apparently the authors were quietly working on skdb instead - a (proprietary?) sqlite-compatible database with incrementally maintained views. I haven't yet tested it for internal consistency but the documentation is promising.

Examples of information-dense responsive UIs that are trivial with imgui but feel very difficult to implement on the web:

"LLVM 11 tends to take 2x longer to compile code with optimizations, and as a result produces code that runs 10-20% faster [...] compared to LLVM 2.7 which is more than 10 years old"

"Simplicity is systemic [...] an important aspect is that of a possible local 'for a component' perfect simple solution can be 'for the system', a complex one - and vice versa."

Tyler Neely experimenting with building a database only from components that can each be written in a day.