0009: 2021 Q1 roundup, updates to internal consistency, garden of forking paths, push vs pull, beca, cambria

Hello to everyone who joined recently!

I made a short summary of what I've done so far in the first quarter of 2021 and some ideas for what I'll do next (blog version, twitter version). If any of them are exciting, let me know.

In my internal consistency post I claimed that the flink datastream api can't express the example. This is inaccurate. The builtin operators don't include unwindowed joins or retraction-aware aggregates, but you can write your own operators and Vasia Kalavri demonstrated an internally consistent version of the example. I updated the post accordingly.

I'm fascinated by methodology and the reproducibility crisis (eg see my notes on Everything is fucked). A while back Gelman and Loken coined the term the garden of forking paths to describe the numerous choices a researcher can make while analyzing data that might push the apparently-objective results one way or the other.

Just recently I ran across this preprint that actually studies this in practice. They gave the same question and dataset to 73 teams of social scientists and got back analyses that ranged from strongly negative to strongly positive. Hardly any of the variation was explained by the researchers backgrounds or pre-existing opinions on the question. Instead:

...the massive variation in reported results originated from unique or idiosyncratic decisions in the data analysis process, leaving 95.2% of the total variance in results unexplained.

They also identified 107 forks in the path - more forks than teams.

(They also acknowledge in the summary that the only rational reaction to this paper is to have many different teams meta-analyze their results.)

This reminds me again of Uncontrolled, a fantastic book arguing that the soft sciences, and especially social science and economics, are just not tractable to standard scientific methods which expect to be able to find compact and broadly generalizable theories. Instead, we should invest in continuous A/B testing on a massive scale.

Justin Jaffray wrote about push vs pull models in query engines. In typical fashion, the explanations given in the academic literature don't really make sense or are missing pieces and Justin has to do some archeology to figure out what the actual tradeoffs are.

It seems to me that there should be much more than two strategies available. There is an underlying graph of data dependencies between the rows that are produced at each node in the graph and the problem is to schedule the exploration of the graph in a way that minimizes both unnecessary visits (eg producing rows that get discarded by a limit downstream), repeated visits (if we need to recalculate a row that was discarded) and total memory usage (number of rows that we have cached/buffered at any one time).

This has been explored a little and even in this limited setting it seems that the optimal scheduling is often neither purely push or pull.

Martin Kleppmann and Heidi Howard published a neat paper on byzantine eventual consistency in p2p databases. It characterizes exactly which invariants can be maintained on a p2p database under attack by arbitrary numbers of malicious nodes and gives an algorithm that works for all possible cases.

It seems very similar to the algorithm used by matrix. It would be interesting to analyze the invariants that should be upheld by matrix room properties (eg can't moderate after having your moderator privileges removes) to see if they are I-confluent...

Another PaPoC paper is on Cambria. I couldn't find a preprint but the web version from last year is much nicer anyway.

If you make local-first software, how do you cope with having different versions of the app with different schemas communicate through each other? Or different apps that operate on the same document, in almost-compatible ways. The author explore using bidirectional lenses to allow partial interop between different versions. One of the more mind-bending parts of the design is that lens get written into the document itself, so that older versions of an app can automatically read newer versions of the document.

Most of the rest of the post is about how hard the problem is and all the open problems that they discovered. The scale of the problem seems really daunting and yet I can't pin down what is the essential difficulty.

I just finished reading The scout mindset by Julia Galef. If you've spent time hanging around the rationality community then nothing in it will be surprising, but if you haven't then it's an excellent introduction.

I haven't decided exactly what I'm going to do next, but my wrist is feeling much better and I haven't written any real code for months now...