0012: dida wasm api + indexes + reduce, food and carbon emissions, async rust, handmade seattle, ideas matter, tools for thought and dida animations, redpanda wasm, live 2021, opportunity costs of twitter, work vs jobs, sourcehut simplicity, writing tools faster, ec2 trends, the state of academia

New stuff:

I wrote a detailed breakdown of things that are wrong with sql and how to do better in a new query language.
Dida updates:
- Added a js<->wasm api (example here, sugar upcoming so it looks more like this). From the js side the wasm api looks exactly the same as the node api, so you can share code between them.
- Indexes are now LSM trees instead of, uh, unsorted arrays. Joins are no longer silly.
- Distinct got a faster implementation.
- Added a reduce operator.
- Fixed a bug in pointstamp order by adding a new constraint to graph layout.
- Lots more unit tests.
Some minor bugfixes for focus.

Comparing food emissions to total carbon budget by Our World In Data

One-quarter to one-third of global greenhouse gas emissions come from our food systems

Emissions from food alone could use up all of our budget for 1.5C or 2C

They estimate that if everyone switched to the Lancet low-meat diet it would reduce carbon emissions from food by around half (meaning it would reduce total carbon emissions by 1/6 to 1/4).

Many people are not aware that the impact of a US-style high-meat diet is so large.

Six ways to make async rust easier

It's far too late to actually change async rust, but this is nice as a retrospective on things that have not worked out as nicely as hoped in rust. And it's also compatible with decisions that have been made so far for async zig.

Handmade Seattle is coming up on Nov 11-12 and ticket sales are open.

Some highlights from previous years' recorded talks:

Dion systems is building a language and structured-ish editor that are designed together, allowing the editor to be richer but also allowing the language to be simpler by foisting features onto the editor.
Whitebox systems is making an instarepl-like tool for c. They've gone way further than anything else I've seen, with lots of original UX improvements.

Ideas not mattering is a psyop

In the same way most people think they understand how bikes work but cannot come close to drawing a working bike, many of us think we could have generated a seemingly obvious idea when really we would have come up with a version lacking key components that make the actual idea work. Note that it took decades between the introduction of the first bicycles for their designs to stop being utterly ridiculous and to start being actually convenient to use.

A variety of older essays on tools for thought:

If you talk through a problem with someone who is very experienced in a particular field, it's clear that they don't just have more knowledge about the field but also have better ways of structuring the problem and testing solutions their head. This can include visual/spatial representations, useful techniques and knowledge of their limitations, lists of common mistakes, questions to ask about any potential solution etc.

I think we're really bad at eliciting and teaching these things. If you look at the typical class or textbook, almost all of it is dedicated to the knowledge that was produced by these mental tools instead of trying to share the tools themselves. The closest I've seen to directly sharing tools is How To Solve It which lists and demonstrates various heuristics that are useful for writing proofs.

The reason I want to add these interactive animations to the dida docs is to see if I can directly convey the pictures I see in my head, that help me think about problems in incremental maintenance. If it's effective I can imagine wanting to try to do the same for other subjects that I feel like I have a better-than-average understanding of.

Redpanda added support for wasm plugins so you can send code to data instead of data to code.

I think we're going to see a lot more of this. It allows almost totally decoupling storage systems from query execution. Eg you could ship a SQL->wasm compiler as a totally separate library but still run it on redpanda without having to ship all the data back and forth between storage nodes and compute nodes. Ksqldb ships the entire dataset for many individual plan operators! Redpanda could potentially wipe the floor with them if they want to go that route.

So far redpanda doesn't offer an incremental/streaming layer on top. But dida does compile to wasm...

LIVE 2021 is looking for submissions. The dealine is Aug 6.

Programming is cognitively demanding, and way too difficult. LIVE is a workshop exploring new user interfaces that improve the immediacy, usability, and learnability of programming. Whereas PL research traditionally focuses on programs, LIVE focuses more on the activity of programming.

On the opportunity cost of tv.

It takes about 1000 hours to render yourself competent in a given field. Competence corresponds to the 'DIY level' which is a little below the level where other people would be willing to pay you. Given that the average time spent watching TV every week is about 22 hours, this means it take about a year to replace a skill you were previously paying for.

The same goes for twitter, hacker news etc. The opportunity cost is huge.

Eg I can skim 5-10 papers per hour. A big conference publishes 50-100 papers per year. So if I wanted to skim every paper in CHI, SIGMOD, CIDR, PODS, VLDB, ICDE, POPL, SPLASH, PLDI and SOSP it would take me about 2 hours a week. What would 2 hours per week of twitter get me?

How work became a job

...we have outsourced not just the manufacture of goods but every function of human life, from the telling of stories and the transmission of culture, to the rearing of our children and the care for our elderly, up to and including the very sense of undertaking a meaningful task.

You may 'own' a tractor, a smartphone, or an automobile, but as far as the ability to make upgrades or repairs is concerned, it is still owned by the manufacturer. The very institution of private property has been bifurcated: large corporations can be said to 'own' property, but for the individual and for the family what is referred to as ownership increasingly amounts to a protracted form of rent.

Prioritizing simplicity

Microsoft bought GitHub for 7 billion dollars. GitLab is valued at 3 billion dollars. Together, the two companies have over 2,000 employees. SourceHut made 4 thousand dollars in Q1 with two employees and an intern. How do we deliver on this level of reliability and performance compared to these giants? The answer is a fundamental difference in our approach and engineering ethos.

I hesitate to link to Drew Devaults writing, which tends to lean towards trollish, but I do like aspects of the engineering ethos behind sourcehut.

I've learned a lot over the years from blog posts and talks from the team behind Bitsquid and now Our Machinery. This recent talk on Writing tools faster is no exception. It's a great example of a how a small team can benefit from unifying and simplifying their entire stack rather than uncritically adopting the complexity inherent in whatever the tool-de-jour is.

It doesn't matter if you're doing things "the right way" if you never have time to do them.

Make sure you can move quickly and tackle problems in order of importance.

AWS EC2 hardware trends

Costs for compute, ram, storage and network bandwidth are pretty much constant for any given instance type over the last ~5 years. But occasionally a new instance type is introduced that gets significantly better value for money.

Some hero has made a list of psychology experiments that were widely publicised in books and talks but later could not be replicated.

Various links on the state of academia:

A little more space to play

I believe that science, at its most creative, is more akin to a hunter-gatherer society than it is to a highly regimented industrial activity, more like a play group than a corporation.

In hunter-gatherer societies, as I have learned from my good friend the psychologist Peter Gray, there are no chiefs. Play is the principal form of training, a lack of adult responsibility continues often into the child's twenties and children are not forced into molds by persistent, intrusive attention of their elders.

Playfulness is associated with learning to be an adult. In the end, hunter-gatherer children voluntarily focus on what they each do best: fishing, hunting, dancing, peacemaking, art, music or as repositories of information on healing, navigating, building.

What does this example have to do with science? Peter Medawar, who won the Nobel Prize for his work on the immunology of transplantation, captured the diversity of talents that scientists exhibit and should exploit.

"Scientists are people of very dissimilar temperaments doing different things in very different ways. Among scientists are collectors, classifiers and compulsive tidiers-up; many are detectives by temperament and many are explorers; some are artists and others artisans. There are poet-scientists and philosopher-scientists and even a few mystics. What sort of mind or temperament can all these people be supposed to have in common?" he wrote.

Instead, our courses, our hiring, our promotions, the criteria for papers being published in Nature or Science, our NIH grants, our PhD mentoring, our fellowships, even our awards and honors, are built on adherence to a constrained model.

We do not generally value the idiosyncratic, though we occasionally note it after extraordinary success. We consider the successful PhD or postdoc as someone who finishes quickly and uses a set of conventional assays on well-worked problems. In my 50 years of science, the scientific community itself has become more intolerant, more conservative.

Yet, as I look back at the best scientists from my own lab or in the departments or Universities I have been in, or the great innovators in the scientific fields I know, the element that characterizes these people is playfulness, independence and nonconformity.

New Science

New Science aims to build new institutions of basic science, starting with the life sciences.

Over the next several decades, New Science will create a network of new scientific institutes pursuing basic research while not being dependent on universities, the NIH, and the rest of traditional academia and, importantly, not being dominated culturally by academia.

Our goal is not to replace universities, but to develop complementary institutions and to provide the much needed “competitive pressure” on the existing ones and to prevent their further ossification. New Science will do to science what Silicon Valley did to entrepreneurship.

Golden eggs and better telescopes

The Max Planck Society gives the equivalent of one to a few million dollars a year to allow individuals to run research groups. These groups continue until the director retires. These groups are not founded upon detailed research proposals in which all of the problems and solutions have been mapped out. Rather they are founded on demonstrated ability and a vision for a future state of human knowledge. Once a topic is mainstream, the Max Planck Society loses interest. And it is happy to take risks. The Max Planck Society is venture capital for basic research.

The above is by the author of Statistical Rethinking, one of my favourite textbooks, and the first that made me feel like I actually understood how to do statistics at all. Their recent essays on causal modelling are maybe a hint as to what they think the golden telescope will be.

Distill is going on hiatus

Over the past five years, Distill has supported authors in publishing artifacts that push beyond the traditional expectations of scientific papers. From Gabriel Goh's interactive exposition of momentum, to an ongoing collaboration exploring self-organizing systems, to a community discussion of a highly debated paper, Distill has been a venue for authors to experiment in scientific communication.

But over this time, the editorial team has become less certain whether it makes sense to run Distill as a journal, rather than encourage authors to self-publish. Running Distill as a journal creates a great deal of structural friction, making it hard for us to focus on the aspects of scientific publishing we're most excited about. Distill is volunteer run and these frictions have caused our team to struggle with burnout.

Nonreplicable publications are cited more than replicable ones

We use publicly available data to show that published papers in top psychology, economics, and general interest journals that fail to replicate are cited more than those that replicate. This difference in citation does not change after the publication of the failure to replicate. Only 12% of postreplication citations of nonreplicable findings acknowledge the replication failure. Existing evidence also shows that experts predict well which papers will be replicated. Given this prediction, why are nonreplicable papers accepted for publication in the first place? A possible answer is that the review team faces a trade-off. When the results are more 'interesting', they apply lower standards regarding their reproducibility.

The Science Reform Brain Drain

The combination of hostility that reformers face from inside academia, the scarcity of jobs, and the intense expectations of productivity required for tenure mean that key members of the reform movement are beginning to leave the field of scientific psychology altogether.

I would not be surprised if, in a decade, a more reliable and stringent vetting process for scientific research was established outside the university system, and much of the work from 'big name' professors at major universities did not meet these standards.

Finally, it suggests that life is bigger than science, and reformers need to remember that we don't just need to consider whether academia wants us, but whether we want academia.

Against public engagement

...evaluating researchers more strongly based on what laypeople think about them and less strongly based on what their peers think about them can quickly turn into a quality problem for science: It creates an incentive for attention-grabbing claims in favour of rigorously tested ones.

Of course we want science to impact all of these areas. But again, it should be science as a system that has this impact. If we try to syphon knowledge off of its lower-level constituents instead, we effectively throw out the magical cumulative and self-correcting properties that made us so fond of science in the first place.

The generalizability crisis.

Suppose I hypothesize that high social status makes people behave dishonestly. If I claim that I can test this hypothesis by randomly assigning people to either read a book or watch television for 10 minutes, and then measuring their performance on a speeded dishwashing task, nobody is going to take me very seriously. It doesn't even matter how the results of my experiment turn out: there is no arrangement of numbers in a table, no p-value I could compute from my data, that could possibly turn my chosen experimental manipulation into a sensible proxy for social status. And the same goes for the rather questionable use of speeded dishwashing performance as a proxy for dishonesty.

This reminds me of the 'will it replicate' test that we ran in a cogsci tutorial a few years ago. Take 10 papers, of which 5 were succesfully replicated and 5 failed, and ask the students to guess which is which. It's depressingly easy. One of the really succesful heuristics was 'does the conclusion bear any resemblance to the experiment'. In theory this shouldn't affect replication at all - regardless of what was generalized from the experiment, the experiment itself should replicate - but it seems very highly correlated. Maybe sloppy thinking and sloppy experiments go hand in hand? Or maybe this detects hype-driven researchers?

But the same logic also applies to a large number of other factors that we do not routinely model as random effects - stimuli, experimenters, research sites, and so on. Indeed, as Brunswik long ago observed, "...proper sampling of situations and problems may in the end be more important than proper sampling of subjects, considering the fact that individuals are probably on the whole much more alike than are situations among one another." As we shall see, extending the random-effects treatment to other factors besides subjects has momentous implications for the interpretation of a vast array of published findings in psychology.

Let us assume for the sake of argument that there is a genuine and robust causal relationship between the manipulation and outcome employed in the Alogna et al study. I submit that there would still be essentially no support for the authors' assertion that they found a 'robust' verbal overshadowing effect, because the experimental design and statistical model used in the study simply cannot support such a generalization. The strict conclusion we are entitled to draw, given the limitations of the experimental design inherited from Schooler and Engstler-Schooler (1990), is that there is at least one particular video containing one particular face that, when followed by one particular lineup of faces, is more difficult for participants to identify if they previously verbally described the appearance of the target face than if they were asked to name countries and capitals. This narrow conclusion does not preclude the possibility that the observed effect is specific to this one particular stimulus, and that many other potential stimuli the authors could have used would have eliminated or even reversed the observed effect

It is instructive - and somewhat fascinating from a sociological perspective - to observe that while no psychometrician worth their salt would ever recommend a default strategy of measuring complex psychological constructs using a single unvalidated item, the majority of psychology studies do precisely that with respect to multiple key design factors. The modal approach is to stop at a perfunctory demonstration of face validity - that is, to conclude that if a particular operationalization seems like it has something to do with the construct of interest, then it is an acceptable stand-in for that construct. Any measurement-level findings are then uncritically generalized to the construct level, leading researchers to conclude that they've learned something useful about broader phenomena like verbal overshadowing, working memory, ego depletion, etc., when in fact such sweeping generalizations typically obtain little support from the reported empirical studies.

In many research areas, if generalizability concerns were to be taken seriously, the level of effort required to obtain even minimally informative answers to seemingly interesting questions would likely so far exceed conventional standards that I suspect many academic psychologists would, if they were dispassionate about the matter, simply opt out.

This is similar in direction to the book Uncontrolled, but goes much further by actually measuring the effects of the observed per-site variation in real experiments.