0028: HYTRADBOI jam, sqllogictest in a week, how safe is zig again, rr on alder lake, google maps jank, links

Published 2022-10-05

HYTRADBOI jam

The jam results are now up at hytradboi.com/jam. I haven't had a chance to read through all of them yet but it looks like there were some neat projects.

sqllogictest in a week

For my own jam project, I tried to make a sql frontend that could pass sqllogictest.

In sport and in music there's a kind of training where you try to push a skill to it's maximum along some dimension to see where you break down first. It's a way of finding bottlenecks that can be improved. That's kind of what I had in mind.

Sqllogictest is one of the smaller test suites from sqlite and it doesn't have great coverage of the sql language. But it's a test suite I'm very familiar with because it was the first test suite that we tried to pass at materialize, and it also mostly tests generic sql rather than any particular dialect or interface. So it's well suited to a speed challenge.

I tried to roughly categorize the things I noted.

Mechanical.

Typing is still a bottleneck. I make a lot of errors. Correcting the errors sometimes breaks my concentration, and then take a while to recover what I was doing. I'm particularly bad around special characters. Some of this is caused by my new laptop having a slightly different keyboard layout. It might be worth spending a few minutes a day working on typing exercises, focused on the keys that are in different places or have different shapes on my new keyboard.

I spend a lot of time looking up definitions of functions or types. I've been avoiding adding support for zls to my editor because it doesn't handle comptime code and I assumed something better would come along eventually once the new self-hosted zig compiler was finished. But in practice most of the things I look up are static globals. So zls support is near the top of my todo list now.

My test workflow worked well. I have a process runner in my editor that auto-completes from my bash history and restarts the process on every save. I used this to run a tiny, manually edited subset of the tests so I could jump to errors easily in the editor. Then in a real terminal I ran the full tests.

Running full tests took up to 20 minutes by the end, so for the last two days I ended up having two branches of work in parallel in two different sway workspaces. I would code/debug on one while the tests ran on the other. This avoided long waits.

I worked pretty long hours with few breaks. This was probably not optimal. But I'm not sure yet where the sweet spot is for different kinds of work. A lot of this challenge was mechanical - code that I mostly just typed out more-or-less-correctly first try. But parts of parsing were tricky and could have done with a clear head and more planning time.

Strategic.

I stuck with the bnf for a long time because the other alternative was hand-writing a parser. Just transcribing the materialize parser would require typing 5 characters per second for 24 straight hours - definitely not a viable jam strategy. But this is a classic false binary. Once I realized that the bnf was also not viable I immediately had the idea to write a grammar and parser generator using the recursive descent + commit parsing style that I'm most familiar with. I could have saved ~3 days by spending a little more time on day one generating alternatives.

I invested a few hours in adding some debugging tools to the parser. This was more than paid for itself - I did have a lot of parser bugs and they would have been painful to solve otherwise.

When working on the bnf, and for the first half-day working on the new parser, I tried to use a parsing technique that I wasn't very familiar with. I also didn't work through a proof that it was correct. This was also before I had invested time in debugging tools. As a result, I spent a lot of time trying to debug failures with very little visibility into the parsing process. I think I worked like this because of the time pressure, but it was false economy - if I had spent an hour or two thinking through the memoization algorithm on paper I probably would have gotten the caching key correct and saved several days of debugging.

I tweaked the test runner to print the test sql, filename and error code on every failure. I wrote the result of each run into a file and poke at it with ripgrep, cut, sort/uniq etc to prioritize which feature to work on next. The test coverage of sqllogictest is very skewed so this made a big difference to my final score.

It would have been nice to be able to store more details error information in a database for more complex queries about usage. I should maybe invest some time in a very ergonomic interface to eg sqlite aimed at these kinds of adhoc debugging uses.

Rather than try to fix slow queries I added cpu/memory limits to the evaluator. If I hadn't done this, I don't think I would have been able to even complete a test run before the end of the week, which would have meant total failure.

Architectural.

I really like the final parser. The code is as easy to understand as hand-written parsers that I've worked with before but the grammar is significantly denser which makes it easier to read and to modify. I didn't end up needing any custom parse actions, but they would be easy to add by switching on the comptime-known rulename in Parser.parse. I was also able to detect accidental left-recursion at compile-time whereas in hand-written parsers I would only find out when the fuzzer stack-overflows.

Early on in the parse I made tokens implicit to save having to declare them separately. This was a terrible idea. I spent much more time debugging typos than I saved by not typing out what ended up being a list of 30 words.

For the ast I tried to enable both static and dynamic typing. The main data stored in the parser is:

// AST nodes
nodes: u.ArrayList(Node),
// For each node, the location in the source code that it came from
node_ranges: u.ArrayList([2]usize),

The nodes themselves are just a giant enum with one case for each rule in the grammar:

pub const Node = union(enum) {
    select: select,
    ...
};

But the payloads in this enum are structs with static types:

pub const select = struct {
    distinct_or_all: NodeId(distinct_or_all),
    result_columns: NodeId(result_columns),
    from: NodeId(from),
    where: NodeId(where),
    group_by: NodeId(group_by),
    having: NodeId(having),
    window: NodeId(window),
    order_by: NodeId(order_by),
    limit: NodeId(limit),
};

Those NodeId types are just integer references into the nodes array, but in a statically-typed wrapper. So select.from.get(p) returns a from struct, but p.nodes.items[select.from.id] returns a Node.

I really liked this combination. When working through a particular chunk of the tree I got static types, but I could also just use the union tags and the raw integer ids to do generic traversals (eg printing the parse tree). The usual alternative is some visitor pattern but I find that inversion of control pretty clunky.

I took an experimental approach to name resolution in the planner. This is one of the tricky parts of sql because sql's scoping rules are bonkers.

In materialize we had a single pass that went from the sql ast to a high-level IR that referred to columns by position. Each function in the pass would take an ast node and some context, and return an ir node and a scope object representing what names were available in that object. There were two difficulties with this:

  1. SQLs scoping rules are incredibly entangled with it's syntax, so the scope object had to become increasingly complicated to reflect all the parts of sql syntax that affect name resolution.
  2. The optimizer running on the IR wanted to move nodes around and change them, but it had to be very careful not to change the position of any columns, or add or remove any columns, lest it break the position-based references elsewhere in the IR.

For the jam I tried to separate out the layers a bit more. The rough steps were:

  1. Translate the sql ast to a high-level IR that retains the syntactic structure.
  2. Assign a unique id to each scalar or table expression that produces new columns.
  3. For each anonymous expression (eg in ON, GROUP BY, ORDER BY), move it into a map node with a unique id and replace the original with a reference to that id.
  4. For each column reference (eg foo, foo.bar, foo.*, *), find the expression that creates that column and replace the column reference with the unique id of that expression. Crucially, we still have all the important syntactic features in the IR itself so we don't have to make a separate scope object to mirror those features.
  5. At this point the IR only refers to columns by unique id. If I had an optimizer, it would run at this point and not have to worry about keeping track of columns.
  6. Finally, just before execution, make a pass through the IR to pick a layout and replace column ids with physical position.

I was in a rush so steps 1-3 were fused together, but I think it would have been better to separate them. Step 4 is here. Step 6 is here.

This worked well for the subset of sql that I implemented. The main missing feature is correlated subqueries / lateral joins - I'd have to implement those too to be sure that it's a good architecture.

Zig.

For context: I generally enjoy working with zig and I've found it a practical choice for the kinds of problems I often work on. Many of the nice things in the architecture section were dependent on zig's excellent metaprogramming. I'm sharing all the nits below in the hopes that it is useful feedback. Certainly not as complaints and or demands.

I spend a lot of time doing basic iter/collection stuff like removing duplicates from an array. The stdlib isn't very fleshed out yet and it's not clear whether that's where collection utils will go anyway. In the meantime, it might make sense to start adding these to my util.zig that I copy-paste between projects.

I had three bugs related to memory safety:

  1. In the parser I held onto a hashmap entry in the cache, invalidated it with a recursive call to parse and then wrote to the entry value_ptr. In the rare cases where the hashmap was resized in the recursive call, this write was lost and I was very confused about why certain deeply nested sql expressions failed to parse. This was not caught by the GPA or by any of the safety checks. I haven't thought of a runtime check I could add. I think my mitigation in future will be to only hold hashmap entries in very small blocks of code, if at all.

  2. The parser/planner/evaluator and the database used two separate arenas. Once, I forgot to clone a table name from the evaluator before inserting it into the database. The resulting UAF wasn't detected by the GPA because of the intervening ArenaAllocator. It manifested as allowing a table to be created twice which produced very confusing errors in downstream tests. If there was a cheap test of arena ownership I could have added asserts to the database methods. I think this might be possible with the existing arena allocator.

  3. When initializing an array of column references I put a break in the wrong place. Sometimes this left parts of the array uninitialized until they blew up downstream math. I fixed this by switching to a syntactic pattern that makes it impossible to leave entries uninitialized.

In rust I would have made 3. even harder to mess up by passing a closure: alloc_array(len, |i| ...). But in zig this would be far too verbose:

alloc_array(len, table_def.columns, (struct {
   fn f(columns: []const Column, i: usize) ColumnName {
      return ...;
   }).f)

Many collection utils would also be much less verbose with closures. There is a proposal for second-class closures (closures which can only be passed as arguments, but not stored on the stack or heap). This would solve most of the usecases I have for closures without taking on the full complexity of closure memory management.

I had several bugs caused by forgotting to handle a field of a large struct, or by adding a field and then not handling it everywhere it was used. While zig has exhaustive compiler-checked switchs for enums/unions, there is no equivalent exhaustive destructuring of structs and no ergonomic way to approximate it with metaprogramming. (I left a comment on the relevant issue in the zig repo.)

During the jam I didn't even bother returning diagnostics for most errors because maintaing a separate pathway from the error itself is extra work and, ironically, error-prone. I'm hopeful for error payloads, although it's not obvious how the various design issues can be neatly resolved.

Zig has defer and errdefer. I found myself often occasionally wanting noerrdefer. It's common to have some nested logic like:

{
    thing.push();
    ...lots of code...
    thing.pop();
}

(Here is a particulary nested example.)

It's easy to mess up that pairing and so I'm often tempted to turn it into:

{
    thing.push();
    defer thing.pop();
    ...lots of code...
}

But during the jam in most cases it was not correct or even possible to thing.pop() if an error was thrown - only if the block was exited succesfully. I found a related issue that was closed as 'completed' but I'm not sure what the actual resolution was.

The parser generator has several steps:

  1. grammar.txt -> rules
  2. rules -> ast
  3. rules -> parser

In theory it's possible to do all of these at comptime. 1 I decided would be best left until there is a ComptimeAllocator available. 2 should be possible using @Type but I ran into segfaults in both stage1 and stage2 (the old and new zig compilers) and couldn't find a way around. Sadly I didn't check in the code, but I often cause crashes when I get ambitious with @Type so I can probably make an independent repro. Now that stage2 is near release this might be a good time for me to try some compiler development and track down the bug.

Overall.

I enjoyed the jam challenge a lot. I learned a lot more in a week than I usually do in a month. It was not a sustainable effort (~70 hours), so I can't work like that all the time. But I think there is probably value in regularly challenging myself in this way. Maybe every 3-4 months, to give me time to act on the lessons from each challenge before attempting another.

Something that maybe wasn't clear from the outside is that on Friday morning I was pretty sure I was going to fail miserably. I kept working at it because success at the problem itself was not the point. The point is to stretch my comfort zone, which requires spending a lot of time in the uncomfortable probably-going-to-fail zone. That's not something I really understood when I was younger.

how safe is zig again

I rewrote this article yet again. I focused it even more on actual examples, specific scenarios and lived experiences.

It's hopefully clear that there is a complex set of tradeoffs here and that as the author I'm not staking a position but instead tentatively feeling out the territory.

rr on alder lake

The rr in stable nixos doesn't work on alder lake, but 5.6.0 works.

 (pkgs.rr.overrideAttrs (finalAttrs: previousAttrs: rec {
        version = "5.6.0";
        src = fetchFromGitHub {
            owner = "mozilla";
            repo = "rr";
            rev = version;
            sha256 = "H39HPkAQGubXVQV3jCpH4Pz+7Q9n03PrS70utk7Tt2k=";
        };
    }))

The E cores apparently don't have perf counters, so you need to use rr record --bind-to-cpu=0.

I also learned that rr doesn't yet support.

google maps jank

On google maps you can drag intermediate points to change the route. If you share the result with someone and they open it in the browser they will see the altered route. But if they open it in the mobile app they will only see the original route.

Example

I learned this by taking someone on a significantly longer walk than they realized they were signing up for.

Val feels like an attempt to marry the best parts of swift and rust.

Constructing and Analyzing the LSM Compaction Design Space. An excellent systematic mapping of the design space and the implications of each axis.

Recent versions of gtk changed the default behaviour of the file picker to always start at 'recent files'. Here is how to change it back.

https://jasonpargin.substack.com/p/stop-telling-me-humanity-is-doomed

The fantasy, the "blue pill", is that you're one of the few enlightened souls in a world of mindless, sleepwalking drones and that your dark pessimism is proof of your superiority. The "red pill" truth is that everyone around you is just as important and that most of them are trying their best. The brutal reality no one wants to face isn't that there's some thrilling secret war being fought, but that the mindless hours Neo spent in that cubicle produced work that probably made other people's lives better.

[..] But if I'm right and the world has steadily improved over the centuries because of the strenuous efforts of billions of people, it means you are saddled with a tremendous responsibility to give back.

https://tis.so/divide-by-less

If [...] an intervention was a great help for 5% of people and irrelevant for the other 95%, your job isn't to open a Jupyter notebook and run a significance test to figure out if that intervention is 'real science'; it's to figure out what language describes the commonalities of that 5% such that anyone reading along at home will know if it's worth trying for them. Our modern conception of science often gets this exactly backwards: by 'removing outliers' and bounding findings in some scoring system, effects that are obviously ginormous when you see them in front of you get dully labeled as 'failure to replicate' because you picked a lot of people it wasn't true for also.