0029: san francisco, seattle, tigerbeetle, studying, links

Published 2022-11-04

I'll be in San Francisco Nov 6-11, and in Seattle for Handmade Nov 15-19.

tigerbeetle

In October I joined the database team at TigerBeetle.

Like most of my major life decisions, this was made somewhat impulsively. There wasn't one big reason, rather many nudges that all happened to peak at the same time:

I'll be working for TigerBeetle 3 out of every 4 weeks, with the other week reserved for research, tinkering and writing here. I think the direction of my own work next year will be different - I'll let go of the work on programming languages/environments which consumed most of this year to focus on more near-future work on databases and query engines.

I'm going to return my Emergent Ventures grant. That money will have more leverage elsewhere.

I'll leave my github sponsors open but I'll take down the banners here, and I think it's perfectly reasonable for y'all to cancel your sponsorships.

This log will keep going too, because I like to read other peoples monthly notes. It's a version of social media that respects attention and focus.

The last two years were worthwhile. I will definitely try more independent efforts in the future. But I'll be better prepared next time, and I won't go it alone.

notes to self

I've been doing a lot of code review lately. It's difficult to really pay attention when reading code vs writing it. A trick I've been trying lately is to imagine that someone told me they already found a subtle bug in the PR I'm reviewing, and they bet I won't find it.

Similarly, for my own PRs I've started reviewing the diff myself before submitting it, with the goal of predicting all the questions that the reviewer will ask and answering them in advance.

TigerBeetle has a lot of asserts (eg). I'm finding they fill a lot of different purposes:

I'm learning to think about disks in a similar way to distributed systems. Many reads and writes can be in flight at one time. They can be reordered before being applied to the disk. In the event of crash, they might not be applied at all. Driver bugs and hardware faults can result in reads and writes being corrupted, applied to the wrong location or dropped entirely. Resilient storage requires treating data with the same level of suspicion as reads/writes received over the network.

You can read cpu perf counters before/after a benchmark (eg) rather than trying to run enough iterations to amortize the noise of process setup/teardown. (This is sort of obvious, because the perf command is itself just a program, but it had never occurred to me to find out how to do this).

Zig's build tool has an undocumented function addOption, which provides an easy way to pass options at the command line or in the build file as comptime constants to the program being built.

Rather than writing slice[a * b .. (a + 1) * b], write slice[a * b ..][0..b]. It's clearer, and corresponds more directly to the underlying operation (slice.ptr += a * b and slice.len = b).

When porting c build systems to zig, pick out the underlying commands with strace -fq -s 100 -e trace=execve make 2>&1 | grep $(which gcc).

studying

I'm slowly working through Understanding Software Dynamics.

Here is some cute zig code for the first chapter, featuring variables whose type depends on the op (total, incr) and compile-time manipulation of the asm strings (** unroll_count).

pub fn run(comptime op: anytype) void {
    var total = switch (op) {
        .add => @as(u32, 3),
        .fmul => @as(f32, 3),
        ...
        else => unreachable,
    };

    const incr = switch (op) {
        .add => @as(u32, 3),
        .fmul => @as(f32, 1.01),
        ...
        else => unreachable,
    };

    const start = util.rdtscp();

    var loop_index: usize = 0;
    while (loop_index < loop_count_max) : (loop_index += 1) {
        switch (op) {
            .add => asm volatile (
                \\add %%ebx, %%eax;
                ** unroll_count
                : [total] "={eax}" (total),
                : [total] "{eax}" (total),
                  [incr] "{ebx}" (incr),
            ),
            .fmul => asm volatile (
                \\fmul %%st(1), %%st(0);
                ** unroll_count
                : [total] "={st(0)}" (total),
                : [total] "{st(0)}" (total),
                  [incr] "{st(1)}" (incr),
            ),
            ...
            else => unreachable,
        }
    }

    const elapsed = util.rdtscp() - start;

    std.debug.print("{:>4} \t {d:>6.0} cycles \t {d:>6.2} cycles/iteration \t total={}\n", .{
        op,
        elapsed,
        @intToFloat(f64, elapsed) / loop_count_max / unroll_count,
        total,
    });
}

Along the way I learned how to print perf counters that aren't built in to the perf binary. Intel has a full list of counters here. For alder lake, in the additional info for eg ASSISTS.FP we can see EventSel=C1H UMask=02H. That becomes perf stat -e 'cpu_core/event=0xc1,umask=0x02,name=assists.fp/'.

Judging from my progress so far I'll be working on this book for a while, but next on the list is likely Algorithms for Modern Hardware followed by Operating Systems: Three Easy Pieces.

I skimmed this years P99, HPTS and Strange Loop talks too. Not much stands out in my memory. The ScyllaDB IO scheduler seems interesting but is beyond me at the moment. Marc Brookers talk on the need for tools to understand system dynamics resonated but doesn't itself contain any answers.

The case for free online books. Among other arguments, notes the ability to cite a book with a link that other people can trivially check. This is huge - fact-checking citations used to require manual work (and often a university library subscription) but if all books are online then following citations is trivial.

BUGGIFY - notes on how FoundationDB helps the fuzzer into weird edge cases.

Everything about this automata toy is wonderful.

Zig 0.10.0 shipped. The bigs news is the self-hosted compiler, but also of note:

inline switch is nice by itself, but I'm excited to see if inline parameters will make it into the language too. Together they'll make it easy in vectorized interpreters to write a single function and produce both specialized versions for common arguments and a generic fallback for other arguments.