Scattered Thoughts

After much delay erl-telehash is getting some love with a 0.1.1 release. At this point the core features are implemented and seem to be working. I'm now working on some simple demo apps that I can use for larger scale testing.

With my work on erl-telehash and at Smarkets I find myself fighting erlang more and more. The biggest pains are the dearth of libraries, the lack of polymorphism and being forced into a single model of concurrency.

The first is self-explanatory and pretty well-known. I frequently have to fire up a python process through a port to do something simple like send an email. Even the standard library is incomplete and inconsistent.

The second doesn't start to hurt until your codebase gets a bit bigger. For example, Smarkets makes a lot of use of fixed-precision decimal arithmetic which leads to code like this: decimal:mult(Qty, decimal:sub(decimal:to_decimal(1), Price)). It also means any time you want to change a data-structure for one with an equivalent interface you have to rewrite whole swaths of code.

The third point is a bit more contentious. I'm fairly convinced that the erlang philosophy of fail-early, crash-only, restartable tasks is the right solution for most problems. What bugs me is that erlang conflates addresses, queues and actors by giving each process a single mailbox. This leads to problems like requiring the recipient of a message to have a global name if it is to be independently restartable, which means you can't run more than one copy of that message topology on the same node. It also encourages processes to send messages directly to other processes which makes it difficult to create flexible, rewirable topologies or to isolate pieces of a topology for testing. I would prefer a model in which processes send and receive messages through queues which are wired together outside of the process. This would also allow restarting a process (and clearing but not deleting its queues) without giving it a global name.

I'm not about to run out now and rewrite erl-telehash in another language. It's close enough to complete (for my purposes at least) that I'll just continue with the existing code. For future experiments, however, I want something better.

The top candidate at the moment is clojure. It has the potential to replace my use of erlang and python, saving lots of cross-language pain. Agents look a lot like a (cleaner, saner) implementation of the mealy machines that I wrote at Smarkets. Lamina neatly solves the queue pains I described above. [Datalog] is the natural way to describe a lot of collections, including th_bucket which is in its current form is not obviously correct. The clojure community just seems to churn out well-designed libraries (lamina, aleph, slice, incanter, pallet, cascalog, storm, overtone etc).

In the short term I will get started by rewriting binmap, since it's fresh in my mind and simple enough to finish quickly. If that goes well it will eventually become an educational port of swift.

P.S.
I am often available for freelance work - check out my resume
Copyright © Jamie Brandon 2011