This Article IsCreated at don't-know-whenLast Modified at 2023-09-15Referenced as ia.www.b18

Using Janet as Database

You must clone this repo and read it together with this article. Otherwise you won’t understand this article.

Two branches of the repo are of importance.

  • branch sqlite: the version that uses SQLite3 for persistence
  • branch janet: the version that uses Janet for persistence

A few days ago, I was testing out writing HTTP application in Zig, with htmx. (unrelated: htmx is very good)

When I added persistency to the project, at first I chose SQLite. It worked. However sqlite-interfacing code felt like too “loose and moving” in the whole project. It felt flaky.

In pursuit of mathematical soundness, I swapped out SQLite with Janet. The result is pleasing to me, so I wrote about it here.

Here’s how I used Janet the programming language as database of my toy application.

Query and Data Model

When using Janet as a database, the programming language itself is the query language.

Since the application is a simple counter, I used (def stuff {:counter 0}).

To get the data, I used (stuff :counter). To set the data, I used (set (stuff :counter) 1).

If the data model of your application is more complex, then I highly recommend reading the official Janet documentation. I also wrote another article to describe how to mix Zig data and Janet data.

Persistence

To save data, I simply used the POSIX file-system.

First, I used the Zig API janet.marshal to serialize stuff to a string. The string is then saved in a file with the following steps.

P.S. fdatasync on Linux doesn’t flush change to file size (see man 2 fdatasync)

To read the data, I did the opposite with janet.unmarshal.

The most important different of Janet from SQL database is that you can serialize cyclic data and functions! By using direct reference to objects, we avoid SQL JOIN hell. In a sense, Janet is a network database.

In Janet REPL, the database can be loaded from disk with (unmarshal (slurp "database")). This is useful for debugging, or to play around with the data.

Why use SQLite then

Persistence performance. I haven’t benchmarked this yet, but I assume that SQLite is faster, as it won’t write the whole database to file every time it needs to persist stuff.

Query performance. I don’t believe this is a problem. Many people write network-facing applications in Python. Janet is as fast as CPython, so it should be ok. If I hit a performance bottleneck here, I would rather use BQN or Polars than SQL, or I can rewrite that part in Zig.

Type check. SQL has built-in type check. Janet can do this too with built-in functions, and the data model is not restricted to match relational algebra.

What can be improved

To save disk space, a key-value database can be used instead of storing multiple <4kb files. From the source code of Python’s shelve module, shelve uses a gdbm for backing storage, which is a key-value database.

To store cyclic data without pain, we need GC. This is the case for database as well (if it supports auto-delete either/both nodes of an edge if the edge was deleted (in graph database)). With some effort, maybe I can hack Janet and dbm together, so that they share the same GC.

With Zig’s explicit allocator design and CTFE, it is possible to “relocate” nested structures into the same place by replacing pointers with relative offset. It works like a moving GC but without GC. With some effort, maybe I can add cyclic support to s2s. The downside of this is, of course, potential footgun from moving memories around.

Finishing Thoughts

For the past few years, there was this popular business practice where they make a new database out of nowhere and call it something new. Here are some examples:

Since I am affine to data, those “products” are very confusing to me. Here’s what I observed in practice:

I don’t know how Janet fit in this performance scale, as I haven’t used it as database before.

Again, I am in awe of how useful Janet is. It is the same feeling I got when I first learned that a Lua file can act as a configuration file — “code as data”.