This August in my Database project written in Rust

This month besides adventures like moving into another apartment and having a release on my daily job, I have got time to work on the project, listen to Andy Pavlo’s intro to Database System course, and think through ideas that floated in my mind. I will start with what happened to the database project in the code/development perspective and then write my ideas on what is the future of the project.

PostgreSQL Protocol Extended Query

Steven continues to contribute to the PostgreSQL wire protocol. This month he laid the foundation of Extended Query part of PostgreSQL wire Protocol. The database can parse and save prepared statements. Two phases that have left are parameter bindings and query execution. In terms of PG Protocol to be a separate crate, this work on Extended Query and Authentication flow should be finished before finalizing the final shape of the crate.

Dynamic expression evaluation

Last month, we introduced the evaluation of simple expressions. The database can add, subtract numbers or concatenate strings. This month Andrew started to work on extending capabilities to evaluate dynamic expressions that could contain column names. He submitted a PR for updated queries that I hope we merge soon. I think this work lays out a direction to evaluating predicates in where clause and executing more complex select queries.

Definition Schema

Before this month, saving and handling metadata was messy. It did not allow us to implement the SQL interface over it. So this month, we reworked it to align with described in the SQL standard DEFINITION SCHEMA. For now, it can save CATALOGs, SCHEMAs, and TABLEs. This restructure helps to introduce users for authentication by adding the appropriate table to the DEFINITION SCHEMA and a couple of methods

No more Frontend and Backend storage

Working through materials of Intro to Database Systems course, I started to realize that the thing that I called FrontendStorage is something that belongs to the SQL engine. Because it was something that had known how to write, read, and delete from underling sled key-value storage. Also, it had known how to store metadata about existing schemas and tables and how to map sled’s Database and Tree structures to schemas and tables. It was reworked into CatalogManager that managed how to load and store data after DDL and DML queries execution. CatalogManager seeks the help of DataDefinition structure that manages everything that happens to DEFINITION SCHEMA. These structures will help work toward making database transactional. They are going to provide a bridge between in-memory datasets that transactions can manipulate and data that is stored on disk.

Persistence

Finally, the last improvement to the storage system in the database was that it does not use a temp folder anymore to store data. CatalogManager and DataDefinition struct were developed with the possibility to save data on-disk. By default, the database will start in in-memory mode, it simplifies SQL engine testing a lot, but if you pass the PERSISTENCE env variable to the starting command the database will use a disk.

More information about project development on GitHub

Thoughts on the Project future

Since the beginning of the project, I was thinking of how to make it bigger than just a pet project that I am building in my garage. I started to notice how happier I am when I saw the number of stars, forks, and watchers increasing. How differently I am looking at the project when people contributing code in it. This month was full of thoughts on how I can establish a commercial open source company. Working through that I found out that I am missing a team, a prototype, and a product/company name. In the next few months, I am going to work on filling the missing parts. I will start working on gathering a team and establishing a goal and a mission of my future company - in other words, I will spend some time on how to transform a garage idea into a scalable, high performant, NewSQL database written in Rust.

Thanks for the reading.