Today we're sharing an update on Dialog, a far edge database for local-first applications and autonomous computing agents.
What is Dialog?
Back in April of this year we gave you a first look at Dialog and shared the details in this blog post. As we set out to build a decentralized edge database, our team had a few important questions we needed to resolve, such as how to reconcile changes made by different editors and how to retrieve data for offline access. Here's how we've answered these questions.
Server Shortcomings
In order to reconcile changes made during concurrent writing we have to determine what our source of truth will be. Centralized systems use servers to determine this, but there are shortcomings.
First, servers must remain online for reconciliation to occur. If a server goes offline, reconciliation can't happen. Think about the times that you've tried to do some work but because the server went down, everything halted to a stop. If the work you're doing is critical, say in the field of healthcare, this is unacceptable.
Another (more philosophical) concern with utilizing servers as a source of truth is that one single reality does not necessarily = truth. In the real world, there can be many truths! We as humans may not always agree on a source of truth because we each have our own subjective realities.
Concurrency Control
Furthermore, concurrent writes in a distributed system are challenging. In the paper Concurrency Control in Distributed Database Systems by Bernstein, P.A. and Goodman, N, the authors layout the problem:
The concurrency control problem is exacerbated in a distributed DBMS (DDBMS) because (1) users may access data stored in many different computers in a distributed system, and (2) a concurrency control mechanism at one computer cannot instantaneously know about interactions at other computers.
But instead of turning to two-phase locking or timestamp ordering, we are utilizing CRDTs. CRDT structure and its underlying math allows us to guarantee eventual consistency.
Offline Access
Many parts of the world do not have access to reliable Internet. People are also on the go and do much of their computing through their phones. This means connection to the Internet often gets interrupted as people are in transit. So how can we ensure that we can continue working while offline in a decentralized database that doesn't use a server to store data?
Enter IPFS. IPFS allows us to store data on decentralized nodes and link to them via content addressing. Our CRDT database structure can efficiently use these hash-linked files, allowing offline access to occur. (When we integrate Dialog with Webnative File System, users will be able to utilize WNFS for this and other purposes!)
Dialog Updates
One change since the last time we wrote about Dialog is that we are no longer using the Delete/Rederive algorithm for enabling incremental view maintenance. Instead we are using techniques inspired by differential dataflow.
We've also made progress on encryption and key management, and we have research code that has been used to prototype outstanding research questions and do rough performance testing.
Next Steps
We are currently in the specification process where we are laying out the requirements for integrating the prototype with IPFS and Webnative SDK. We are going to be starting a rust implementation soon as well.
Stay up to date on Dialog news and learn how you can co-build with us by registering for our Fission Reactor community calls. Fission Reactor is what we call our applied research pod, and every month we share updates on Dialog's progress. We hope you can join us!
Bonus Treat: Our Applied Researcher Quinn Wilton recently gave a presentation on Dialog at HPTS. The event wasn't recorded, but she has kindly shared her slides for you to view here: Postmodern Systems and Datalog