Litestream author here. This is on the staging site for the docs for the upcoming v0.4.0 release. I’d love to hear feedback from anyone who wants to kick the tires before the release. Thanks!
Good, clear docs, but as a SRE I want to know a bunch about how it fails.
Is there any indication on the replicas that they are read-only? Can you query the webserver for whether it's read-only or read write?
Is there and plan for failover/promotion? Do I need to do anything besides update the config file? Can I start a server in the mode "Make sure you're up-to-date with what's remote, then switch to RW mode?"
What happens if I accidentally write to a read-only replica? Will the error be detectable immediately, or will I need to do some sort of exercise to discover it? Is it even possible by SQLite semantics, or will I get locking timeouts if I try to write while the process is running? Can I use permissions to help?
Thanks for the feedback. I'll definitely add a docs page to explain failure modes. That's a really good idea. I'll briefly answer your questions here as well.
> Is there any indication on the replicas that they are read-only?
No, Litestream runs as a separate process so it can't control whether the application writes to the SQLite database. That being said, you can set up your replicas to pass "mode=ro" in the DSN to SQLite to ensure they can't issue writes. SQLite also has a "PRAGMA query_only" to check if that flag is enabled.
> Is there and plan for failover/promotion?
There's no failover/promotion. Litestream was originally for continuously replicating a single node to S3 to ensure durability. The idea was that many VPS providers have decent reliability so you can achieve high uptime (e.g. 99.9%) with a single node and still have a fallback in case it fails catastrophically. If you have higher uptime requirements then Litestream may not work for you but there's a large class of applications where that works well. Recovery is just a matter of calling the "litestream restore" command so you can automate it pretty easily.
> What happens if I accidentally write to a read-only replica? Will the error be detectable immediately, or will I need to do some sort of exercise to discover it?
If you accidentally write to the replica then it will corrupt the replica. Litestream will not automatically detect it, however, you'll start getting errors from SQLite saying that your database is corrupt—usually pretty quickly.
> Is it even possible by SQLite semantics, or will I get locking timeouts if I try to write while the process is running?
Yes, SQLite works well in a multi-process environment and that's how Litestream interacts with it. Litestream occasionally obtains brief write locks so you should set the "busy_timeout" in your application so that it doesn't get an error when it tries to obtain a write lock at the same time.
> Can I use permissions to help?
You could probably run your application and Litestream as different users and adjust permissions accordingly. It's probably easier to set the "mode=ro" though.
Also, I'll note that I'm working on some future tooling for SQLite replication that acts more as a database-as-a-service that makes the administrative part more straightforward (e.g. any node can write, no need to worry about configuring read-only databases, etc).
> Is there any indication on the replicas that they are read-only?
Right, I get that Litestream can't prevent my application from opening the database RW, but what I'd like to see is that there's an endpoint on the http server that I can query to tell if it's in "Leader" or "Follower" mode; i.e. will Litestream be reading from the db file, or will it be writing to it? That could be used as a part of implementing a master-election scheme; An elector starts up Litestream in either "RW" or "RO" mode, and the application determines from litestream when it's safe to write. The only reason I'd not trust that to the elector is that I want litestream to finish it's replay first, as above.
> Is it even possible by SQLite semantics, or will I get locking timeouts if I try to write while the process is running?
Would it be possible for the follower Litestream copies to simply hold this lock all of the time, "preventing" (i.e. not through actual exclusion, but existing sqlite semantics) writes?
I haven't tested it yet but the feature is exactly as expected, the documentation is very clear, and the demo using fly.io is a wonderful example of how to use it.
Out of curiosity, how do you know when someone shares a link about LiteStream on HN? Asking because you never fail to answer people's questions and comments, which is impressive :)
Thanks for checking out the docs. That really helps.
As for HN, I use f5bot[1] for keyword notifications and then I use hnreplies[2] for getting notified when someone replies to a comment. They work great although I wish HN had some of these features built in.
Good question. Litestream does physical replication so it simply copies the underlying pages of the database. That means it’ll work with any database changes including DDL or even extensions.
One exception to that is encrypted databases like SQLCipher since Litestream needs to be able to read some metadata from the database.
Yep! The schema storage is kinda wild. They store the DDL for the creation of the objects as the raw SQL which is quite a bit different than how something like Postgres has a bunch of tables in the pg_catalog to specify the database structure.
Yes, this is a post about a new feature in Litestream though. It allows live replication from a primary node to one or more read-only replicas. It gives you the ability to do scale out reads as well as do geographic replication.
There's a small demo app where you can see a single primary in Chicago (ord) is updated when you click, "Increment", and it replicates out to 15 other regions worldwide via SQLite & Litestream: