SQLite Replication

SQLite is probably the world’s simplest Relational Database Management System. Basically it’s a C library which can be embedded in programs easily. There is no server/client mechanism, the database is a single file. For small work loads, it often makes no sense to use a big RDBMS package like MySQL or PostgreSQL, unless of course you need the special features they provide.

So, I came across this situation where I need replicate SQLite database. The problem arised because, data redundancy was needed across multiple servers and the program in question was supporting SQLite, MySQL and PostgreSQL; but one of the servers had only workload for PostgreSQL database and installing MySQL for the small amount of data the program handled wasn’t sensible. The other two servers had pure MySQL workload. Also, the updates needed to be propagated. So there is the deadlock.

I searched around to find nothing useful, but I remembered there is a cron called incrond which can watch files and directories for events using INOTIFY and execute commands when specific events occur. The solution is almost there. All I need is a script to copy the database file to other servers when data was written to it. I wrote a simple script which would copy the files to other servers; at first I tried rsync with incremental updates, but it didn’t work, because SQLite doesn’t delete data actually when rows are deleted as written in this FAQ at official website. The data is simply marked for deletion and reused during future inserts.

So I guess you now know why incremental updates won’t work: even if I delete a row, the size of database is going to remain same. Actually it should work with rsync’s checksum method, but it didn’t for me. Nevertheless, since the data size was pretty small, I used scp to transfer the database.

The script is ready, now you need a incrontab entry for IN_MODIFY event i.e. to run the script when the file is modified. Here’s a small example:

/var/lib/database.sqlite IN_MODIFY /scripts/copy_database.sh

That’s it. Whenever the file is modified, it will be transferred to the other servers. Wait, the story doesn’t end there.

I tried modifying the database on primary server and it did properly replicate to other servers, but the new changes weren’t visible neither on the primary server nor on the other servers. In the sense, the program could see the deleted row. This is mainly because of caching in memory. The solution was to reload the program (which didn’t cost much in this case). So I just added the commands for reloading the program in the script before syncing the database to other servers on the primary server and then two ssh commands in the script which execute the reload command on the other two servers.

/etc/init.d/program reload
ssh server1 '/etc/init.d/program reload'
ssh server2 '/etc/init.d/program reload'