Prepare for Multi-Region Data with LiteFS
Fly.io has built-in support for distributed Postgres and Redis, but by using LiteFS we'll be able to use SQLite as well.
LiteFS depends on the Fuse package, which acts as a virtual file system.
This means that our SQLite database will be moved into this virtual file system, which is how our applic
Lecturer: 0:00 Our application is humming along nicely, but we have one problem.
0:04 Let's say that we have users who are all over the world. How do we decide where we deploy our application based on our users? Do we think, "Let's look at the analytics, how many users are there in certain parts of the world? We'll put the application wherever the most users are"?
0:20 That is a strategy that works out pretty well for some use cases. If we want to maximize the performance of our application, then co-locating our application in all the different regions throughout the world, we'll do some horizontal scaling, and put it -- the different nodes of our application -- all throughout the world.
0:36 That would be a lot better, except for the data. If we don't also do the same to our data, then our applications have to reach all the way around the world to get the data as well. It's not a whole lot better. It is a little bit better, but not a whole lot.
0:50 We have to distribute our data. Once we start distributing our data, now, we have conflict resolution. What happens if the user up here is updating data, and the user up here is updating data? Which one wins? It can be a really complicated problem. However, there are some good solutions to this. These solutions are built-in to Fly.
1:09 With Fly, you can have PostgreSQL clusters. You get automatic read replicas. The idea is you have one node that is the primary node. It's the only one that is able to write to the database. All the other ones can get their own database, but they can only read from it. Any writes have to be sent to the primary node.
1:27 Fly has a great built-in way to accomplish this with very little impact on your application code. Luckily, for us, because we're using SQLite, Fly also has a solution for this with LiteFS.
1:40 LiteFS basically acts as a virtual file system. Remember, SQLite is just a file in the file system. What LiteFS does is we have this virtual file system that proxies any reads and writes. So that we can make sure that any writes go to the proper region, and get propagated to the proper spot.
2:02 We're going to set up LiteFS in our application. Then, eventually, we'll get to multiple regions. To get started, we need to have fuse3 installed. We are using our Dockerfile with apt-get. We'll add a fuse3 in our apt-get requirements there.
2:17 Then, because we're running with Docker, we can use the LiteFS distribution on Docker Hub to get the official Docker image. We're using the latest version that is in here. Use whatever the latest version is at the time that you're following along if you are following along.
2:31 We'll stick that toward the bottom of our Dockerfile. We need the LiteFS binary right when we start our application. We'll bring that in, and then the next thing that we need is configuration.
2:44 The configuration default location is in /etc/litefs.yml. We'll put our own YML configuration in the root of our project, and then we'll add it. We'll say add litefs.yml to that location where it's expecting it. Let's make that litefs.yml right here. We'll copy this configuration over here. Let's talk about this a little bit.
3:07 Fuse is the library that's responsible for creating the virtual file system. That's the thing that we installed right here. It's going to create the virtual file system for us so that LiteFS can intercept to all of the writes so that we get that propagation of all of those database writes.
3:26 The directory where we want this virtual file system to be is where our application should be accessing our database. The application should not access the actual data anymore. It should access it through the proxy so that we make sure that we're doing all the writes, and propagating those.
3:45 When we specify the DIR, this is saying, "Hey, fuse. This is the directory I want the virtual file system be set up in." Because we've set LiteFS here, we need to update our database URL to use LiteFS instead.
3:59 Now, I don't like this configuration having to be the same, and being in two different places, I think that'd be easy to mess up. Like, I just decide I want it to be Lite-FS, I could break things there.
4:10 What we're going to do instead is we'll make an env called LITEFS_DIR. This will be...I want to spell it right, though. Otherwise, that'll be confusing. We'll say LiteFS, and we'll stick LITEFS_DIR right there. That'll be interpolated. Then, over here, we can interpolate this as well, so LITEFS_DIR. That configuration will be interpolated for our fuse configuration.
4:35 Now, the directory for the data is where the actual data is going to lease. We do need to have the underlying SQLite database. It needs to be somewhere. This is where LiteFS is going to put that SQLite database. Currently, our SQLite database, before we made this change, was in /data/sqlite.db.
4:58 We could put it in the same place. It's possible you may want to have it in a very specific place. For us, we're going to create a brand-new location, so we don't have any problems when we're deploying this for the first time. Then, you can do an import later if you want to import existing data.
5:16 I should also note that I'm going to be pushing this directly to our main branch, but we already set up the staging environment. If you wanted to test this out in staging, that'd probably be a good idea. Especially, if this is a long-running application that you've got existing data and existing users, and stuff.
5:31 For us, we have mounted our data persistence -- our persistent volume that we're calling data -- to /data. That is where our data needs to live inside of LiteFS. Where LiteFS creates a SQLite database and performs actual writes to, that needs to be under our data directory.
5:53 We're going to call this, simply, data. We're going to put our SQLite database under the LiteFS directory to communicate, "Hey, LiteFS is the one that's writing to this actual file here."
6:04 With that, we have one other thing that we need to do in our LiteFS config and another thing to do in our Dockerfile. Let's come over here, and let's talk about lease configuration.
6:15 The idea is that we only have one region or one instance of our application that is allowed to write to its database. The rest are read replicas. This works well for most applications and allows us to co-locate our data as close to the users as possible.
6:34 The challenge here is, how do you decide which region get, or which instance of your application gets to be the primary instance? The one that actually gets to write to its database.
6:44 There are a couple of strategies that you can apply here. We have static and consul. We're going to start with static because it's a lot simpler. Then, we'll bring in consul because it has some nice capabilities we'll talk about later.
6:56 For the static, we basically are saying, "Hey, this one, 100 percent, is going to be the primary node." Because we're only deploying to one region, we don't have to do anything special with the candidate here. We can hard code it to say true.
7:09 That will mean that the instance that starts up here can assume that it is the primary, and it can write to its database. We don't have to worry about synchronizing writes, or anything. That's not a thing in LiteFS here. Our type is static. We are the candidate for being the primary node.
7:26 Then, in our Dockerfile, we need to make sure that the virtual file system is up and ready, and the SQLite database is accessible through that before we start our application. To do that, we're going to run our application under the LiteFS binary, which we installed right here.
7:44 There are instructions for doing this. There are a couple of ways you can do it. You can simply run LiteFS mount, and then configure the exec property in your LiteFS config. You can run LiteFS mount with a double hyphen here, and then whatever you want it to exec.
8:01 In our case, we're going to go with this approach here. LiteFS mount, and then the double dash. Then, npm-start is what we want it to execute. With that, we are ready to rock and roll.
8:14 Let's git status, see what we changed. We modified the Dockerfile, added a new file, so git add all the things. Then, we'll git commit everything with the commit, add LiteFS. Now, let's push that, and come over to our actions.
8:34 Here, we've got our add LiteFS action. That is going to deploy our application. Once that's done, we're going to get some interesting logs that I want to talk about. We'll wait for that to show up.
8:46 The logs are saying it's shutting down the virtual machine. We're closing down the old app, starting up the new one. We're pulling the container image, unpacking it, setting up all the firecracker stuff in here. There are a couple of logs in here I want to focus on as this application gets started up.
9:02 The part where we start seeing some interesting stuff is right here where we're determining the primary region. Here, we're using static primary. That's because we configured our lease configuration to be static. We are the primary.
9:19 We're reading the config file from /app/litefs. The LiteFS version is printed out here. We've got our primary lease acquired, because we're the only instance that can be the primary instance. The /litefs is where our LiteFS directory is going to be mounted to.
9:36 We've got a couple of other logs here related to determining the primary node and the cluster. Then, we're starting the subprocess. This is where our code starts. Here, we're running node start.js, which loads our Prisma schema.
9:51 It identifies the SQLite database under file /litefs/sqlite.db. Then, the database file is zero length on initialization. We didn't have a database yet, because this is a brand-new location. The SQLite database, sqlite.db, is created at file /litefs/sqlite.db.
10:13 It found one migration because, again, this is a brand-new database. We're not using the existing one. We're applying this migration. Then, it's telling us it applied this migration. These logs are all coming from Prisma. Then, we get to start our app. Our server is listening, and health checks are passing.
10:32 In review, what we did to make all of this work was we updated our Dockerfile to include fuse3, so we can have that virtual file system all up and loaded. That's a dependency of LiteFS.
10:45 Then, we're copying the LiteFS binary from the latest version, adding our LiteFS config. Then, running our command, LiteFS mount, so that LiteFS can get all set up before our actual application starts.
10:57 Then, our LiteFS configuration specifies the directory to be LITEFS_DIR, which was configured in our Dockerfile right here so that those could be shared for both our database URL as well as our LiteFS configuration.
11:12 Then, the data is pointing to the /data, which is the destination for our persistent volume called data. We're putting this inside of the LiteFS directory. Then, we specify the lease type of static to make things as easy as possible. That gets our application running within LiteFS.