Current section: Query Optimization 4 exercises

Intro to Query Optimization


00:00 we're going to optimize the query that's on this page. So we're going to add just a silly number of users to our database, and we're going to figure out how to optimize that query. We're also going to be optimizing queries in general by adding some foreign key indexes that you'd probably

00:18 be doing by default. But first, let's talk about optimization of queries. So people have this choice of spending a ton of money on hardware and caching to make things fast, or spending still a ridiculous amount of money, but much less on consultants who know the knowledge that I'm about to teach you today.

00:38 So I want to call out, though, that speeding things up is awesome, but you should always still consider pending UI when it comes to your user interface because you don't ever control the network. So your refresh page when you're working locally and everything, that may be ultra fast, but you don't control the network,

00:57 and so it won't always be fast for your users. So just don't forget about pending UI. So I want to talk about what database indexes are. That's what we're going to be focusing on. Rather than diving any deeper into SQL, we're not going to dive that much deeper than we are now. Full stack developers like yourself don't typically

01:17 need to go that much deeper into SQL. And if you do, then yeah, you can absolutely do that. But for most of us, we don't really need to dive much deeper than we already have dove, diven, dived into SQL. But we do want to understand indexes.

01:36 I got this metaphor from Tyler, and we've got this idea of a photo album. So how do you organize your photos? At first, you might just think, well, we're going to be really naive and just stick all of our photos into a single folder. And over time, that's going to be really difficult to find the photo that you're looking for

01:54 because maybe you're randomly generating IDs like we're doing in our database. And so it's not even sorted by date taken or anything like that. So it can be very difficult to identify where your photos are. And so you might think, okay, well, I'll make an index.

02:12 I'll make these folders. And in this example, this is an imperfect metaphor here because in an index, you don't actually move the data anywhere. You just have like a SIM link over to it. So you have like this data structure over here that says,

02:28 okay, here's where all of these similarly grouped things are and I'm just going to point to where those are in the database. That's kind of what, database indexes are pretty deep, but this is a metaphor after all. So yeah, here I have all these photos copied in here, but this would actually be more like a SIM link where the photo is just going to be,

02:48 it's like a shortcut sort of thing. So in any case, now it's much easier for me to answer the question, hey, where were you in October of 2023? Oh, like here are all the photos for that. But then somebody asks you, well, what about, like, have you ever been to the Alps or have you been to Amsterdam or the Great Wall of China?

03:08 So you're like, okay, well, I'll keep what I had before, but now I'm going to add another index based on the other metadata that I have for those photos. And so now we've got SIM links for all of those photos for those locations. But then somebody says, well, okay,

03:24 what about the time that you were in Amsterdam in 2023? Were you ever in Amsterdam in 2023? Well, now you're like, well, I mean, I've got Amsterdam right here and I could look through all those and like iterate through every one of these and see if that was in 2023, or I can go in these 2023 folders

03:44 and look and see if any of those were in Amsterdam. So what we can do is actually do a multi-column index where we say, okay, well, I'm looking for when I was in Amsterdam and I was looking for when I was in Amsterdam in 2023. So we've got kind of these multi-segments. Now this is an imperfect metaphor, but hopefully it gives you an idea

04:04 of what an index can do for you. It's just some extra data that is completely managed by the database to more quickly and efficiently find the records that you're looking for based on some data about that column. And hopefully this also gives you an idea of what makes a good index

04:23 and what makes a bad one, right? So it probably wouldn't make a whole lot of sense to have an index for the exact timestamp of every one of the times that you were in certain places or whatever, because then you're literally, it's just like a folder of those exact things, right?

04:39 But there can be some really nice optimizations that the database can do for you, even in those cases. We're actually gonna be, spoiler alert, we're gonna be adding an index on an updated app as well. So there, like I said, metaphor is imperfect, but hopefully it gives you an idea

04:58 of the ways that indexes can be used to optimize queries and what indexes really are. So some things to consider on indexing. It does take more space because you're gonna be allocating some data storage for that index. And it's also gonna take some more time

05:17 to add new records or update records, because when you update the records, you have to go and update the index as well. But that said, sometimes it can make mutations faster because if we have an update where we want to update a specific user and we have this where clause, and if we have that username indexed, then we'll be able to find that user record a lot faster

05:37 so that we can make that update. So it's not always like indexes make things slower, it's indexes make things slower by default, but then can make up for that by making other things faster. And then in general for web applications, most of the web applications, all of the web applications I've built,

05:56 the read speed was a lot more important than the write speed anyway. And so that's not the case for every application, of course, but for many Epic web applications, you're going to be reading a lot more often than you write and users are a lot more accepting of a slower read.

06:14 And we're talking on the order of milliseconds, not seconds, but they're okay with a slower read or a slower write than they are okay with a slower read. So it's important to make your reads a lot faster than your writes. Okay, so a couple of things to think about indexing by default, primary keys.

06:32 So your ID, that's actually going to be done by default by your database. Unique fields, this is also a useful thing to have indexed. And Prisma actually does that for us automatically. So anything that we label as unique, Prisma will create an index for us. And so non-unique foreign keys is another thing

06:51 that you should probably index by default. This is not handled by Prisma, not handled by the database. And so this is something you'll have to set up manually. And yeah, pretty much like almost all the time, it makes sense to add an index here. You'll almost always gain a nice benefit in performance,

07:11 query performance for adding those kinds of indexes. Okay, so other things to consider indexing, stuff that shows up in your WHERE clause and in your ORDER BY clause, definitely a good spot to start thinking about indexing. And so as always, when you're doing these performance optimizations, you want to measure the before and then measure the after

07:31 and make sure that the difference is a positive one. So yeah, keep that in mind. You don't want to just index everything that you put in your WHERE clause and your ORDER BY clauses, but that's a good first place to look. So when you're looking for how to optimize things, you're going to look for memory and CPU spikes.

07:50 So if you see a bunch of spikes of CPU when like the users do this specific action, then one of those queries that were performed doing that action could be the cause of that. And then you're going to use the EXPLAIN QUERY PLAN

08:06 command or query that will output the plan that the database engine plans on executing to execute the query. So an example of that is you say EXPLAIN QUERY PLAN and then your query, and then it will tell you the query plan

08:23 is I'm going to scan the entire user table of the database. And we're going to do that because our user's name is not indexed. But then if we do this on the username, then we're going to say using index. And when we're using the index, that's going to be way, way faster

08:41 because we don't have to read every single record. And so, yeah, that will be faster. Okay, so indexing in Prisma. Like I mentioned, we already have indexes for our unique fields that's generated in our migrations SQL script. But if you want to manually index something,

09:01 you add the double at index directive or attribute here that allows you to say, I want to add an index on the location ID. And Prisma takes care of making this create index call and then the database takes care of that index.

09:21 And it will use it when it's executing queries. And then you can do multiple columns and we'll actually be doing that in this exercise. And so we want to, this is kind of like our Amsterdam and then year, right? Where we say like, okay, I'm looking for when I was in Amsterdam in 2023. Okay, so here's Amsterdam. And then here's the second part of the index

09:39 is the year or whatever. So yeah, we can say location ID and the name. And we're, yeah, absolutely going to be taking advantage of that capability as well. So we're going to be running SQL commands in this exercise. And so you have a couple options for what you can do.

09:59 You can create a temporary file in the playground if you want to, and then run this. You could run it with a NPX, TSX. Actually, you know what? I'll just show you. So I'm going to make temp ignored.ts and stick that right there. And then let's run NPX.

10:20 Let's, or yeah, TSX watch, and then temp ignored. And so now I get my results there. If I change this, it will rerun every time. And so you could do it that way. It's nice and easy to make that happen that way. The way I'm going to do it though

10:38 is I'm going to use the SQLite 3 CLI. So you'll get that installed on your machine. The instructions will have instructions on how to do that. And then you are in the SQLite CLI. So I can say, explain the query, and I get the output there too. So you have some options

10:57 on how you want to run these queries. Use whichever one you feel most comfortable with. And yeah, you're going to be explaining queries like nobody's business in this exercise. So it's a lot of fun. It can be really fun to go pretty deep on some of this stuff, especially the stuff that consultants get paid big bucks to do.

11:17 There are literally just, the first step of the exercise is like three lines of code changed, maybe just two. And then the second step is literally one line of code change makes all the difference. And that's something that I found in software development is that performance optimizations take an enormous amount of work

11:36 if you're building the tools to make it possible to make these optimizations work. But then when you're consuming those tools, it's normally just like a couple of lines to opt into the particular optimization and add some config or whatever. It's kind of interesting how that shakes out.

11:54 Think about adding react.memo on a component that you need to optimize its rendering. That's a very easy thing to do, but it takes a fair bit of work for React to do that and make that efficient for all of us, right? So all of that said,

12:11 I think we're ready to get into optimizing some queries. So have a great time with this one and we'll see you on the other side.