Improve UX and Security with Time-based One Time Passwords (TOTPs)

When a user creates a new account with their email address, we want to verify that email address belongs to them. That way if they ever need to change their password or we need to communicate with them about something we know we've got the right address for them. But to verify the user's information can be tricky. We can send them something and have them send it back to us, but we need to make sure that what we send them is random enough to not be guessed, simple enough to be typed, and only valid for a short period of time. The Epic Stack used to generate an encrypted token, but there are limitations and vulnerabilities associated with doing this. So now the Epic Stack has adopted a new pattern called "Time-based One Time Passwords." Learn more about what went into the decision here (along with the ChatGPT AI conversation that aided in making this decision) by reading the decision documents for the Email Code and the Time-based One-time Password Algorithm.

Transcript

Kent Dodds: 0:00 I had a big long conversation with ChatGPT about the merits of the authentication system in the Epic Stack, and how we validate user emails, and I changed. The way that we used to do it is encrypted payloads, an encrypted token with a magic link. Now, we're using a time-based one-time password.

0:22 First off, I decided I wanted to make things a little bit easier for users to submit the one-time password thing. We send them a six-digit thing that they can type in instead of a magic link that they have to click. Benefit of that is, they don't have to have two different tabs. One, where they requested the link, and then next, where they clicked and open the link.

0:47 That's nice. It's also helpful if the device that you're opening the email in is separate from the one that you want to sign in. That was an accepted decision that I decided.

1:04 In doing that, I decided, I also should probably secure up the way that we do email verification in the first place because we are susceptible to rainbow table attacks and other brute-force related attacks that could potentially leak our encryption secret.

1:23 Which is unlikely, but could definitely happen. We're also open to disgruntled employee attacks and stuff like that. By switching to a time-based one-time password algorithm, we sidestep those issues.

1:40 You can read through the decision documents to get a little bit more background on this. Of course, I do link to this entire conversation with ChatGPT where I'm weighing the different trade-offs. I'm really close to saying, "Forget it," and then decide to go back and forth. It's pretty interesting conversation. Feel free to read through that.

1:59 I want to show you what the user experience looks like now. I go to login. Of course, you have to have the server running, npm run dev. Let's go to the home page again. I go to login. I am an existing user. That flow has been unchanged, still username and password. That's going to be the same.

2:22 The new flow comes into creating an account and "Forgot Password." Let's create an account first. We're going to say, jenny@example.com. We'll submit. That will send us to this new page. This page did not used to exist, it used to just say, "Check your email and then click on the magic link."

2:42 Now, we go to a /verify that has the email and the URL. We're supposed to enter the code right here. Then we received an email here. During development, it is marked out, and we just log the email to the console. We have both the text and HTML form.

2:58 Here in the text, we can see that is our code. We can copy that and paste it right here. Then we go submit and now we can go through the onboarding process. That is unchanged, same thing.

3:12 If we go to forgot password...Actually, let's do create account again and say, jenny@example.com. We submit. We'll have a new email with a new one-time password. We also have this link. This will pre-fill the code as a URLSearchParam. I'll copy that and paste that in place. That will instantly verify and redirect us.

3:39 Now, if that one-time password is already been used, then that's not going to work. It will say, "The code is invalid." If it's a completely invalid code, then that will also say, "It's invalid." All of the typical things that you would expect work in the scenario. Trying to use somebody else's code with a different email address is also not going to work all of that stuff.

4:05 All right, let's take a look also at forgot password flow. If I say, username or email. We say, Kody, that's the username. It's important that we keep the username or email here rather than resolving this to a common thing, like username specifically or just email.

4:23 The reason for that is if we resolved it to email, for example, then somebody would be able to find a user's email address by typing their username in here. Then vice versa, if you had their email address and want to find out what their username was, you could use this. You don't want to do that. We keep the ambiguity of username or email here.

4:46 It's nice to be able to provide either username or email because this means we also don't need to have a "Remind me of my username. I forgot my username" thing. You just give us either one of those and we'll send you the email.

4:59 We will have gotten that email here. Again, we can click on this or we can type this in. We type that in and submit. Now, we can do the password reset. There we go. Now, I can authenticate and we're logged in. That is the user experience for the flow.

5:24 The way that this works is also interesting. The way that we do that is we have this util here. Let's start from the route. If we go to our auth routes and we go to our signup. Then in the signup, we have...Of course, the UI is just a form. Nothing interesting there. The action we're parsing the form, like we do with our regular conform and Zod schema and stuff.

5:53 The interesting stuff comes in right here. First of all, we want this OTP, this one-time password to be valid for 30 minutes. That's pretty long for a one-time password, but for this type of experience, I think 30 minutes is perfectly reasonable because this is a generated project for you. If you disagree, you can always just change this. It's very easy to do.

6:16 30 minutes is our default here. We generate the TOTP, the time-based one-time password. You can dive into that and look at the implementation. This is all custom implemented based heavily on the NOTP, which was great, but it only supports sha1.

6:37 That is not great. The maintainer is not actively responding to stuff. Anyone can feel free to make this as an open-source package. I definitely welcome that. This was copy, paste, and modified and improved based on that package.

6:54 GenerateTOTP, generate an HOTP, an HOTP is a HMAC-based one-time password. For us, we want it to be time-based, because that means that it, by design, will expire after the validSeconds that you provide here. We can even keep it in the database indefinitely and once that validSeconds time has passed, there is no way for that thing to be valid anymore. That's one of the benefits of a TOTP.

7:29 The TOTP is based on the HOTP. We generate our counter based on the number of seconds we want to be valid for. We generate a random key and then we return that random key, the algorithm that was used, the number of validSeconds, and OTP value itself. We use those to create a verification.

7:51 First of all, we have a unique constraint, if we look at our schema.prisma. We have a model for verification and we have a unique constraint on the verificationTarget, and the type. That means it's impossible for a user to have two OTPs for the same email, for example, because that wouldn't make any sense and there could be a bit of confusion if they had multiple.

8:16 That's why we instantly delete many of these. Delete all of the ones that have the same verificationType and the verificationTarget. Our verificationType is onboarding, and the target is their email. The thing we're trying to verify is the email.

8:35 We create a new one with that type, that target, the algorithm that was used to create the TOTP, and the secret that was generated to create that one-time password, the length of time that it's supposed to be valid, and the one-time password itself.

8:54 Then we redirect the user to the signup verify page with their email that they typed in. They provided that to us and we send them to the verify page with that prefilled in, because that's one part of the OTP.

9:13 You can't give a random number and have it signed you in as whatever user-provided or has that OTP number. That doesn't work. You have to have both of them. We prefilled that because you gave that to us. That's what's going to be our redirect URL.

9:32 Then we add the OTP to the email that we're going to be sending. We give them the OTP directly and then also give them the URL that they can click and have that prefilled in. Now, they have the option. They can do one or the other, and it will work just as well. If it was successful in sending the email, then we redirect them and they can continue on.

9:55 Now, once they land on the signup verify page, in our loader, we're first going to check if it has the OTP in the query parameters. If it does, then they probably clicked on the magic link and here they are.

10:09 If it does, we're going to do the regular validation that we'll look at in a second. If it doesn't, then we'll return the form in an empty state so that the user can fill in the missing pieces.

10:22 The form, there's nothing too interesting about the form here. It's a just regular conform thing going on here. When the form is submitted, that is going to go into our validate logic. We run through that validate logic, when they land on the page, if they have both of the pieces of information. If they don't, we wait for them to submit the form.

10:45 If they're submitting the form that's coming from formData, if they are landing on the page that's coming from the URLSearchParams. In either case, the logic is the same.

10:52 Here's regular conform, validation staff. We have a superRefine on our Zod schema that will go check the database for a verification that matches the email and the code.

11:06 We grab the pieces that we need to verify that code. If there is no verification, maybe the verification was deleted, they're using an old code or something like that, then we'll say, "Invalid code."

11:18 We're not going to give them any more helpful information there, because why would we? It would probably be useful if we say, "Have a little thing, request another code," or something. As far as the validation is concerned, it's not concerned with that.

11:34 Assuming there is a verification in the database that matches the OTP that they gave us and their email, then we're going to verify it with the OTP that they gave us, and the secretKey that we stored in the database.

11:52 We're going to make sure we use the same algorithm and the same valid number of seconds. That's also an important input here. Then we have the window set to 50 because I was testing something. This window should be set to one.

12:09 OTPs are used for one-time passwords for Google Authenticator or 1Password, wherever the thing you scan the QR code, and you get a new password every 30 seconds.

12:21 When you have something like that which we will in the future, you can end up with a situation where the clock that...The one-time password depends heavily on the clock. If the clock on the server and the clock on the device is off a little bit, then the one-time password can also be off and that's not great.

12:43 The window means, "I'm going to be valid for this length of time, but also for the previous length of time as well." That way you can account for that clock drift. You normally don't want to have much more than three in there, but it depends on the situation.

12:59 In our case, we are both the client and the server. The server is acting both of those roles. We don't need to worry about clock drift. That's why we can say window of one for this specific usage.

13:13 If there's no results then the code was invalid, it is not verified. We're going to add that issue. Otherwise, we don't add any issues and things are hunky-dory. Awesome. That's the integration with Zod here and all of its type-safe and it's amazing.

13:30 If we are doing a validation as the user is typing in stuff, that's where this can come in. We handle that case, if there is no value at this point, we know there was an error. We'll send the submission back, which will include all of the errors.

13:47 If we get past that, then we know that things are valid and we have a value. We're going to delete the verification from the database and then we'll go into the session and add the onboardingEmailSession, so we know which user is being on-boarded in this process. What email it is that they have? They verified this email, this is theirs. Now, we can onboard them with that email.

14:12 That is the entire process. The forgot password is very similar thing. We create that one-time password right here and send that as part of the email and everything. When they land on this page, we can check it in the loader, as well as in the action, if they manually type it in.

14:36 You might be looking at this and thinking, "There may be an abstraction there and it's possible." There is a good one. I did play around with it for a second and it didn't feel we're quite there with a nicer abstraction. It's nice to be able to customize this very specifically.

14:53 That is the one-time password support in the Epic Stack. It is awesome and paving the way for us to do two-factor authentication. I hope that you give this a shot in your own applications. It strikes a great balance of security and ease of use like complexity. I feel confident about it and I hope that it's helpful to you. See you.