6 Safe Refactorings for Untested Legacy Code

Nicolas Carlo
Nicolas Carlo

In this talk, Nicolas, a developer and legacy code specialist, breaks down the misunderstood concept of refactoring—defined as: changing the structure of code without altering its behavior. That includes keeping bugs in place until structural improvements are made.

He emphasizes that true refactoring is only possible when behavior is verified, ideally through automated tests. If there are no tests, you're dealing with legacy code, and the paradox begins: you want to refactor safely, but need tests to do that—yet you can’t test without refactoring first.

Nicolas proposes a structured, recipe-driven approach to navigating this paradox, especially when working with untested legacy code. The core principle? A three-step loop:

Create new code → Replace occurrences → Delete the old code.

Share this talk

Transcript

Hi, Epic Web. I hope you are enjoying your conference. I'm here to talk about refactorings. Refactoring is an abused word, so I'm gonna go straight to the definition. When I say refactoring, I mean by Martin Fowler's definition. It's a change of the structure of the code that does not change behavior. Bugs included.

When I'm refactoring code, I do not fix a bug. I refactor the code first, then I fix the bug. So with this definition of refactoring, how do you know after you change the code that you have refactored? Right? Well, the answer is you test it. You test the behavior before, you test the behavior after.

You can do that manually. In general, we recommend people to use automated test because it's much faster and reliable than you. If you're practicing TDD, it's actually part of the loop. If the code is not tested, what do you do? Well, congratulations. You're working with legacy code as per Martin Michael Fieser's definition of legacy code.

So code not tested, legacy code. And this is where I step in because my name is Nicholas, and I am a freelance web developer specialized in legacy code. I have this weird interest in everything, refactoring related and all.

I've developed my own Versus Code extension to bring more automated refactorings, and I animate, I organize communities here in Montreal, Canada, where we talk a lot about maintainability. So the paradox of legacy code is this one. You have to code, it is not tested. You want test to do safe refactorings.

But turns out when you try to do this, you realize you first need to refactor the code so you can write some tests. How do you deal with it? Well, the answer is people picture things as black and white and it's a nuance as everything. Either your code is automated with test or it's not and you're yellow.

Actually, there is a middle ground, and we're aiming for this. So better than doing yellow refactorings is to follow some recipes. That's what Martin Fuller's book, Refactoring, is about. Gives you 60 plus refactoring moves. Even better is to lean on automated refactorings provided by your tools.

In this presentation, I want to quickly show you six refactorings because there are so many refactorings that you can learn out there. If there are only things that you need to learn, I would recommend this one, especially if you're dealing with legacy code. So let's start with the first one, inline.

Inline, I started up with this one because it's a basic one. It almost will feel super intuitive like, no move because you already know how to do that. Consider this code. We realize, actually, the decreased quality method prevents another better refactoring.

So in this scenario, it is common that we like to undo that abstraction because it's not good anymore. We do that by inlining. There are many different scenarios, but in general, the recipe is always the same.

You take whatever is inside, decrease quality, and you replace the occurrences with the body of the function, even better if it's just a variable. Finally, after you do that, you repeat for all occurrences that are relevant, you may be able to delete the function. Damn. You have done inline function.

It brings me to that consideration. The refactoring moves that we will take, they always follow the same recipe. And that's probably even more important than the refactorings itself. It's to get into the mindset of, first, create new code. Second, replace occurrences. Third, delete. In the case of inline, there is no creation of new code.

We're removing an abstraction. But we first replace the occurrence, then we delete. We follow that recipe in this order. Okay. Personally, when I inline, I like to lean on my, editors. So WebStorm is the best editor I know out there to automate refactorings. But if you're using Versus Code, my extension can give you some of the inline.

Eventually, the TypeScript team will automate these moves, I hope, but, at least you can, inline the variables with my extension. And you should learn the shortcuts. Don't do the step manually. Use the shortcut. That is it, that's it for inline. Okay. Second refactoring, very, very important one, rename.

Rename, is super important because you know that naming things is super hard. It's out of scope, but I point you to a series of article by Paulo Belchi about naming as a process. The the short of it is you don't have to get the names right at the first try. You can just make them better and better.

The worst scenario for names are misleading names. The function is named something, but it does more or less or different things. Right? So you want to get to better names. How do you rename things? Well, let's consider this d of r q variable that you may come across, because sometimes people do that.

It's not clear what this represent, but as you're working while with this code and with the context, at some point you realize, oh, this is the dice frequency. Okay. I mean, we're using JavaScript here, so we can express ourselves. Right? Like, it's it's gonna be, as cheap to name it dice frequency. So let's do that.

How would you do that though? Very, very often, I see people start, replacing all occurrences, and the code will be broken for a while. If it's in the single file with a few occurrences, that's fine.

But what if this variable is exported and used across the codebase with maybe a name that, cannot be searched and replaced because there are other occurrences? Here is how to do that move safely. The first step, you don't replace anything. You create a proxy, create new code.

So we create the new variable that we want to use, with the name that we want. That's new code, it's not used, but we adapt it. Then, we replace the occurrences. That's easy because we can start replacing our currencies and we can stop at any time.

In general, the code will still be working, so we're in a working state most of the time. That is the trick. I can stop, I can go in a meeting, I can come back and continue my work. Or if I really need to, ship something, I can stop and and continue later. So we we do that.

We replace all of them until the first initial variable is only used by your proxy. In this case, well, we can inline it. That's why we learn inline first. So put your cursor on this, control at m, and bam, inline. You're done. Right. That is rename. In practice, I never do it this way.

I showed you the move to for you to understand the logic, but you should really be using your editor shortcut. It's f two in Versus Code. Please use it. It's very reliable in JavaScript and TypeScript. Use that.

You see? We follow the same recipe. Create the new code, replace occurrences, delete the old code. That's it for a name. Let's move to extract. Right. So how do we extract code? Well, first of all, why would we extract code? I have met two major use cases.

The first one is when you want to clarify the intention, and that is probably not what you expected. So see this code, see this condition here? I can read that code. Fine. But what does it mean? Yeah. Okay. Player one score lower than four, same for players two, and then the sum should be six.

What does it mean? I don't know yet. But this condition, usually, when they tend to be long, it's worth it to extract at least in a variable that will give them the name. However, at this point, I don't know the perfect name. So what do we do? Naming is a process.

I'm gonna come with a very nonsensical name. That's okay. Like, I can't call it Applesauce. Go read the series I mentioned before. You will understand why, but it doesn't matter. What matters is it is obviously wrong. That's a problem for later. For the moment, I am extracting. Right?

And not having a good name should not prevent you from doing the move. So Apple's all cities. First, I copy the value and I assign it to a variable. I create new code. I don't chat anything. The code is still working. I'm not messing up with stuff.

Then, I replace the occurrences with the rest, with with the variable I just created. Here, there is only one, but maybe there are many. Eventually, I will be done with the replace, and I will finally figure out what that means. This is when I hit f two, and I rename. Right? Rename f two.

This condition was for early game in a tennis game. So is early game is actually the name of the variable. F two rename. You see? Like, you practice all of these small shortcuts, as you're going. You're expressing yourself through refactorings. Let's see a second use case. Second use case, this one, you know it, duplication. Actual duplication.

You see this code, for instance, we're increasing the quality of the item, and every time we're doing that, we're checking that we're not over 50. We could write this code differently with a mat min, or a clamp or whatever. But the problem at this point is that this logic is spread across the code.

So let's extract it in the function this time. So the first thing we do is we create the new code. We don't don't cut anything. Don't mess up with the code. First, we create the code. Right? We take the time to name it properly, then we copy the body inside.

When we do that, we realize, oh, we actually need to pass an item here. Maybe we need to return something. Like, we we we message the signature of the function. Once we're ready, we start replacing the occurrences. So this one can be replaced, this one too, etcetera, etcetera. I can stop at any time.

The code is not broken. This is important. Keep doing that until we have replaced all the occurrences, and we're done. Again, this is the same recipe. And, hopefully, because I'm hammering that message, you should get out of this presentation with this concept. First, create the new code. Don't mess up with the thing.

Once you're ready, replace occurrences one by one. You can stop at any time. At the end, eventually, if you reach that point, you can delete the old stuff. Create new stuff, move to the new stuff, remove the old and new stuff. This is how we deal with legacy code in general, not just for refactoring.

Extracting, it is, again, basic refactoring. It should be automated. The first three refactoring moves I've showed you should be automated. User shortcuts, for this is actually handled nicely today with Versus Code. But with abracadabra, you can just have your cursor wherever, you are. You hit control alt v, extract them.

It even handles all the occurrences if you have many. Okay. We have seen in line, rename, extract. Now we're gonna use these three for, more involved refactoring move that is very useful too. It's called change signature. So let's look at this add reservation function.

What if it is used at many places and we want to update its signature? For example, we may want to convert we may want to add more arguments, and we don't want to have three, four, five. So instead, we want to convert to using an object. Right? A single params object.

But this is ad reservation is spread across the code base. How do we do that while keeping the code in a working state? Let me show you. First, we apply extract function on the body of this function. So we create a new function that we name add reservation double underscore new, whatever.

It doesn't matter as long as you you give it a a unique name that is different. And now we have a single proxy. So this new function with the body is only used by that one old function. So now we have only one place to adapt. Right?

We can, add more parameters, or we could transform it to an object and update this proxy to the correct thing. And it will be used by clients. Once we have done that, we replace all the occurrences of add reservation to add reservation new. And maybe we can stop and pause and think, oh, this case is different.

Eventually, because we can stop at any time we do that, and then we reach the end where the old function is not used anymore, and we can remove it. And instead, we only use the new one that we can rename. So extract, replace the occurrences, inline, and finally, you delete and you rename.

So extract, inline, rename, use these combos to do more advanced refactoring. Change signature is following the same recipe overall, and it brings me to something you need to realize. If you stop in the middle, you can. That's a good thing. But things may be more complex than they used to be before the refactoring. It is normal.

It's part of the process. This is because you can stop at any time, which is a very powerful, advantage. However, if you go through, eventually, the complexity will drop. This is very common.

It blocks a lot of people from starting any large scale refactoring because they're afraid that in the middle, it's much more complex, but it should go through.

Change signature is not always automated. Although with Abracadabra, I do have it. WebStorm, you have it. You're great. But, yeah, if it's not automated with your editor or if it doesn't work, just do the manual steps. We have seen four refactorings, and I have five, ten minutes to show you the remaining two.

These two remaining refactorings, I've picked because they are the most useful moves that you can do following a recipe, so safer than doing YOLO, that, you will have to use when you're dealing with code that is not tested, legacy code. However, they will probably not be automated. Right?

So now you are equipped with inline, extract, rename, change signature. Let's dive into distill and extract. Okay. Now we're showing code that looks more realistic. I had to simplify it a little bit so it fits on the slide. But, you can see this is an express router. Right? It's a route. We have a request.

We have a response.

Why people will be reluctant on testing this code? Well, the answer is things outside of your control. Things that are coming from the framework, from the outside, that are painful to replace. Basically, people will try to mock them. Some tools are helping you do that. Sure. You can have great tooling that is doing magic for you.

But in general, there is a simpler and better way, and you should not really mock what you don't own. But I'm stepping a bit aside here of what I want to tell you. Alright. So requests come from the framework, and it's spread across the code.

So your logic, like all of these if else nested with the calculations, that's what you want to test. I know this code, you will need at least a dozen tests to verify that all the scenarios are working as expected. But with this, you either need to address it from, you know, the making the HTTP call the time.

And so going through the framework, do you need to test the framework 12 times and have the test that will be slower? The second option is to mock the request, but you don't own it. It's gonna be annoying. And if you're grading, will your mock follow? It depends on your tooling.

Maybe for Express, you will have libraries that help you do that, but maybe you don't. I'm gonna show you a technique where you don't need that. Same goes with the response. So request response, see how they're sprinkled across our, business logic that we want to test. The idea here is to distill them.

So distillation is to separate these up, request and response that are outside of your control in in in a in a thin layer, and you preserve pure business logic in a large layer. That's the goal. So you then you can get your business logic and test it. So what do we do? Well, extract variable is your friend.

Here, I'm gonna only use the refactorings I showed you, and we're gonna use ref extract a lot. So extract variable, I go through all of the requests, and turns out everything is coming from request query. And as you can see, now the reference to the request that comes from Express is only at the top.

I destructure the variable that I use elsewhere, but the direct reference is only at the top. What do we do with the response? In this specific example, we could push it to the bottom, but it's not always the case. And I promise I would only use the refactoring that I showed you. So I'm gonna extract, right?

Extracting to pull up. So in this case, I extract a function in the scope of this controller because I need to have access to res. So I have a send response function that forwards whatever arguments you pass it to res dot json. And I replace all the occurrences. I I I move on with the extraction. Cool.

Now look at that. I have here pure business logic with, yeah, with a proxy, like send response. It's a function. But you know what? I can pass that as a as a parameter to my function, and I don't have to pass the full response interface, in particular when you're doing TypeScript. So that's what I'm doing.

Another extraction, I take the whole body that we have distilled and we extract distill and extract. Extract function, we give it name. In this case, I'm gonna be honest. This is doing two things. I'm computing the cost, and I'm sending it as a response.

The name suggests that we should not end the refactoring here, but what is important is I can stop here touching this code, and now I can start writing some tests before I I do more changes. Here, you have the pure business logic you can test.

There, you have the framework code that is necessary to make things happen, but you only need one test that verify when I call this endpoint, I do get the response. Logic was hard to test. We have distilled and extract, and now the framework code is isolated. And I have pure logic I can put elsewhere.

Distill and extract, very powerful to be honest with you with that most of the time. Sometimes you cannot use it. So that's where you will reach for extend and override. The approach is the same. It's just that we're reversing. Instead of extracting your business logic, we're gonna move away the framework code. So same stuff here.

What makes this code hard to test? Well, not really hard, but the console log in the middle are annoying. They will pollute your test report, and some people, they will mock the console log, the global console log. That works. But what if you actually need to do some console logging because you are debugging what you do?

Now you are doing stuff that's inconvenient because you're trying to do, magical stuff with a hammer. You don't need that. We can be more surgical about this. So let's consider this. We take this console log and we extract function on it. So we extract that, in this case, it's a class.

So it's gonna be in a method, it could be a function. We extract this in a log method, local to that game class. Right? It takes a message and it forwards to console. Log. Could have forwarded all the arguments, to be honest, but this works. Right?

Now that we have that, we have something that, Michael Fesars, in his book, call a seam. A seam is a way to modify the behavior of the code without changing the code itself. Consider this. We have isolated that annoying part in this log method. We can extend it.

We can create a testable game that wraps that extends the existing game. And once we have that, we can, modify the log method, and we can remove the annoying console log. Right? We can even put custom implementation. So if I need to test whatever messages are logged because this is interesting behavior, I can do that.

I can do custom stuff. The truth is you won't stop here. Like, this is nice so you can start writing tests. This was the goal of this presentation, is to get you in a state where you can start easily writing tests. However, this calls for more refactoring.

You probably need the concept of a logger, and we will move that behavior to logger because this should not belong to game. And eventually, you won't need the testable gaming anymore. You will just inject whatever logger that you're using. And in your test, it won't be logging stuff to the console. So that's it. You have it.

Six refactorings for legacy code. I know that was a lot of content, but really hopefully you get the idea of, refactoring is a process you can do as you go. Unless you're doing very large scale refactoring. But for a lot of things like renaming, you can do them as you're implementing features, as you're fixing bugs.

You should not stop and create a ticket for it and ask permission for refactoring, etcetera. Do them as you go. But you can only do that if you learn to, change the code while keeping it in a working state. So you have to go to that intermediate state that looks awesome. Right?

Where you're on two wheels with the old, implementation and the new one going on, and carry that work until the end so you can move back and go faster. But you do that as you go. Do that. Learn these shortcuts. Learn automated refactorings, what your editor can do for you.

I mean, if, renaming a variable is just a a key stroke away, you will do that much more easily than if you have to do it manually, to be honest. That is the same for extract, inline. They are the bread and butter for refactoring moves.

If you want to go further, here are some resources I recommend you to have a look to. Of course, Martin Fowler's book, Refactoring Guru. It's a nice website with a lot of examples, much bigger catalog of refactoring moves. I also wrote myself a book with more techniques to deal with legacy code bases.

I have that Versus Code extension. If you want to do more if you're using Versus Code and coding in JavaScript, have a look. That's it. Here are the links to the slides. Thank you very much for having me, for giving me your time. I hope you have a great conference. I hope you learned a ton of things.

And if you want to keep in touch, the best ways to approach me are either on LinkedIn or on Blue Sky. Thank you very much, and have a great day.

Related Talks