Home » Automated Code Quality » Legacy Tests

How to Add Tests to Legacy Code Without Breaking Things

Adding tests to legacy code that was not designed for testability is one of the most valuable and most difficult things a development team can do. The key is to avoid rewriting the code first. Instead, write characterization tests that verify the code's current behavior, then use those tests as a safety net for any future changes. AI tools can accelerate this process by analyzing code paths and generating test cases automatically.

Why Legacy Code Is Hard to Test

Legacy code typically has characteristics that make testing difficult. Functions do too many things at once, mixing business logic with database calls, HTTP requests, and file system access. Global state creates hidden dependencies between functions. Classes are tightly coupled so that testing one component requires setting up half the application.

The instinct when facing untestable code is to refactor it first to make it testable. This is the wrong order. Refactoring without tests is dangerous because you have no way to verify that your refactoring preserved the original behavior. The correct order is: write tests for the current behavior first, then refactor with confidence that the tests will catch any regressions.

Characterization Tests: Your Starting Point

A characterization test does not assert what the code should do. It asserts what the code actually does right now. You call a function with specific inputs, observe the output, and write a test that expects exactly that output. If the function has a bug that produces wrong output for certain inputs, the characterization test still asserts the wrong output, because the goal is to detect changes, not to verify correctness.

This might feel backwards, but it serves a critical purpose. Once you have characterization tests covering the main code paths, you can refactor the code safely. If a refactoring accidentally changes behavior, the characterization test fails, and you know exactly what changed. You can then decide whether the behavioral change was intentional or a regression.

Finding the Right Entry Points

In legacy code, not every function is equally worth testing. Focus on functions that meet one or more of these criteria:

Frequently modified: Code that changes often is most likely to break. Tests here have the highest ROI because they will catch regressions soon.
Business-critical: Functions that handle payments, authentication, or data integrity should be tested first because failures in these areas have the highest impact.
Already causing bugs: If a function has a history of bug reports, it needs tests to prevent recurrence.
About to be modified: Before making any change to legacy code, write tests for the current behavior. This is the safest time to add coverage.

Techniques for Testing Untestable Code

Extract and Override

When a function mixes business logic with external calls, extract the external calls into separate methods that can be overridden in a test subclass. This lets you test the business logic without setting up databases, HTTP servers, or file systems.

Seam Insertion

A seam is a point in the code where you can change behavior without modifying the code itself. Constructor injection, method parameters, and configuration values are all seams. Identify existing seams in the legacy code and use them to substitute test doubles for production dependencies.

Approval Testing

For functions that produce complex output like HTML, reports, or data structures, approval testing captures the entire output and compares it against a saved reference. If the output changes, the test fails. This is especially useful for legacy code where you are not sure exactly what the correct output looks like but you know the current output is what users are seeing.

How AI Accelerates Legacy Testing

An AI agent can analyze a legacy function, trace its code paths, identify the inputs needed to exercise each path, and generate test cases automatically. This is particularly valuable for complex functions where manually determining all the code paths would take hours of careful reading.

The AI-generated tests serve as a starting point. A developer should review them to verify they make sense and add any domain-specific assertions the AI might have missed. But having a complete set of characterization tests generated in minutes rather than days dramatically lowers the barrier to getting legacy code under test coverage.

Get your legacy code under test coverage without the risk of rewriting it first. See how an AI development team generates tests that protect your existing behavior.

Contact Our Team

Learn About the AI Development Team