Refactoring Legacy Code with AI: A Bumpy Journey

AI refactoringlegacy codesoftware engineeringdevelopment experience

LaFu Code

July 09, 2025

•

-- views

Background: That Headache-Inducing Legacy Project

To be honest, it's a bit embarrassing. Our company has this project called "Phoenix" that went live in 2015, still using the old Spring 3.x + JSP stack. Over the years, business requirements kept piling up, and the code became increasingly messy - now it's a standard "legacy nightmare."

Last month, the boss suddenly said we needed to add a new feature involving user permissions and order flow modifications. I took a look at the code... wow, one Controller method with 800+ lines, SQL written directly in JSP files, not a single comment. Following normal procedures, it would take at least 3 months to get it done.

Since AI tools have been pretty hot lately, I thought I'd try using AI to help with the refactoring. After all, we were screwed either way, might as well give it a shot.

Week 1: Having AI Read Code - Epic Fail

Initially, I thought it would be simple - just throw the code at Claude and have it analyze everything. Turns out there were several problems:

Too much code - The entire project had 200k+ lines, Claude couldn't handle it and kept hitting token limits
Lost context - When analyzing in segments, AI often misunderstood business logic
Mixed JSP code - Java code mixed with HTML confused the AI frequently

Later, a colleague recommended Cursor, saying it was specifically optimized for codebases. Tried it and it was indeed better - at least it could understand the project structure.

But there were still issues. For example, this piece of code:

// UserController.java line 156
if(user.getType().equals("1") && order.getStatus() == 2 && 
   (System.currentTimeMillis() - order.getCreateTime().getTime()) > 86400000) {
    // A bunch of business logic...
}

I asked the AI what this meant, and it said "Special handling for VIP users 24 hours after order creation." Sounds reasonable, but actually "1" represents regular users here, and "2" is paid status. The AI got it completely backwards.

Lesson 1: AI understands code logic fine, but human oversight is still needed for business meaning.

Week 2: Small Steps, Divide and Conquer

After the first week's setbacks, I changed strategy. Instead of having AI analyze the entire project at once:

Map core processes first - Manually drew out main business flow diagrams
Process by modules - Only had AI handle one module at a time
Write test cases - Had AI help write unit tests to ensure refactoring didn't break anything

This strategy worked much better. For example, refactoring the user login module:

// Original code (simplified)
public String login(HttpServletRequest request) {
    String username = request.getParameter("username");
    String password = request.getParameter("password");
    // 200 lines of validation logic...
    // Direct session and database operations...
    return "success";
}

I had AI refactor it to:

@PostMapping("/login")
public ResponseEntity<LoginResponse> login(@RequestBody LoginRequest request) {
    try {
        User user = userService.authenticate(request.getUsername(), request.getPassword());
        String token = jwtService.generateToken(user);
        return ResponseEntity.ok(new LoginResponse(token, user.getId()));
    } catch (AuthenticationException e) {
        return ResponseEntity.status(401).body(new LoginResponse(null, null));
    }
}

Taking it step by step worked well. But new problems emerged:

AI-generated code style inconsistency - Sometimes using Optional, sometimes returning null directly; sometimes throwing exceptions, sometimes returning error codes. Required constant coaching and standardization.

Week 3: Pitfalls and Gains Coexist

By the third week, I'd basically figured out the AI's temperament. Here are some insights:

What AI does well:

Code formatting and refactoring - Breaking down long methods, extracting common logic - AI excels at this
Generating test cases - Give it a method signature, AI can generate tests covering various edge cases
Code comments - AI writes more detailed comments than we do ourselves

What AI doesn't do well:

Complex business decisions - AI often misunderstands business rules
Performance optimization - AI tends to write "correct" code, but not necessarily "efficient" code
Architecture design - Major architectural decisions still need human input

A Specific Example

There was an order status update method that originally looked like this:

public void updateOrderStatus(Long orderId, Integer status) {
    Order order = orderDao.findById(orderId);
    if (status == 1) {
        // Pending payment logic
    } else if (status == 2) {
        // Paid logic
        // Send SMS
        // Update inventory
        // Log records
    } else if (status == 3) {
        // Shipped logic
    }
    // ... 7-8 more statuses
}

I had AI refactor it, and it provided a state pattern implementation. The code structure was indeed much cleaner. But there was a problem: the original code sent SMS when status was 2, but AI's refactored version lost this logic.

When I asked AI why, it said "SMS sending shouldn't be handled in order status updates, should use event-driven approach." Technically correct, but reality is we don't have message queues, and SMS service is synchronous.

Lesson 2: AI often provides "ideal" solutions, but real projects have various constraints that humans need to balance.

Final Results: Wins and Losses

After three weeks of struggle, the final results were:

Successful aspects:

Dramatically improved code readability - Original 800-line method split into 10+ smaller methods
Test coverage improved from 0% to 70% - AI-generated test cases were decent quality
Fixed several potential bugs - AI discovered some edge case issues during refactoring

Less ideal aspects:

Time cost - While AI writes code fast, time spent coaching AI, validating results, and fixing issues was substantial
Slight performance degradation - Refactored code was more standardized but performance dropped in some areas
Team learning curve - Other colleagues needed time to adapt to the new code structure

Data comparison:

Before refactoring: Core API average response time 650ms, 3-5 production bugs per week
After refactoring: Core API average response time 720ms, 1-2 production bugs per week

Not as dramatically improved as imagined, but definitely better.

Some Reflections

This experience gave me a more realistic understanding of AI-assisted development:

AI is a tool, not a silver bullet - It can improve efficiency but can't solve all problems
Human-AI collaboration is key - AI handles execution, humans handle decisions and oversight
Gradual improvement is more reliable - Trying to do everything at once leads to indigestion

For those wanting to try AI refactoring, my advice is:

Start with small modules, don't try to refactor entire systems right away
Definitely write test cases - they're your safety net
Spend time understanding AI output, don't blindly trust it
Keep realistic expectations - AI can help but isn't omnipotent

Overall, this attempt was worthwhile. While it wasn't as dramatic as "3 days for 3 months of work," it did improve our code quality and gave the team deeper understanding of AI tools.

Next time I encounter a similar project, I'll be more confident using AI to assist development.

Follow WeChat Official Account

Scan to get:

• Latest tech articles
• Exclusive dev insights
• Useful tools & resources

💬 评论讨论

欢迎对《Refactoring Legacy Code with AI: A Bumpy Journey》发表评论，分享你的想法和经验