Using AI to Modernize Our .NET Legacy Codebase

Introduction

We’re facing a common problem seen by many companies that have been around as long as us: an aging codebase deeply rooted in legacy technologies that is difficult to port to more modern technologies. In our case, it’s a large monolith solution with projects primarily targeting .NET Framework 4.8. It’s been working well for us, but poses numerous difficulties. First, it requires Windows to develop and run and while there are workarounds for developing on Mac and potentially deploying on Windows containers, this adds friction and is more difficult to maintain. Second, while .NET Framework 4.8 will be officially supported for the foreseeable future, it’s obviously a dead-end technology. There have been numerous advances made in .NET that we are simply not able to take advantage of. Finally, we’re living in an AI world now and while we’ve been able to use AI successfully with .NET 4.8, as you’ll soon see, it’s not as good of an experience as it could be with modern .NET. This blog post will focus on my efforts using AI to aid in this modernization project.

Traditional manual labor

The modernization effort had been started long before I got involved, but it primarily involved trying to carve pieces out of the monolith that could be run elsewhere instead of upgrading the monolith itself. This is certainly a valid approach, but I felt that there is too much momentum behind the monolith to rely on that strategy alone. My early work involved purely mechanical changes that were relatively small in scope. Things like converting the csproj files to the new SDK format, converting some of our smaller core libraries to .NET Standard 2, and trying to consolidate dependencies. This was very time consuming work and while it was progress that needed to be made, it didn’t provide noticeable value to most of the company. In fact, it added development friction and numerous dependency issues that were difficult to track down. I felt like Sisyphus pushing the rock up the hill.

Then I focused on upgrading our core Entity Framework project from EF 5 to EF 6.5.1 and EF Core 9, which was the largest upgrade I’d attempted to date. Like before, this was very manual work. I hand wrote a large number of tests, modified T4 templates, and built out new context classes. These changes rippled throughout the codebase that required changing all the existing call sites to use the new interfaces while hoping that nothing broke. We were getting close to deploying the changes when priorities shifted and I got moved to a different project.

Early AI experiments

This turned out to be beneficial as this took place right as AI coding assistants were taking off. I was firmly in the skeptic camp at this point, but we had a clear mandate to use the technology as best we could. Looking back, I think I was setting the tool up for failure. I would periodically return to the modernization effort by telling Claude to “Update this project to .NET 8” and would get terrible results, as you would expect! Over time, however, I started to get better at using Claude Code and was starting to believe that there was something to it. I again tried returning to the modernization effort but still got similar results. In fact, I tried using Microsoft’s .NET upgrade assistant to try to assist and while it did useful work for some of our projects, it failed miserably for others. This didn’t help my confidence in the ability to leverage AI for this process.

But then I really tried to rein in my expectations and asked Claude to just write some tests for various query patterns for some of our most used models. It did really well. Keeping the scope small let Claude do a much better job than it otherwise would, providing better results with less disappointment along the way.

Building Confidence… a little prematurely

Months had passed but I eventually was able to return to the Entity Framework upgrade with the goal of actually getting it out the door. With the new tests written, I started asking Claude to investigate various code paths to see if it could detect any areas of concern. It did find things here and there and I would have it go into a small TDD cycle by writing a test to prove (or disprove) the issue and then try to fix it. It excelled at this and I would follow that pattern more and more, especially as I started doing real tests in our dev environment. When we actually deployed this large scope of work, we saw zero issues arise that was caused by the upgrade. I was proud of the work and very confident in what Claude could do for us.

So naturally, I fell back into the “Upgrade the next project to .NET 8” pattern again! Well, not quite that bad, but my confidence had caused me to start trying to bite off larger chunks and do big swaths of upgrades at once. Luckily, the failures here were only my own and never even got to a PR stage. I was trying to continue on the success of the EF deployment but was being dragged down by all these current mis-starts.

The new SOAP client

Because of what we do, we have a large library of connectors and clients to third party services and many of these rely on WCF for SOAP communication. Our business layer depends on this project and to really start digging into that business layer, this proxy library would have to be fully upgraded. After some discussion with some of my co-workers, we settled on an approach where we would create our own SOAP client that just used HttpClient and the basic XmlSerializer rather than the WCF nuget packages. SOAP services generally expose WSDL files that define the interface and models, so I wanted to build a tool that would use that WSDL file to generate our HttpClient specific client. This seemed like the perfect job for an AI agent because the input was extremely well defined and we could compare the output with our existing WCF clients. It worked, and after a number of manual iterations with the tool, we got it to the point where it felt like it was ready. I used Claude to generate the tool, write tests to compare the new client with the existing WCF client, and build an adapter layer that would let us use a feature flag to slowly transition customers.

This process took longer than I would have hoped, but I had a pretty strong requirement that we generate exactly the same request body as WCF because at least one of the services we interact with is extremely picky on element ordering. After spending some time in the dev environment, we finally pushed the new client to production and started moving customers.

Slipping through the cracks

Eventually, we started noticing issues with the new client. This wasn’t unexpected; the surface of this service is very large and it’s very difficult to test all the endpoints and permutations of data with a live server. The feature flag allowed us to quickly turn the new client off and assess the issue without the excessive pressure. This would usually involve having Claude Code look at the codebase and the error, then come up with suggestions on what might be wrong. There were some red herrings, but overall produced good results.

I was frustrated knowing that my changes were getting in the way of customers getting their job done, but I still felt good because we were making progress. It felt like I was polishing a product as opposed to grinding away on something that might never be released.

The turning point

These issues helped me re-evaluate how I was approaching the issues with Claude. During this time, I came across an article (that I can’t seem to find anymore!) about how Anthropic developers actually use Claude Code. It mentioned mocking out services that would prevent Claude from running in an iterative loop of evaluate, implement, and test. I had Claude build up a simple mock endpoint that I could use along with things like WireMock.net that would give Claude full control over the system under test. I could then give it the issue we were seeing, tell it how to mock the service the client worked against, and told it to iterate until it had a solution that it could verify was fixed. This usually involved writing a test that it could control so that not only could the agent use it to iterate, but we could use it for regressions going forward.

This has dramatically sped up the time needed to find and fix some issues, but there were others that still lingered. One such issue was a SOAP method that wasn’t implemented in our adapter layer. Because the service we were mocking was so large, I opted to only implement the methods that we actually used. Claude did the evaluation and built out the adapter. I focused manual testing on the major methods: getting rates, getting labels, etc. As it turns out, some of the more obscure methods had been missed. Claude is really good about finding connections in code, but it’s not exhaustive and sometimes that’s on purpose. Rather than trying to get Claude to do something it wasn’t great at, I instead had it build a static analysis tool using the Roslyn compiler that could do this kind of work. Visual Studio already shows usages of interface methods so I was able to compare the results of the new tool with what Visual Studio was reporting to ensure that the tool gave us the correct results. Now, Claude could quickly find all the methods that were missed and fix them, leaving me confident that we had caught them all. This immediate validation feedback let Claude work much quicker and in a more hands off way. I didn’t feel like I had to steer the agent nearly as much because I gave it a way to check whether it was doing the right thing.

This was something I had not thought of prior. I’d had Claude build tools that I could use, but hadn’t thought to have it build tools that it could use. When doing this kind of work manually, I could find all usages of a method in Visual Studio before manually implementing it. I’d have no real need for a bulk usage tool, and even if it would have sped up my work a bit, it wouldn’t have been worth the time to write and debug it. But now writing a tool like that is something that the agent can do on the side while I’m doing other work. Because the effort to create a tool like that is so much lower, the bar for whether it’s worth building drops significantly. And while I would have only used the tool to see which methods I had left to implement, Claude can use it for any number of use cases I haven’t even thought of.

Conclusion

The modernization project is only accelerating, thanks in no small part to Claude Code and our ability to get better at using it. This didn’t just make it easier to push the rock up the hill; it’s removed the hill entirely. The work is now demonstrably progress, letting us reach a once improbable goal of getting our core infrastructure to .NET 8+.