vege - Fotolia
What are the best ways to prevent refactoring problems?
How do refactoring problems come about, and is there a way to predict them? Contributor Brad Irby has the answers.
The most common cause of refactoring problems is not in the code -- it's a lack of proper tools. Refactoring nearly always includes renaming variables and methods, changing method signatures, and moving things around. Trying to make all these changes by hand can easily lead to disaster. Assume, for example, you change the name of a variable that is local to a method, but the new name hides a class level of the same name. Refactoring tools should generate a warning during the rename, while changing the name by hand will not. Invest in the proper tools from the start.
In the code, a common coding technique that causes refactoring problems is using the results of a database query directly instead of having an intermediate class. In older applications without an object-relational mapping layer it was common to access the results of a database query using field names or numbers. This is dangerous because an error due to a typo will not manifest itself at build time, only at runtime when the application may crash due to a missing field or bad data type. The first refactoring for any legacy system that uses this data access method is to introduce the repository pattern. This will also help with the next common problem, lack of unit tests.
Refactoring without unit tests to validate the changes is asking for trouble. The obvious danger of refactoring without a unit test safety net is that undetected defects will be introduced in the refactored code. However, another danger exists when the legacy code already has an undetected defect that is not exposed until the refactoring. In such cases the natural response is to search through the latest changes looking for the error, when the root cause can actually be in code that remained untouched. By first creating a set of unit tests for the existing code, these pre-existing defects can be identified before any logic is changed.
Finally, refactoring code that has many external interfaces can be error prone, especially for loosely typed or string parameters. Refactoring the receiving system can introduce defects when the external system sends a particular data set that 'has always worked,' but that was unexpected in the receiver. These defects will not be found during the build cycle since they will only occur during a full QA cycle or even in production. A good solution to this is to capture all data transmitted between the two systems for a certain amount of time. Refactor the receiving system to be testable, then use this captured live data in unit tests to ensure the refactoring can handle live traffic correctly.