Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge in another. Many books have been written about debugging (see below: Further reading), as it involves numerous aspects, including: interactive debugging, control flow, integration testing, log files, monitoring (application, system), memory dumps, profiling, Statistical Process Control, and special design tactics to improve detection while simplifying changes.
Step 1. Identify the error.
This is an obvious step but a tricky one, sometimes a bad identification of an error can cause lots of wasted developing time, is usual that production errors reported by users are hard to be interpreted and sometimes the information we are getting from them is misleading.
A few tips to make sure you identify correctly the bug are.
See the error. This is easy if you spot the error, but not if it comes from a user, in that case see if you can get the user to send you a few screen captures or even use remote connection to see the error by yourself.
Reproduce the error. You never should say that an error has been fixed if you were not able to reproduce it.
Understand what the expected behavior should be. In complex applications could be hard to tell what should be the expected behavior of an error, but that knowledge is basic to be able to fix the problem, so we will have to talk with the product owner, check documentation… to find this information
Validate the identification. Confirm with the responsible of the application that the error is actually an error and that the expected behavior is correct. The validation can also lead to situations where is not necessary or not worth it to fix the error.
Step 2. Find the error.
Once we have an error correctly identified, is time to go through the code to find the exact spot where the error is located, at this stage we are not interested in understanding the big picture for the error, we are just focused on
finding it. A few techniques that may help to find an error are:
Logging. It can be to the console, file… It should help you to trace the error in the code.
Debugging. Debugging in the most technical sense of the word, meaning turning on whatever the debugger you are using and stepping through the code.
Removing code. I discovered this method a year ago when we were trying to fix a very challenging bug. We had an application which a few seconds after performing an action was causing the system to crash but only on some computers and not always but only from time to time, when debugging, everything seemed to work as expected, and when the machine was crashing it happened with many different patterns, we were completely lost, and then it occurred to us the removing code approach. It worked more or less like this: We took out half of the code from the action causing the machine to crash, and we executed it hundreds of times, and the application crashed, we did the same with the other half of the code and the application didn’t crash, so we knew the error was on the first half, we kept splitting the code until we found that the error was on a third party function we were using, so we just decided to rewrite it by ourselves.
Step 3. Analyze the error.
This is a critical step, use a bottom-up approach from the place the error was found and analyze the code so you can see the big picture of the error, analyzing a bug has two main goals: to check that around that error there aren’t any other errors to be found (the iceberg metaphor), and to make sure what are the risks of entering any collateral damage in the fix.
Step 4. Prove your analysis
This is a straight forward step, after analyzing the original bug you may have come with a few more errors that may appear on the application, this step it’s all about writing automated tests for these areas (is better to use a test framework as any from the xUnit family).
Once you have your tests, you can run them and you should see all them failing, that proves that your analysis is right.
Step 5. Cover lateral damage.
At this stage you are almost ready to start coding the fix, but you have to cover your ass before you change the code, so you create or gather (if already created) all the unit tests for the code which is around where you will do the changes so that you will be sure after completing the modification that you won’t have break anything else. If you run this unit tests, they all should pass.
Step 6. Fix the error.
That’s it, finally you can fix the error!
Step 7. Validate the solution.
Run all the test scripts and check that they all pass.
A mixed-control team organization attempts to combine the benefits of centralized and decentralized control, while minimizing or avoiding their disadvantages.
Rather than treating all members the same, as in a decentralized organization, or treatin single individual as the chief, as in a centralized organization, the mixed organization differentiates the engineers into senior and junior engineers. Each senior engineer leads a group of junior engineers and reports, in its turn, to a project manager. Control is vested in the project manager and senior programmers, while communication is decentralized among each set of individuals, peers, and their immediate supervisors. The patterns of control and communication in mixed-control organizations are shown in Figure 8.2.
A mixed-mode organization tries to limit communication to within a group that is most likely to benefit from it. It also tries to realize the benefits of group decision making by vesting authority in a group of senior programmers or architects. The mixed-control organization is an example of the use of a hierarchy to master the complexity of software development as well as organizational structure.