Recommendations for Difficult Debug Sessions
The following recommendations will help in working through difficult debugging sessions; they are also beneficial for simple debug work.
Proceed logically- random debug rarely works
Trying to resolve a problem requires knowing exactly the nature of the problem, and while changing the environment, keeping track of results. By not proceeding in a logical fashion and watching for changes, chances of resolving the problem are greatly reduced, and chances of wasting a lot of time increase. Proceed logically and reduce the problem to the bare facts of its manifestation, and then try to narrow down the cause.
Divide and conquer
In a sense you are at war with the bug, so one of the basic tenets of war should be followeddivide and conquer. This should be applied to all facets of the debugging processreducing the parameter space of the problem when searching and testing out remedies. Always go back and test out the fixes; check the application with and without the fixes to make sure the problem has really been solved.
Be wary of the multiple problem situation
Cases where multiple problems are effecting the situation are rare, but can be intensely difficult. This can lead to frustration, due to the myriad results from applying possible fixes. Sometimes, the case is such that one problem is unconsciously avoided or worked around, but then shows up again when debugging another problem. Either way, if applying fixes to a problem only changes its behavior or only reduces the manifestations, then it may be a multiple problem situation. In these cases one has to be particularly careful to change as little as possible when troubleshooting to allow for the most precise interpretations changes in behavior.
Watch for race condition or intermittent bugs
Intermittent problems often take creativity and resourcefulness just to reproduce them in order to start debugging. Unfortunately, these cases can often be misdiagnosed. One can mistakenly assume the problem was fixed due to changing or "fixing" something, only to then find it rearing its ugly head again. Worse yet, it could be a case where the problem fails for one user and not others due to its nature, and the problem is discounted until it strikes again with another user. In these cases, proceed to do anything to gather more data on the problem. Core files are often useful, since dead man forensics can always be done on them to try to find the root cause.