

In some companies, programmers ignore non-reproducable errors, however, the normal mission is to support testing and report them. Non-reproducible errors can be the most expensive bugs in your software company. Remember that when we have an error that we are not able to reproduce often causes a feeling of distrust in the product.
Inevitably, programs will have errors. No matter how careful the developer or rigid to testing there will always be unexpected errors. People have to learn to live with them and not close their eyes trying to ignore them. Since the errors are part of software development and deployment–and can often be quite problematic, in terms of cost and image, it is best to have them checked.
Google’s software developer, software tester Anthony Vallone wrote an excellent piece speaking to this issue for developers and quality assurance testers.
Understanding software bugs
Effective bug management is a critical activity in any software project. When bug management is ineffective, the project as a whole suffers: time, effort and energy are spent not on fixing bugs, but on arguments and delay tactics.
In order to validate the existence of the bug, the first step developers take is often made using the information in the bug report to reproduce the failure. However, reproducing reported bugs is not always straightforward. In fact, some reported bugs are impossible to reproduce. When all attempts at reproducing a reported bug are futile, the bug is marked as non-reproducible.
Although unreproducible errors can occur in any stage of the software life cycle, they are more frequent during the testing and when the product went live. When errors occur, the log should contain a lot of detail (hopefully). Unfortunately, detail that led to an error is often unavailable once the error is encountered. Also, if you’ve followed advice about not logging too much, your log records prior to the error record may not provide adequate detail.
The error could occur also due to the insufficient resources, timing issues, memory corruption or uninitialized memory and memory leaks.
Google’s software developer, software tester Anthony Vallone has provided some guidelines for development and testing to minimize the likelihood of these bugs from occurring. According to Vallone, the parameters involved in effective bug management span the range of purely technical issues, to human behavior and to organizational politics. In many cases, these parameters are conflicting – satisfying one will result in neglecting the other. Finding the best solution for the conflicts is not easy.
Guidelines to reduce unreproducible errors
When the error is due to deadlocks, timing issues, memory corruption, uninitialized memory access, memory leaks, and resource issues, he provided some guidelines for development. As a precaution, organization should simplify the synchronization logic. If it’s too hard to understand, it will be difficult to reproduce and debug complex concurrency problems.
The next step is to avoid deadlocks and define an order for obtaining multiple locks and fine-grained it to increase concurrency complexity. Developers should also avoid shared memory. Shared memory access is very easy to get wrong, and the bugs may be quite difficult to reproduce.
For testers, Vallone suggests to process stress test the system regularly to unexpected failures when your system is under heavy load. Tester should test the software with debug and optimized builds under constrained resources by reducing the number of data centers, machines, processes, threads, available disk space, or available memory. They can also use dynamic analysis tools like memory debuggers, ASan, TSan, and MSan regularly to identify many categories of unreproducible memory/threading issues.
Vallone then proposes to use tried and tested defensive programming, fuzz testing, error handling at minimizing unreproducible bugs. He said defensive programming is used to verify the work of your dependencies with known risks of failure like user-provided data, I/O operations, and RPC calls.
“The most common sections of code to remain untested is error handling code. Don’t skip test coverage here. Bad error handling code can cause unreproducible bugs and create great risk if it does not handle fatal errors well,” he said.
In addition, the software developer suggested other resolution terminologies commonly used for non-reproducible bugs including checking for duplicate keys, testing concurrent data access, developing APIs and following good logging practices.
Reach out to the Vallone’s blog to get the detail information in minimizing unreproducible bugs.
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.