Suman Jana and his colleagues have developed an automated tool that finds bad error code in the C programming language.

Silent bugs are among the hardest software flaws to catch. They rarely cause a program to freeze or crash, but can leave a computer open to attack until they are found and fixed.

Suman Jana, a computer science professor at Columbia Engineering and a member of the Data Science Institute, has led the development of a new tool to root out a type of silent bug that lurks in error-handling code, the set of instructions in a program that tells users when something has gone wrong.

Jana and his colleagues outline their automated bug-finder in a paper presented earlier this month at the International Conference on Automated Software Engineering in Singapore. Their tool targets error-handling bugs in the language C, which though decades old, underlies most security browsers and encryption software and many industrial applications, from smart phones to self-driving cars.

Finding silent bugs in C is especially tricky since C does not provide specialized features for flagging and reporting mistakes, unlike Java or Python. Ideally, error-handling code tells users if they have entered an incorrect value, tried to divide by zero, or made some other error. But if the code is flawed, mistakes slip through and the consequences are often too subtle to detect. Hackers exploit these bugs to eavesdrop on encrypted emails and in some cases take over a machine.

In developing a tool to find bad error-handling code, Jana and his colleagues had a key insight. Error-handling code is shorter and more likely to use “goto” statements than regular code, leaving fewer branching points and program statements.

Their tool, APEx, exploits this feature to pick out error-handling code in the subprograms, or functions, that carry out computations in a larger program. At the function level, APEx infers the constraints that determine whether an input is flagged as a mistake and how it gets communicated to the user. APEx then compares error constraints of the same function across different programs to identify the code that looks different — the bugs.

Testing APEx on more than 200 functions in C the researchers found 118 bugs, most in the programming libraries that use the SSL or TLS networking protocol to encrypt communications. At least one bug was serious enough to be catalogued in the Common Vulnerabilities and Exposures (CVE) database, a Hall of Fame for bug finders and the software flaws they uncover.

A similar error-handling bug found in the GNU programming library in 2014 received wide news coverage for its potential to let attackers bypass security protections and read encrypted emails and other communications in hundreds of applications.

Most of the bugs that the researchers have reported have been fixed, says Jana. As a token of thanks, one company, wolfSSL, even sent a pint glass emblazoned with its logo that now rests above Jana’s desk. It’s a small reward compared to the five-figure bug bounty Apple recently announced, but Jana is happy for the recognition. Companies that once responded testily when bugs were brought to their attention now seem to embrace the extra oversight.

“The culture has changed,” said Jana. “They have accepted the fact that software has bugs and that fixing them quickly will enhance their reputation.”

APEx builds on an earlier tool, called EPEx, short for Error Path Explorer, that scans error paths, tests whether errors are handled correctly and reports any bugs. The main limitation of EPEx, which inspired the creation of APEx, was the need to manually create error specifications.

The number of software bugs is growing rapidly as the analog world moves to digital. In a joint project with the U.S Department of Homeland security, the cybersecurity firm Synopsys helps developers find and fix open-source software bugs in C and Java with its Coverity Scan service. In 2014, Coverity reported finding more than 240,000 bugs—more than the previous seven years combined.

Finding bugs through trial and error is no longer practical.

“The world now runs on software,” says Junfeng Yang, a computer science professor at Columbia Engineering and member of the Institute who was not involved in the study. “There’s no way to manually inspect that code to find all the bugs. Automated error-detection will be the future.”

Repairing bugs once they are found is also time consuming. Jana and his coauthor Baishakhi Ray, a computer science professor at University of Virginia, recently received a $500,000 National Science Foundation grant to develop bug-patching tools that help developers quickly fix bad code.

Read the study: APEx: Automated Inference of Error Specifications for C APIs.
A profile of Suman Jana: Protecting Security and Privacy in an Age of Perceptual Computing

— Kim Martineau