Neha Rungta and Dr. Eric Mercer, Computer Science
As computer systems grow in complexity it is becoming harder to completely test them. The cost of finding and correcting errors after deployment is staggering. Current validation methods are not well suited to finding subtle errors laden in the concurrency of modern systems. This project searches the behavior in such systems, guiding the search to focus its time in areas that seem most likely to contain errors. This work decreases the amount of time and memory needed for the errors that are found and corrected before deployment of the system. In this project we have developed a heuristic that takes a guided search algorithm to find defects in formal models of computer systems, a communication protocol, or circuits. The guided search explores a model using a heuristic function to estimate error locations as opposed to an exhaustive search which enumerates all the possible states to find the error in a brute force manner.
Computer systems can be formally modeled to explore their behavior. Formal methods often consist of states and rules. States represent the conditions of the model and the rules are the transitions that lead to other states. Explicit state model checking is a verification method that proves whether a model can or will reach a state that violates a pre-defined property. This is an error state. Models used in this project are from a Model Checker, Estes which was developed in the Verification and Validation lab at BYU. Estes model checks embedded software at the machine-code level to reveal concurrency errors that are missed at higher levels of abstraction. Current heuristics ignore the structure of the model and only consider the property that is being verified. They estimate path cost solely based on the difference between the current state and error state in terms of variable assignment.
The research in this project created a new heuristic that uses the structural information in the system to refine the heuristic estimate of the path cost to an error. To use the structural information, we did a static analysis of the model and built a control flow graph. A control flow graph computes the data analysis of the program and represents its model of execution. Each node in the control flow graph represents a single atomic assembly level instruction. A path in the graph represents the computation path of the system.
There are two versions of the algorithm. The first is a straightforward naïve algorithm. When a state is generated in Estes, before it is inserted in the search queue a heuristic value is assigned to the state. If the heuristic value is very low, it indicates that the current state is really close to the error and is put in front of the search queue. To calculate the heuristic, we take the Program Counter (PC) of the state and find the corresponding PC value in the control flow graph. We then compute the shortest path from that point to the error state. This distance is the heuristic value of the state. If the current state is far from the error state the shortest distance will be high and corresponding heuristic value will be high too. The above algorithm will always underestimate the distance required to reach the error state since the heuristic is admissible. For example if the PC value of the current state is 105, the shortest distance will be 4 because based on the Control Flow graph shown in Figure 2 the algorithm calculates the path as: 105-100-101-102- 201(Error), and estimates it being reachable in 4 steps. But the true distance to the error state should be computed along the path: 105 -100 -101 -102 -106 — 200 – 100 -101 -102 – 201(Error). To compute a more accurate heuristic a modified version of the algorithm was developed where we build a new graph that is indexed on the PC and return values. The heuristic value is based more on the true distance between current state and error state. To further improve the heuristic estimate, for all jumps to subroutines which use indirect addressing mode, the values of the registers are resolved dynamically to eliminate all the false edges. At points of non-determinism like branch points, we look at the condition code registers to find which path is valid, so even if the other path is shorter but not valid, we ignore it and calculate the shortest distance on the actual path.
For testing purposes we used academic models dealing with concurrency. Two models we tested extensively are finding the starvation state for the Dining Philosopher’s problem and violation of the mutual exclusion property in Hyman’s Mutex algorithm. The results achieved with the both versions of the algorithm were very encouraging. The number states explored are far fewer compared to the exhaustive search (the first three columns in the Table 1 show the comparisons between the number of states, the next three compare time taken). Even with the extra time required to build the graph and compute the shortest distance it outperforms the exhaustive search in the time required. This is an efficient way to overcome the state explosion problem, and helps us use the resources like time and memory more efficiently shown in Table 1.
This algorithm is extensible to further optimizations. Further research could focus on constant propagation and dynamically resolving the loops. In constant propagation we can take the value of the constants and push them as far down in the Control Flow as possible. This dynamic analysis combined with the static analysis will be able to refine the prediction of the minimum of steps that are required for a property to hold or fail. The static analysis can be further extended to identify the start and end of loops and resolve loop invariants dynamically. So if a variable affecting the error property is changed in a loop, we can figure out exactly how many times we will go through the loop before we can reach the error state.