Download Dynamic Invariant Generation for Multithreaded

Transcript
Dynamic Invariant Generation for Multithreaded Programs
Arijit Chattopadhyay
Thesis submitted to the Faculty of the
Virginia Polytechnic Institute and State University
in partial fulfillment of the requirements for the degree of
Master of Science
in
Computer Engineering
Chao Wang, Chair
Michael S. Hsiao
Sandeep Shukla
April 30, 2014
Blacksburg, Virginia
Keywords: Concurrency, Likely Invariant, Dynamic Invariant Generation, Partial Order
Reduction, Error Diagnosis, Atomic Region
Copyright 2014, Arijit Chattopadhyay
Dynamic Invariant Generation for Multithreaded Programs
Arijit Chattopadhyay
(ABSTRACT)
We propose a fully automated and dynamic method for generating likely invariants from multithreaded programs and then leveraging these invariants to infer atomic regions and diagnose concurrency errors in the software code. Although existing methods for dynamic invariant generation
perform reasonably well on sequential programs, for multithreaded programs, their effectiveness
often reduces dramatically in terms of both the number of invariants that they can generate and the
likelihood of them being true invariants. We solve this problem by developing a new dynamic invariant generator, which consists of a new LLVM based code instrumentation tool, an Inspect based
thread interleaving explorer, and a customized inference engine inside Daikon. We have evaluated
the resulting system on public domain multithreaded C/C++ benchmarks. Our experiments show
that the new method is effective in generating high-quality invariants. Furthermore, the state and
transition invariants generated by our new method have been proved useful in applications such as
error diagnosis and inferring atomic regions in the concurrent software code.
Dedication
My Mother Chandana Chattopadhyay
and
My Father Sudhakar Chattopadhyay
iii
.
Acknowledgments
I would like to extend my deep gratitude to my advisor, Professor Chao Wang, for giving me
the opportunity to work in Reliable Software and Systems (RSS) Lab at Virginia Tech. His
endless support, motivation, guidance, and unflagging faith on me made this work possible. I
am honored to have this opportunity to learn various aspects of research in software verification.
At the same time, I am extremely proud to have Dr. Michael Hsiao and Dr. Sandeep Shukla on my
MS thesis committee. Their insight, advice, and help contributed strongly to this work and kept
me motivated and focused. My thesis work is supported in part by the NSF grant CCF-1149454
and the ONR grant N00014-13-1-0527. I would like to thank both the institutions for providing
necessary funding. I would also like to acknowledge Dr. Changhee Jung for his guidance on using
the LLVM compiler infrastructure.
I would like to express my sincere gratitude to my sister Ishita and brother-in-law Subhendu for
inspiring me in doing scientific research. I would also like to thank Rajni, for being extremely
supportive and helpful, despite of my surliness and frustration.
I would take this opportunity to thank Professor Suvrojit Das from NIT-Durgapur, who first ignites
iv
the passion for systems software in me. I would also like to thank Markus Kusano for providing
test programs and technical details about the Inspect tool. I acknowledge the help and inspiration
that I received from my friends in the RSS and PROACTIVE research groups, which engaged me
in doing hard work with light heart. I would also like to thank all my friends in Blacksburg, who
made my stay here even more enjoyable.
My implementation of the proposed methods uses various open source software tools and development frameworks. I have extensively used the LLVM compiler infrastructure, the GCC compiler,
and Java, Python and C/C++ languages. My research also builds upon research tools such as
Daikon and Inspect. I have used the VIM text editor for writing the code, LATEX text formatter
for writing the thesis and other documents. I would like to thank contributors to these awesome
software tools.
v
Contents
List of Figures
ix
List of Tables
xi
1
Introduction
1
1.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.2
Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1.3
Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2
Preliminaries
10
2.1
Multithreaded Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2
Dynamic Invariant Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3
Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
vi
3
4
5
6
Dynamic Invariants Generation for Concurrent Programs
21
3.1
LLVM Based Code Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2
Systematic Thread Interleaving Exploration . . . . . . . . . . . . . . . . . . . . . 24
3.3
Customized Invariant Inference Engine . . . . . . . . . . . . . . . . . . . . . . . . 26
Applications of the Generated Invariants
28
4.1
Diagnosing Concurrency Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2
Inferring Likely Atomic Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Experiments
33
5.1
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2
Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2.1
Threats to validity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2.2
Limitations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
R Tool User’s Manual
6.1
43
Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.1.1
Checking out the R Tool git repository . . . . . . . . . . . . . . . . . . . 44
6.1.2
Installing LLVM and the DaikonPass . . . . . . . . . . . . . . . . . . . 44
vii
6.2
7
6.1.3
Installing the Inspect Tool . . . . . . . . . . . . . . . . . . . . . . . . 45
6.1.4
Installing the Trace Classifier . . . . . . . . . . . . . . . . . . . . 46
6.1.5
Installing Daikon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.1.6
Updating the path in exports.sh . . . . . . . . . . . . . . . . . . . . . 47
6.1.7
Installing Qt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Using R Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.2.1
R Tool as the Frontend for Daikon
6.2.2
R Tool for Generating Invariants from Concurrent Programs . . . . . . . . 50
6.2.3
R Tool for Inferring Likely Atomic Code Regions . . . . . . . . . . . . . 51
Conclusions and Future Work
. . . . . . . . . . . . . . . . . . . . 48
53
Bibliography
55
viii
List of Figures
1.1
Our new dynamic invariant generation method. . . . . . . . . . . . . . . . . . . .
6
1.2
The if-statement that is intended to be atomic. . . . . . . . . . . . . . . . . . . . .
6
2.1
Motivating Example 1
2.2
The main function for the example in Figure 2.1. . . . . . . . . . . . . . . . . . . 17
2.3
Fixing withdraw() for the example in Figure 2.1. . . . . . . . . . . . . . . . . . 18
2.4
Motivating Example 2
2.5
Generated Invariants by R Tool for Figure 2.4 . . . . . . . . . . . . . . . . . . . . 19
4.1
Example for diagnosing concurrency errors. . . . . . . . . . . . . . . . . . . . . . 29
4.2
Transitional invariants generated for Figure 4.1. . . . . . . . . . . . . . . . . . . . 30
4.3
The bank account transfer example . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1
Sync 01: part of the code for the two threads. . . . . . . . . . . . . . . . . . . . . 36
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
ix
5.2
Scalability Comparison of DPOR, PCB and HaPSet algorithm. . . . . . . . . . . . 38
6.1
Screenshot of the User Interface for R Tool . . . . . . . . . . . . . . . . . . . . . 52
x
List of Tables
5.1
The characteristics of the benchmark examples used in our experimental evaluation. 40
5.2
Invariant Comparison between Daikon and R Tool
5.3
The quality of invariants generated by R Tool . . . . . . . . . . . . . . . . . . . . 41
5.4
Size of atomic regions under dynamic invariants. . . . . . . . . . . . . . . . . . . 42
5.5
Invariants generated with different search strategies. . . . . . . . . . . . . . . . . . 42
xi
. . . . . . . . . . . . . . . . 41
List of Abbreviations
LLVM
Low Level Vritual Machine
DPOR
Dynamic Partial Order Reduction
PCB
Pre-emptive Context Bounding
HaPSet History Aware Predecessor Set
xii
Chapter 1
Introduction
Multithreaded programs remain notoriously difficult to design and analyze due to the often subtle
thread interactions and astronomically large number of possible thread interleavings. Existing
methods for dynamically generating likely invariants from sequential programs have proved to be
valuable in many settings including program understanding, software maintenance, verification,
error diagnosis, and automated repair. However, effective methods for dynamically generating
likely invariants from multithreaded C/C++ programs are still lacking.
Existing dynamic invariant generation methods work well on sequential programs, for concurrent
programs, our experience shows that their effectiveness reduces dramatically in terms of both the
number of invariants that they can generate and the likelihood of them being true invariants. The
main reason is that dynamic invariant generation depends heavily on the set of execution traces
fed to the inference engine. In general, if the execution traces are not diversified enough, the
1
Arijit Chattopadhyay
Chapter1. Introduction
2
resulting likely invariants will be less representative of the actual program behavior. However,
generating a set of sufficiently diverse execution traces is difficult in the context of multithreaded
programs, since the program behavior depends not only on the program’s data input but also on the
thread schedules. In a typical execution environment, the thread scheduling is determined by the
operating system and the actual implementation of the thread library. Therefore, naively executing
the program many times would not necessarily increase the diversity of the thread interleavings.
For example, although Daikon is a highly successful tool for dynamically generating likely
invariants for sequential programs written in languages such as Java, C, C++ and Perl, we will show
in Section for multithreaded programs, Daikon’s invariant inference engine often gets confused
and therefore produces wrong invariants.
Another limitation of existing dynamic invariant generation methods is that they tend to report too
many invariants. Even if many of them are actually true invariants, it is unlikely that the developers
will have time to sift through all of them and make sense of them. Furthermore, in the context
of analyzing the behavior of multithreaded programs, the basic assumption often is that (1) the
sequential part of the computation is implemented correctly, to an extent that the program behaves
as expected most of the time, except for some rare thread interleavings; and (2) the concurrency
control part of the program is potentially problematic, where rare but subtle thread interactions may
cause the program to fail. In this case, we argue that the focus should be on analyzing a special
type of invariants called the transition invariants. These are invariants over shared variables only
and are capable of capturing programmer’s intent regarding concurrency control, e.g. certain code
block should be kept atomic, and certain operations should be made mutually exclusive.
Arijit Chattopadhyay
1.1
Chapter1. Introduction
3
Background
There is a large body of work on using abstract interpretation based static analysis [8, 9] for
invariant generation. The main advantage of these techniques is that the reported invariants are true
for every reachable state of the program. Typically, invariants generated by these techniques are
predicates expressed in some linear abstract domains, such as difference logic, UTVPI, octagonal,
polyhedral, etc. There is another line of research on static invariant generation, which relies on
the constraint based approach [7, 31, 30]. The main advantage of these methods is that they are
able to generate complex invariants, including polynomial invariants and non-linear invariants.
Recent development along this line includes the work by Furia et al. [15] on generating loop
invariants from post-conditions, and invariants related to integer arrays [4, 5, 19]. However, due
to the inherent limitations of static program analysis, invariant generation based on purely static
techniques tend to lack either in precise or in scalability. Our new method, in contrast, relies solely
on dynamic analysis techniques.
There is also a large body of work on using dynamic analysis techniques for invariant generation.
Tools developed using these techniques, such as Daikon [11, 12], have been highly successful
in practice. The main advantage of dynamic invariant generation techniques is their versatility.
They have been applied to realistic applications for which static techniques are difficult to handle
and have front-end tools for a wide range of programming languages. Daikon, for example, has a
front-end for C/C++ called Kvasir and a front-end for Java called Chicory. Other dynamic invariant
generation tools along this line include DIDUCE [18], DySy [10], Agitator [3], and Iodine [17].
Arijit Chattopadhyay
Chapter1. Introduction
4
However, existing dynamic generation tools do not work well on multithreaded programs due to
the nondeterminism in thread scheduling. We have solved this problem by developing a method
based on a new code instrumentation tool, a systematic interleaving explorer, a trace classifier, and
a customized invariant inference engine.
There are also hybrid techniques for invariant generation, which leverages both static analysis and
dynamic analysis to improve performance. For example, Nguyen et al. [26] proposed a method
for generating invariants expressed as polynomials and linear relations over a limited number array
variables. Such invariants have been difficult to generate by existing methods. There are also
hybrid techniques based on random testing [22] and guess-and-check [32], which first generate a
set of candidate invariants from concrete execution data and then verify them using SMT solvers.
These techniques are orthogonal to our work.
There are many existing methods for detecting and diagnosing atomicity violations in multithreaded programs. Some of these methods are based on inferring likely atomic regions statically [38], whereas others are based on dynamic [23, 37, 6, 33] and hybrid [35] techniques. The
CTrigger [27] tool, for instance, first generates ordering constraints over global events from good
runs through profiling, and then leverages the constraints during test runs. However, existing
methods focus exclusively on the event ordering constraints, e.g., whether a set of shared variable
accesses from different threads should or should not appear in a certain order. In contrast, the
transition invariants generated and used by our new method are thread-modular, in that they refer
only to the current and past values of shared variables accessed from the same thread. In this sense,
our method is orthogonal to the existing techniques.
Arijit Chattopadhyay
1.2
Chapter1. Introduction
5
Contribution
In this thesis, we propose a fully automated and dynamic method for generating high quality invariants from multithreaded C/C++ programs to address the aforementioned limitations in existing
methods. We also show that the state and transition invariants produced by our method is useful
in applications such as diagnosing concurrency errors and inferring likely atomic regions in the
software code. Our method builds on a new LLVM based code instrumentation tool, an Inspect
based thread interleaving explorer, and a customized invariant inference engine in Daikon.
The overall flow of our method is illustrated in Figure 1.1, which takes multithreaded C/C++ code
as input and generates a set of likely invariants as output. First, the program is processed by the
instrumentation tool, which adds monitoring and control capabilities to the program for dynamic
analysis. Then, we run the instrumented executable under Inspect, which systematically explores the possible thread interleavings. The trace logs are then fed to a test oracle based classifier,
which divides them into good traces and bad traces. Finally, a customized invariant inference
engine in Daikon takes the subset of the traces (e.g., the good traces or bad traces) as input and
returns the likely invariants as output.
Our new method has the capability of inferring both state invariants and transition invariants. A
state invariant is a predicate that holds at a thread-local program location and is expressed in terms
of the program variables at that location. For example in the program in Figure 1.2, the predicate
(A.x == 10) is a state invariant at Line 4. In contrast, a transition invariant is a predicate that
holds at the entry and exist points of an arbitrary code block and is expressed in terms of the two
Arijit Chattopadhyay
Block size
Chapter1. Introduction
Multithreaded
C/C++ code
Likely
invariants
LLVM based
Instrumentation
Daikon based
Invariant Gen.
Executable
Strategy
6
Exploration of
Interleavings
good traces
bad traces
Test Oracle based
Trace classifier
Trace Logs
Figure 1.1: Our new dynamic invariant generation method.
Thread 2
Thread 1
1
2
3
4
5
6
7
p = & A;
...
if (p!=NULL){
...
p->x = 10;
...
}
8
9
10
11
12
13
14
p = NULL;
...
...
...
...
...
...
Figure 1.2: The if-statement that is intended to be atomic.
copies of program variables at the entry and exit points, respectively. For example, in the program
in Figure 1.2, the predicate (p == orig(p)) is a transition invariant over the code block starting
at Line 2 and ending at Line 3. Here orig(p) and p are values of variable p at the entry and exit
points of this code block, respectively.
Transition invariants are important in multithreaded programs, since they often indicate whether
the associated code block is atomic or should be atomic. Here, a code block is atomic only if the
execution of instructions in this code block is not interfered by any conflicting instruction from a
Arijit Chattopadhyay
Chapter1. Introduction
7
concurrent thread. In the above example, the transition invariant (p == orig(p)) inferred from
the code block starting at Line 2 and ending at Line 3 indicates that the value of pointer p did not
change. Therefore, Line 2 and Line 3 belongs to the same atomic region. Although the baseline
inference engine in Daikon can generate transition invariants, it only works at the method entry
and exit points. Whereas our method has the capability of generating transition invariants over
arbitrary code blocks. which our tool infers likely transition invariants is set by the user. This is
important for multithreaded programs since the intended or actual atomic regions may not always
coincide with the method boundary. Indeed, most of the atomic regions that we have encountered
in the experiments are either code blocks inside a function body, or span across several methods.
In addition to generating a larger number of high-quality invariants as compared to existing methods such as Daikon, we also propose a new method for leveraging these invariants to diagnose
concurrency errors and to infer likely atomic regions in the source code. In terms of error diagnosis,
we can compare and contrast invariants learned from the good traces and bad traces – the difference
is often useful in revealing the root cause of a concurrency bug. In terms of inferring atomic
regions, we start by generating likely transition invariants for code blocks of increasing size. If
the generated transition invariants holds in the original software code (which we check by manual
inspection at this moment), then we can conclude that it indeed is an atomic region.
We have implemented our new methods and conducted experimental evaluation on open source
multithreaded C/C++ programs. Our result shows that the state and transition invariants generated
by our method are of significantly higher quality than existing methods such as Daikon; and at the
same time, these invariants are useful in the context of diagnosing concurrency errors and inferring
Arijit Chattopadhyay
Chapter1. Introduction
8
atomic regions.
To sum up, this thesis makes the following contributions:
• We develop a new method for dynamically generating high-quality state and transition invariants for multithreaded C/C++ programs.
• We propose new methods for leveraging these invariants to diagnose concurrency errors and
to infer likely atomic regions in the software code.
• We implement the proposed methods and evaluate them on a set of public-domain benchmarks to demonstrate their effectiveness.
1.3
Organization
The remainder of this thesis is organized as follows.
Chapter 2, shows the technical foundations, needed to express the content of the thesis more
formally.
Chapter 3, details the process of the dynamic invariant generation for concurrent programs and the
algorithms associated to each individual stages.
Chapter 4, presents the two main usages of the generated invariants.
Chapter 5, shows the complete experimental results, that we have done with R Tool, to compare it
with Daikon as well as to evaluate other various features of this tool.
Arijit Chattopadhyay
Chapter1. Introduction
9
Chapter 6, provides in-depth step wise details, that user has to follow to install and run R Tool, in
various modes.
Chapter 7, concludes the thesis, with the note to the future extension of this work.
Chapter 2
Preliminaries
2.1
Multithreaded Programs
A multithreaded program consists of a set of shared variables SV and a set of threads T1 . . . Tm .
Each thread Ti is a sequential program with a set of thread-local variables LV i , where 1 ≤ i ≤ m.
Let st be an instruction. An execution instance of st is called an event, denoted e = htid, l, st, l′ i,
where tid is the thread index, l and l′ are the thread program locations before and after executing
st. An event is said to be visible if it access a shared variable or a thread synchronization object
(mutex lock or condition variable). Otherwise, the event accesses only thread-local variables and
it is said to be invisible. During systematic thread interleaving exploration and execution trace
logging, invisible events will be ignored.
We model each thread Ti as a state transition system Mi . The transition system of the program,
10
Arijit Chattopadhyay
Chapter2. Preliminaries
11
denoted M = M1 × M2 × . . . × Mm , is constructed using standard interleaving composition. Let
M = hS, R, s0 i, where S is the set of global states, R is the set of transitions, s0 is the initial state.
Each state s ∈ S is a tuple of thread program states. Each transition e ∈ R is an event from one of
the m threads.
e
...
1
s1 → sn , where s0 , . . . , sn are the states and
An execution trace of M is a sequence π = s0 →
e1 , . . . , en are the events. We shall use special event halt to denote normal program termination,
and special event abort to denote faulty program termination. Event abort corresponds to a failed
R assert() statement.
Therefore, an event from thread Ti may have one of the following types:
• halt, which denotes normal program termination;
• abort, which denotes faulty program termination;
• f ork(j) for creating child thread Tj , where j 6= i;
• join(j) for joining back thread Tj , where j 6= i.
• lock(lk) for acquiring lock lk;
• unlock(lk) for releasing lock lk;
• signal(cv) for setting signal on condition variable cv;
• wait(cv) for receiving signal on condition variable cv;
• read(v) for reading of shared variable v;
• write(v) for writing to shared variable v;
• mEntry(method) for entering a method call;
Arijit Chattopadhyay
Chapter2. Preliminaries
12
• mExit(method) for returning from a method call;
• bEntry(block) for starting a code block;
• bExit(block) for ending a code block.
Here, mEntry() and mExit() are supported by existing dynamic invariant generation tools such
as Daikon, whereas bEntry() and bExit() are new additions in our method.
The reason why thread synchronization events are modeled is that we need to control the execution
order of these events during thread interleaving exploration. In this work, we target multithreaded
C/C++ applications written using the POSIX threads. We assume that the program uses thread
creation and join, mutex locks, and condition variables for synchronization. Locks and condition
variables are the most widely used primitives in mainstream platforms. If other blocking synchronizations are used, they also need to be modeled as events in a similar fashion. By doing this,
we will be able to know at each moment during the program execution, which threads are in the
blocking mode (disabled) and which threads are in the execution mode (enabled).
A currently enabled thread becomes disabled if (i) it attempts to execute lock(lk) while lk is held
by another thread; (ii) it attempts to execute wait(cv) while the signal on cv has not yet been set; or
(iii) it attempts to execute join(j) while the child thread Tj is still running. Similarly, a currently
disabled thread becomes enabled if (i) another thread releases the lock lk by executing unlock(lk),
(ii) another thread sets the condition variable cv by executing signal(cv), or (iii) the waited child
thread Tj terminates. It is important to compute the sets of enabled and disabled threads at run time,
since attempts to schedule disabled threads while postponing the execution of enabled threads may
Arijit Chattopadhyay
Chapter2. Preliminaries
13
lead to artificial deadlocks.
2.2
Dynamic Invariant Generation
Likely invariants generated from a program dynamically are properties that hold over the execution
traces produced by a set of test cases. As such, it is not guaranteed that they would hold for
all possible executions of the program. Furthermore, the invariant inference engine often uses
a statistical analysis and, in theory, is neither sound nor complete. In practice, the number of
invariants that can be generated, as well as the likelihood of them being true invariants, greatly
depends on the quality of the test suite.
Daikon is a highly successful dynamic invariant discovery tool. It supports programs written in
various programming languages including C/C++, C#, Eiffel, F#, Java, Lisp and Visual Basic. For
each of these languages, Daikon provides a front-end tool for code instrumentation to add selflogging capability to the target program. All these front-end tools prepare the program to generate
trace logs in a common format, which are then fed to the same backend invariant inference engine.
For example, the C/C++ front-end in Daikon is called Kvasir, which works directly on the compiled binary code. In this work, we have developed a new instrumentation tool based on the popular
LLVM compiler platform to replace Kvasir. The main advantage of our new instrumentation tool is
to leverage the large number of static program analysis procedures implemented in LLVM, which
can potentially increase the performance of the dynamic analysis.
Arijit Chattopadhyay
Chapter2. Preliminaries
14
As we have mentioned earlier, Daikon do not have the capability of diversifying the execution
traces in the presence of multithreading, and as a result, may generate many bogus invariants.
Furthermore, Daikon is effective in generating linear invariants of the form (ax + by < c), but
weak in generating more complex invariants such as polynomial invariants (c0 +c1 x1 +c2 x2 +... <
0) and array invariants. For the latest development along this line of research, please refer to the
recent work by Nguyen et al. [26]. However, the focus of our work is not on improving the
expressiveness of the generated invariants, but on improving the quality of the invariants that are
relevant to concurrency. The vast majority of invariants generated by existing tools capture the
sequential program behavior. But we want to focus only on invariants that refer to shared variables
and therefore capture the behavior relevant to thread interference.
2.3
Motivating Examples
In this section, we will use a few examples to illustrate the problems with existing methods and
highlight our main contributions.
First, consider the program in Figure 2.1, which has a global variable named balance being
accessed by methods getBalance() and setBalance(). The third method withdraw() invoke the previous two methods to deduct a certain amount from balance. Since global variable
balance is protected by a setLock() and setUnlock() pair every time it is accessed, there
is no data-race. However, there can be atomicity violations. Although the method withdraw()
is meant to be executed atomically – without other threads interleaved in between the calls to
Arijit Chattopadhyay
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Chapter2. Preliminaries
15
int getBalance() {
int bal;
setLock();
bal = balance;
setUnLock();
return bal;
}
void setBalance(int bal) {
setLock();
balance = bal;
setUnLock();
}
void* withdraw(void *arg) {
int bal = getBalance();
bal = bal -100;
setBalance(bal);
}
Figure 2.1: Motivating Example 1
getBalance() and setBalance() – it is not enforced by the programmer. For example, starting
with balance=400. If two concurrent threads run withdraw() at the same time, the result may
be either 300 or 200.
Existing dynamic invariant generation tools do not work well in this case. For example, if we
instrument the compiled program with the Kvasir tool, and then run Daikon, most likely we
will get a false invariant. The reason is that Daikon relies on the native execution environment
to determine the program’s thread schedule at run time. In this example, since the number of
instructions in each thread is small (significantly smaller than the time slice of the Linux operating
system), most likely, each thread will have ample time for completion before the OS forces a
context switch.
Since the erroneous interleaving may never occur, Daikon may falsely report the invariant ::balance
- orig(::balance) + 100 = 0, where ::balance denotes the value of balance at the exit
point of withdraw() and orig(::balance) denotes the value of balance at the entry point.
Arijit Chattopadhyay
Chapter2. Preliminaries
16
This is not a true invariant, and it can make the developer believe that the execution of withdraw()
is atomic despite of the fact that it is not the case. In other words, it has masked the symptom of a
concurrency bug.
Our new method, in contrast, relies on a systematic concurrency testing tool called Inspect to
diversify the thread interleavings before feeding the execution traces to the invariant inference
engine. For the example in Figure 2.1, we will be able to cover both the good interleaving
(leading to balance=200) and the bad interleaving (leading to balance=300). Consequently,
the invariant inference engine will be able to produce the correct transition invariant ::balance
< orig(::balance). This is the correct result since the best we can infer from the executions
of this program (two threads running withdraw() concurrently) is that balance decreases.
There are many possible applications for the invariants such as (::balance - orig(::balance)
+ 100 = 0) and (::balance < orig(:balance)). In this thesis, we focus on two specific
applications: diagnosing concurrency errors, and inferring atomic regions.
For error diagnosis, we rely on a user provided test oracle to distinguish failing executions from
passing executions. During software testing, it is reasonable to expect the user to provide a test
oracle, which separates bad test runs from good test runs. For our running example, assume
that the oracle asserts that (balance==200) must hold at the end of the execution. This is
illustrated in Figure 2.2. For the buggy program in Figure 2.1, if the method withdraw() is
executed atomically by both threads, the assertion would pass; but if the method withdraw() is
not executed atomically, the assertion would fail.
Arijit Chattopadhyay
Chapter2. Preliminaries
17
int main(void)
{
...
pthread_create(&t1,0,withdraw,0);
pthread_create(&t2,0,withdraw,0);
...
pthread_join(t1,0);
pthread_join(t1,0);
R_assert(balance==200); // test oracle
return 0;
}
Figure 2.2: The main function for the example in Figure 2.1.
If we run our new invariant generation method on the passing traces only, according to our previous
discussion, it would report the transition invariant (::balance - orig(::balance) + 100
= 0). In contrast, if we run our new invariant generation method on the failing traces only, it would
report the transition invariant (::balance < orig(::balance)). Presenting the discrepancy
between these two set of results (invariants generated from passing runs versus invariants generated
from failing runs) will help the user localize the root cause of the concurrency failure.
For atomic region inference, we present only the transition invariants obtained from the passing
runs to the user. For our running example, the transition invariant generated for the code block
Lines 27-32 is (::balance - orig(::balance) + 100 = 0). To determine if the code
block is intended to be atomic, we compute the thread-local transition relation of this code block.
For the method withdraw(), the transition relation projected to the global variables is represented
as follows:
::balance = orig(::balance) - 100
Therefore, the transition invariant generated by our method matches the thread-local transition
relation – it is another way of saying that, in all the passing test runs, the method withdraw() is
Arijit Chattopadhyay
Chapter2. Preliminaries
18
void* withdraw(void *arg) {
setLock();
int bal = getBalance();
bal = bal - 100;
setBalance(bal);
setUnlock();
}
Figure 2.3: Fixing withdraw() for the example in Figure 2.1.
executed atomically.
One way to fix the atomicity violation bug in withdraw() is to protect the method body with
another pair of setLock() and setUnlock() – we make sure that a reentrant lock is used here.
This is illustrated in Figure 2.3. To confirm that this is indeed a correct fix, we can run our new
dynamic invariant generation tool again, which would only encounter passing runs, based on which
it reports the expected transition invariant (::balance - orig(::balance) + 100 = 0).
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
typedef struct { int a, b, c;} Data;
Data *A[128] ;
Data *p = NULL;
void* thr1 (void *arg) {
...
...
Data *tmp = p;
if (tmp != NULL) {
p->a = 100;
p->b = 200;
p->c = 300;
R_assert(tmp->a == 100 && tmp->b == 200 && tmp->c == 300 );
}
...
...
}
void * thr2(void *arg) {
int i = rand() % 127+1;
p = A[i];
return 0;
}
Figure 2.4: Motivating Example 2
Although existing tools such as Daikon can already infer state and transition invariants, there
is a major limitation: they only generate such invariants at the method entry and exit points.
Arijit Chattopadhyay
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Chapter2. Preliminaries
19
=======================================
..main.c_10_15_0():::ENTER
::p == 36497856
=======================================
..main.c_10_15_0():::EXIT
::p == orig(::p)
::p == 36497856
=======================================
..main.c_12_15_1():::ENTER
::p == 36497856
=======================================
..main.c_12_15_1():::EXIT
::p == orig(::p)
::p == 36497856
=======================================
..main.c_15_15_3():::ENTER
::p == 36497856
=======================================
..main.c_15_15_3():::EXIT
::p == orig(::p)
::p == 36497856
Figure 2.5: Generated Invariants by R Tool for Figure 2.4
Unfortunately, atomic regions in multithreaded programs often do not coincide with the procedural
boundaries. Our new method, in contrast, has the capability of generating state and transition
invariants for code blocks of arbitrary size – as shown in Figure 1.1, the user can set the block size
from the command line.
Consider our second motivating example, shown in Figure 2.4, which has a global array of struct
type Data. Within thr1(), the three fields are intended to be updated atomically – the programmer’s intent is captured by the test oracle in R assert(). In our dynamic invariant generation
tool, this test oracle allows us to distinguish passing test runs from failing test runs. The question
here is how to infer the intended atomic region, which spans from Line 10 to Line 15. In this case,
R Tool’s capability of generating invariant for arbitrary code blocks is crucial.
Figure 2.5 shows the section of R Tool’s output regarding the transition invariant. Here, we start
with a code block size of 2, which means that for any code region that has two consecutive accesses
Arijit Chattopadhyay
Chapter2. Preliminaries
20
of a shared object, we will try to generate the transition invariants. The partitioning of the source
code into code regions was performed by our new LLVM based instrumentation tool, as shown in
Figure 1.1, which gradually increases the number of load/store instructions over global variables
in the code block, to allow the determination of the largest possible atomic code region.
For example, in Figure 2.5, ..main.c 10 15 0() means that R Tool is considering the code
block from Line 10 to Line 15 in Figure 2.4, whereas ..main.c 12 15 1() means that R Tool
is considering the code block from Line 12 to Line 15. In both cases, R Tool is able to generate
invariant (::p==orig(::p)), which indicates that the value of the pointer p is never changed
in between. By comparing these automatically generated invariants with the transition relation of
the source code, the developer can reach a conclusion that, for the test runs to pass, the code block
from Line 10 to Line 15 must be kept atomic.
Chapter 3
Dynamic Invariants Generation for
Concurrent Programs
In this section, we present our new dynamic invariant generation method, which consists of three
main components: an LLVM based code instrumentation tool, a systematic interleaving exploration
tool, and a customized inference engine in Daikon. The overall flow of our method is illustrated
by Algorithm 1, which takes the C/C++ source code of a multithreaded program as input and
returns a set of likely invariants as output.
Algorithm 1 R Tool for dynamically generating invariant.
inst output ← inspect pass(src code)
inst output ← daikon pass(inst output)
inst output ← spacer pass(inst output, spacer size)
trace file ← gen traces(inst output)
thrd mod traces ← trace classfier(trace file)
invariants ← inv inference(thrd mod traces)
21
Arijit Chattopadhyay
3.1
Chapter3.Dynamic Invariants Generation for Concurrent Programs
22
LLVM Based Code Instrumentation
We have developed an LLVM based front-end for instrumenting multithreaded C/C++ code. As
shown in Algorithm 1, the instrumentation consists of three code transformation passes.
The first pass, named inspect pass, takes the C/C++ code as input and returns an instrumented
version of the code as output. Inside this pass, we first identify all the program points where
the thread schedule needs to be controlled by the Inspect tool during systematic interleaving
exploration. These program points include the calls to thread synchronization routines, the read and
write operations on shared memory locations, and the calls to low-level built-in atomic instructions
in GNU, such as the compare-and-set (CAS) operations. At each program points, we inject new
code before these visible operations, to allow the control of the thread at run time by the systematic
interleaving explorer.
The second pass, named daikon pass, takes the instrumented code as input and returns another
instrumented version of the code as output. Inside this pass, we inject new code to add selflogging capabilities to the program at the execution time. The trace log generated by the program
will conform to the common file format as required by the backend invariant inference engine in
Daikon. By default, this pass instruments the code only at the method invocation and response
points, which is compatible with the default front-end tool in Daikon, called Kvaisr. However,
our implementation also has the capability of optimizing the runtime performance of trace logging
by limiting the number of program points to be considered. For example, if we pass dumpppt|
through the command line, the instrumentation tool will report the names of all functions in each
Arijit Chattopadhyay
Chapter3.Dynamic Invariants Generation for Concurrent Programs
23
program module into a text file. A user may choose to edit the text file, remove some program
points, and therefore reduce the logging overhead at run time.
The third pass, named spacer pass, takes the previously instrumented code as input and returns
the final version of the code as output. Inside this pass, we also inject new code to add selflogging capabilities at the boundary of arbitrary code blocks, not just at the procedural boundaries
as in Kvasir. This is accomplished by the tool inserting hook function calls to the entry and exit
points of these code blocks, which in turn take care of the trace generation at run time. This is an
important pass because it allows our method to generate state and transition invariants for arbitrary
code blocks, which are crucial for inferring atomic regions in the software code. The size of the
code block, around which hook function calls be inserted, can be controlled by the user. More
specifically, this pass requires an integer value as spacer size, based on which the pass garners a
bunch of load/store instructions over global variables and inserts hook function calls at boundary
of this block of instructions.
Note that the events logged as a result of the second and third passes are kept in the same format.
From the standpoint of the backend invariant inference engine, there is no distinction between a
pair of entry and exit points for a method, and a pair of entry and exit points for an arbitrary code
block. Therefore, the backend inference engine do not have to be changed drastically in order to
infer invariants at the boundary of arbitrary code blocks. By varying the value of the spacer count,
we can dynamically change the locations where state and transition invariants are generated. This
will help us to detect likely atomic regions in the code. We will discuss this application in details
in Section 4.2.
Arijit Chattopadhyay
Chapter3.Dynamic Invariants Generation for Concurrent Programs
24
Since the input program is a multithreaded program, during our instrumentation, we need to couple
the thread id information with each program point. The thread id information is derived from
the pthread t field of a POSIX thread. However, we have made it a unique and persistent number
associated to each thread in the program. By persistent, we mean that the thread id remains the
same across different program executions, despite that the value of the pthread t field changes
due to the thread being loaded into different memory locations. This allows the thread id
to be coupled with the program seamlessly, based on which the backend inference engine can
deduce indexed predicates such as (::balance - orig(::balance) + thread id*10 =
0). Such invariants will be useful in applications such as thread modular verification [14, 21].
3.2
Systematic Thread Interleaving Exploration
Due to interleaving explosion, in general, we cannot afford to explicitly enumerate all possible
thread schedules while diversifying the execution traces for the backend inference engine. In this
work, we rely on a set of advanced search strategies implemented in the Inspect tool to produce
a subset of representative thread interleavings. These search strategies are based on the theory of
partial order reduction [16, 13]. They group the possible interleavings of a concurrent program
into equivalence classes, and then pick one representative from each equivalence class.
Equivalence classes are defined using Mazurkiewicz traces [24]. Two sequences of events are said
to be in the same equivalence class if we can create one sequence from the other by successively
permuting the adjacent and independent events. Two events are dependent if they are from two
Arijit Chattopadhyay
Chapter3.Dynamic Invariants Generation for Concurrent Programs
25
different threads, access the same memory location, and at least one of them is a write or modify
operation; otherwise, they are independent. It has been proved [16] that in the context of verifying
concurrent systems, exploring one representative from each equivalence class is sufficient to catch
all deadlocks and assertion violations.
In this work, we will leverage a state-of-the-art algorithm implemented in the Inspect tool,
called dynamic partial order reduction (DPOR [13]). Here, the dependency relation is computed
dynamically at run time, which is the main reason why it can robustly handle realistic applications
written in C/C++ languages; for such applications, statically computing the dependency would
have been too difficult due to the widespread use of pointers, method calls, and heap allocated data
structures.
In addition to DPOR, we also employ two more reduction strategies: preemptive context bounding
(PCB [25]) and history-aware predecessor set (HaPSet [36]) reductions. The idea of using context
bounding to reduce complexity of software verification was first introduced for static analysis [29]
and later for testing [25]. It has since become an influential technique, as in practice many
concurrency bugs can be exposed by interleaving with fewer context switches. In this strategy,
named PCB, the interleaving explorer generate only those interleavings with less than or equal to
k preemptive context switches, where k is a small integer of 2 or 3. HaPSet can be viewed as a
further improvement over DPOR and PCB.
Arijit Chattopadhyay
3.3
Chapter3.Dynamic Invariants Generation for Concurrent Programs
26
Customized Invariant Inference Engine
We customized the invariant generation engine in Daikon to focus on two types of invariants
in a multithreaded program : the state invariants and the transition invariants, both of which are
expressed in terms of shared variables. A state invariant is a predicate expressed in terms of variable
values at the current program location. A transition invariant is a predicate expressed in terms of
variable values both at the current location and at some previous program locations. Therefore, a
transition invariant is capable of capturing the impact of a code block.
More formally, we consider a program P as a state transition system P = hS, R, Ii, where S is the
set of states, I ⊆ S is the set of possible initial states, R ⊆ S × S is the transition relation.
In general, a transitional invariant [28], denoted T , is an over-approximation of the transitive
closure of R restricted to the reachable state space, i.e., R+ ∩ (R∗ (I) × R∗ (I)) ⊆ T . In other
words, a transitional invariant captures the relation between the pre- and post-conditions of a set
of instructions.
In concurrent systems, transition invariants are extremely important because they would conform
to the transition relation of a sequential code block in the absence of thread interference, but would
deviate from the transition relation in the presence of thread interference. In testing and verification
of concurrent systems, we typically assume that the sequential part of the computation is correct
but the concurrency control part of the program is potentially buggy. In such cases, transition
invariants will be more important than most of the other invariants generated by the inference
engine. To accurately generate transition invariants, we need to fed to the inference engine a
Arijit Chattopadhyay
Chapter3.Dynamic Invariants Generation for Concurrent Programs
27
diverse and representative set of execution traces. This is the main reason why, without the help of
systematic exploration techniques, existing tools would produce many false invariants. R Tool, in
contrast, can robustly generate transition invariants by taking thread scheduling into consideration.
Besides thread interleaving, another reason why the inference engine may produce wrong results
is that the good and bad executions are typically mixed in the input. To solve this problem, we
have developed a trace classification module that separates the execution traces into two groups:
the passing traces and the failing traces, based on a test oracle. Specifically, R Tool provides a
macro called R assert, through which the programmers can write correctness conditions that the
program is expected to satisfy. During the dynamic analysis, if the assertion fails, the corresponding execution trace will be classified as a bad trace. While inferring likely invariant, the engine
may choose to consider only the bad traces, only the good traces, or all traces. Then, we can
compare and contrast the invariants generated in these three scenarios. We shall demonstrate in
the following sections that the resulting information is useful in applications such as diagnosing
concurrency bugs and inferring likely atomic regions in the software code.
Chapter 4
Applications of the Generated Invariants
In this section, we illustrate two applications where the dynamically generated likely invariants
from multithreaded programs would be useful. The first application is diagnosing the concurrency
bug in a multithreaded program. The second application is inferring likely atomic regions in the
software code.
4.1
Diagnosing Concurrency Errors
In this application, we assume that the multithreaded program has some assertions that would be
satisfied in most cases, but may fail in some rare thread interleavings. In this case, the execution
traces of the program are classified into two groups: the good traces, and the bad traces. In order
to identify the root cause of the concurrency error, we leverage the two sets of invariants generated
by R Tool and compare them to identify the discrepancy. Our conjecture is that the discrepancy
28
Arijit Chattopadhyay
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Chapter4. Applications of Generated Invariants
29
const int NUM = 2;
int sum = 0;
int array[128];
int array_index = -1;
void *thrd(void * arg) {
int temp = array_index;
......
temp = temp+1;
array_index = temp;
array[array_index] = 1;
......
}
int main(){
.......
//After Joining all the NUM threads
for(i=0; i<NUM; ++i) {
sum += array[i];
}
R_assert (sum==NUM);
......
}
Figure 4.1: Example for diagnosing concurrency errors.
will provide valuable information for us to understand why the concurrency error occurs.
More formally, let Invgood be the set of likely invariants generated from the good traces only, and
Invbad be the set of likely invariants generated from the bad traces only. Let Invdiff = (Invgood \
Invall ) be the result of the set-minus operation. That is, Invdiff consists of invariants that are
satisfied by the good traces but not the bad traces.
Consider the example in Figure 4.1, where a parameterized number of threads share a global array
and index named array index. NUM is the number of threads that execute method thrd()
concurrently. The test oracle provided by the programmer is shown in Line 26, which states
that the expected result is (sum==NUM). The assertion passes in good runs where each thread
executes method thrd() atomically, but fails in bad runs where threads interfere with each other.
Depending on how they interfere with each other, the value for sum ranges from 1 to NUM.
When given the good traces, our R Tool will be able to generate transition invariants as reported
Arijit Chattopadhyay
1
2
3
4
5
6
7
8
Chapter4. Applications of Generated Invariants
30
==========================
..main.c_13_17_0():::ENTER
::array_index < ::array
==========================
..main.c_13_17_0():::EXIT
::array == orig(::array)
......
::array_index - orig(::array_index) - 1 == 0
Figure 4.2: Transitional invariants generated for Figure 4.1.
in Figure 4.2. In particular, the invariant on the last line shows that the combined effect on running
Lines 13-17 in Figure 4.1 over variable array index is that it will be increased by one.
When given the bad traces, however, our R Tool will not be able to generate the aforementioned
transition invariant. Instead, it will generate ::array index > orig(::array index), which
also covers cases in which array index is increased by 2, 3, ...
By comparing the two sets of invariants, we will be able to see the difference: ::array index =
orig(::array index)+1 versus ::array index > orig(::array index), which high-
lights the root cause of this concurrency error.
4.2
Inferring Likely Atomic Regions
In this application, we assume that certain code blocks are either atomic or meant to be atomic,
even if the atomicity is not properly enforced in the latter case. However, these code blocks are
not known to us a priori, and we would like to identify them automatically with the help of the
dynamically generated transition invariants.
When the transition relation of a code region is consistent with the transition invariant generated
Arijit Chattopadhyay
Chapter4. Applications of Generated Invariants
31
for the same code region, we say that section has been executed atomically. Furthermore, if the
transition invariant is generated from good traces only – and they are not satisfied by the bad traces
– we say that the code region is meant to be atomic. If the transition invariant is generated from
all traces, we say with high confidence that the code region is atomic. For example, the method
thrd() in Figure 4.1 and the method withdraw() in Figure 2.1 are meant to be atomic, whereas
the fixed version of withdraw() in Figure 2.3 is atomic.
Another important notion in this context is the maximal atomic region. Although Lines 13-15 can
be regarded as a likely atomic region, it is embedded in Lines 13-17, which is a larger atomic
region. Since R Tool can vary the size and location of the code regions for which it generates
transition invariants, we are able to check code blocks of increasing number of instructions. Algorithm 2 shows the process of iteratively increasing the spacer size, until we reach a block of
instructions where the transition relation is no longer consistent with its transition invariant.
Algorithm 2 Computing the maximal likely atomic region.
spacer size ← 1
do
spacer size ← spacer size + 1
tr inv ← R Tool(src code, spacer size)
res ← chk consistency(tr inv, T R)
while res = true
Computing the Transition Relation
The transition relation of an instruction is the relation
between pre- and post-condition of that instruction. Similarly, the transition relation of a block
of instructions is the relation between their pre- and post-conditions. Therefore, the transition
relation, when being projected to the shared variables only, shows the cumulative effect of all the
Arijit Chattopadhyay
Chapter4. Applications of Generated Invariants
32
Acc_1||Acc_2
Transfer_x
---------------------1: Acc_1 = Acc_1-x
...
2: Acc_2 = Acc_2+x;
...
Transfer_y
----------------------1’: Acc_2 = Acc_2-y;
....
2’: Acc_1 = Acc_1+y
....
Figure 4.3: The bank account transfer example
instructions in the code block.
Consider a banking example for transferring fund from one account to another, shown in Figure
4.3. Here, Transfer x() is called by one thread and Transfer y() is called by another thread,
where x and y as (thread-local) parameters, respectively. In a good run, the entire method has to be
executed atomically by each thread, without interference from the other thread. The transition
relation of the first statement in Transfer x() is Acc 1 = orig(Acc 1)-x. Similarly, the
transition relation of the entire method is Acc 1+Acc 2 = orig(Acc 1+Acc 2). The transition
relation for Transfer y() is computed similarly.
At this moment, the transition relation is computed by the user and then provided to R Tool. We
plan to automate this process in the future, by leveraging existing static program analysis tools
such as BLAST [21], which can compute transition relations using symbolic methods.
Chapter 5
Experiments
We have implemented the proposed methods in a software tool, called R Tool , which can handle
unmodified C/C++ code written for the Linux/Pthreads platform. R Tool consists of an LLVM
based instrumentation tool, an Inspect based interleaving explorer, a Java based trace classifier,
and a customized inference engine in Daikon. We implemented R Tool by writing 2126 lines of
new C++ code, 776 lines of Java code, 229 lines of Python code, respectively.
We have evaluated our new methods on a set of public domain benchmark examples. Our targeted
applications are low level systems code, such as multithreaded implementations of concurrent
data structures and synchronization primitives. Due to this reason, we have extended the existing
infrastructure in Inspect to handle the GNU built-in atomic instructions, which are crucial in
directly and atomically accessing the shared memory. Our experimental evaluation is designed to
answer the following research questions:
33
Arijit Chattopadhyay
Chapter5. Experiments
34
• How good is the quality of the invariants generated by our new method, especially when
compared to existing tools such as Daikon?
• How useful are the invariants in the context of diagnosing concurrency errors and inferring
likely atomic regions?
We have evaluated our new methods on two sets of concurrency related benchmarks. The SVCOMP’14 examples come from the 2014 Software Verification Competition [34], consisting of
implementations of low level algorithms such as mutual exclusion and concurrent data structures.
The SCTBench benchmark example simulates a multithreaded banking transaction process. The
characteristics of these benchmark examples are summarized in Table 5.1. Note that some of these
benchmarks, such as Dekker, are inherently non-terminating. To apply dynamic analysis, we have
modified the source code of the test program slightly in order to turn the infinite loops to finite
loops, while maintaining the program’s intent.
5.1
Results
Table 5.2 shows the results of applying both R Tool and Daikon to the benchmark examples.
Column 1 shows the program name, Column 2 shows the number of program points, and Column 3
shows the number of shared variables. Since Daikon only generates invariants at the method entry
and exit points, for comparison purposes, we have taken only the results of R Tool at the same set
of method entry and exit points – otherwise, R Tool would always generate a lot more invariants
than Daikon. Columns 4-7 show the statistics of running Daikon, including the total number
Arijit Chattopadhyay
Chapter5. Experiments
35
of invariants reported, and among them, the true invariants and the bogus invariants, as well as the
execution time in seconds. Columns 8-11 show the statistics of running R Tool , including the total
number of invariants reported, and among them, the true invariants and the bogus invariants, as well
as the execution time in seconds. For both Daikon and R Tool , we have manually classified the
reported invariants (total) into true invariants (correct) and bogus invariants (incorrect). Due to the
use of systematic interleaving exploration, R Tool in general took more time to complete, but did
not report any bogus invariants. In contrast, a large number of invariants reported by Daikon are
bogus.
Table 5.3 shows the statistics of the invariants generated by R Tool in more details. Since we are
not comparing the result against Daikon (baseline), we will consider the invariants generated at
all possible program points, not just the method entry and exit points. Therefore, the total number
of invariants reported in this table is larger than that of Table 5.2. Here, we first classify the
invariants generated by R Tool into two groups: Transition invariants over shared variables and
the rest (called the Regular Invariants). For each transition invariant reported by R Tool , we also
manually inspect the source code to see if we can manually construct the same transition invariant.
If the reported transition invariants matches the manually generated ones, in the last column, we
put yes; otherwise, we put no.
Our results show that, except for Sync01, R Tool is always able to generate the correct transition
invariants over shared variables. The reason why R Tool cannot generate one of the two transition
invariants for Sync01 is because of the way the test program in this benchmark is written. Part of
the code for Sync01 is shown in Figure 5.1 (left), where the waiting thread at Line 20 only gets the
Arijit Chattopadhyay
19
20
21
22
23
24
while (num > 0)
cond_wait(&empty,&m);
orig = num;
temp = num;
temp = temp+1;
num = temp;
Chapter5. Experiments
40
41
36
while (num == 0)
cond_wait(&full,&m);
Figure 5.1: Sync 01: part of the code for the two threads.
signal to proceed when num is set to 0 by another thread, shown in Figure 5.1 (right). Even if we
try to run the tests with different values for num, this code block (Lines 21-24) will never get the
transition invariant ::num - orig(::num)-1 =0, because the inference engine (in Daikon as
well as in R Tool ) needs different values of num in order to generate a linear invariant. (Recall
that, on a two dimensional plane, two points are needed to determine the line.) Thus, it fails to
detect the transitional invariant in this case.
Table 5.4 shows the result of using the generated invariants to infer atomic regions. Column 1
shows the name of the benchmark. Column 2 shows the number of atomic regions inferred from
the code. Columns 3-5 show the minimum, maximum, and average size of these atomic regions.
We have manually inspected all reported atomic regions, and checked whether they are correct or
not. Column 6 shows whether the atomic regions are indeed correct. Except for few cases, we
have successfully detected all available atomic regions in the software code. However, in Sync01,
IncTrue, IncCas, LogProcSweep, although the system detects only one atomic region, there are in
fact two atomic regions.
Finally, we evaluate the scalability of our new method. In general, the computational complexity
of R Tool depends on the length of each individual execution trace, as well as the number of
execution traces fed to the backend inference engine. Among them, the number of execution traces
Arijit Chattopadhyay
Chapter5. Experiments
37
is the main bottleneck, since by using an Inspect based interleaving explorer, in theory, we
subject R Tool to the potentially exponential number of thread interleavings in the worst case,
even with the help of partial order reduction techniques. In practice, however, we can still use
more aggressive reduction heuristics, such as PCB and HaPSet (covered in Section 3.2), during
interleaving exploration. The question is whether such interleaving reduction decreases the quality
of the generated invariants.
Table 5.5 shows the comparative results of using the three different exploration techniques in
R Tool . Here, we compare both the number of runs that each strategy requires to explore the
interleaving space, as well as the number of transition invariants generated by R Tool as a result of
the exploration. The first thing that we notice is that there is a significant reduction in the number
of runs by changing from DPOR to PCB and HaPSet. But, at the same time, we notice that the
number of transition invariants actually remains the same for all benchmarks. In order words,
using a more aggressive (and unsound) exploration strategy (such as PCB or HaPSet) does not
seem to noticeably reduce the effectiveness of the invariant generation algorithm. Figure 5.2 shows
the scalability of the each algorithm, with increasing number of threads in the targeted program.
This experimental results have been found, by running R Tool with on Indexer Benchmark from
SVCOMP’14 and varying the number of threads in the program. The x abscissa, denotes the
number of threads present in the system, where as the ordinate denotes the number of iteration
that Inspect tool is required, to find the necessary interleavings. Though DPOR times out very
quickly but HaPSet, shows significant lower rate of increment in the number of the number of
iterations, with the increasing number of threads in the system.
Arijit Chattopadhyay
Chapter5. Experiments
5000
38
Ha PSET
PCB
DPOR
4500
Number of Iterations
4000
3500
3000
2500
2000
1500
1000
500
0
2
3
4
5
6
Thread Count
7
8
9
10
Figure 5.2: Scalability Comparison of DPOR, PCB and HaPSet algorithm.
5.2
Discussions
5.2.1 Threats to validity.
There are two threats to validity in our experimental evaluation. The first threat is the degree
to which the SVCOMP’14 and SCTBench programs used in our experiment represent realistic
applications. Although we believe that they are representative of the kind of low level systems
code that we target, there can be discrepancies. Another threat is that, for applications, we only
evaluated the use of transition invariants over shared variables for inferring likely atomic regions.
It remains unclear if the vast majority of the remaining invariants are useful (we suspect that they
are mostly sequential properties and therefore have nothing to do with concurrency, but this needs
to be investigated further).
Arijit Chattopadhyay
Chapter5. Experiments
39
5.2.2 Limitations.
Our method is limited in three aspects. First, it relies on the user to provide test oracles in the
form of R assert conditions so that R Tool can separate good traces from bad traces. One
way to address this limitation is to design a static code analysis method for inserting assertions
automatically. Second, our method relies on systematic interleaving exploration to improve the
diversity of the execution traces. Although our experiments shows that the current implementation
of interleaving exploration is good enough for low-level systems code whose size is small, scalability may become a problem for larger programs. Third, our method for automatically inferring
likely atomic regions still relies on the user’s manual inspection to confirm the results. One way
to remedied this limitation is to use an automated verification procedure to formally prove the
correctness of the result.
Arijit Chattopadhyay
Chapter5. Experiments
40
Table 5.1: The characteristics of the benchmark examples used in our experimental evaluation.
Benchmark Name LoC
Program Points
FibBench
55
Lazy01
52
Stateful01
55
Sync01
64
Dekker
68
Lamport
119
Peterson
55
TimeVarMutex
55
Szymanski
106
IncTrue
55
IncCas
60
IncDec
58
IncDecCas
89
ReOrder
63
AccountBad
61
LogProcSweep
83
pfscan
932
This program computes the Fibonacci series using two shared variables, i and j, and two threads.
thread1,thread2,
Method thread1 increases the variable data by 1 whereas thread2
thread3
increases it by 2. Method thread3 checks the result. All methods are
protected by lock.
thread1, thread2, Shared variables data1 and data2 are accesses by two threads. All
main
accesses are protected by lock.
thread1, thread2, Shared variable num is accessed by two threads and is protected by
main
lock and condition variables.
thr1,thr2,main
This is an implementation of Dekker’s algorithm for for mutual
exclusion.
thr1,thr2,main
This is an implementation of Lamport’s algorithm for mutual exclusion for two threads.
Peterson
This is an implementation of Peterson’s algorithm for mutual exclusion.
allocator,deallocator,
This is scaled-down version of the inode allocation procedure.
main
Here, inode is a shared variable set allocator and reset by
deallocator. Accesses are protected by two locks.
thr1,thr2,main
This is an implementation of the mutual exclusion algorithm by
Szymanski for two threads.
thr1,main
This is an implementation of the atomic increment operation for an
integer variable. The atomic increment is implemented using a shared
integer flag.
thr1,main
This is an another implementation of the atomic increment operation
for an integer. This implementation mimics the atomic compare-andset operation.
inc,dec,main
A shared integer variable is atomically incremented and decremented
by two threads, for which mutual exclusion is maintained by using a
shared integer flag.
inc,dec,main
Another implementation of the shred integer incrementing and decrementing procedure. Here, the increment and decrement follow a
compare and set pattern.
setThread,
This example has two threads running setThread and
checkThread, main
checkThread, respectively. A shared variable is set by setThread
and checked by checkMethod.
deposit, withdraw, This example simulates a banking transaction application, where
main, check result
deposit adds a fixed amount of money to the shared variable
balance, whereas withdraw removes the same fixed amount. The
third method check result checks the result. All accesses to the
shared variable are protected by lock.
new log entry,
This sample code from Apache Foundation is a logger that maintains
add log entry,
a list of Log entry. It has two threads, one of which adds new log
logging
entries and other resets the list.
worker
This is a multithreaded implementation of the file system scanning
utility in Linux for scanning a particular path in the file system.
t1,t2,main
Description
Arijit Chattopadhyay
Chapter5. Experiments
41
Table 5.2: Invariant Comparison between Daikon and R Tool
Name
FibBench
Lazy01
Stateful01
Sync01
Dekker
Lamport
Peterson
TimeVarMutex
Szymanski
IncTrue
IncCaS
IncDec
IncDecCaS
Reorder2
AccountBad
LogProcSweep
pfscan
Benchmark
Program points
3
4
3
3
3
3
3
3
3
2
2
3
3
3
4
7
1
global vars
2
1
2
1
4
5
4
3
3
2
1
6
3
4
6
1
1
Total Invs
17
15
22
14
39
48
39
27
29
15
18
57
24
47
68
41
0
Daikon (baseline)
Correct Invs Incorrect Invs
15
2
15
3
15
8
12
2
24
20
17
31
13
26
23
4
14
15
11
4
16
2
31
26
24
0
38
9
50
18
39
2
0
0
Time (s)
2.9
2.6
2.7
1.6
1.7
5.0
1.7
1.6
1.6
2.2
2.7
3.4
2.9
2.5
3.8
2.5
1.4
Total Invs
30
14
18
9
53
76
51
22
35
19
10
75
40
56
78
45
1
R Tool (new)
Correct Invs Incorrect Invs
30
0
14
0
18
0
9
0
53
0
76
0
51
0
22
0
35
0
19
0
10
0
75
0
40
0
56
0
78
0
45
0
1
0
Table 5.3: The quality of invariants generated by R Tool .
Benchmark
Name
FibBench
Lazy01
Stateful01
Sync01
Dekker
Lamport
Peterson
TimeVarMutex
Szymanski
IncTrue
IncCaS
IncDec
IncDecCaS
Reorder2
AccountBad
LogProcSweep
R Tool Gen
Regular Transition
Invs
Invs
99
2
39
2
58
4
31
1
198
2
399
2
188
2
51
2
205
2
33
1
5
1
229
2
25
2
113
2
174
2
61
1
Total
Invs
101
41
62
32
200
401
190
53
207
34
6
231
27
115
176
62
Manual Inspection
Transition
Matched
Invs
2
yes
2
yes
4
yes
2
no
2
yes
2
yes
2
yes
2
yes
2
yes
1
yes
1
yes
2
yes
2
yes
2
yes
2
yes
1
yes
Time (s)
258.6
5.5
4.0
2.0
62.9
9.1
11.4
1.7
40.4
12.6
3.8
21.3
3.6
273.1
8.2
3.1
11.4
Arijit Chattopadhyay
Chapter5. Experiments
42
Table 5.4: Size of atomic regions under dynamic invariants.
Benchmark
FibBench
Lazy01
Stateful01
Sync01*
Dekker
Lamport
Peterson
TimeVarMutex
Szymanski
IncTrue*
IncCaS*
IncDec
IncDecCaS
Reorder
AccountBad
LogProcSweep*
Atomic regs Min size Max size Avg size
2
2
5
3.5
2
2
5
3.5
4
3
5
4
1
3
9
6
2
2
4
3
2
2
4
3
2
2
4
3
2
2
2
2
2
2
10
6
1
3
8
5.5
1
2
19
10.5
2
2
4
3
2
2
9
5.5
2
2
3
2.5
2
2
6
4.5
1
2
3
2.5
Correct
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
Table 5.5: Invariants generated with different search strategies.
Name
Benchmark
FibBench
Lazy01
Stateful01
Sync01
Dekker
Lamport
Peterson
TimeVarMutex
Szymanski
IncTrue
IncCaS
IncDec
IncDecCaS
Reorder2
AccountBad
LogProcSweep
DPOR
# Runs # Invs
71624
2
160
2
48
4
7
1
3896
2
392
2
730
2
4
2
5980
2
848
1
132
1
1936
2
32
2
19425
2
160
2
9
1
PCB (k=2)
# Runs # Invs
252
2
260
2
68
4
16
1
15
2
21
2
15
2
19
2
23
2
84
1
100
1
240
2
76
2
832
2
260
2
84
1
HaPSet
# Runs # Invs
2576
2
132
2
32
4
4
1
519
2
321
2
150
2
4
2
3015
2
1140
1
256
1
1352
2
56
2
71
2
36
2
36
1
Chapter 6
R Tool User’s Manual
R Tool is a dynamic invariant generation tool for sequential and concurrent C/C++ programs. It is
built upon the LLVM compiler front-end (llvm-3.2), the Inspect concurrency testing tool (inspect0.3), a Java based trace classifier (SimpleDeclParser) and the Daikon dynamic invariant generation
tool (Daikon).
6.1
Installation
The installation process consists of the following steps:
1. Check out the R Tool git repository.
2. Install LLVM and the DaikonPass.
3. Install Inspect.
43
Arijit Chattopadhyay
Chapter6. Software Tool
44
4. Install the trace classifier SimpleDeclParser.
5. Install Daikon.
6. Update the path in the exports.sh.
7. Install Qt (Optional – required only for using the GUI).
6.1.1 Checking out the R Tool git repository
Check out the git repository into a local folder by using the following command,
git clone [email protected]:arijitvt/RTool.git
This will create a folder named R Tool , which is a self-contained package with all dependencies.
For rest of this document, R Tool folder will denote this folder and all the relative paths will be
calculated with respect to this folder.
6.1.2 Installing LLVM and the DaikonPass
Inside the R Tool folder, there exists our customized version of LLVM (llvm-3.2.src.tar.bz2) compressed file. In order to install this version of LLVM,
• First, untar the package by running the command
tar -xvf llvm-3.2.src.tar.bz2.
This will create a folder named llvm-3.2.src and place the source inside the folder.
Arijit Chattopadhyay
Chapter6. Software Tool
45
• Copy the Daikon folder (which is inside the LLVM PASS folder) to the existing folder called
R Tool /llvm-3.2.src/lib/Transform.
• Then, go to the build directory, which should contain conf.sh file. Provide the executable
permission to the script by running the command
chmod +x conf.sh .
• Run the conf.sh from the build folder. This will build LLVM and all its code transformation passes that are needed to run R Tool . The resulting files are inside the build/Release+Debug+Asserts folder.
• Finally, update the LLVM variables to have the complete path point to Release+Debug+Asserts
inside the R Tool /exports.sh.
6.1.3 Installing the Inspect Tool
Since our customized version of the Inspect tool uses the SMT solver, we need to install smtdp
before the installation of the Inspect. Go to the smt dp folder and run make from there. It will
compile and also copy the required files to the proper location.
Installing Inspect is also straightforward. Go to the inspect-0.3 folder and run the make
command from there. It will compile and create the inspect executable inside the R Tool /bin
folder. To test whether the make command has been executed properly, check if you have the
inspect executable in the R Tool /bin folder.
Arijit Chattopadhyay
Chapter6. Software Tool
46
After the installation, remember to set the insp path variable with the location of inspect-0.3
folder location in the R Tool /exports.sh file.
6.1.4 Installing the Trace Classifier
The jar file for the trace classifier is present in the R Tool /jars folder. Update the CLASSPATH
with the location of the parser.jar in R Tool /exports.sh.
Since the source code for that jar file is already present in the R Tool /SimpleDeclParser folder,
if needed, one can also create the jar file again from the java source, using the standard jar file
making process, and then update the CLASSPATH in the R Tool /exports.sh with the appropriate
location of jar file. .
6.1.5 Installing Daikon
The original guide for Daikon installation can be found in this url
http://plse.cs.washington.edu/daikon/download/doc/daikon.html.
Below is a brief summary of the major steps associated with this process.
• Download and install JDK 1.7 (I’ve found that jdk 1.6 does not work for the current
version of daikon).
• Update the JAVA HOME in the bashrc with the installation directory of JDK1.7.
Arijit Chattopadhyay
Chapter6. Software Tool
47
• Populate DAIKONDIR variable with the top directory of the daikon folder and export the
variable.
• Export the daikon.bashrc by running
source $DAIKONDIR/scripts/daikon.bashrc
• Then run make from the daikon folder. This will install Daikon and update all the required
the path.
6.1.6 Updating the path in exports.sh
During the installation process, we have already updated various path variables. You may want to
double check if all the path variables are set properly at this stage.
Once that is done, update the PATH variable with the location of the R Tool /bin folder and update
the daikon variable with the source folder where the DaikonPass resides. DaikonPass folder exists
in the llvm-3.2.src/lib/Transforms/Daikon folder. This folder should contain the hook.h and hook.c
files, which contains the source for the hook assert macro that we have defined.
6.1.7 Installing Qt
A user friendly interface for R Tool has been developed using Qt 5 framework. If the user choses
to use the Gui option of R Tool , then it is necessary to install Qt 5 framework. Qt comes with
standard installer , which can downloaded from the following url
Arijit Chattopadhyay
Chapter6. Software Tool
48
http://qt-project.org/downloads.
6.2
Using R Tool
R Tool can be used in three modes.
1. Use the existing daikon infrastructure through a quintessential interface.
2. Use the R Tool infrastructure to generate invariant at all function entry-exit and around the
global variable access locations.
3. Generate the invariants at various location varied dynamically using command line control.
Before using R Tool , check if you can have access to the three scripts: controller.py , runner.py
and r tool.py . If it is not possible to access these scripts, then update the path variable properly in
the R Tool /exports.sh and check again.
6.2.1 R Tool as the Frontend for Daikon
R Tool can be used to run Daikon as it is, in a more comprehensive way and using much shorter
commands. To compile the target program (the program for which you want to generate invariants)
and dump the program points, run the command
controller.py daikon dump
Arijit Chattopadhyay
Chapter6. Software Tool
49
This compiles the target program, creates the executable, and also creates the program points and/or
variables of interest as well. You may want to edit these files, for example, to reduce the number
of program points and/or variables of interest (on which the invariants are generated).
Once this step is over, then we should be able to run the instrumented target program, using the
command
controller.py daikon rep #
Here # should be the run count/iteration count – that is, count of the time we are running the target
program before giving the traces to Diakon for invariant generation.
Since daikon needs at least four different values of the variables for generating linear invariants,
you may need to edit the program source so that the program runs under different program inputs
(different values of global variables during different runs).
That is the main reason why we need to supply this iteration count. This process will also dump all
the thread-modular trace files inside a trace folder, called arijit, where one can generate invariant
using the Daikon’s backend by running the command from arijit folder,
java daikon.Daikon *.dtrace
This invariant generation step using the Daikon’s backend is same for all the other methods listed
here.
Arijit Chattopadhyay
Chapter6. Software Tool
50
6.2.2 R Tool for Generating Invariants from Concurrent Programs
The main usage of the R Tool is to generate invariants for not only sequential C/C++ programs, but
also concurrent C/C++ programs. For this purpose, we should invoke R Tool in a similar fashion
as discussed in the previous section. However, we also need to run the target program under the
supervision of the Inspect systematic concurrency testing tool.
To compile the program for Inspect and at the same time generate program points, use the below
mentioned command.
controller.py rtool comp
Once the instrumented program is generated, we can use
controller.py rtool run $ # %
Here, $ corresponds to the iteration count (as mentioned in the previous section), and # corresponds to the algo choice. Since Inspect can use three strategies (DPOR, PCB or HaPSet)
for systematic exploration of the possible thread interleavings, we can use the command-line
argument to dynamically choose the algorithm (0 -> DPOR, 1 -> PCB, 2->HaPSet).
This number corresponds to the algo choice. When we choose PCB or HaPSet, % denotes the
number as required for PCB (context bounds) and HaPSet (size of HaPSet vectors). For DPOR,
this number should be set to 0.
Arijit Chattopadhyay
Chapter6. Software Tool
51
6.2.3 R Tool for Inferring Likely Atomic Code Regions
R Tool ’s atomic section determination algorithm, relies on oracle information provided by the
user. This requires additional manual work to be done, other than the automated instrumentation
framework of R Tool . For that purpose, user should include the hook.h and provide the oracle
information in hook assert macro. Compilation and linking of the program, has been taken care of
automatically by the controller scripts. When R Tool is used to automatically infer atomic sections
in the program code, the compilation should follow the exact same procedure as above, except that
the first argument will be ’cs’:
controller.py cs comp
To run the program,
controller.py cs run $ @ # %
The rest of the three special characters ($,\#,%) remain the same as described in the previous
section, except that @ should be replaced with the spacer count, which set the number of global
instructions that should be included in the same atomic region, allowing the user to dynamically
vary the locations of instrumentation to determine the atomic region.
All these utilities can be executed though a GUI built based on the qt framework. An sample of
this GUI can be seen in figure 6.1.
Arijit Chattopadhyay
Chapter6. Software Tool
Figure 6.1: Screenshot of the User Interface for R Tool
52
Chapter 7
Conclusions and Future Work
In this thesis, I have presented a new method for dynamically generating likely invariants from
multithreaded programs, as well as leveraging these invariants to diagnose concurrency errors and
infer likely atomic regions in the software code. Compared to existing methods, which often work
well on sequential programs but become ineffective on multithreaded programs, our new method
can generate a significantly large number of high-quality invariants.
We have implemented the new methods in a software tool, called R Tool , which builds upon
the LLVM compiler frontend, the Inspect interleaving exploration tool, and a customized invariant
inference engine inside Daikon. Our experimental results show that the proposed methods are
effective in generating invariants from multithreaded C/C++ programs, and these invariants are
useful in applications such as diagnosing concurrency related bugs and identifying atomic code
regions.
53
Arijit Chattopadhyay
Chapter7. Conclusions and Future Work
54
This work can be extended in many ways. First, the state and transition invariants generated by
R Tool have strong potential in boosting the performance of software verification tools that are
based on predicate abstraction. For example, tools such as SLAM [2, 1] and BLAST [20, 21]
rely on proper predicates to construct effective models for sequential and concurrent software.
However, computing these predicates often requires many costly refinement iterations. Invariants
generated from R Tool can be provided to these tools as candidate predicates, which has the
potential of significantly accelerating the refinement process.
Another possible extension to R Tool is to increase the program path coverage of the execution
traces by leveraging automated test input generation techniques. Right now, the user needs to
diversify the test inputs manually. Having a fully automated method for generating quality test
input can increase the diversity of the traces, which in turn will increase the quality of the generated
invariants. Finally, right now the user still needs to provide the test oracle that R Tool can use to
classify execution traces into good traces and bad traces. It would be very useful if we could
automate this process, for example, by developing a software tool that can automatically insert test
oracles into the software code.
Bibliography
[1] T. Ball, A. Podelski, and S. K. Rajamani. Boolean and cartesian abstraction for model
checking c programs. In International Conference on Tools and Algorithms for Construction
and Analysis of Systems, pages 268–283, 2001.
[2] T. Ball and S. K. Rajamani. The SLAM Toolkit. In International Conference on Computer
Aided Verification, pages 260–264, 2001.
[3] M. Boshernitsan, R. Doong, and A. Savoia. From daikon to agitator: Lessons and challenges
in building a commercial tool for developer testing. In International Symposium on Software
Testing and Analysis, pages 169–180, New York, NY, USA, 2006. ACM.
[4] M. Bozga, P. Habermehl, R. Iosif, F. Konečný, and T. Vojnar. Automatic verification of
integer array programs. In International Conference on Computer Aided Verification, volume
5643 of Lecture Notes in Computer Science, pages 157–172. Springer, 2009.
[5] A. R. Bradley, Z. Manna, and H. B. Sipma. What’s decidable about arrays? In International
Conference on Verification, Model Checking, and Abstract Interpretation, pages 427–442,
55
Arijit Chattopadhyay
Bibliography
56
Berlin, Heidelberg, 2006. Springer-Verlag.
[6] F. Chen, T. Serbanuta, and G. Rosu. jPredictor: a predictive runtime analysis tool for java. In
International Conference on Software Engineering, pages 221–230, 2008.
[7] M. A. Colón, S. Sankaranarayanan, and H. B. Sipma. Linear invariant generation using nonlinear constraint solving. In In Computer Aided Verification, pages 420–432. Springer Verlag,
2003.
[8] P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis
of programs by construction or approximation of fixpoints. In ACM SIGACT-SIGPLAN
Symposium on Principles of Programming Languages, pages 238–252, New York, NY, USA,
1977. ACM.
[9] P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among variables of a
program. In ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages,
POPL ’78, pages 84–96, New York, NY, USA, 1978. ACM.
[10] C. Csallner, N. Tillmann, and Y. Smaragdakis. Dysy: Dynamic symbolic execution for
invariant inference. In International Conference on Software Engineering, pages 281–290,
New York, NY, USA, 2008. ACM.
[11] M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin. Dynamically discovering likely
program invariants to support program evolution. In International Conference on Software
Engineering, pages 213–224, New York, NY, USA, 1999. ACM.
Arijit Chattopadhyay
Bibliography
57
[12] M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao.
The daikon system for dynamic detection of likely invariants. Sci. Comput. Program., 69(13):35–45, Dec. 2007.
[13] C. Flanagan and P. Godefroid. Dynamic partial-order reduction for model checking software.
In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages
110–121, New York, NY, USA, 2005. ACM.
[14] C. Flanagan and S. Qadeer.
Thread-modular model checking.
In International SPIN
Workshop on Model Checking Software, pages 213–224, 2003.
[15] C. A. Furia and B. Meyer. Fields of logic and computation. chapter Inferring Loop Invariants
Using Postconditions, pages 277–300. Springer-Verlag, Berlin, Heidelberg, 2010.
[16] P. Godefroid. Model checking for programming languages using verisoft. In ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pages 174–186, New York,
NY, USA, 1997. ACM.
[17] S. Hangal, N. Chandra, S. Narayanan, and S. Chakravorty. Iodine: a tool to automatically
infer dynamic invariants for hardware designs. In Design Automation Conference, pages
775–778, 2005.
[18] S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection.
In International Conference on Software Engineering, pages 291–301, New York, NY, USA,
2002. ACM.
Arijit Chattopadhyay
Bibliography
58
[19] T. A. Henzinger, T. Hottelier, L. Kovács, and A. Voronkov. Invariant and type inference
for matrices. In International Conference on Verification, Model Checking, and Abstract
Interpretation, pages 163–179, Berlin, Heidelberg, 2010. Springer-Verlag.
[20] T. A. Henzinger, R. Jhala, and R. Majumdar. The blast software verification system. In
International SPIN Workshop on Model Checking Software, pages 25–26, 2005.
[21] T. A. Henzinger, R. Jhala, R. Majumdar, and S. Qadeer.
Thread-modular abstraction
refinement. In International Conference on Computer Aided Verification, pages 262–274,
2003.
[22] M. Li. A practical loop invariant generation approach based on random testing, constraint
solving and verification. In International Conference on Formal Engineering Methods:
Formal Methods and Software Engineering, pages 447–461, Berlin, Heidelberg, 2012.
Springer-Verlag.
[23] S. Lu, J. Tucek, F. Qin, and Y. Zhou. Avio: Detecting atomicity violations via access interleaving invariants. In International Conference on Architectural Support for Programming
Languages and Operating Systems, pages 37–48, New York, NY, USA, 2006. ACM.
[24] A. Mazurkiewicz. Trace theory. In Advances in Petri Nets 1986, Part II on Petri Nets:
Applications and Relationships to Other Models of Concurrency, pages 279–324, New York,
NY, USA, 1987. Springer-Verlag New York, Inc.
Arijit Chattopadhyay
Bibliography
59
[25] M. Musuvathi, S. Qadeer, T. Ball, G. Basler, P. A. Nainar, and I. Neamtiu. Finding and
reproducing heisenbugs in concurrent programs.
In USENIX Symposium on Operating
Systems Design and Implementation, pages 267–280, 2008.
[26] T. Nguyen, D. Kapur, W. Weimer, and S. Forrest. Using dynamic analysis to discover
polynomial and array invariants. In International Conference on Software Engineering, pages
683–693, Piscataway, NJ, USA, 2012. IEEE Press.
[27] S. Park, S. Lu, and Y. Zhou. Ctrigger: Exposing atomicity violation bugs from their hiding
places. In International Conference on Architectural Support for Programming Languages
and Operating Systems, pages 25–36, New York, NY, USA, 2009. ACM.
[28] A. Podelski and A. Rybalchenko.
Transition invariants.
Logic in Computer Science,
Symposium on, 0:32–41, 2004.
[29] S. Qadeer and D. Wu. KISS: keep it simple and sequential. In ACM SIGPLAN Conference
on Programming Language Design and Implementation, pages 14–24, 2004.
[30] E. Rodrı́guez-Carbonell and D. Kapur. Generating all polynomial invariants in simple loops.
J. Symb. Comput., 42(4):443–476, Apr. 2007.
[31] S. Sankaranarayanan, H. B. Sipma, and Z. Manna. Non-linear loop invariant generation using
gr&#246;bner bases. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, POPL ’04, pages 318–329, New York, NY, USA, 2004. ACM.
Arijit Chattopadhyay
Bibliography
60
[32] R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, and A. V. Nori. A data driven
approach for algebraic loop invariants. In ESOP, pages 574–592, 2013.
[33] F. Sorrentino, A. Farzan, and P. Madhusudan. PENELOPE: weaving threads to expose
atomicity violations. In ACM SIGSOFT Symposium on Foundations of Software Engineering,
pages 37–46, 2010.
[34] 2014 software verification competition. URL: http://sv-comp.sosy-lab.org/2014/.
[35] C. Wang, R. Limaye, M. Ganai, and A. Gupta. Trace-based symbolic analysis for atomicity
violations.
In International Conference on Tools and Algorithms for Construction and
Analysis of Systems, pages 328–342, 2010.
[36] C. Wang, M. Said, and A. Gupta. Coverage guided systematic concurrency testing. In
International Conference on Software Engineering, pages 221–230, 2011.
[37] L. Wang and S. D. Stoller. Runtime analysis of atomicity for multithreaded programs. IEEE
Trans. Software Eng., 32(2):93–110, 2006.
[38] M. Xu, R. Bodı́k, and M. D. Hill. A serializability violation detector for shared-memory
server programs. In ACM SIGPLAN Conference on Programming Language Design and
Implementation, pages 1–14, 2005.