KOÇ UNIVERSITY
GRADUATE SCHOOL OF SCIENCES & ENGINEERING
COMPUTER ENGINEERING
PhD THESIS DEFENSE BY HASSAN SALEHE MATAR
Title: Runtime Race Detection for Shared Memory Programming Models
Speaker: Hassan Salehe Matar
Time: June 22, 2018, 15:00
Place: ENG 208
Koç University
Rumeli Feneri Yolu
Sariyer, Istanbul
Thesis Committee Members:
Asst. Prof. Dr. Didem Unat (Advisor, Koç University)
Asst. Prof. Dr. Ayşe Yılmazer (Istanbul Technical University)
Prof. Dr. Attila Gürsoy (Koç University)
Assoc. Prof. Dr. Öznur Özkasap (Koç University)
Asst. Prof. Dr. Tankut Barış Aktemur (Özyeğin University)
Abstract:
This dissertation proposes methods for detecting runtime data races in three shared memory programming models: (i) OpenMP tasks, (ii) dataflow programming models such as Atomic DataFlow (ADF), and (iii) the POSIX Threads in embedded systems. The need for methods of detecting races in these models is fueled by the fact that they are commonly used in the
HPC, parallel programming, and concurrent programming communities but there is a lack of tools to detect races in these programming models.
A determinacy race is a condition which occurs when concurrently executing entities (e.g; tasks) access the same memory location without specified ordering between them and at least one access is a write to that memory location. As a result, a program with determinacy races may produce different final output results at different runs on the same input. One potential problem when writing parallel programs with OpenMP is to introduce determinacy races where for a given input, the program may unexpectedly produce different final outputs at different runs. Such startling behavior can result from the incorrect ordering of OpenMP tasks. We present a method to detect determinacy races in OpenMP tasks at runtime. Based on OpenMP program semantics, our proposed solution models an OpenMP program as a collection of tasks with inferred dependencies among them where a task is implicitly created with a parallel region construct or explicitly created with a task construct. We define happens-before relation among tasks based on such dependencies for determining an execution order when detecting determinacy races. Based on this formalization, we developed a tool, TaskSanitizer, which detects and reports concurrent memory accesses whose tasks do not have common dependencies. Finally, TaskSanitizer works at runtime, has been able to find bugs in micro-benchmarks and it is reasonably efficient to be utilized in a working environment. Archer is an efficient tool for detecting data races in OpenMP programs between concurrent threads. In contrast, we detect determinacy races where ordering between concurrent components is missing. Archer may fail to detect such cases and it also misses concurrent tasks executed by the same thread. By building the happen-before relations on tasks rather than threads, we can catch these situations.
We decided to call determinacy races in ADF as output nondeterminism because all tasks are atomic and therefore different final program outputs can be observed at different runs of the same buggy program and input. Output nondeterminism is possible if programmer does not specify necessary dependency between tasks which access the same memory locations. This is possible as implementing highly concurrent programs can be challenging because programmers can easily introduce unintended nondeterminism, which has the potential to affect the program output. Such unintended nondeterminism is output nondeterminism which is a special determinacy race where a program produces different final outputs at different runs on the same input, without such intention of the programmer. We propose and implement a technique for detecting output nondeterminism in applications developed on shared memory systems with dataflow execution model. Such nondeterminism bugs may be caused by missing or incorrect ordering of task dependencies that are used for ensuring certain ordering of tasks. The proposed method is based on the formulation of happens-before relation on tasks in a dataflow dependency graph. Its implementation is composed of two main phases; log recording and detection. For recording the necessary information from the execution, the tool instruments the dataflow framework and the applications, on top of the LLVM compiler infrastructure. Later it processes the collected log and reports on the found output nondeterminism in the execution. The tool can integrate well with the development cycle to provide the programmer with a testing framework against possible nondeterminism bugs. To demonstrate its effectiveness, we study a set of benchmark applications written in Atomic DataFlow programming model and report on real nondeterminism bugs in them.
Lastly, we propose EmbedSanitizer, a tool for detecting concurrency data races in 32-bit ARM-based, multithreaded, POSIX Threads C/C++ applications. We motivate the idea of detecting data races in embedded systems software natively; without virtualization or emulation or use of alternative architecture. Detecting data races in applications on a target hardware provides more precise results and increased throughput and hence enhanced productivity of the developer. EmbedSanitizer extends ThreadSanitizer, a race detection tool for 64-bit applications, to do race detection for 32-bit ARM applications. We evaluate EmbedSanitizer using PARSEC benchmarks on an ARMv7 CPU with 4 logical cores and 933MB of RAM. Our race detection results precisely match with results when the same benchmarks run on 64-bit machine using ThreadSanitizer. Moreover, the performance overhead of EmbedSanitizer is relatively low as compared to running race detection on an emulator, which is a common platform for embedded software development.
This dissertation proposes a method for detecting determinacy races and it demonstrates its effectiveness using the OpenMP tasks. Moreover, it presents a technique for detecting determinacy races, dubbed output nondeterminism, in dataflow programming models such as the ADF. Lastly, it proposes a tool for detecting data races in 32-bit POSIX Threads applications for embedded systems. It is with anticipation that this dissertation will benefit both the industry and research communities. It also opens doors for further research on race detection for better solutions.