«Nastaran Shaﬁei nastaran Supervisor: Franck van Breugel Supervisory Committee Members: Eric Ruppert Jonathan S. Ostroﬀ July 17, 2013 ...»
Model Checking Distributed Java
PhD Dissertation Proposal
Supervisor: Franck van Breugel
Supervisory Committee Members:
Jonathan S. Ostroﬀ
July 17, 2013
2 Introduction 2
2.1 Distributed Applications..................... 2
2.2 The Java Platform........................ 4
2.3 Model Checking.......................... 4
2.4 Java PathFinder (JPF)...................... 5 3 Related Work 6
3.1 Cache-based Approach...................... 8
3.2 SUT Level Centralization.................... 10 4 Thesis Statement 15 5 Proposed Approach 16
5.1 Centralizing the Distributed Application............ 17 5.1.1 Separating Types..................... 17 5.1.2 Representing Processes.................. 22 5.1.3 Implementing Multiprocess VM............. 22
5.2 Modeling Inter Process Communication (IPC)......... 25
5.3 Capturing Communication with External Processes...... 25 6 Progress to Date 27 7 Timeline 28 1 Abstract The subject of the proposed dissertation is veriﬁcation of distributed applications. In the context of this proposal, we refer to a distributed application as a collection of communicating processes, regardless of where these processes physically reside or what communication channels they use. The veriﬁcation technique that we use is model checking [1, 2], which is an automated technique that examines all possible system states in a systematic way to check if desired properties are satisﬁed. In our research, we focus on distributed applications written in the Java programming language, and extend the Java PathFinder (JPF)1  model checker to verify these applications.
2 Introduction This section describes the goal and context of our research, namely
• the relevance of distributed applications and their speciﬁc veriﬁcation requirements,
• the relevance of Java as a platform for distributed applications,
• the suitability of model checking as a veriﬁcation technique for distributed applications, and
• the suitability of Java Pathﬁnder (JPF) as our model checker choice.
2.1 Distributed Applications Distributed computing is becoming more and more important these days as most systems in use are distributed. For example, mobile applications, the popularity of which keeps rising, are mostly distributed . Some examples of widely used, Java-based, mobile applications are Google maps mobile and Gmail mobile.
There are several key factors driving the development of distributed applications [5, 6]. Some services intrinsically require the use of a communication network to connect diﬀerent components. Massively multiplayer online games are among such services which allow a large numbers of people to play simultaneously, e.g., RuneScape2 which is written in Java.
http://babelfish.arc.nasa.gov/trac/jpf http://www.runescape.com Distributed computing can also allow for developing fault tolerant applications where a failure in a process does not stop other processes from running, and the application can still complete its task. The Netﬂix API is an example of a system that uses distributed components to provide fault tolerance3.
Moreover, distributed systems provide the use of the computational power of multiple machines to process tasks faster and handle larger problems. For example, Memcached4 is a high-performance distributed memory caching system designed to speed up dynamic web applications. The Netﬂix EVCache5 open-source project employs Memcached. Some other users of the Memcached caching system are Facebook, Twitter, Wikipedia, YouTube.
Distributed computing is also used in intensive scientiﬁc simulations to gain speed, e.g., CartaBlanca 6 is a physical system simulation package written in Java which uses MPJ Express7 (a Java message passing library) to parallelize its computation.
Finally, using distributed applications allows for sharing resources in a networked system such as disk, printers, ﬁles, databases. This can be seen in systems based on cloud computing  which are distributed systems following the client-server model wherein one or more clients request information from a server. Cloud computing is one of the major focuses of leading companies in the computer industry, such as Apple, Amazon, Google. The majority of Google services follow the cloud computing model. Some of those services are based on Java, such as Google Docs, Google Calendar, and Gmail.
In general, distributed applications are hard to develop. These applications are inherently concurrent, and their behavior is nondeterministic which makes it hard for programmers to consider all possible behaviors of the application. Other than concurrency errors, developers of such applications need to deal with other issues tied with a distributed setting, such as the possibility of failures at diﬀerent levels, for example, within the process initiating the communication, between the time that data is transmitted between processes, within the process receiving data. Another issue in programming distributed applications is gaining a consistent view of data across the system.
In general, testing distributed systems is hard. Diﬀerent components of the system may have diﬀerent software and hardware requirements, and http://techblog.netflix.com/2012/02/fault-tolerance-in-high-volume.
html http://memcached.org https://github.com/Netflix/EVCache http://www.lanl.gov/projects/CartaBlanca http://mpj-express.org therefore setting up an environment to test such applications can be diﬃcult. Furthermore, due to the possibility of failures at diﬀerent levels, testing such applications against potential defects requires injection or simulation of failures at several diﬀerent layers.
2.2 The Java Platform Java is one of the most popular programming languages 8. It is considered a language of choice for many developers of distributed applications, e.g., the majority of top most watched Java projects on GitHub, which is a popular web-based hosting service for software systems, are distributed applications.
Java has several features which makes it a powerful environment for developing such applications [5, 6]. Java is platform independent, that is, a single version of Java code can run on any platform with a Java virtual machine (JVM). It supports multithreaded programming and oﬀers an exception handling mechanism which is useful for developing fault tolerant applications.
It also provides multilevel support for network communication including basic networking support such as sockets used to establish connection between processes, and data communication protocols such as TCP and UDP. At a higher level, it provides networking capabilities such as distributed objects, communication with databases.
Finally, Java supports two aspects of security for distributed applications.
Since in a distributed Java application, running components (such as Java applets) can migrate across the network, Java provides ways to secure the runtime environment of recipient processes, for example, by restricting access to the local ﬁle system. It also allows for adding user authentication, and encryption of data sent across the network to establish secure network connections.
2.3 Model Checking Model checking has several advantages that make it superior to other veriﬁcation methods such as testing , runtime veriﬁcation [9, 10], theorem proving [11, 12], type checking [13, 14], and abstract interpretation [15, 16, 17].
Model checking is naturally preferred to testing and runtime veriﬁcation when concurrency comes into play, since unlike model checking these techniques do not have any control over the scheduling of the concurrently running components, and therefore are not able to check all possible execution http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html paths of the application. Model checkers can also easily provide counterexamples which make the process of ﬁxing errors much easier.
Model checking is mostly automated, and it is generally easier to apply compared to techniques such as theorem proving which requires a high level of expertise and user interaction. Moreover, model checkers allow for the speciﬁcation of properties related to the functionality of the application, hence allowing for verifying a wide range of requirements, i.e., unlike the type checking technique and static analyzers based on abstract interpretation which are implemented speciﬁc to certain properties. A detailed description of veriﬁcation techniques mentioned above and their comparison with model checking can be found in my qualifying exam .
The major challenge in model checking is the state space explosion problem which occurs when the state space of the system under test (SUT) becomes too large and available memory resources are not enough to store it.
A commonly used technique to address this problem is partial order reduction which reduces the number of executions that need to be checked by considering concurrently executed instructions that do not aﬀect each other.
Based on the way that states are represented, model checking algorithms can be classiﬁed into two main categories: explicit-state model checking which directly deals with states versus symbolic model checking which deals with sets of states . In this work we use explicit-state model checking.
This technique uses graph algorithms to create and explore the state space.
Vertices of the graph represent states and the edges represent instructions which, when executed, take the system from one state to another. While exploring the graph, the states are checked against the desired properties.
The algorithm keeps a record of visited states so that it can backtrack to states which encapsulate non-deterministic choices to explore new paths.
Figure 1: Diﬀerent centralization techniques applied at diﬀerent levels, (a) the SUT level, (b) the OS level, (c) the model checker level.
such as partial order reduction, garbage collection, and state compression.
JPF also oﬀers a highly conﬁgurable structure, and introduces numerous extension mechanisms which make it a suitable engine for many existing tools, e.g., tools that automatically create test cases by performing symbolic execution of Java applications [23, 24].
To capture nondeterministic choices, JPF introduces choice generators which make JPF to branch the executions at certain points. JPF allows for extending choice generators to deﬁne choices, such as scheduling sequences, timeouts and random values. Another important extension mechanisms of JPF are listeners, which are runtime conﬁgured plugins that allow for monitoring the SUT executions and interacting with JPF.
JPF also includes the model Java interface (MJI) to delegate code execution from the JPF level to the underlying host JVM. This mechanism is mainly used to handle native calls, i.e., calls to platform and OS speciﬁc library functions which are written in languages other than Java. Moreover, the JPF attribute system allows for associating veriﬁcation speciﬁc information to values of ﬁelds, local variables and stackframe operands which is manipulated and passed along the execution paths. This feature is suitable to implement data ﬂow related properties. Finally, JPF allows for changing the semantics of the execution of bytecode instructions. For more details on JPF, the reader is referred again to .
3 Related Work Conventional model checking techniques implemented by various Java model checkers [3, 19, 20, 21] are only applicable to single-process applications, and they cannot handle distributed systems. In general, applying the model checking technique on distributed Java applications is not trivial.
The techniques that have been proposed to model check distributed Java applications can be divided into two main categories: (1) cache-based, (2) centralization. In the cache-based approach, the model checker veriﬁes only one process and its communication with the rest of the processes. In the centralization approach, the distributed application is captured within a single process, and the model checker is able to verify all the communicating processes.
The cache-based approach runs only one process, as a SUT, within the model checker, and rest of the processes, henceforth called peers, run outside of the model checker either within their native environment or a virtualization thereof [25, 26, 27].
The main challenge of this approach is to keep the SUT in synchronization with its peers since the model checker does not have any control over the execution of peers, and their execution is not subject to backtracking.
After the model checker backtracks, the SUT may resend data which might interrupt the correct behavior of the peers. Moreover, after backtracking, peers do not resend previously sent data to the SUT. Existing cache-based techniques [25, 26, 27] address this problem by introducing a cache layer between the SUT and its peers. Section 3.1 explains these techniques in more details.
An alternative approach for the cache-based approach is centralization [28, 29, 30, 31]. The existing centralization techniques can be applied at either the SUT level or the operating system (OS) level. In centralization at the SUT level (see Figure 1 (a)), the distributed application is transformed into a single-process application which is then fed to a model checker as a SUT [28, 29]. In this technique, distributed processes are mapped onto communicating threads within a single process application. This usually includes a model of the inter-process communication (IPC) mechanism that is used for communication. Centralization at the SUT level requires dealing with several issues. How are processes represented? How is exclusive access to static attributes provided for diﬀerent processes? How are static synchronized methods handled? How is the shutdown semantics speciﬁed? Since the technique proposed in this research is also subjected to similar issues, we provide a detailed discussion of techniques applying centralization at the SUT level in Section 3.2.