Ross: a tool for precise points-to analysis based verification of Java programs

1. Unpacking the tool

The name of our tool is "Ross". The working of this tool is described in our OOPSLA 2018 paper titled "(R)efinenent in (O)bject-(S)ensitive points-to analysis via (S)licing". It uses points-to analysis to verify whether downcasts in a Java program are safe, and to identify allocation sites that create only immutable objects.

Please begin by downloading this zip file. When this zip file is unzipped, it results in a folder called "RossTool", which in turns contains two subfolders, namely petablox_v2 and petablox-bench. Throughout this document we refer to the full path of the "RossTool" folder on your computer as $ROOT.

To use our tool (i.e., Ross), we recommend a machine with at least 50 GB of free disk space, 32 GB of RAM, a Core i7 or comparable processor, and Java 1.8.

2. Running the tool on a demo program

2.1. To check if the tool is functioning, run it on the included demo program:

$ cd $ROOT/petablox_v2
$ ./runbm.sh ci-downcast demo

In case there is any problem with the step above, please recompile the tool as follows:

$ cd $ROOT/petablox_v2
$ ant clean
$ ant

2.2. Upon successful execution of the tool you will get output similar to the following:

Buildfile: $ROOT/petablox_v2/build.xml

run:
     [java] Redirecting stdout to file: $ROOT/petablox_v2/examples/demo/petablox_output/log.txt
     [java] Redirecting stderr to file: $ROOT/ross/petablox_v2/examples/demo/petablox_output/log.txt

BUILD SUCCESSFUL
Total time: 20 seconds

2.3. The output files from the run above can be found in the folder $ROOT/petablox_v2/examples/demo/result-ci-downcast/ folder.

2.4. Detailed instructions for interpreting the output files are given in Section 4 below. log.txt is the main output file.

2.5. Instructions for modifying the demo program (if desired) are given Section 5.

3. Names of the analyses and benchmarks

3.1. The names of analyses that are packaged in our zip are: ci-downcast, milanova1-downcast, milanova2-downcast, ross-downcast-MNE, zhang-downcast, ci-immutableobject-app, milanova1-immutableobject-MNEApp, milanova2-immutableobject-MNEApp, ross-immutableobject-MNEApp..

Note, ross-downcast-MNE and ross-immutableobject-MNEApp are our primary contributions, while the other analyses are existing analyses used for comparison purposes. We discuss the meanings of all the analyses named above in Section 4 below.

3.2. Wherever a benchmark name is expected in the steps below, one can use either the name demo, or one of the following names: antlr, avrora, batik, bloat, chart, luindex, lusearch, pmd, sunflow, xalan, toba-s, javasrc-p.

Note, the analysis zhang-downcast will not work on the demo program. It will work only on the other benchmarks.

3.3. To run any of the analyses named in Section 3.1 above on any of the benchmarks or on the demo program, follow these steps:

$ cd $ROOT/petablox_v2
$ ./runbm.sh <analysis-name> <benchmark-name>

For example if you want to run our approach (i.e., "Ross") in the context of the downcast safety client, on the benchmark antlr, follow these steps:

$ cd $ROOT/petablox_v2
$ ./runbm.sh ross-downcast-MNE antlr

4. The output files

The output files from the various analyses would be available in the folders as named below:

Benchmark name	Output folder
demo	$ROOT/petablox_v2/examples/demo/result-<analysis-name>/

antlr, avrora, batik, bloat, chart, luindex, lusearch, luindex, pmd, sunflow, xalan	$ROOT/petablox-bench/dacapo/benchmarks/<benchmark-name>/result-<analysis-name>/

toba-s, javasrc-p	$ROOT/petablox-bench/ashesJSuite/benchmarks/<benchmark-name>/result-<analysis-name>/

For example, to access the results of the analysis ross-downcast-MNE for the benchmark antlr, go to the directory $ROOT/petablox-bench/dacapo/benchmarks/antlr/result-ross-downcast-MNE/.

Note, in the zip provided we have included sample output files in most of the output folders. Some of the output files that we have included would match what the tool actually produces currently, while others are previously generated files that could differ from the files that the tool produces currently.

4.1 Results of "-downcast" analyses (except zhang-downcast)

4.1.1. These analyses correspond to the experiments discussed in Section 5.1 (and Table 1) in our OOPSLA '18 paper.

4.1.2. The main output file in any output folder is log.txt. In this file, a snippet similar to the following will occur at one or two locations:

Relation reachableCast: 1051 nodes, 187.0 elements (T0,V0)
Relation unsafeDowncast: 901 nodes, 148.0 elements (T0,V0)
Relation safeDowncast: 332 nodes, 39.0 elements (T0,V0)
.
.
Total time: 00:01:58:428 hh:mm:ss:ms

The snippet shown above corresponds to the ci-downcast analysis, on benchmark antlr. The 39 "elements" under the "safeDowncast" category are the reachable downcasts that have been reported as safe by the analysis (see Column (4) in the "antlr" row in Table 1 in the paper). The 148 "elements" under the "unsafeDowncast" category are the remaining reachable downcasts (i.e., these are to be regarded as potentially unsafe). "Total time" is the total (end-to-end) running time of the analysis. The numbers about "nodes" can be ignored ("elements" are source-level features, while "nodes" are corresponding intermediate-representation level features).

4.1.3. The analysis ross-downcast-MNE is our approach. The log.txt files from from this analysis contain two occurrences of the snippet mentioned above. The first occurrence is from the first iteration of the approach (using k=1 at all allocation sites), while the second occurrence is from the second iteration of the approach (using k=3 at refined sites, and hence more precise).

ci-downcast is a context-insensitive analysis, milanova1-downcast is Milanova et al.'s object sensitivity analysis with k=1 at all allocation sites, while milanova2-downcast is Milanova et al.'s object sensivity analysis with k=2 at all allocation sites.

4.1.4. Each analysis also produces the files safeDowncast.txt and unsafeDowncast.txt in the same output folder as log.txt. These files contain the locations of the individual safe and unsafe downcasts, respectively. Each line is a downcast location, and contains the name of the Soot intermediate-representation variable that is being downcasted, the class and method where the downcast occurs, and the type to which the variable is downcasted.

4.2 Results of "*-immutableobject" analysis

4.2.1. These analyses correspond to the experiments discussed in Section 5.2 (and Table 2) of our OOPSLA '18 paper.

4.2.2. The information about immutable sites and total time taken is available in the file log.txt file. Following is a snippet from log.txt corresponding to the ci-immutableobject-app analysis on antlr.


Relation VO: 184 nodes, 340.0 elements (H0)
..    
Relation NonVO: 185 nodes, 1122.0 elements (H0)
.
.
Total time: 00:13:19:102 hh:mm:ss:ms

In the output above, the number of allocation sites that are reported as immutable by the analysis is 340. This corresponds to Column (3) in the "antlr" row in Table 2 in the paper. 1122 allocation sites are reported as potentially mutable.

As with the downcast analysis, ross-immutableobject-MNEApp produces output twice in log.txt, corresponding to its two iterations.

The files VO.txt and NonVO.txt are emitted in the same directory as log.txt. Each line in these files identifies an allocation site (VO.txt contains the immutable sites while NonVO.txt contains the potentially mutable sites.)

4.3 Results of zhang-downcast analysis

Zhang's analysis internally runs a number of iterations, reporting reachable and unsafe downcasts after each iteration. The results get more and more precise with each iteration. To know the total number of iterations that were required for any benchmark (if the analysis terminates), check the iteration number of the last iteration in log.txt. Our script runbm.sh is configured to stop any run of this analysis after a 6-hour timeout.

For instance, following is a snippet from antlr's log.txt file. In this case, Iteration 15 was the last iteration (indexing of iterations starts from 0):

Iteration: 15 unresolved queries size: 1
.
.
.
Relation reachableCast: 963 nodes, 186.0 elements (T0,V0)
.
.
Relation unsafeDowncast: 362 nodes, 55.0 elements (T0,V0)

Only the output from the last iteration matters (the other iterations appear earlier in log.txt, and can be ignored). Note, this analysis does not directly report the number of safe downcasts. However, this can be deduced by subtracting the unsafeDowncast elements (which is 55 in the snippet above) from the reachableCast elements (which is 186 in the snippet above).

5. Modifying the demo application

If you would like to make changes in the demo program, after making the changes you would need to recompile the source code. Follow the instructions here:

5.1. "cd" to the directory $ROOT/petablox_v2/examples/demo/

5.2. The source code of the demo application program is stored inside "src" sub-folder. Alter the source code as per your requirement (if you add any new methods to the code, make sure they are called directly or transitively from "main" in class M).

5.3. To compile the source code, execute the following steps while being in the directory $ROOT/petablox_v2/examples/demo/:

$ ant clean
$ ant

Note, ant makes use of the file build.xml. Note also that the file petablox.properties contains important parameters used by all the analyses.

6. Obtaining all the numbers in the tables in the OOPSLA '18 paper

6.1. The information in Table 1 in the paper can be obtained from the corresponding log.txt files as outlined below:

Column No.	Analysis to Run	How to obtain from log.txt
2 (total casts)	ci-downcast	Number of elements in line "Relation reachableCast"

3 (#alloc sites)	ci-downcast	From line "dom H size"

4 (CI R)	ci-downcast	Number of elements in line "Relation safeDowncast"

5 (CI time)	ci-downcast	From line "Total time"

7 (base R)	milanova<k>-downcast^*	Number of elements in line "Relation safeDowncast" - number of elements in line "Relation reachableCast" + Value in Column 2 of Table 1

8 (base time)	milanova<k>-downcast^*	From line "Total time"

9 (our R)	ross-downcast-MNE	Number of elements in 2nd occurrence of line "Relation safeDowncast" - number of elements in 2nd occurrence of line "Relation reachableCast" + Value in Column 2 of Table 1

10 (our sites)	ross-downcast-MNE	From line "Relation RFN"

11 (our time)	ross-downcast-MNE	From line "Total time"

^*Value of k is as listed in Column 6 of Table 1.

Note, results from Milanova's approach with k=1 are the same as results from the first iteration of our approach (i.e., Ross).

6.2. The information in Table 2 in the paper can be obtained from the corresponding log.txt files as outlined below:

Column No.	Analysis to Run	How to obtain from log.txt
2 (# alloc sites)	ci-immutableobject-app	Num elements in line "Relation VO" + Num elements in line "Relation NonVO"

3 (CI R)	ci-immutableobject-app	Number of elements in line "Relation VO"

4 (CI time )	ci-immutableobject-app	From line "Total time"

6 (base R)	milanova<k>-immutableobject-MNEApp^*	Value in Column 2 of Table 2 - num elements in line "Relation NonVO"

7 (base time)	milanova<k>-immutableobject-MNEApp^*	From line "Total time"

8 (our R)	ross-immutableobject-MNEApp	Value in Column 2 of Table 2 - num elements in 2nd occurrence of line "Relation NonVO"

9 (our sites)	ross-immutableobject-MNEApp	From line "No of sites refined"

10 (our time)	ross-immutableobject-MNEApp	From line "Total time"

^*Value of k is as listed in Column 5 in Table 2.