blob: 165ae437face086d2f1558c4c22f4360c3231116 [file] [log] [blame]
<h2 id="Avoiding_exponential_blowup_when_processing_DAGs">Avoiding exponential blowup when processing DAGs</h2>
<!-- John Field of Google is interested in this. -->
<p>
Google's <a href="https://bazel.build/">Bazel</a> open-source project is a
publicly-released version of their build system, Blaze. Blaze builds every
line of source code that is written by any Google programmer &mdash; all of
that source code appears in a single repository! Therefore, Bazel/Blaze needs to
be fast. Bazel represents all of the source code and its dependencies as a
large DAG (a
<a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph">directed
acyclic graph</a>). It needs to manipulate these DAGs efficiently.
</p>
<p>
One of the biggest problems that the Bazel developers face is exponential
blowup of DAG sizes and therefore of run time. Periodically, one of the
Bazel developers makes such a mistake, and Bazel becomes unusable until
they can diagnose and fix the problem.
</p>
<p>
Here are two different ways to view the problem.
</p>
<ol>
<li>
In a DAG, multiple nodes may have the same child. Traversing the DAG
naively would visit the child multiple times &mdash; in the worst case,
exponentially many times. It is necessary to avoid doing so.
</li>
<li>
Bazel contains a function that takes a DAG as input and generates a DAG
as output (the output is an object graph). The Bazel developers want to
ensure that the size of the output DAG is O(|input DAG|). The input DAG is
processed bottom-up (it ensures that each input node is visited once) and
Bazel stores the results of intermediate computations that construct the
output DAG with nodes of the input DAG. The key thing the Bazel developers
want to avoid is copying intermediate subgraphs that have unbounded size.
</li>
</ol>
<p>
More concretely, there is only one Java type for all DAGs, and there is a
method <code>flatten()</code>. It's a mistake to call
<code>flatten()</code> on certain DAGs, because doing so may cause
exponential blowup.
</p>
<!--
</p>
<p>
A pluggable type system can subdivide the Java type, using this type hierarchy:
</p>
<p>
@PossbilyPropagatable
|
@NotPropagatable
</p>
<p>
A call to flatten() is permitted only if the receiver is known to be
@NotPropagatable.
</p>
<p>
This should be very easy to code up and evaluate, once we have more precise
definitions.
-->
<p>
The goal of this project would be to better understand the problem with
Bazel, to formalize it, and to create a program analysis that solves
the problem. You would evaluate your work by running it on the Bazel
codebase to discover latent problems, or by providing it to the Bazel
developers to run each time they propose a code change. The Bazel
team is interested in collaborating by evaluating a tool.
</p>
<h2 id="case-study-index-out-of-bounds">Array indexing (index-out-of-bounds errors)</h2>
<p>
An index-out-of-bounds error occurs when a programmer provides an illegal
index to an array or list, as in <code>a[i]</code> or <code>a.get(i)</code>
where <code>i</code> is less than 0 or greater than the length of
<code>a</code>. In languages like C, this is disastrous: buffers
overflows lead to about 1/6 of all security vulnerabilities. In languages
like Java, the result is &ldquo;merely&rdquo; that the program crashes. In
both languages, it is desirable to prevent programmers from making this
error and to prevent users from suffering the bad consequences.
</p>
<p>
This project will be a substantial case study with
the <a href="https://checkerframework.org/manual/#index-checker">Index
Checker</a>. The first goal is to identify its merits and limitations.
Does it scale up to big,
interesting programs? Are there common, important code patterns that it
fails to handle? Does it produce too many false positive warnings? Does
it place too heavy a burden on the user, either in terms of annotations or
in terms of complexity of the error messages? Worst of all, are there
unknown unsoundnesses in the tool?
The second goal is to improve its precision enough to make it usable by
real-world programmers.
</p>
<p>
A <a href="#index-checker-mutable-length">related project</a> is to extend
the Index Checker to handle
mutable collections such as
<code>List</code>s, where the <code>remove()</code> method makes sound,
precise analysis very tricky.
</p>
<h3 id="case-study-nullness-bazel">Bazel tool</h3>
<!-- John Field of Google is interested in this. -->
<p>
This project is related to the
<a href="https://bazel.build/">Bazel</a> build system, and was
proposed by its development manager.
</p>
<p>
The Bazel codebase contains 1586 occurrences of the <code>@Nullable</code>
annotation. This annotation indicates that a variable may hold a null
value. This is valuable documentation and helps programmers avoid null
pointer exceptions that would crash Bazel. However, these annotations are
not checked by any tool. Instead, programmers have to do their best to
obey the <code>@Nullable</code> specifications in the source code. This is
a lost opportunity, since documentation is most useful when it is
automatically processed and verified. (For several years, Google tried
using <a href="http://findbugs.sourceforge.net/">FindBugs</a>, but they
eventually abandoned it: its analysis is too weak, suffering too many
false positives and false negatives.)
</p>
<p>
Despite the programmers' best efforts, null pointer exceptions do still
creep into the code, impacting users. The Bazel developers would like to
prevent these. They want a guarantee, at compile time, that no null
pointer exceptions will occur at run time.
</p>
<p>
Such a tool already exists: the
<a href="https://checkerframework.org/manual/#nullness-checker">Nullness
Checker</a> of the <a href="https://checkerframework.org/">Checker
Framework</a>. It runs as a compiler plug-in, and it issues a warning at
every possible null pointer dereference. If it issues no warnings, the
code is guaranteed not to throw a <code>NullPointerException</code> at run time.
</p>
<p>
The goal of this project is to do a large-scale case study of the Nullness
Checker on Bazel. The main goal is to understand how the Nullness Checker
can be used on a large-scale industrial codebase. How many lurking bugs
does it find? What
<a href="https://checkerframework.org/releases/1.9.13/api/org/checkerframework/checker/nullness/qual/Nullable.html"><code>@Nullable</code></a>
annotations are missing from the codebase because the developers failed to
write them? What are its limitations, such as code patterns that it cannot
recognize as safe? (You might create new analyses and incorporating them
into the Nullness Checker, or you might just reporting bugs to the Nullness
Checker developers for fixing.) What burdens does it place on users? Is
the cost-benefit tradeoff worth the effort &mdash; that is, should Google
adopt this tool more broadly? How should it be improved? Are the most
needed improvements in the precision of the analysis, or in the UI of the
tooling?
</p>
<h3 id="case-study-nullness-bcel">BCEL library</h3>
<p>
Annotate the BCEL library to express its contracts with respect to nullness.
Show that the BCEL library has no null pointer exceptions (or find bugs
in BCEL). There are
already <a href="https://github.com/apache/commons-bcel/compare/trunk...typetools:trunk?expand=1">some
annotations</a> in BCEL, but they have not been verified as correct by
running the Nullness Checker on BCEL. (Currently, those annotations are
trusted when type-checking clients of BCEL.)
</p>
<p>
To get started:
</p>
<ul>
<li>Fork https://github.com/typetools/commons-bcel.git</li>
<li>Clone your new fork</li>
<li><code>git checkout typecheck-nullness</code></li>
<li>mvn verify</li>
</ul>
<p>
Some challenging aspects of this case study are:
</p>
<ul>
<li>
There is some poor design that needs to be resolved in discussions with
the BCEL maintainers. For example, consider the <code>copy()</code>
method. Some implementations of <code>copy()</code> return null, but
are not documented to do so. In addition, some implementations
of <code>copy()</code> catch and ignore exceptions. I think it would
be nicest to change the methods to never return null, but to throw an
exception instead. (This is no more burdensome to users, who currently
have to check for null.) Alternately, the methods could all be
documented to return null.
</li>
</ul>
<h2 id="your-own-new-type-system">Invent your own new type system</h2>
<p>
We also welcome your ideas for new type systems. For example, any run-time
failure can probably be prevented at compile time with the right
analysis. Can you come up with a way to fix your pet peeve?
</p>
<p>
It is easiest, but not required, to choose an existing type system from the
literature, since that means you can skip the design stage and go right to
implementation.
</p>
<p>
This task can be simple or very
challenging, depending on how ambitious the type system is. Remember to
focus on what helps a software developer most!
</p>
<h2 id="Bounded-size_strings">Bounded-size strings</h2>
<!-- John Field of Google is interested in this. -->
<p>
Windows cannot run command lines longer than 8191 characters. Creating a
too-long command line causes failures when the program is run on Windows.
These failures are irritating when discovered during testing, and
embarrassing or worse when discovered during deployment. The same command
line would work on Unix, which has longer command-line limits, and as a
result developers may not realize that their change to a command can cause
such a problem.
</p>
<p>
Programmers would like to enforce that they don't accidentally pass a
too-long string to the <code>exec()</code> routine. The goal of this
project is to give a compile-time tool that provides such a guarantee.
</p>
<p>
Here are two possible solutions.
</p>
<p>
<b>Simple solution:</b>
For each array and list, determine whether its length is known at compile
time. The routines that build a command line are only allowed to take such
constant-length lists, on the assumption that if the length is constant,
its concatenation is probably short enough.
</p>
<p>
<b>More complex solution:</b>
For each String, have a compile-time estimate of its maximum length. Only
permit <code>exec()</code> to be called on strings whose estimate is no more than 8191.
String concatenation would return a string whose estimated size is the sum
of the maximums of its arguments, and likewise for concatenating an array
or list of strings.
</p>
<h2 id="lock-ordering">Lock ordering</h2>
<p>
The <a href="https://checkerframework.org/manual/#lock-checker">Lock
Checker</a> prevents race conditions by ensuring that locks are held when
they need to be. It does not prevent deadlocks that can result from locks
being acquired in the wrong order. This project would extend the Lock
Checker to address deadlocks, or create a new checker to do so.
</p>
<p>
Suppose that a program contains two different locks. Suppose that one
thread tries to acquire lockA then lockB, and another thread tries to
acquire lockB then lockA, and each thread acquires its first lock. Then
both locks will wait forever for the other lock to become available. The
program will not make any more progress and is said to
be <a href="https://en.wikipedia.org/wiki/Deadlock">deadlocked</a>.
</p>
<p>
If all threads acquire locks in the same order &mdash; in our example, say
lockA then lockB &mdash; then deadlocks do not happen. You will extend the
Lock Checker to verify this property.
</p>
<h2 id="asm">Upgrade to a newer version of ASM</h2>
<p>
The
<a href="https://checkerframework.org/annotation-file-utilities/">Annotation
File Utilities</a>, or AFU, insert annotations into, and extract
annotations from, <code>.java</code> files, <code>.class</code> files,
and text files. These programs were written before the
<a href="https://asm.ow2.org/">ASM</a> bytecode library supported Java 8's
type annotations. Therefore, the AFU has its own custom version of ASM
that supports type annotations. Now that ASM 6 has been released and it
supports type annotations, the AFU needs to be slightly changed to use
the official ASM 6 library instead of its own custom ASM variant.
</p>
<p>
This project is a good way to learn about <code>.class</code> files and
Java bytecodes: how they are stored, and how to manipulate them.
</p>
<p>
(Kush Gupta is working on this project.)
</p>
<h2 id="typestate">Stateful type systems</h2>
<p>
This project is to improve support for
<a href="https://checkerframework.org/manual/#typestate-checker">typestate checking</a>
</p>
<p>
Ordinarily, a program variable has
the same type throughout its lifetime from when the variable is declared
until it goes out of scope. "Typestate"
permits the type of an object or variable to <em>change</em> in a controlled way.
Essentially, it is a combination of standard type systems with dataflow
analysis. For instance, a file object changes from unopened, to opened, to
closed; certain operations such as writing to the file are only permitted
when the file is in the opened typestate. Another way of saying this is
that <code>write</code> is permitted after <code>open</code>, but not after <code>close</code>.
Typestate
is applicable to many other types of software properties as well.
</p>
<p>
Two <a href="https://checkerframework.org/manual/">typestate checking frameworks</a>
exist for the Checker Framework. Neither is being maintained; a new one
needs to be written.
</p>
<h2 id="analysis_diffs">Tool for analysis diffs</h2>
<!--
This project idea is duplicated at
~mernst/public_html/uw-only/research/potential-research-projects.html .
-->
<p>
Many program analyses are too verbose for a person to read their entire
output. However, after a program change, the analysis results may
change only slightly. An "analysis diff" tool could show the
difference between the analysis run on the old code and the analysis run
on the new code.
</p>
<ul>
<li>
The analysis diffs may help the programmer to better understand the
changes.
</li>
<li>
Bug detection tools. such
as <a href="http://findbugs.sourceforge.net/">FindBugs</a> or
the <a href="https://checkerframework.org/">Checker Framework</a>, have
extremely verbose output when first run on a program. Programmers
could examine and fix only the warnings about code they have changed
(and that they are currently thinking about).
</li>
<li>
Tools that always have large output, such as inference tools, could
become manageable to users if output is shown in small doses.
</li>
<li>
You can probably think of other uses.
</li>
</ul>
<p>
The analysis diff tool would take as input two analysis results (the
previous and the current one). It would output only the new parts of its
second input. (It could optionally output a complete diff between two
analysis results.)
</p>
<p>
One challenge is dealing with changed line numbers and other analysis
output differences between runs.
</p>
<p>
It would be nice to integrate the tool with git pre-commit hooks or GitHub
pull requests, to enable either of the following functionality (for either
commits to master or for pull requests):
</p>
<ul>
<li>
Permit only those commits/pulls that do not add any new analysis warnings.
</li>
<li>
Permit only those commits/pulls that are "clean" &mdash; the analysis
issues no warnings for any changed line.
</li>
</ul>
<p>
A concrete example of an analysis diff tool
is <a href="https://github.com/plume-lib/checklink/blob/master/checklink-persistent-errors">checklink-persistent-errors</a>;
see the documentation at the top of the file. That tool only works for
one particular analysis, the W3C Link Checker.
An analysis diff tool also appears to be built into FindBugs.
The goal of this project is to build a general-purpose tool that is easy
to apply to new analyses.
</p>
<h2 id="performance">Performance improvements</h2>
<p>
The Checker Framework runs much slower than the standard javac compiler
&mdash; often 20 times slower! This is not acceptable as part of a
developer's regular process, so we need to speed up the Checker
Framework. This project involves determining the cause of slowness in
the Checker Framework, and correcting those problems.
</p>
<p>
This is a good way to learn about performance tuning for Java applications.
</p>
<p>
Some concrete tasks include:
</p>
<ul>
<li>Profile the Checker Framework. Run a profiler such as
<a href="https://www.yourkit.com/java/profiler/">YourKit</a> to determine
which parts of the Checker Framework consume the most CPU time and memory.
</li>
<li>Perhaps compare a profile of the Checker Framework against a profile
of regular javac. This probably is not necessary because the Checker
Framework is so much slower than regular javac.
</li>
<li>Consider interning string values. The Checker Framework does a fair
amount of string manipulation, in part because it reads from resources
such as stub files that do not produce <code>Element</code>s. Interning
could save time when doing comparisons. You can verify the correctness
of the optimization by running the
<a href="https://checkerframework.org/manual/#interning-checker">Interning
Checker</a> on the Checker Framework code. Compare the run time of the
Checker Framework before and after this optimization.
</li>
<li> Based on profiling results, devise other optimizations, implement
them, and evaluate them.
</li>
</ul>
<h2 id="run-time-checking">Run-time checking</h2>
<p>
Implement run-time checking to complement compile-time checking. This will
let users combine the power of static checking with that of dynamic
checking.
</p>
<p>
Every type system is too strict: it rejects some programs that never go
wrong at run time. A human must insert a type loophole to make such a
program type-check. For example, Java takes this approach with its
cast operation (and in some other places).
</p>
<p>
When doing type-checking, it is desirable to automatically insert run-time
checks at each operation that the static checker was unable to verify.
(Again, Java takes exactly this approach.) This guards against mistakes by
the human who inserted the type loopholes. A nice property of this
approach is that it enables you to prevent errors in a program
with no type annotations: whenever the static checker is unable
to verify an operation, it would insert a dynamic check. Run-time checking
would also be useful in verifying whether the suppressed warnings are
correct &mdash; whether the programmer made a mistake when writing them.
</p>
<p>
The annotation processor (the pluggable type-checker) should automatically
insert the checks, as part of the compilation process.
</p>
<p>
There should be various modes for the run-time checks:
</p>
<ul>
<li>fail immediately.</li>
<li>logging, to permit post-mortem debugging without crashing the program.</li>
</ul>
<p>
The run-time penalty should be small: a run-time check is necessary only
at the location of each cast or suppressed warning. Everywhere that the
compile-time checker reports no possible error, there is no need to insert a
check. But, it will be an interesting project to determine how to minimize
the run-time cost.
</p>
<p>
Another interesting, and more challenging, design question is whether you need to add and maintain
a run-time representation of the property being tested. It's easy to test
whether a particular value is null, but how do you test whether it is
tainted, or should be treated as immutable? For a more concrete example,
see the discussion of the (not yet implemented)
[Javari run-time checker](http://pag.csail.mit.edu/pubs/ref-immutability-oopsla2005-abstract.html).
Adding this run-time support would be an interesting and challenging project.
</p>
<p>
We developed a prototype for the
<a href="https://ece.uwaterloo.ca/~wdietl/publications/pubs/EnerJ11-abstract.html">EnerJ runtime system</a>.
That code could be used as starting point, or you could start afresh.
</p>
<p>
In the short term, this could be prototyped as a source- or
bytecode-rewriting approach; but integrating it into the type checker is a
better long-term implementation strategy.
</p>
<h2 id="ide-support">IDE and build system support</h2>
<p>
The Checker Framework comes with support for
<a href="https://checkerframework.org/manual/#external-tools">external tools</a>,
including both IDEs (such as an Eclipse plug-in) and build tools
(instructions for Maven, etc.).
</p>
<p>
These plug-ins and other integration should be improved.
We have a number of concrete ideas, but you will also probably come up
with some after a few minutes of using the existing IDE plugins!
</p>
<p>
This is only a task for someone who is already an expert, such as someone
who has built IDE plugins before or is very familiar with the build system.
One reason is that these tools tend to be complex, which can lead to
subtle problems.
Another reason is that we don't want to be stuck maintaining code written by
someone who is just learning how to write an IDE plugin.
</p>
<p>
Rather than modifying the Checker Framework's existing support or
building new support from scratch, it may be better to adapt some other
project's support for build systems and IDEs. For instance, you might
make <a href="https://github.com/coala/coala">coala</a> support the
Checker Framework, or you might adapt the tool integration provided
by <a href="http://errorprone.info/">Error Prone</a>.
</p>
<!--
Integrate the Checker Framework with an IDE such as Eclipse, IntelliJ, IDEA or NetBeans,
which will make pluggable type-checking easier to use and more attractive to developers.
Some specific projects include:
* Create an IDE plug-in that invokes the checker and reports any errors to the developer, just as the IDE currently does for type errors.
* Improve the existing Eclipse plug-in.
* Add a button to the IDE that hides/shows all annotations of a given variety, to reduce clutter. This would be useful beyond type annotations.
* Implement quick fixes for common errors.
* Integrate type inference with the IDE, to make it easier for a developer to add annotations to code.
* Improve an IDE's existing refactoring support so that it is aware of, and properly retains, type annotations.
* Highlight defaulted types and flow-sensitive type refinements, making error messages easier to understand.
* Highlight the parts of the code that influenced flow-sensitive type refinements, again to make errors easier to comprehend.
IDEs that build on the OpenJDK Java compiler could benefit from even tighter integration with the Checker Framework.
This project may entail the following:
* familiarity with the IDE plug-in development environment, Eclipse PDT or NetBeans plug-in support: By the end of the project, the developer should be familiar with building reasonably complex IDE plug-ins.
* UI/UX design: The developer would design a UI experience that is appealing to developers and integrates well with the developer's workflow!
-->
<h2 id="exhaustive-testing">Model checking of a type system</h2>
<p>
Design and implement an algorithm to check type soundness of a type system
by exhaustively verifying the type checker on all programs up to a certain
size. The challenge lies in efficient enumeration of all programs and
avoiding redundant checks, and in knowing the expected outcome of the
tests. This approach is related to bounded exhaustive
testing and model checking; for a reference, see
[Efficient Software Model Checking of Soundness of Type Systems](http://www.eecs.umich.edu/~bchandra/publications/oopsla08.pdf).
</p>
<p>
A valuable project all by itself would be to compare heavy-weight and
light-weight type inference this whole-program inference vs. Checker
Framework Inference vs. Julia, to understand when each one is worth using.
</p>