| <!DOCTYPE html> |
| <html> |
| <head> |
| <title>Checker Framework ideas for new contributors (GSoC ideas)</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> |
| <link rel="icon" type="image/png" href= |
| "../logo/Checkmark/CFCheckmark_favicon.png"> |
| </head> |
| <body> |
| |
| <img src="../logo/Logo/CFLogo.png" alt="Checker Framework logo" /> |
| |
| <h1>New contributor ideas</h1> <!-- omit from toc --> |
| |
| <p>Contents:</p> |
| <!-- start toc. do not edit; run html-update-toc instead --> |
| <ul> |
| <li><a href="#introduction">Introduction</a> |
| <ul> |
| <li><a href="#get-started">How to get started: do a case study</a></li> |
| <li><a href="#ask-questions">How to get help and ask questions</a></li> |
| <li><a href="#types-of-projects">Types of projects</a></li> |
| <li><a href="#apply">How to apply</a></li> |
| </ul></li> |
| <li><a href="#evaluate-type-system">Evaluate a type system or a Checker Framework feature</a> |
| <ul> |
| <li><a href="#case-study-signature">Signature strings</a></li> |
| <li><a href="#case-study-signedness">Signed and unsigned numbers</a></li> |
| <li><a href="#optional-case-study">Java's Optional class</a></li> |
| <li><a href="#Whole-program_type_inference">Whole-program type inference</a></li> |
| <li><a href="#determinism">Determinism</a></li> |
| <li><a href="#sound-by-default">Sound checking by default</a></li> |
| <li><a href="#compare-other-tools">Comparison to other tools</a></li> |
| <li><a href="#case-study-android-support">Android support annotations</a></li> |
| </ul></li> |
| <li><a href="#annotate-library">Annotate a library</a> |
| <ul> |
| <li><a href="#choose-a-library">Choosing a library to annotate</a></li> |
| <li><a href="#case-study-nullness-guava">Guava library</a></li> |
| </ul></li> |
| <li><a href="#create-new-type-system">Create a new type system</a> |
| <ul> |
| <li><a href="#non-empty-checker">Non-Empty Checker for precise handling of Queue.peek() and poll()</a></li> |
| <li><a href="#custom-tainting-checking">Custom tainting checking</a></li> |
| <li><a href="#track-unsupported-operations">Track unsupported operations</a></li> |
| <li><a href="#overflow">Overflow checking</a></li> |
| <li><a href="#index-checker-mutable-length">Index checking for mutable length data structures</a></li> |
| <li><a href="#nullness-bug-detector">Nullness bug detector</a></li> |
| </ul></li> |
| <li><a href="#cf-other">Enhance the toolset</a> |
| <ul> |
| <li><a href="#index-errors">Improving error messages</a></li> |
| <li><a href="#java-expression-parser">Java expression parser</a></li> |
| <li><a href="#dataflow">Dataflow enhancements</a></li> |
| <li><a href="#Purity_analysis">Purity (side effect) analysis</a></li> |
| <li><a href="#javadoc">Javadoc support</a></li> |
| </ul></li> |
| </ul> |
| <!-- end toc --> |
| |
| <h1 id="introduction">Introduction</h1> |
| |
| <p> |
| The <a href="https://checkerframework.org/">Checker Framework</a> is an |
| innovative programming tool that prevents bugs at development |
| time, before they escape to production. |
| </p> |
| |
| <p> |
| Java's type system prevents some bugs, such as <code>int count = |
| "hello";</code>. However, it does not prevent other bugs, such as null |
| pointer dereferences, concurrency errors, disclosure of private |
| information, incorrect internationalization, out-of-bounds indices, etc. |
| <em>Pluggable type-checking</em> replaces a |
| programming language's built-in type system with a more powerful, |
| expressive one. |
| </p> |
| |
| <p> |
| We have created around 20 |
| <a href="https://checkerframework.org/manual/#introduction">new type |
| systems</a>, and other people have created |
| <a href="https://checkerframework.org/manual/#third-party-checkers">many |
| more</a>. |
| The more powerful type system is not just a |
| bug-finding tool: it is a verification tool that gives a guarantee that |
| no errors (of certain types) exist in your program. Even though it is |
| powerful, it is easy to use. It follows the standard typing rules |
| that programmers already know, and it fits into their workflow. |
| </p> |
| |
| <p> |
| The Checker Framework is popular: it is used daily at Google, Amazon, |
| Uber, on Wall Street, and in other companies from big to small. It is |
| attractive to programmers who care about their craft and the quality of |
| their code. The Checker Framework is the motivation for Java's type |
| annotations feature. It has received multiple awards. |
| <!-- at conferences such as JavaOne. --> |
| With this widespread use, there is a need for people to help with the |
| project: everything from bug fixes, to new features, to case studies, to |
| integration with other tools. We welcome your contribution! |
| </p> |
| |
| <p> |
| Why should you join this project? It's popular, so you will have an |
| impact. It makes code more robust and secure, which is a socially |
| important purpose. Past GSOC students have had great success. |
| (David Lazar became a graduate student at MIT; multiple students |
| have published papers in scientific conferences, such |
| as <a href="https://homes.cs.washington.edu/~mernst/pubs/array-indexing-issta2018-abstract.html">Vlastimil |
| Dort at ISSTA 2018</a>.) You will |
| get to scratch your own itch by creating tools that solve problems that |
| frustrate you. And, we have a lot of fun on this project! |
| </p> |
| |
| <p> |
| <b>Prerequisites:</b> You should be very comfortable with the Java |
| programming language and its type system. You should know how a type |
| system helps you and where it can hinder you. You should be willing to |
| dive into a moderately-sized codebase. |
| You should understand fundamental object-oriented programming concepts, |
| such as |
| <a href="https://en.wikipedia.org/wiki/Liskov_substitution_principle">behavioral |
| subtyping</a>: subtyping theory |
| permits argument types to change contravariantly (even though Java forbids it |
| for reasons related to overloading), whereas return types may change |
| <a href="https://en.wikipedia.org/wiki/Covariance_and_contravariance_%28computer_science%29">covariantly</a> |
| both in theory and in Java. |
| </p> |
| |
| |
| <p> |
| <b>Potential projects:</b> |
| Most of this document lists potential summer projects. The projects are |
| grouped roughly from easiest to most challenging. Many of the projects are |
| applicable beyond Google Summer of Code. |
| <!-- |
| You can find more potential projects in the |
| <a href="https://github.com/typetools/checker-framework/issues">issue |
| tracker</a>. |
| --> |
| </p> |
| |
| |
| |
| <h2 id="get-started">How to get started: do a case study</h2> |
| |
| <p> |
| To <b>get started</b>, first do a case study of using the Checker |
| Framework: that is, run the Checker Framework on some program. |
| A case study gives you experience in using the Checker |
| Framework, and it may reveal bugs in either the Checker Framework or in |
| the program it is analyzing. |
| </p> |
| |
| <p> |
| Do the case study before submitting your proposal. |
| You might want to start with a small program such as from your |
| coursework, then repeat the process with an open-source program or library. |
| </p> |
| <ol> |
| <li> |
| <a href="https://checkerframework.org/manual/#installation">Install</a> |
| the Checker Framework. |
| <li> |
| <a href="https://checkerframework.org/manual/#how-to-read-this-manual">Review |
| the Checker Framework documentation.</a> |
| <li> |
| Choose an existing library or program to type-check. |
| Choose a program that is at least 1000 lines long. |
| The library or program should be under active maintenance; don't choose one |
| that has not had a commit in several years. |
| You will find the case study easier if you are already familiar with |
| the program, or if it is written in good style. |
| <li> |
| Choose one type system, from |
| among <a href="https://checkerframework.org/manual/#introduction">those |
| distributed with the Checker Framework</a>, that is appropriate for the |
| program. |
| <li> |
| If the program is hosted on GitHub, fork it and create a branch for |
| your work. (Leave the master branch unchanged |
| from upstream.) |
| <li> |
| Annotate the program, based on its documentation. |
| <br/> |
| Please do <em>not</em> make changes unrelated to annotating the |
| program, such as inserting/removing whitespace or sorting |
| the <code>import</code> statements. Doing so bloats the size of the |
| diffs and makes it hard to understand the essential changes. |
| <li> |
| Change the build system so that building the annotated branch runs the type-checker. |
| <li> |
| Run the type-checker. If it issues |
| warnings, <a href="https://checkerframework.org/manual/#handling-warnings">correct them</a>. |
| This might require adding more annotations, |
| fixing bugs in the program, or suppressing warnings. |
| Be sure that the program's test suite continues to pass. |
| Repeat until |
| the type-checker passes on the program. |
| <ul> |
| <li>Don't add an <code>if</code> statement that always succeeds, just |
| to suppress a warning. Convince yourself that both branches can |
| execute, or else don't add the <code>if</code> statement. |
| <li>If you add a <code>@SuppressWarnings</code> annotation, |
| <a href="https://checkerframework.org/manual/#suppresswarnings-best-practices-smallest-scope">write |
| it on the smallest possible scope</a> and |
| <a href="https://checkerframework.org/manual/#suppresswarnings-best-practices-justification">explain |
| why</a> the checker warning is a false positive and you are certain |
| the code is safe. |
| </ul> |
| <li> |
| Share it with us so that |
| we can evaluate it and give you feedback. |
| |
| <p> |
| Share the case study as soon |
| as you finish it or as soon as you have a question that is not answered |
| in the <a href="https://checkerframework.org/manual/">manual</a>; |
| don't wait until you submit your proposal. |
| The subject line should be descriptive (not just "Case study", but |
| "Nullness case study of Apache Commons Exec library"). |
| You should give us access to |
| <ul> |
| <li>the original (unannotated) version of the program, |
| <li>the annotated version of the program, and |
| <li>the exact command that runs the type-checker from the command |
| line. |
| </ul> |
| The best way to give all this information is a pointer to your GitHub |
| fork of the library. |
| </ol> |
| |
| <p> |
| The primary result of your case study is that you will discover bugs in the |
| subject program, or you will verify that it has no bugs (of some particular |
| type). If you find bugs in open-source code, report them to the program's |
| maintainer, and let us know when they are resolved. |
| </p> |
| |
| <p> |
| Another outcome of your case study is that you may discover bugs, limitations, |
| or usability problems in the Checker Framework. Please |
| <a href="https://checkerframework.org/manual/#reporting-bugs">report them</a>. |
| We'll try to fix them, or they might give you inspiration for |
| improvements you would like to make to the Checker Framework this summer. |
| </p> |
| |
| <p> |
| You can also try to fix problems yourself and submit a |
| <a href="https://github.com/typetools/checker-framework/pulls">pull |
| request</a>, but that is <em>not</em> a requirement; most successful GSOC |
| applicants had not submitted a successful pull request before being selected. |
| (Some GSOC projects have a requirement to fix an issue in the issue tracker. |
| We do not, because it is unproductive. |
| Don't try to start fixing issues before you |
| understand the Checker Framework from the user point of view, which will |
| not happen until you have completed a case study on an open-source program.) |
| You may discuss your ideas with us by sending mail |
| to <a href="https://groups.google.com/forum/#!forum/checker-framework-gsoc">checker-framework-gsoc@googlegroups.com</a>. |
| </p> |
| |
| <p> |
| Why should you start with a case study? |
| <!-- , instead of diving right into fixing |
| bugs, designing a new type system, or making other changes to the Checker |
| Framework? |
| --> |
| Before you can contribute to any project, you must |
| understand the tool from a user point of view, including its strengths, |
| weaknesses, and how to use it. |
| A case study is the best way to |
| learn about the Checker Framework, determine whether you would enjoy |
| joining the project during the summer, and show your aptitude so that you |
| will be chosen for the summer. |
| </p> |
| |
| |
| <h2 id="ask-questions">How to get help and ask questions</h2> |
| |
| <p> |
| We are very happy to answer your questions, and we are eager to interact |
| with you! It's OK to have questions, and your questions can lead to |
| improvements in the documentation and the tool. |
| </p> |
| |
| <p> |
| Before you ask a question, read this file and the |
| <a href="https://checkerframework.org/manual/#troubleshooting">"Troubleshooting" |
| section</a> of the Checker Framework manual |
| (including <a href="https://checkerframework.org/manual/#reporting-bugs">"How |
| to report problems"</a>), |
| and also search in the |
| <a href="https://checkerframework.org/manual/">Checker Framework manual</a> |
| for the answer. |
| Don't send us a message |
| that says nothing but “please guide me” |
| or “tell me how to fix this bug”. Such a message disqualifies |
| you from participating in GSOC, because it shows that you do not read |
| instructions, and you haven't thought about the problem nor |
| tried to solve it. |
| </p> |
| |
| <p> |
| Your questions should show that you will be a productive colleague over the |
| summer: tell us what you have tried, tell us what went wrong or |
| where you got stuck, and ask a concrete technical question that will |
| help you get past your problem. If you can do that, then definitely ask |
| your question, because we don't want you to be stuck or frustrated. |
| </p> |
| |
| <p> |
| Whenever you send email (related to GSoC or not), |
| please use standard email etiquette, such as: avoid all-caps; use a |
| descriptive subject line; don't put multiple different topics in a single |
| email message; start a new thread with a new subject line |
| when you change the topic; don't clutter discussions with irrelevant |
| remarks; don't use screenshots (unless there is a problem with a GUI), but |
| instead cut-and-paste the output or code into your message; |
| if you are making a guess, clearly indicate that it is a guess and |
| your grounds for it. If you violate these basic rules, you will |
| demonstrate that you don't read instructions and you don't act |
| professionally. Bug |
| reports should be |
| <a href="https://checkerframework.org/manual/#reporting-bugs">complete</a> |
| and should usually be |
| <a href="https://checkerframework.org/manual/#reporting-bugs">reported</a> |
| to the issue tracker. |
| </p> |
| |
| |
| <h2 id="types-of-projects">Types of projects</h2> |
| |
| <p> |
| Here are some possible focuses for a project: |
| </p> |
| <ul> |
| <li> |
| <a href="#evaluate-type-system">Evaluate</a> a recently-written type |
| system, or a feature used by multiple type systems. |
| </li> |
| <li> |
| <a href="#annotate-library">Annotate</a> a popular library, so that it |
| is easier to type-check clients of the library. |
| </li> |
| <li> |
| <a href="#create-new-type-system">Create</a> a new type system, to |
| prevent some Java programming error. |
| </li> |
| <li> |
| <a href="#cf-other">Enhance</a> a type system or the Checker Framework |
| itself. |
| </li> |
| </ul> |
| |
| <p> |
| This document gives a few suggestions in each category. |
| </p> |
| |
| |
| <h2 id="apply">How to apply</h2> |
| |
| <p> |
| To <b>apply</b>, you will submit a single PDF through the Google Summer |
| of Code website. This PDF should contain two main parts. We suggest |
| that you number the parts and subparts to ensure that you don't forget anything, and |
| to ensure that we don't overlook anything when reading your application. You might find it |
| easiest to create multiple PDFs for the different parts, then concatenate |
| them before uploading to the website, but how you create your proposal is |
| entirely up to you. |
| </p> |
| |
| <ol> |
| <li>The proposal itself: what project you want to work on during the |
| summer. You might propose to do a project listed on this webpage, or |
| you might propose a different project. |
| |
| <p>The proposal should have a descriptive title, both in the PDF and in |
| the GSoC submission system. Don't use a title like "Checker |
| Proposal" or "Proposal for GSoC". Don't distract from content with |
| gratuitous graphics. |
| |
| <p>List the tasks or subparts that are required to complete your |
| project. This will help you discover a part that you had forgotten. |
| We do not require a detailed timeline, because you don't yet |
| know enough to create one. |
| </p> |
| |
| <p> |
| If you want to do a |
| case study, say what program you will do your case study on. |
| |
| <p>If you want to create a new type system (whether one proposed on |
| this webpage or one of your own devising), then your proposal should include |
| the type system's user manual. You don't have to integrate it in the Checker |
| Framework repository (in other words, use any word processor or text |
| editor you want to create a PDF file you will submit), but you should describe |
| your proposed checker's <a href="https://checkerframework.org/manual/#creating-parts-of-a-checker">parts</a> |
| in precise English or simple formalisms, and you should follow the |
| suggested <a href="https://checkerframework.org/manual/#creating-documenting-a-checker">structure</a>. |
| |
| <p> |
| If you want to do exactly what is already listed on this page, then |
| just say that (but be specific about which one!), and it will not hurt |
| your chances of being selected. However, show us what progress you |
| have made so far. You might also give specific ideas |
| about extensions, about details that are not mentioned on this webpage, |
| about implementation strategies, and so forth. |
| </p> |
| |
| <p>Never literally cut-and-paste text that was not written by you, because |
| that would be plagiarism. If you quote from text written by someone |
| else, give proper credit. |
| Don't |
| submit a proposal that is just a rearrangement of text that already |
| appears on this page or in the Checker Framework manual, because it does |
| not help us to assess your likelihood of being successful. |
| </p> |
| |
| </li> |
| |
| <li>Your qualifications. Please convince us that you |
| are likely to be successful in your proposed summer project. |
| |
| <ol> |
| <li>A URL that points to a code sample. |
| Don't write any new code, but provide code you wrote in the |
| past, such as for a class assignment |
| or a project you have worked on outside class. |
| It does not need to have anything to do with |
| the Checker Framework project. It should be your own personal work. |
| The purpose is to assess your programming skills so we can assign you |
| to an appropriate project. |
| A common problem is to submit undocumented code; we expect every |
| programmer to write documentation when working on the Checker |
| Framework. |
| Don't put a lot of different files in Google Drive and share that |
| URL; it's better to upload a single <code>.zip</code> file or |
| provide a GitHub URL. |
| </li> |
| <li> |
| What you have done to prepare yourself |
| for working with the Checker Framework during the summer. |
| You may wish to structure this as a list. |
| Examples of items in the list include: |
| <ul> |
| <li>A URL for code you have annotated as a case study. Please indicate the |
| original unannotated code, the annotated code, and the exact command to |
| run the type-checker from the command line. Ensure that the GSoC |
| mentors can compile your code. |
| (It is acceptable to use the same code, or different code, for this |
| item and the code sample above.) |
| </li> |
| <li>URLs for bugs or pull requests that you have filed.</li> |
| <li>Information about other projects you have done, or classes you |
| have taken, that prepare you for your proposed summer task. This |
| is optional, because it might already appear in your resume.</li> |
| </ul> |
| </li> |
| <li>A resume. |
| A <a href="https://en.wikipedia.org/wiki/R%C3%A9sum%C3%A9">resume</a> |
| contains a brief description of your skills and your job or project |
| experience. It will often list classes you have taken so far and your |
| GPA. It should not be longer than one page.</li> |
| <li>An unofficial transcript or grade report (don't spend |
| money for an official one).</li> |
| </ol> |
| </li> |
| </ol> |
| |
| <p> |
| The <b>best way</b> to impress us is by doing a thoughtful job in the case |
| study. The case study is even more important than the proposal text, |
| because it shows us your abilities. |
| The case study may result in you submitting issues against the issue tracker of the |
| program you are annotating or of the Checker Framework. |
| Pull requests against our GitHub project are a plus but are not required: |
| good submitted bugs are just as valuable as bug fixes! |
| You can also make a good impression by correctly answering questions from |
| other students on the GSOC mailing list. |
| </p> |
| |
| <p> |
| Get feedback! Feel free to <a href="#ask-questions">ask questions</a> |
| to make your application more |
| competitive. We want you to succeed. Historically, students who start |
| early and get feedback are most successful. You can submit a draft |
| proposal via the Google Summer of Code website, and we will review it. We |
| do <em>not</em> receive any notification when you submit a draft |
| proposal, so if you want feedback, please tell us that. |
| Also, we can only see draft proposals; we cannot see final proposals until |
| after the application deadline has passed. |
| </p> |
| |
| |
| <h1 id="evaluate-type-system">Evaluate a type system or a Checker Framework feature</h1> |
| |
| <p> |
| These projects evaluate a recently-written type system or a feature used |
| by multiple type systems. |
| Using the type systems on real code is our most important source of new ideas and improvements. |
| Many people have started out “just” doing a case |
| study but have ended up making deep, fundamental contributions and even |
| publishing scientific papers about their discoveries. |
| </p> |
| |
| <p> |
| One possible outcome is to identify |
| weaknesses in the type-checker so that we can improve it. Another |
| possible outcome is to provide evidence that the type-checker is |
| effective and convince more users to adopt it. You will probably |
| also discover defects (bugs) in the codebase being type-checked. |
| </p> |
| |
| |
| |
| <h2 id="case-study-signature">Signature strings</h2> |
| |
| <p> |
| Determine whether the <a href="https://asm.ow2.org/">ASM library</a>, or |
| some other library, properly handles signature strings. |
| </p> |
| |
| <p> |
| Some challenging aspects of this case study are: |
| </p> |
| <ul> |
| <li> |
| Some libraries define their own new signature string formats (!), which |
| you need to define in the Signature String Checker. |
| </li> |
| <li> |
| Sometimes the library's documentation is incorrect, and in other cases the |
| string format is not defined. |
| </li> |
| </ul> |
| |
| |
| <h2 id="case-study-signedness">Signed and unsigned numbers</h2> |
| |
| <p> |
| The <a href="https://checkerframework.org/manual/#signedness-checker">Signedness |
| Checker</a> ensures that you do not misuse unsigned values, such as |
| by mixing signed and unsigned values in a computation or by performing a |
| meaningless operation. |
| </p> |
| |
| <p> |
| Perform a case study of the Signedness Checker, in order to detect errors |
| or guarantee that code is correct. |
| </p> |
| |
| <p> |
| A good way to find projects that use unsigned arithmetic is to find a library that supports unsigned |
| arithmetic, then search on GitHub for projects that use that library. |
| </p> |
| |
| <p> |
| Here are some relevant libraries. |
| </p> |
| <ul> |
| <li>In the JDK's <code>Integer</code> |
| and <code>Long</code>, these include |
| <code>compareUnsigned</code>, |
| <code>divideUnsigned</code>, |
| <code>parseUnsignedInt</code>, |
| <code>remainderUnsigned</code>, and |
| <code>toUnsignedLong</code>. |
| <br/> |
| Classes like <code>DataInputStream</code>, <code>ObjectInputStream</code>, |
| and <code>RandomAccessFile</code> have <code>readUnsignedByte</code>. |
| <br/> |
| <code>Arrays</code> has <code>compareUnsigned</code>. |
| The JDK is already annotated; search for <code>@Unsigned</code> within |
| <a href="https://github.com/typetools/jdk">https://github.com/typetools/jdk</a>. |
| </li> |
| <li> |
| In Guava, see |
| its <a href="https://github.com/google/guava/wiki/PrimitivesExplained#unsigned-support">unsigned |
| support</a>, such |
| as <a href="https://google.github.io/guava/releases/snapshot-jre/api/docs/com/google/common/primitives/UnsignedBytes.html">UnsignedBytes</a>, |
| <a href="https://google.github.io/guava/releases/snapshot-jre/api/docs/com/google/common/primitives/UnsignedLong.html">UnsignedLong</a>, |
| <a href="https://google.github.io/guava/releases/snapshot-jre/api/docs/com/google/common/primitives/UnsignedLongs.html">UnsignedLongs</a>, |
| etc. |
| Guava is already annotated; search for <code>@Unsigned</code> within |
| <a href="https://github.com/typetools/guava">https://github.com/typetools/guava</a>. |
| </li> |
| <li>The <a href="https://github.com/jOOQ/jOOU">jOOU</a> library consists of support for unsigned |
| integers.</li> |
| </ul> |
| |
| <p> |
| Another possibility is to find Java projects that <em>could</em> use an |
| unsigned arithmetic library but do not. For |
| example, <a href="https://github.com/bcgit/bc-java">bc-java</a> defines |
| its own unsigned libraries, and some other programs might do direct bit |
| manipulation. |
| </p> |
| <!-- Not a good choice because this is just example code, not an actively-maintained project. |
| <li>Project |
| Nayuki: <a href="https://www.nayuki.io/page/forcing-a-files-crc-to-any-value">CRC</a>, <a href="https://www.nayuki.io/page/notepadcrypt-format-decryptor-java">crypt</a>, <a href="https://www.nayuki.io/page/native-hash-functions-for-java">hash</a>.</li> |
| --> |
| <!-- Not a good choice because it is not under active development: |
| <li><a href="https://bytonic.de/html/jake2.html">Jake2</a></li> |
| --> |
| |
| <p> |
| Your case studies will show the need for enhancements to the Signedness |
| Checker. For example, the Signedness Checker does not currently handle |
| boxed integers and BigInteger; these haven't yet come up in case studies |
| but could be worthwhile enhancements. You may also need to |
| write more annotations for libraries such as the JDK. |
| </p> |
| |
| |
| <h2 id="optional-case-study">Java's Optional class</h2> |
| |
| <p> |
| Java 8 introduced the |
| <a href="https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Optional.html"><code>Optional</code></a> |
| class, a container that is either empty or contains a non-null value. |
| It is intended to solve the problem of null |
| pointer exceptions. However, <code>Optional</code> has <a href="https://homes.cs.washington.edu/~mernst/advice/nothing-is-better-than-optional.html">its own problems</a>. |
| </p> |
| |
| <p> |
| Because of <code>Optional</code>'s problems, many commentators advise programmers to use |
| <code>Optional</code> only in limited ways. |
| </p> |
| |
| <p> |
| The goal of this project is to evaluate |
| the <a href="https://checkerframework.org/manual/#optional-checker">Optional |
| Checker</a>, which warns programmers who |
| have misused <code>Optional</code>. |
| Another goal is to extend the Optional Checker to make it more precise or |
| to detect other misuses of <code>Optional</code>. |
| </p> |
| |
| |
| <h2 id="Whole-program_type_inference">Whole-program type inference</h2> |
| |
| <p> |
| A type system is useful because it prevents certain errors. The downside |
| of a type system is the effort required to write the types. Type inference |
| is the process of writing the types for a program. |
| </p> |
| |
| <p> |
| The Checker Framework includes |
| a <a href="https://checkerframework.org/manual/#whole-program-inference">whole-program |
| inference</a> that inserts type qualifiers in the user's program. |
| It works well on some programs, but needs more enhancements to work well on |
| all programs. |
| </p> |
| |
| |
| <h2 id="determinism">Determinism</h2> |
| |
| <p> |
| Programs are easier to use and debug if their output is deterministic. For |
| example, it is easier to test a deterministic program, because |
| nondeterminism can lead to flaky tests that sometimes succeed and sometimes |
| fail. As another example, it is easier for a user or programmer to compare |
| two deterministic executions than two nondeterministic executions. |
| </p> |
| |
| <p> |
| We have created a prototype Determinism Checker. It is documented in |
| this <a href="https://checkerframework.org/determinism-checker-manual/manual.html#determinism-checker">draft |
| manual chapter</a>, and its implementation is in |
| <a href="https://github.com/t-rasmud/checker-framework/tree/nondet-checker">t-rasmud/nondet-checker</a>. |
| You will do a case study of this type system. |
| </p> |
| |
| |
| <h2 id="sound-by-default">Sound checking by default</h2> |
| |
| <p> |
| By default, the Checker Framework is |
| <a href="https://checkerframework.org/manual/#unsound-by-default">unsound |
| in</a> <a href="https://checkerframework.org/manual/#nullness-lint">several</a> |
| <a href="https://github.com/typetools/checker-framework/issues/986">circumstances</a>. |
| “Unsound” means that the Checker Framework |
| may report no warning even though the program can misbehave at run time. |
| </p> |
| |
| <p> |
| The reason that the Checker Framework is unsound is that we believe that |
| enabling these checks would cause too many false positive warnings: |
| warnings that the Checker Framework issues because it cannot prove that the |
| code is safe (even though a human can see that the code is safe). Having |
| too many false positive warnings would irritate users and lead them not to |
| use the checker at all, or would force them to simply disable those checks. |
| </p> |
| |
| <p> |
| We would like to do studies of these command-line options to see whether |
| our concern is justified. Is it prohibitive to enable sound checking? Or can we |
| think of enhancements that would let us turn on those checks that are |
| currently disabled by default? |
| </p> |
| |
| <p> |
| There is no need to annotate new code for this project. Just use existing |
| annotated codebases, such as those that are type-checked as part of the |
| Checker |
| Framework's <a href="https://github.com/typetools/checker-framework/blob/master/azure-pipelines.yml">Azure |
| Pipeline</a>. In other words, you can start by enabling Azure |
| Pipelines for your fork and then changing the default behavior in a |
| branch. The Azure Pipelines job will show you what new warnings appear. |
| </p> |
| |
| |
| <h2 id="compare-other-tools">Comparison to other tools</h2> |
| |
| <p> |
| Many other tools exist for prevention of programming errors, such as |
| Error Prone, NullAway, FindBugs, JLint, PMD, and IDEs such as Eclipse and |
| IntelliJ. These tools |
| are not as powerful as the Checker Framework (some are bug finders rather |
| than verification tools, and some perform a shallower analysis), but they |
| may be easier to use. |
| Programmers who use these tools wonder, "Is it worth my time to switch to |
| using the Checker Framework?" |
| </p> |
| |
| <p> |
| The goal of this project is to perform a head-to-head comparison of as |
| many different tools as possible. You will quantify: |
| </p> |
| |
| <ul> |
| <li>the number of annotations that need to be written</li> |
| <li>the number of bugs detected</li> |
| <li>the number of bugs missed</li> |
| <li>the number of false positive warnings</li> |
| </ul> |
| |
| <p> |
| This project will help programmers to choose among the different tools |
| — it will show when a programmer should or should not use the |
| Checker Framework. |
| This project will also indicate how each tool should be improved. |
| </p> |
| |
| <p> |
| One place to start would be with an old version of a program that is |
| known to contain bugs. Or, start with the latest version of the program |
| and re-introduce fixed bugs. (Either of these is more realistic than |
| introducing artificial bugs into the program.) A possibility would be to |
| use the Lookup program that has been used in previous case studies. |
| </p> |
| |
| |
| <h2 id="case-study-android-support">Android support annotations</h2> |
| |
| <p> |
| Android uses its own annotations that are similar to some in the Checker |
| Framework. Examples include the |
| <a href="https://tips.seebrock3r.me/annotations-to-support-your-contracts-609ff259d5df">Android |
| Studio support annotations</a>, |
| including <code>@NonNull</code>, <code>@IntRange</code>, <code>@IntDef</code>, |
| and others. |
| </p> |
| |
| <p> |
| The goal of this project is to implement support for these annotations. |
| That is probably as simple as creating aliased annotations |
| by calling method <code>addAliasedTypeAnnotation()</code> |
| in <a href="https://checkerframework.org/api/org/checkerframework/framework/type/AnnotatedTypeFactory.html">AnnotatedTypeFactory</a>. |
| </p> |
| |
| <p> |
| Then, do a case study to show the utility (or not) of |
| pluggable type-checking, by comparison with how Android Studio currently |
| checks the annotations. |
| </p> |
| |
| |
| <h1 id="annotate-library">Annotate a library</h1> |
| |
| <p> |
| These projects annotate a library, so that it is easier to |
| type-check clients of the library. Another benefit is that this may find |
| bugs in the library. It can also give evidence for the usefulness of |
| pluggable type-checking, or point out ways to improve the Checker |
| Framework. |
| </p> |
| |
| |
| <p> |
| When type-checking a method call, the Checker Framework uses the method |
| declaration's annotations. |
| This means that in order to type-check code that uses a library, the |
| Checker Framework needs an annotated version of the library. |
| </p> |
| |
| <p> |
| The Checker Framework comes with a |
| few <a href="https://search.maven.org/search?q=annotatedlib">annotated |
| libraries</a>. Increasing this number will make the Checker Framework even |
| more useful, and easier to use. |
| </p> |
| |
| <p> |
| After you have <a href="#choose-a-library">chosen a library</a>, |
| fork the library's source code, adjust |
| its <a href="https://checkerframework.org/manual/#external-tools">build |
| system</a> to run the Checker Framework, and add annotations to it until |
| the type-checker issues no warnings. |
| </p> |
| |
| <p> |
| Before you get started, be sure to read |
| <a href="https://checkerframework.org/manual/#get-started-with-legacy-code">How |
| to get started annotating legacy code</a>. More generally, read the |
| <a href="https://checkerframework.org/manual/#how-to-read-this-manual">relevant |
| sections of the Checker Framework manual</a>. |
| </p> |
| |
| |
| <h2 id="choose-a-library">Choosing a library to annotate</h2> |
| |
| <p> |
| There are several ways to <b>choose a library</b> to annotate: |
| </p> |
| <ul> |
| <li> |
| The best way to choose a library is to try to annotate a program and notice |
| that library annotations are needed in order to type-check the program. |
| </li> |
| <li> |
| Alternately, you can |
| choose a <a href="https://docs.google.com/spreadsheets/d/17x_jKkGquEFq7LBQhS9HGXiG7iIl2AlXoPGfB6N5_bw">popular |
| Java library</a>. |
| </li> |
| </ul> |
| |
| <p> |
| When annotating a library, it is important to type-check both the library |
| and at least one client that uses it. Type-checking the client will |
| ensure that the library annotations are accurate. |
| </p> |
| |
| <p> |
| Whatever library you choose, you will need to deeply understand its |
| source code. You will find it easier to work with a library that is |
| well-designed and well-documented. |
| </p> |
| |
| <p> |
| You should choose a library that is |
| not <a href="https://search.maven.org/search?q=org.checkerframework.annotatedlib">already |
| annotated</a>. There are two exceptions to this. |
| </p> |
| <ul> |
| <li> |
| A library might be annotated for one type system, but you add |
| annotations for a different type system. One advantage of this is that |
| the library's build system is already set up to run the Checker |
| Framework. You can tell which type systems a library is annotated for |
| by examining its source code. |
| </li> |
| <li> |
| A library might be annotated, but the annotations have not been |
| verified by running the type-checker on the library source code. You |
| would verify that the annotations in the library are correct. |
| </li> |
| </ul> |
| |
| |
| <h2 id="case-study-nullness-guava">Guava library</h2> |
| |
| <p> |
| Guava is already partially annotated with nullness annotations — in |
| part by Guava's developers, and in part by the Checker Framework team. |
| However, Guava does not yet type-check without errors. Doing so could |
| find more errors (the Checker Framework has found nullness and indexing |
| errors in Guava in the past) and would be a good case study to learn the |
| limitations of the Nullness Checker. |
| </p> |
| |
| |
| <h1 id="create-new-type-system">Create a new type system</h1> |
| |
| <p> |
| The Checker Framework is shipped with <a href="https://checkerframework.org/manual/#introduction">about 20 type-checkers</a>. Users can |
| <a href="https://checkerframework.org/manual/#creating-a-checker">create a |
| new checker</a> of their own. However, some users don't want to go to |
| that trouble. They would like to have more type-checkers packaged with the |
| Checker Framework for easy use. |
| </p> |
| |
| <p> |
| Each of these projects requires you to design a <a href="https://checkerframework.org/manual/#creating-a-checker">new type system</a>, |
| implement it, and perform case studies to demonstrate that it is both |
| usable and effective in finding/preventing bugs. |
| </p> |
| |
| |
| <h2 id="non-empty-checker">Non-Empty Checker for precise handling of Queue.peek() and poll()</h2> |
| |
| <p> |
| The Nullness Checker issues a false positive warning for this code: |
| </p> |
| |
| <pre> |
| import java.util.PriorityQueue; |
| import org.checkerframework.checker.nullness.qual.NonNull; |
| |
| public class MyClass { |
| public static void usePriorityQueue(PriorityQueue<@NonNull Object> active) { |
| while (!(active.isEmpty())) { |
| @NonNull Object queueMinPathNode = active.peek(); |
| } |
| } |
| } |
| </pre> |
| |
| <p> |
| The Checker Framework does not determine that <code>active.peek()</code> returns a non-null value in this context. |
| </p> |
| |
| <p> |
| The contract of <code>peek()</code> is that it returns a non-null value if the queue is not empty and the queue contains no null values. |
| </p> |
| |
| <p> |
| To handle this code precisely, the Nullness Checker needs to know, for each queue, whether it is empty. |
| This is analogous to how the Nullness Checker tracks whether a particular |
| value <a href="https://checkerframework.org/manual/#map-key-checker">is a key in a map</a>. |
| </p> |
| |
| <p> |
| It should be handled the same way: by adding a new subchecker, called the |
| Nonempty Checker, to the Nullness Checker. Its types are: |
| </p> |
| <ul> |
| <li><code>@UnknownNonEmpty</code> — the queue might or might not be empty |
| <li><code>@NonEmpty</code> — the queue is definitely non-empty |
| </ul> |
| |
| <p> |
| There is a start at this type-checker in branch <code>nonempty-checker</code>. It: |
| </p> |
| <ul> |
| <li>defines the annotations |
| </li> |
| <li>creates the integration into the Nullness Checker |
| </li> |
| </ul> |
| <p> |
| However, it is not done. (In fact, it doesn't even compile.) |
| For information about what needs to be done, see <a href="https://github.com/typetools/checker-framework/issues/399">issue #399</a>. |
| </p> |
| |
| <p> |
| When you are done, the Nullness Checker should issue only the <code>// ::</code> diagnostics from <code>checker/tests/nullness/IsEmptyPoll.java</code> — no more and no fewer. |
| You can test that by running the Nullness Checker on the file, and when you are done you should delete the <code>// @skip-test</code> line so that the file is run as part of the Checker Framework test suite. |
| </p> |
| |
| |
| <h2 id="custom-tainting-checking">Custom tainting checking</h2> |
| |
| <p> |
| The Checker Framework comes with |
| a <a href="https://checkerframework.org/manual/#tainting-checker">Tainting |
| Checker</a> that is so general that it is not good for much of anything. |
| In order to be useful in a particular domain, a user must customize it: |
| </p> |
| <ul> |
| <li> |
| rename the <code>@Tainted</code> and <code>@Untainted</code> qualifiers |
| to something more specific (such as <code>@Private</code> |
| or <code>@PaymentDetails</code> or <code>@HtmlQuoted</code>), and |
| <li> |
| annotate libraries. |
| </ul> |
| |
| <p> |
| The first part of this project is to make this customization easier to do |
| — preferably, a user will not have to change any code in the Checker |
| Framework (the |
| <a href="https://checkerframework.org/manual/#subtyping-checker">Subtyping |
| Checker</a> already works this way). |
| As part of making customization easier, a user should be able to specify |
| multiple levels of taint — many information classification hierarchies |
| have more than two levels (for example, the US government separates classified |
| information into three categories: Confidential, Secret, and Top Secret). |
| </p> |
| |
| <p> |
| The second part of this project is to provide several examples, and do case |
| studies showing the utility of compile-time taint checking. |
| </p> |
| |
| <p> |
| Possible examples include: |
| </p> |
| <ul> |
| <li>SQL injection |
| <li>OS command injection |
| <li>the <code>@PrivacySource</code> and <code>@PrivacySink</code> |
| annotations used by the Facebook <a href="http://fbinfer.com/">Infer |
| static analyzer</a>. |
| <li>information flow |
| <li>many of the <a href="http://cwe.mitre.org/top25/">CWE/SANS most |
| dangerous software programming errors</a> (and the "on the cusp" ones too) |
| <!-- More details appear in these files: |
| ~/research/games/notes/notes |
| ~/prof/grants/2012-02-darpa-verigames/proposal/top25-as-types.pdf |
| --> |
| </ul> |
| |
| <p> |
| For some microbenchmarks, see the Juliette test suite for Java from CWE. |
| </p> |
| |
| |
| <h2 id="track-unsupported-operations">Track unsupported operations</h2> |
| |
| <p> |
| Some objects do not fully implement their interface; they |
| throw <code>UnsupportedOperationException</code> for some operations. One |
| example is unmodifiable collections. They throw the exception when a |
| mutating operation is called, such |
| as <code>add</code>, <code>addAll</code>, <code>put</code>, <code>remove</code>, |
| etc. |
| </p> |
| |
| <p> |
| Design a type system to track which operations might not be supported. |
| The checker would issue a warning whenever |
| an <code>UnsupportedOperationException</code> might occur at run time. |
| </p> |
| |
| <p> |
| Here is a possible type hierarchy: |
| </p> |
| |
| <pre> |
| @MightSupportNothing ← every hierarchy must have a top type |
| | |
| @MightNotSupport({"add", remove"}) |
| | |
| @MightNotSupport("add") |
| | |
| @SupportsAllOperations ← this is the default type qualifier |
| </pre> |
| |
| <p> |
| <code>@SupportsAllOperations</code> |
| indicates that an object that supports all its documented operations: |
| it never throws <code>UnsupportedOperationException</code>. |
| <code>@MightNotSupport(operations)</code> indicates which methods might throw |
| <code>UnsupportedOperationException</code>. |
| Methods such |
| as <a href="https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Arrays.html#asList(T...)">Arrays.asList</a> |
| and <a href="https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Collections.html#emptyList()">Collections.emptyList</a> |
| must be annotated to return one of these less-capable supertypes. |
| </p> |
| |
| <p> |
| (An alternative would be to have annotations that are specific to certain |
| types, such as <code>@MutableList</code> for lists.) |
| </p> |
| |
| <p> |
| If a program uses no objects that |
| throw <code>UnsupportedOperationException</code>, no annotations are |
| required. Experiments must be done to determine how high the annotation |
| burden is, in a program that extensively uses such objects. If the whole |
| program is available, then type inference can infer the needed types. |
| </p> |
| |
| |
| <h2 id="overflow">Overflow checking</h2> |
| |
| <p> |
| Overflow is when 32-bit arithmetic differs from ideal arithmetic. For |
| example, in Java the <code>int</code> computation 2,147,483,647 + 1 yields |
| a negative number, -2,147,483,648. The goal of this project is to detect |
| and prevent problems such as these. |
| </p> |
| |
| <p> |
| One way to write this is as an extension of the Constant Value Checker, |
| which already keeps track of integer ranges. It even already |
| <a href="https://checkerframework.org/manual/#value-checker-overflow">checks |
| for overflow</a>, but it never issues a warning when it discovers |
| possible overflow. Your variant would do so. |
| </p> |
| |
| <p> |
| This problem is so challenging that there has been almost no previous |
| research on static approaches to the problem. (Two relevant papers are |
| <a href="https://www.researchgate.net/publication/221655385_IntScope_Automatically_Detecting_Integer_Overflow_Vulnerability_in_X86_Binary_Using_Symbolic_Execution">IntScope: |
| Automatically Detecting Integer Overflow Vulnerability in x86 Binary Using |
| Symbolic Execution</a> and |
| <a href="https://dl.acm.org/citation.cfm?id=3136872">Integer Overflow |
| Vulnerabilities Detection in Software Binary Code</a>.) Researchers are |
| concerned that users will have to write a lot of annotations indicating the |
| possible ranges of variables, and that even so there will be a lot of false |
| positive warnings due to approximations in the conservative analysis. |
| For example, will every loop that contains <code>i++</code> cause a warning that <code>i</code> might overflow? |
| That would not be acceptable: users would just disable the check. |
| </p> |
| |
| <p> |
| You can convince yourself of the difficulty by manually analyzing programs |
| to see how clever the analysis has to be, or manually simulating your |
| proposed analysis on a selection of real-world code to learn its |
| weaknesses. You might also try it |
| on <a href="https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html">good |
| and bad binary search code</a>. |
| </p> |
| |
| <p> |
| One way to make the problem tractable is to limit its scope: instead of |
| being concerned with all possible arithmetic overflow, focus on a specific |
| use case. |
| As one concrete application, |
| the <a href="https://checkerframework.org/manual/#index-checker">Index |
| Checker</a> is currently unsound in the presence of integer overflow. If |
| an integer <code>i</code> is known to be <code>@Positive</code>, and 1 is |
| added to it, then the Index Checker believes that its type |
| remains <code>@Positive</code>. If <code>i</code> was |
| already <code>Integer.MAX_VALUE</code>, then the result is negative — |
| that is, the Index Checker's approximation to it is unsound. |
| </p> |
| |
| <p> |
| This project involves removing this unsoundness by implementing a type system to track when an |
| integer value might overflow — but this only matters for values that |
| are used as an array index. |
| That is, checking can be restricted to computations that involve an operand |
| of type <code>@IntRange</code>). |
| Implementing such an analysis would permit the Index Checker |
| to extend its guarantees even to programs that might overflow. |
| </p> |
| |
| <p> |
| This analysis is important for some indexing bugs in practice. |
| Using the Index Checker, we found 5 bugs in Google |
| Guava related to overflow. Google marked these as high priority and |
| fixed them immediately. In practice, there would be a run-time exception |
| only for an array of size approximately <code>Integer.MAX_INT</code>. |
| </p> |
| |
| <p> |
| You could write an extension of the Constant Value Checker, which already |
| keeps track of integer ranges and |
| even <a href="https://checkerframework.org/manual/#value-checker-overflow">determines |
| when overflow is possible</a>. It doesn't issue a warning, but your |
| checker could record whether overflow was possible (this could be a |
| two-element type system) and then issue a warning, if the value is used as |
| an array index. |
| Other implementation strategies may be possible. |
| </p> |
| |
| <p> |
| Here are some ideas for how to avoid the specific problem |
| of issuing a warning about potential overflow for every <code>i++</code> in |
| a loop (but maybe other approaches are possible): |
| </p> |
| <ul> |
| <li> |
| The loop checks whether <code>i == Integer.MAX_VALUE</code> before |
| incrementing. This wide-scale, disruptive code change is not |
| acceptable. |
| </li> |
| <li> |
| Make the default array size (the length of an unannotated array) be |
| <code>@ArrayLenRange(0, Integer.MAX_VALUE-1)</code> rather |
| than <code>@UnknownVal</code>, which is equivalent |
| to <code>@ArrayLenRange(0, Integer.MAX_VALUE-1)</code>. Now, every |
| array construction requires the client to establish that the length is |
| not <code>Integer.MAX_VALUE</code>. I don't have a feel for whether |
| this would be unduly burdensome to users. |
| </li> |
| </ul> |
| |
| |
| <h2 id="index-checker-mutable-length">Index checking for mutable length data structures</h2> |
| |
| <p> |
| The <a href="https://checkerframework.org/manual/#index-checker">Index |
| Checker</a> is currently restricted to fixed-size data structures. A |
| fixed-size data structure is one whose length cannot be changed once it is |
| created, such as arrays and Strings. This limitation prevents the Index |
| Checker from verifying indexing operations on mutable-size data structures, |
| like Lists, that have <code>add</code> or <code>remove</code> |
| methods. Since these kind of collections are common in practice, this is a |
| severe limitation for the Index Checker. |
| </p> |
| |
| <p> |
| The limitation is caused by the Index Checker's use of types that are dependent on the length of data structures, |
| like <code>@LTLengthOf("data_structure")</code>. If <code>data_structure</code>'s length could change, |
| then the correctness of this type might change. |
| </p> |
| |
| <p> |
| A naive solution would be to invalidate these types any time a method is called on <code>data_structure</code>. |
| Unfortunately, aliasing makes this still unsound. Even more, a great solution to this problem would keep |
| the information in the type when a method like add or remove is called on <code>data_structure</code>. |
| A more complete solution might involve some special annotations on List that permit the information to be persisted. |
| </p> |
| |
| <p> |
| This project would involve designing and implementing a solution to this problem. |
| </p> |
| |
| |
| <h2 id="nullness-bug-detector">Nullness bug detector</h2> |
| |
| <p> |
| Verifying a program to be free of errors can be a daunting task. When |
| starting out, a user may be more interested in |
| <a href="https://checkerframework.org/manual/#other-tools">bug-finding</a> |
| than verification. The goal of this project is to create a nullness bug |
| detector that uses the powerful analysis of the Checker Framework and its |
| Nullness Checker, but omits some of its more confusing or expensive |
| features. The goal is to create a fast, easy-to-use bug detector. It |
| would enable users to start small and advance to full verification in the |
| future, rather than having to start out doing full verification. |
| </p> |
| |
| <p> |
| This could be structured as a new NullnessLight Checker, or as a |
| command-line argument to the current Nullness Checker. Here are some |
| differences from the real Nullness checker: |
| </p> |
| <ul> |
| <li>No initialization analysis; the checker assumes that every value is |
| initialized.</li> |
| <li>No map key analysis; assume that, at every call to |
| <code>Map.get</code>, the given key appears in the map.</li> |
| <li>No invalidation of dataflow facts. Assume all method calls are pure, |
| so method calls do not invalidate dataflow facts. Assume there is no |
| aliasing, so field updates do not invalidate dataflow facts. |
| </li> |
| <li>Assume that boxing of primitives is <code>@Pure</code>: it returns |
| the same value on every call.</li> |
| <li>If the Checker Framework cannot infer a type argument, assume that |
| the type argument is <code>@NonNull</code>.</li> |
| </ul> |
| <p> |
| Each of these behaviors should be controlled by its own command-line |
| argument, as well as being enabled in the NullnessLight Checker. |
| </p> |
| |
| <p> |
| The implementation may be relatively straightforward, since in most cases |
| the behavior is just to disable some functionality of existing checkers. |
| </p> |
| |
| <p>Tools such as FindBugs, NullAway, NullnessLight, and the Nullness |
| Checker form a spectrum from easy-to-use bug detectors to sound |
| verification. NullnessLight represents a new point in the design space. |
| It will be interesting to compare these checkers: |
| </p> |
| |
| <ul> |
| <li>How much easier is it to use? For example, how many fewer |
| annotations need to be written?</li> |
| <li> |
| How many more fewer true positives does it report — in other |
| words, how many more false negatives does it suffer? |
| </li> |
| <li> |
| How many fewer false positives does it report? |
| </li> |
| </ul> |
| |
| <p> |
| Uber's <a href="https://github.com/uber/NullAway">NullAway</a> tool is also |
| an implementation of this idea (that is, a fast, but incomplete and |
| unsound, nullness checker). NullAway doesn't let the user specify Java |
| Generics: it assumes that every type parameter is <code>@NonNull</code>. |
| Does Uber's tool provide users a good |
| introduction to the ideas that a user can use to transition to a nullness |
| type system later? |
| </p> |
| |
| |
| |
| <h1 id="cf-other">Enhance the toolset</h1> |
| |
| |
| <h2 id="index-errors">Improving error messages</h2> |
| |
| <p> |
| Compiler writers have come to realize that clarity of error |
| messages is as important as the speed of the executable |
| (<a href="http://www.brettbecker.com/wp-content/uploads/2016/06/Becker-Effective-2016-SIGCSE.pdf">1</a>, <a href="https://www.mville.edu/sites/default/files/p53-munson_1.pdf">2</a>, |
| <a href="http://se.ethz.ch/~meyer/publications/teaching/compiler-errors.pdf">3</a>, |
| <a href="http://static.barik.net/barik/publications/icse2017/PID4655707.pdf">4</a>). This is especially true when the language or type system has rich features. |
| </p> |
| |
| <p> |
| The goal of this project is to improve a compiler's error messages. Here are |
| some distinct challenges: |
| </p> |
| <ul> |
| <li> |
| Some type errors can be more concisely or clearly expressed than the |
| standard "found type A, expected type B" message. |
| </li> |
| <li> |
| Some types are complex. The error message could explain them, or link |
| to the manual, or give suggested fixes. |
| </li> |
| <li> |
| Compiler messages currently show |
| the <a href="https://checkerframework.org/manual/#effective-qualifier">effective |
| type</a>, which may be different than what the user wrote due to |
| defaulting, inference, and syntactic sugar. For example, a user-written |
| <code>@IndexFor("a")</code> annotation is syntactic sugar for |
| <code>@NonNegative @LTLengthOf("a")</code>, and those types are |
| the ones that currently appear in error messages. |
| It might be good to show simpler types or ones that the user wrote. |
| </li> |
| <li> |
| Some checkers combine multiple cooperating type systems; |
| the <a href="https://checkerframework.org/manual/#nullness-checker">Nullness |
| Checker</a> and |
| the <a href="https://checkerframework.org/manual/#index-checker">Index |
| Checker</a> are examples. If there is a problem with a variable's |
| lower bound type, then its upper bound type should not be shown in the |
| error message. This will make the message shorter and more specific, |
| and avoid distracting the user with irrelevant information. |
| </li> |
| <li> |
| When a checker has multiple type systems, a type error or the lack of one may depend on facts from multiple type systems, and this should be expressed to the user. |
| </li> |
| </ul> |
| |
| |
| <h2 id="java-expression-parser">Java expression parser</h2> |
| |
| <p> |
| A number of type annotations take, as an |
| argument, <a href="https://checkerframework.org/manual/#java-expressions-as-arguments">a |
| Java expression</a>. The representation for these |
| (the <a href="https://checkerframework.org/api/org.checkerframework.dataflow.expression.JavaExpression.html"><code>JavaExpression</code></a> |
| class) is a hack. The goal of this |
| project is to remove it. |
| </p> |
| |
| <p> |
| The <code>JavaExpression</code> class |
| represents an AST. There is no need for the Checker Framework to |
| define its own AST when the JavaParser AST already exists and is |
| maintained. In fact, <code>JavaExpressionParseUtil</code> uses JavaParser, |
| but needlessly converts a |
| JavaParser <code>Expression</code> into |
| a <code>JavaExpression</code>. |
| </p> |
| |
| <p> |
| The goals for the project include: |
| </p> |
| <ul> |
| <li> |
| Replace every use |
| of <a href="https://checkerframework.org/api/org.checkerframework.dataflow.expression.JavaExpression.html"><code>JavaExpression</code></a> |
| by a use of the JavaParser |
| class <a href="https://www.javadoc.io/static/com.github.javaparser/javaparser-core/3.15.22/com/github/javaparser/ast/expr/Expression.html"><code>com.github.javaparser.ast.expr.Expression</code></a>. |
| </li> |
| <li> |
| Replace every use of a subclass of <code>JavaExpression</code> (listed in the |
| "Direct Known Subclasses" section of |
| the <a href="https://checkerframework.org/api/org.checkerframework.dataflow.expression.JavaExpression.html"><code>JavaExpression</code> |
| API documentation)</a> by a use of a |
| <a href="https://www.javadoc.io/static/com.github.javaparser/javaparser-core/3.15.22/com/github/javaparser/ast/expr/Expression.html">subclass of <code>Expression</code></a>. For example, replace every use |
| of <a href="https://checkerframework.org/api/org/checkerframework/dataflow/expression/MethodCall.html"><code>MethodCall</code></a> by <a href="https://www.javadoc.io/static/com.github.javaparser/javaparser-core/3.15.22/com/github/javaparser/ast/expr/MethodCallExpr.html"><code>MethodCallExpr</code></a>. |
| </li> |
| <li> |
| The <a href="https://checkerframework.org/api/org/checkerframework/framework/util/JavaExpressionParseUtil.html"><code>JavaExpressionParseUtil</code></a> |
| class already uses JavaParser, but it uses <code>ExpressionToReceiverVisitor</code> it to construct |
| a <code>JavaExpression</code>. Have it return a |
| JavaParser <code>Expression</code> instead, and delete <code>ExpressionToReceiverVisitor</code>. |
| </li> |
| </ul> |
| |
| <p> |
| Direct replacement of the classes is not possible, or we would have done it |
| already. For example, <code>JavaExpression</code> contains some methods that |
| JavaParser lacks, such as <code>isUnassignableByOtherCode</code>. As a |
| first step before doing the tasks listed above, you may want to convert |
| these methods from instance methods of <code>JavaExpression</code> into static |
| methods in <code>JavaExpressions</code>, making <code>JavaExpression</code> more |
| like a standard AST that can be replaced by JavaParser classes. |
| You also need to decide how to store the <code>type</code> field |
| of <code>JavaExpression</code>, when <code>JavaExpression</code> is eliminated. |
| An |
| alternate design (or a partial step in the refactoring process) would be to |
| retain the <code>JavaExpression</code> class, but make it a thin wrapper around |
| JavaParser classes that do most of the real work. |
| </p> |
| |
| <p> |
| Another aspect of this project is |
| fixing <a href="https://github.com/typetools/checker-framework/issues?q=is%3Aopen+is%3Aissue+label%3AJavaExpressions">the |
| issues that are labeled "JavaExpression".</a> |
| </p> |
| |
| |
| <h2 id="dataflow">Dataflow enhancements</h2> |
| |
| <p> |
| The Checker |
| Framework's <a href="https://checkerframework.org/manual/#creating-dataflow">dataflow |
| framework</a> |
| (<a href="https://checkerframework.org/manual/checker-framework-dataflow-manual.pdf">manual |
| here</a>) implements flow-sensitive type refinement (local type |
| inference) and other features. It is used in the Checker |
| Framework and also in <a href="http://errorprone.info/">Error Prone</a>, |
| <a href="https://github.com/uber/NullAway">NullAway</a>, and elsewhere. |
| </p> |
| |
| <p> |
| There are a number |
| of <a href="https://github.com/typetools/checker-framework/issues?q=is%3Aopen+is%3Aissue+label%3ADataflow">open |
| issues</a> — both bugs and feature requests — related to the |
| dataflow framework. The goal of this project is to address as many of |
| those issues as possible, which will directly improve all the tools that |
| use it. |
| </p> |
| |
| |
| <h2 id="Purity_analysis">Purity (side effect) analysis</h2> |
| |
| <p> |
| A program analysis technique makes estimates about the current values of |
| expressions. When a method call occurs, the analysis has to throw away |
| most of its estimates, because the method call might change any variable. |
| If the method is known to have no side effects, then the analysis doesn't |
| need to throw away its estimates, and the analysis is more precise. |
| </p> |
| |
| <p> |
| For example, the Checker Framework unsoundly trusts but does not check |
| <a href="https://checkerframework.org/manual/#type-refinement-purity">purity annotations</a>. This makes |
| the system vulnerable to programmer mistakes when writing annotations. The |
| Checker Framework contains a sound checker for immutability annotations, |
| but it suffers too many false positive warnings and thus is not usable. A |
| better checker is necessary. It will also incorporate aspects of an escape |
| analysis. |
| </p> |
| |
| <p> |
| Choosing an algorithm from the literature is the best choice, but there |
| still might be research work to do: in the past, when implementing |
| algorithms from research papers, we have sometimes found that they did not |
| work as well as claimed, and we have had to enhance them. One challenge is |
| that any technique used by pluggable type-checking to verify immutability |
| must be modular, but many side effect analyses require examining the whole |
| program. The system should require few or no method annotations within |
| method bodies. I'm not sure whether such a system already exists or we |
| need to design a new one. |
| </p> |
| |
| <p> |
| Perhaps one of these existing side effect analyses could be used: |
| <a href="https://github.com/Sable/soot/wiki/Using-Side-Effect-Attributes">Soot</a>, |
| <a href="https://pdfs.semanticscholar.org/d2bd/29e20c99a10ab9d0c5ecf3acaf15606407d1.pdf">Geffken</a>. |
| </p> |
| |
| <!-- |
| For design and implementation ideas, see: |
| * $qn/notes-purity |
| * ~/prof/grants/2011-09-darpa-apac/phase3-extension-proposal |
| --> |
| |
| |
| <h2 id="javadoc">Javadoc support</h2> |
| |
| <p> |
| Currently, type annotations are only displayed in Javadoc if they are |
| explicitly written by the programmer. However, the Checker Framework |
| provides flexible defaulting mechanisms, reducing the annotation overhead. |
| This project will integrate the Checker Framework defaulting phase with |
| Javadoc, showing the signatures after applying defaulting rules. |
| </p> |
| |
| <p> |
| There are other type-annotation-related improvements to Javadoc that can be |
| explored, e.g. using JavaScript to show or hide only the type annotations |
| currently of interest. |
| </p> |
| |
| |
| |
| </body> |
| </html> |
| |
| <!-- LocalWords: GSoC uwplse codespecs randoop typetools blogosphere Shi |
| --> |
| <!-- LocalWords: Marks's Bikesheds ICST Gyori Legunsen Marinov NonDex bc |
| --> |
| <!-- LocalWords: PossiblyNonDeterministic PossiblyNonDeterministicOrder |
| --> |
| <!-- LocalWords: DeterministicOrder Bazel PossbilyPropagatable Lazar tex |
| --> |
| <!-- LocalWords: NotPropagatable JavaParser contravariantly mortem EnerJ |
| --> |
| <!-- LocalWords: unsoundnesses nondeterministic nondeterministically ASM |
| --> |
| <!-- LocalWords: deterministically prototyped Stateful plugins plugin |
| --> |
| <!-- LocalWords: PDFs Stubparser JLint NullnessLight bytecodes YourKit |
| --> |
| <!-- LocalWords: timeline Uber's NullAway Uber coala BCEL mvn BCEL's AFU |
| --> |
| <!-- LocalWords: Signedness Jake2 README Lookup ASM's Nayuki CRC mis CWE |
| --> |
| <!-- LocalWords: BigInteger lockB lockA signedness microbenchmarks pre |
| --> |
| <!-- LocalWords: checklink subparts UnsignedBytes runtime MyClass jOOU |
| --> |
| <!-- LocalWords: usePriorityQueue PriorityQueue queueMinPathNode didn |
| --> |
| <!-- LocalWords: subchecker IntScope UnsignedLong UnsignedLongs doesn |
| --> |
| <!-- LocalWords: rasmud nondet Geffken Dort unmodifiable MightNotSupport |
| --> |
| <!-- LocalWords: SupportsAllOperations asList emptyList JavaExpression |
| --> |