| \htmlhr |
| \chapterAndLabel{Tainting Checker}{tainting-checker} |
| |
| The Tainting Checker prevents certain kinds of trust errors. |
| A \emph{tainted}, or untrusted, value is one that comes from an arbitrary, |
| possibly malicious source, such as user input or unvalidated data. |
| In certain parts of your application, using a tainted value can compromise |
| the application's integrity, causing it to crash, corrupt data, leak |
| private data, etc. |
| |
| % Ought to have many more examples |
| |
| For example, a user-supplied pointer, handle, or map key should be |
| validated before being dereferenced. |
| As another example, a user-supplied string should not be concatenated into a |
| SQL query, lest the program be subject to a |
| \href{https://en.wikipedia.org/wiki/Sql_injection}{SQL injection} attack. |
| A location in your program where malicious data could do damage is |
| called a \emph{sensitive sink}. |
| |
| A program must ``sanitize'' or ``untaint'' an untrusted value before using |
| it at a sensitive sink. There are two general ways to untaint a value: |
| by checking |
| that it is innocuous/legal (e.g., it contains no characters that can be |
| interpreted as SQL commands when pasted into a string context), or by |
| transforming the value to be legal (e.g., quoting all the characters that |
| can be interpreted as SQL commands). A correct program must use one of |
| these two techniques so that tainted values never flow to a sensitive sink. |
| The Tainting Checker ensures that your program does so. |
| |
| If the Tainting Checker issues no warning for a given program, then no |
| tainted value ever flows to a sensitive sink. However, your program is not |
| necessarily free from all trust errors. As a simple example, you might |
| have forgotten to annotate a sensitive sink as requiring an untainted type, |
| or you might have forgotten to annotate untrusted data as having a tainted |
| type. |
| |
| To run the Tainting Checker, supply the |
| \code{-processor TaintingChecker} |
| command-line option to javac. |
| %TODO: For examples, see Section~\ref{tainting-examples}. |
| |
| |
| \sectionAndLabel{Tainting annotations}{tainting-annotations} |
| |
| % TODO: add both qualifiers explicitly, and then describe their relationship. |
| |
| The Tainting type system uses the following annotations: |
| \begin{description} |
| \item[\refqualclass{checker/tainting/qual}{Untainted}] |
| indicates |
| a type that includes only untainted (trusted) values. |
| \item[\refqualclass{checker/tainting/qual}{Tainted}] |
| indicates |
| a type that may include tainted (untrusted) or untainted (trusted) values. |
| \code{@Tainted} is a supertype of \code{@Untainted}. |
| It is the default qualifier. |
| \item[\refqualclass{checker/tainting/qual}{PolyTainted}] |
| indicates qualifier polymorphism. |
| For a description of qualifier polymorphism, see |
| Section~\ref{method-qualifier-polymorphism}. |
| \end{description} |
| |
| |
| \sectionAndLabel{Tips on writing \code{@Untainted} annotations}{writing-untainted} |
| |
| Most programs are designed with a boundary that surrounds sensitive |
| computations, separating them from untrusted values. Outside this |
| boundary, the program may manipulate malicious values, but no malicious |
| values ever pass the boundary to be operated upon by sensitive |
| computations. |
| |
| In some programs, the area outside the boundary is very small: values are |
| sanitized as soon as they are received from an external source. In other |
| programs, the area inside the boundary is very small: values are sanitized |
| only immediately before being used at a sensitive sink. Either approach |
| can work, so long as every possibly-tainted value is sanitized before it |
| reaches a sensitive sink. |
| |
| Once you determine the boundary, annotating your program is easy: put |
| \code{@Tainted} outside the boundary, \code{@Untainted} inside, and |
| \code{@SuppressWarnings("tainting")} at the validation or |
| sanitization routines that are used at the boundary. |
| % (Or, the Tainting Checker may indicate to you that the boundary |
| % does not exist or has holes through which tainted values can pass.) |
| |
| The Tainting Checker's standard default qualifier is \code{@Tainted} (see |
| Section~\ref{defaults} for overriding this default). This is the safest |
| default, and the one that should be used for all code outside the boundary |
| (for example, code that reads user input). You can set the default |
| qualifier to \code{@Untainted} in code that may contain sensitive sinks. |
| |
| The Tainting Checker does not know the intended semantics of your program, |
| so it cannot warn you if you mis-annotate a sensitive sink as taking |
| \code{@Tainted} data, or if you mis-annotate external data as |
| \code{@Untainted}. So long as you correctly annotate the sensitive sinks |
| and the places that untrusted data is read, the Tainting Checker will |
| ensure that all your other annotations are correct and that no undesired |
| information flows exist. |
| |
| As an example, suppose that you wish to prevent SQL injection attacks. You |
| would start by annotating the |
| \sunjavadoc{java.sql/java/sql/Statement.html}{Statement} class to indicate that the |
| \code{execute} operations may only operate on untainted queries |
| (Chapter~\ref{annotating-libraries} describes how to annotate external |
| libraries): |
| |
| \begin{Verbatim} |
| public boolean execute(@Untainted String sql) throws SQLException; |
| public boolean executeUpdate(@Untainted String sql) throws SQLException; |
| \end{Verbatim} |
| |
| |
| \sectionAndLabel{\code{@Tainted} and \code{@Untainted} can be used for many purposes}{tainting-many-uses} |
| |
| The \code{@Tainted} and \code{@Untainted} annotations have only minimal |
| built-in semantics. In fact, the Tainting Checker provides only a small |
| amount of functionality beyond the Subtyping Checker |
| (Chapter~\ref{subtyping-checker}). This lack of hard-coded behavior has |
| two consequences. The first consequence is that |
| the annotations can serve many different purposes, such as: |
| |
| \begin{itemize} |
| \item |
| Prevent SQL injection attacks: \code{@Tainted} is external input, |
| \code{@Untainted} has been checked for SQL syntax. |
| \item |
| Prevent cross-site scripting attacks: \code{@Tainted} is external input, |
| \code{@Untainted} has been checked for JavaScript syntax. |
| \item |
| Prevent information leakage: \code{@Tainted} is secret data, |
| \code{@Untainted} may be displayed to a user. |
| \end{itemize} |
| |
| The second consequence is that the Tainting Checker is not useful unless |
| you annotate the appropriate sources, sinks, and untainting/sanitization |
| routines. |
| % This is similar to the \code{@Encrypted} annotation |
| % (Section~\ref{encrypted-example}), where the cryptographic functions are |
| % beyond the reasoning abilities of the type system. In each case, the type |
| % system verifies most of your code, and the \code{@SuppressWarnings} |
| % annotations indicate the few places where human attention is needed. |
| |
| |
| If you want more specialized semantics, or you want to annotate multiple |
| types of tainting (for example, HTML and SQL) in a single program, |
| then you can copy the definition of |
| the Tainting Checker to create a new annotation and checker with a more |
| specific name and semantics. You will change the copy to rename the |
| annotations, and you will annotate libraries and/or your code to identify |
| sources, sinks, and validation/sanitization routines. |
| See Chapter~\ref{creating-a-checker} for more |
| details. |
| |
| |
| \sectionAndLabel{A caution about polymorphism and side effects}{tainting-polymorphism-caution} |
| |
| Misuse of polymorphism can lead to unsoundness with the Tainting Checker |
| and other similar information flow checkers. To understand the potential |
| problem, consider the \code{append} function in |
| \code{java.lang.StringBuffer}: |
| |
| \begin{Verbatim} |
| public StringBuffer append(StringBuffer this, String toAppend); |
| \end{Verbatim} |
| |
| Given these declarations: |
| |
| \begin{Verbatim} |
| @Tainted StringBuffer tsb; |
| @Tainted String ts; |
| @Untainted StringBuffer usb; |
| @Untainted String us; |
| \end{Verbatim} |
| |
| \noindent |
| both of these invocations should be legal: |
| |
| \begin{Verbatim} |
| tsb.append(ts); |
| usb.append(us); |
| \end{Verbatim} |
| |
| That suggests that perhaps the function should be annotated as polymorphic: |
| |
| \begin{Verbatim} |
| // UNSOUND annotation -- do not do this! |
| public @PolyTainted StringBuffer append(@PolyTainted StringBuffer this, @PolyTainted String toAppend); |
| \end{Verbatim} |
| |
| The problem with the above annotation is that it permits the undesirable |
| invocation: |
| |
| \begin{Verbatim} |
| usb.append(ts); // illegal invocation |
| \end{Verbatim} |
| |
| \noindent |
| This invocation is permitted because, in the expression, all |
| \<@PolyTainted> annotations on formal parameters are instantiated to |
| \<@Tainted>, the top annotation, and each argument is a subtype of the |
| corresponding formal parameter. |
| |
| Beware this problem both in code you write, and also in annotated libraries |
| (such as stub files). The correct way to annotate this class is to |
| add a class qualifier parameter; see Section~\ref{class-qualifier-polymorphism}. |
| |
| (Side note: if \code{append} were purely functional (had no side effects |
| and returned a new \<StringBuffer>) the method call would be acceptable, |
| because the return type is instantiated to \<@Tainted StringBuffer> for the |
| expression \<usb.append(ts)>. However, the \code{append} method works via |
| side-effect, and only returns a reference to the buffer as a convenience |
| for writing ``fluent'' client code.) |
| |
| % TODO: one could add a String[] value to @Untainted to distinguish different |
| % values, eg @Untainted{``SQL''} versus @Untainted{``HTML''}. |
| |
| % LocalWords: quals untaint PolyTainted mis untainting sanitization java |
| %% LocalWords: TaintingChecker untrusted unvalidated usb PolyDet |