| \htmlhr |
| % Reinstate when lists are supported: |
| % \chapterAndLabel{Index Checker for sequence bounds (arrays, strings, lists)}{index-checker} |
| \chapterAndLabel{Index Checker for sequence bounds (arrays and strings)}{index-checker} |
| |
| The Index Checker warns about potentially out-of-bounds accesses to sequence |
| data structures, such as arrays |
| % , lists, |
| and strings. |
| |
| The Index Checker prevents \<IndexOutOfBoundsException>s that result from |
| an index expression that might be negative or might be equal to or larger |
| than the sequence's length. |
| It also prevents \<NegativeArraySizeException>s that result from a negative |
| array dimension in an array creation expression. |
| (A caveat: the Index Checker does not check for arithmetic overflow. If |
| an expression overflows, the Index Checker might fail to warn about a |
| possible exception. This is unlikely to be a problem in practice unless |
| you have an array whose length is \<Integer.MAX\_VALUE>.) |
| |
| % Here's a pathological example of overflow leading to unsoundness: |
| % |
| % public class IndexOverflow { |
| % public static void main(String[] args) { |
| % @Positive int x = 1073741825; // 2 ^ 30 + 1 |
| % @Positive int x2 = x + x; // 2 ^ 31 + 2 == - 2 ^ 31 + 2 |
| % int[] a = new int[0]; |
| % if (x2 < a.length) { |
| % a[x2] = 42; |
| % } |
| % } |
| % } |
| |
| The programmer can write annotations that indicate which expressions are |
| indices for which sequences. The Index Checker prohibits any operation that |
| may violate these properties, and the Index Checker takes advantage of |
| these properties when verifying indexing operations. |
| % |
| Typically, a programmer writes few annotations, because the Index Checker |
| infers properties of indexes from |
| the code around them. For example, it will infer that \<x> is positive |
| within the \<then> block of an \code{if (x > 0)} statement. |
| The programmer does need to write field types and method pre-conditions or post-conditions. For instance, |
| if a method's formal parameter is used as an index for |
| \<myArray>, the programmer might need to |
| write an \refqualclasswithparams{checker/index/qual}{IndexFor}{"myArray"} |
| annotation on the formal parameter's types. |
| |
| The Index Checker checks fixed-size data structures, whose size is never |
| changed after creation. A fixed-size data structure has no \<add> or |
| \<remove> operation. Examples are strings and arrays, and you can add |
| support for other fixed-size data structures (see |
| Section~\ref{index-annotating-fixed-size}). |
| |
| To run the Index Checker, run the command |
| |
| \begin{alltt} |
| javac -processor index \emph{MyJavaFile}.java |
| \end{alltt} |
| |
| Recall that in Java, type annotations are written before the type; |
| in particular, |
| array annotations appear immediately before ``\<[]>''. |
| Here is how to declare a length-9 array of positive integers: |
| |
| \begin{Verbatim} |
| @Positive int @ArrayLen(9) [] |
| \end{Verbatim} |
| |
| Multi-dimensional arrays are similar. |
| Here is how to declare a length-2 array of length-4 arrays: |
| |
| \begin{Verbatim} |
| String @ArrayLen(2) [] @ArrayLen(4) [] |
| \end{Verbatim} |
| |
| |
| \sectionAndLabel{Index Checker structure and annotations}{index-annotations} |
| |
| Internally, the Index Checker computes information about integers that |
| might be indices: |
| \begin{itemize} |
| \item |
| the lower bound on an integer, such as whether it is known to be positive |
| (Section~\ref{index-lowerbound}) |
| \item |
| the upper bound on an integer, such as whether it is less than the length |
| of a given sequence (Section~\ref{index-upperbound}) |
| \item |
| whether an integer came from calling the JDK's binary search routine on |
| an array (Section~\ref{index-searchindex}) |
| \item |
| whether an integer came from calling a string search routine |
| (Section~\ref{index-substringindex}) |
| \end{itemize} |
| |
| \noindent |
| and about sequence lengths: |
| \begin{itemize} |
| \item |
| the minimum length of a sequence, such ``\<myArray> contains at least 3 |
| elements'' (Section~\ref{index-minlen}) |
| \item |
| whether two sequences have the same length (Section~\ref{index-samelen}) |
| \end{itemize} |
| |
| The Index Checker checks of all these properties at once, but |
| this manual discusses each type system in a different section. |
| There are some annotations that are shorthand for writing multiple |
| annotations, each from a different type system: |
| |
| \begin{description} |
| \item[\refqualclasswithparams{checker/index/qual}{IndexFor}{String[] names}] |
| The value is a valid index for the named sequences. For example, the |
| \sunjavadoc{java.base/java/lang/String.html\#charAt(int)}{String.charAt(int)} |
| method is declared as |
| |
| \begin{Verbatim} |
| class String { |
| char charAt(@IndexFor("this") index) { ... } |
| } |
| \end{Verbatim} |
| |
| More generally, a variable |
| declared as \<@IndexFor("someArray") int i> has type |
| \<@IndexFor("someArray") int> and its run-time value is guaranteed to be |
| non-negative and less than the length of \<someArray>. You could also |
| express this as |
| \<\refqualclass{checker/index/qual}{NonNegative} |
| \refqualclasswithparams{checker/index/qual}{LTLengthOf}{"someArray"} |
| int i>, |
| but \<@IndexFor("someArray") int i> is more concise. |
| |
| \item[\refqualclasswithparams{checker/index/qual}{IndexOrHigh}{String[] names}] |
| The value is non-negative and is less than or equal to the length of |
| each named sequence. This type combines |
| \refqualclass{checker/index/qual}{NonNegative} and |
| \refqualclass{checker/index/qual}{LTEqLengthOf}. |
| |
| For example, the |
| \sunjavadoc{java.base/java/util/Arrays.html\#fill(java.lang.Object\%5B\%5D,int,int,java.lang.Object)}{Arrays.fill} |
| method is declared as |
| |
| \begin{mysmall} |
| \begin{Verbatim} |
| class Arrays { |
| void fill(Object[] a, @IndexFor("#1") int fromIndex, @IndexOrHigh("#1") int toIndex, Object val) |
| } |
| \end{Verbatim} |
| \end{mysmall} |
| |
| \item[\refqualclasswithparams{checker/index/qual}{LengthOf}{String[] names}] |
| The value is exactly equal to the length of the named |
| sequences. In the implementation, this type aliases |
| \refqualclass{checker/index/qual}{IndexOrHigh}, so writing it |
| only adds documentation (although future versions of the Index Checker |
| may use it to improve precision). |
| |
| \item[\refqualclasswithparams{checker/index/qual}{IndexOrLow}{String[] names}] |
| The value is -1 or is a valid index for |
| each named sequence. This type combines |
| \refqualclass{checker/index/qual}{GTENegativeOne} and |
| \refqualclass{checker/index/qual}{LTLengthOf}. |
| |
| % Example commented out; IndexOrLow is not sound for indexOf, because "".indexOf("") returns 0 |
| % |
| % For example, the |
| % \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf(String)} |
| % method is declared as |
| % |
| % \begin{Verbatim} |
| % class String { |
| % @IndexOrLow("this") int indexOf(String str) { ... } |
| % } |
| % \end{Verbatim} |
| |
| \item[\refqualclass{checker/index/qual}{PolyIndex}] |
| indicates qualifier polymorphism. This type combines |
| \refqualclass{checker/index/qual}{PolyLowerBound} and |
| \refqualclass{checker/index/qual}{PolyUpperBound}. |
| For a description of qualifier polymorphism, see |
| Section~\ref{method-qualifier-polymorphism}. |
| |
| \item[\refqualclass{checker/index/qual}{PolyLength}] |
| is a special polymorphic qualifier that combines |
| \refqualclass{checker/index/qual}{PolySameLen} and |
| \refqualclass{common/value/qual}{PolyValue} from the |
| Constant Value Checker (see \chapterpageref{constant-value-checker}). |
| \refqualclass{checker/index/qual}{PolyLength} exists |
| as a shorthand for these two annotations, since |
| they often appear together. |
| |
| \end{description} |
| |
| \sectionAndLabel{Lower bounds}{index-lowerbound} |
| |
| The Index Checker issues an error when |
| a sequence is indexed by an integer that might be negative. |
| The Lower Bound Checker uses a type system (Figure~\ref{fig-index-int-types}) with the following |
| qualifiers: |
| |
| \begin{description} |
| \item[\refqualclass{checker/index/qual}{Positive}] |
| The value is 1 or greater, so it is not too low to be used as an index. |
| Note that this annotation is trusted by the Constant Value Checker, |
| so if the Constant Value Checker is run on code containing this annotation, |
| the Lower Bound Checker must be run on the same code in order to |
| guarantee soundness. |
| \item[\refqualclass{checker/index/qual}{NonNegative}] |
| The value is 0 or greater, so it is not too low to be used as an index. |
| \item[\refqualclass{checker/index/qual}{GTENegativeOne}] |
| The value is -1 or greater. |
| It may not be used as an index for a sequence, because it might be too low. |
| (``\<GTE>'' stands for ``Greater Than or Equal to''.) |
| \item[\refqualclass{checker/index/qual}{PolyLowerBound}] |
| indicates qualifier polymorphism. |
| For a description of qualifier polymorphism, see |
| Section~\ref{method-qualifier-polymorphism}. |
| \item[\refqualclass{checker/index/qual}{LowerBoundUnknown}] |
| There is no information about the value. |
| It may not be used as an index for a sequence, because it might be too low. |
| \item[\refqualclass{checker/index/qual}{LowerBoundBottom}] |
| The value cannot take on any integral types. The bottom type, which |
| should not need to be written by the programmer. |
| \end{description} |
| |
| \begin{figure} |
| \begin{center} |
| \hfill |
| \includeimagenocentering{lowerbound}{5cm} |
| ~~~~\hfill~~~~ |
| \includeimagenocentering{upperbound}{7cm} |
| \hfill |
| \end{center} |
| \caption{The two type hierarchies for integer types used by the Index |
| Checker. On the left is a type system for lower bounds. On the right |
| is a type system for upper bounds. Qualifiers written in gray should |
| never be written in source code; they are used internally by the type |
| system. |
| % Using "\\" works for some but not all installations of LaTeX. |
| \newline |
| In the Upper Bound type system, subtyping rules depend on both the |
| array name (\<"myArray">, in the figure) and on the offset (which is 0, |
| the default, in the figure). Another qualifier is |
| \refqualclass{checker/index/qual}{UpperBoundLiteral}, whose subtyping |
| relationships depend on its argument and on offsets for other qualifiers. |
| } |
| \label{fig-index-int-types} |
| \end{figure} |
| |
| |
| \sectionAndLabel{Upper bounds}{index-upperbound} |
| |
| The Index Checker issues an error when a sequence index might be |
| too high. To do this, it maintains information about which expressions are |
| safe indices for which sequences. |
| The length of a sequence is \code{arr.length} for arrays and |
| \code{str.length()} for strings. |
| It uses a type system (Figure~\ref{fig-index-int-types}) with the following |
| qualifiers: |
| |
| It issues an error when a sequence \code{arr} |
| is indexed by an integer that is not of type \code{@LTLengthOf("arr")} |
| or \code{@LTOMLengthOf("arr")}. |
| |
| \begin{description} |
| |
| \item[\refqualclasswithparams{checker/index/qual}{LTLengthOf}{String[] names, String[] offset}] |
| An expression with this type |
| has value less than the length of each sequence listed in \<names>. |
| The expression may be used as an index into any of those sequences, |
| if it is non-negative. |
| For example, an expression of type \code{@LTLengthOf("a") int} might be |
| used as an index to \<a>. |
| The type \code{@LTLengthOf(\{"a", "b"\})} is a subtype of both |
| \code{@LTLengthOf("a")} and \code{@LTLengthOf("b")}. |
| (``\<LT>'' stands for ``Less Than''.) |
| |
| \<@LTLengthOf> takes an optional \<offset> element, meaning that the |
| annotated expression plus the offset is less than the length of the given |
| sequence. For example, suppose expression \<e> has type \<@LTLengthOf(value |
| = \{"a", "b"\}, offset = \{"-1", "x"\})>. Then \<e - 1> is less than |
| \<a.length>, and \<e + x> is less than \<b.length>. This helps to make |
| the checker more precise. Programmers rarely need to write the \<offset> |
| element. |
| |
| \item[\refqualclasswithparams{checker/index/qual}{LTEqLengthOf}{String[] names}] |
| An expression with this type |
| has value less than or equal to the length of each sequence listed in \<names>. |
| It may not be used as an index for these sequences, because it might be too high. |
| \code{@LTEqLengthOf(\{"a", "b"\})} is a subtype of both |
| \code{@LTEqLengthOf("a")} and \code{@LTEqLengthOf("b")}. |
| (``\<LTEq>'' stands for ``Less Than or Equal to''.) |
| |
| \<@LTEqLengthOf(\{"a"\})> = \<@LTLengthOf(value=\{"a"\}, offset=-1)>, and \\ |
| \<@LTEqLengthOf(value=\{"a"\}, offset=x)> = \<@LTLengthOf(value=\{"a"\}, |
| offset=x-1)> for any x. |
| |
| \item[\refqualclasswithparams{checker/index/qual}{LTOMLengthOf}{String[] names}] |
| An expression with this type |
| has value at least 2 less than the length of each sequence listed in \<names>. |
| It may always used as an index for a sequence listed in \<names>, if it is |
| non-negative. |
| |
| This type exists to allow the checker to infer the safety of loops of |
| the form: |
| \begin{Verbatim} |
| for (int i = 0; i < array.length - 1; ++i) { |
| arr[i] = arr[i+1]; |
| } |
| \end{Verbatim} |
| This annotation should rarely (if ever) be written by the programmer; usually |
| \refqualclasswithparams{checker/index/qual}{LTLengthOf}{String[] names} |
| should be written instead. |
| \code{@LTOMLengthOf(\{"a", "b"\})} is a subtype of both |
| \code{@LTOMLengthOf("a")} and \code{@LTOMLengthOf("b")}. |
| (``\<LTOM>'' stands for ``Less Than One Minus'', because another way of |
| saying ``at least 2 less than \<a.length>'' is ``less than \<a.length-1>''.) |
| |
| \<@LTOMLengthOf(\{"a"\})> = \<@LTLengthOf(value=\{"a"\}, offset=1)>, and \\ |
| \<@LTOMLengthOf(value=\{"a"\}, offset=x)> = \<@LTLengthOf(value=\{"a"\}, |
| offset=x+1)> for any x. |
| |
| \item[\refqualclass{checker/index/qual}{UpperBoundLiteral}] |
| repesents a constant value, typically a literal written in source code. |
| Its subtyping relationship is: |
| \<@UpperBoundLiteral(lit)> <: \<LTLengthOf(value="myArray", offset=off)> |
| if \<lit>+\<offset> $\le$ -1. |
| |
| \item[\refqualclass{checker/index/qual}{PolyUpperBound}] |
| indicates qualifier polymorphism. |
| For a description of qualifier polymorphism, see |
| Section~\ref{method-qualifier-polymorphism}. |
| |
| \item[\refqualclass{checker/index/qual}{UpperBoundUnknown}] |
| There is no information about the upper bound on the value of an expression with this type. |
| It may not be used as an index for a sequence, because it might be too high. |
| This type is the top type, and should never need to be written by the |
| programmer. |
| |
| \item[\refqualclass{checker/index/qual}{UpperBoundBottom}] |
| This is the bottom type for the upper bound type system. It should |
| never need to be written by the programmer. |
| |
| \end{description} |
| |
| The following method annotations can be used to establish a method postcondition |
| that ensures that a certain expression is a valid index for a sequence: |
| |
| \begin{description} |
| \item[\refqualclasswithparams{checker/index/qual}{EnsuresLTLengthOf}{String[] value, String[] targetValue, String[] offset}] |
| When the method with this annotation returns, the expression (or all the expressions) given in the \code{value} element |
| is less than the length of the given sequences with the given offsets. More precisely, the expression |
| has the \code{@LTLengthOf} qualifier with the \code{value} and \code{offset} arguments |
| taken from the \code{targetValue} and \code{offset} elements of this annotation. |
| \item[\refqualclasswithparams{checker/index/qual}{EnsuresLTLengthOfIf}{String[] expression, boolean result, String[] targetValue, String[] offset}] |
| If the method with this annotation returns the given boolean value, |
| then the given expression (or all the given expressions) |
| is less than the length of the given sequences with the given offsets. |
| \end{description} |
| |
| There is one declaration annotation that indicates the relationship between |
| two sequences: |
| |
| \begin{description} |
| \item[\refqualclasswithparams{checker/index/qual}{HasSubsequence}{String[] |
| value, String[] from, String[] to}] |
| indicates that a subsequence (from \code{from} to \code{to}) of the |
| annotated sequence is equal to some other sequence, named by |
| \code{value}). |
| |
| For example, to indicate that \<shorter> is a subsequence of \<longer>: |
| |
| \begin{Verbatim} |
| int startIndex; |
| int endIndex; |
| int[] shorter; |
| @HasSubsequence(value="shorter", from="this.start", to="this.end") |
| int[] longer; |
| \end{Verbatim} |
| |
| Thus, a valid index into \<shorter> is also a valid index (between |
| \code{start} and \code{end-1} inclusive) into \<longer>. More generally, |
| if \code{x} is \code{@IndexFor("shorter")} in the example above, then |
| \code{start + x} is \code{@IndexFor("longer")}. If \code{y} is |
| \code{@IndexFor("longer")} and \code{@LessThan("end")}, then \code{y - |
| start} is \code{@IndexFor("shorter")}. Finally, \code{end - start} is |
| \code{@IndexOrHigh("shorter")}. |
| |
| This annotation is in part checked and in part trusted. When an array is |
| assigned to \code{longer}, three facts are checked: that \code{start} is |
| non-negative, that \code{start} is less than or equal to \code{end}, and |
| that \code{end} is less than or equal to the length of \code{longer}. This |
| ensures that the indices are valid. The programmer must manually verify |
| that the value of \code{shorter} equals the subsequence that they describe. |
| \end{description} |
| |
| |
| \sectionAndLabel{Sequence minimum lengths}{index-minlen} |
| |
| The Index Checker estimates, for each sequence expression, how long its value |
| might be at run time by computing a minimum length that |
| the sequence is guaranteed to have. This enables the Index Checker to |
| verify indices that are compile-time constants. For example, this code: |
| |
| \begin{Verbatim} |
| String getThirdElement(String[] arr) { |
| return arr[2]; |
| } |
| \end{Verbatim} |
| |
| \noindent |
| is legal if \<arr> has at least three elements, which can be indicated |
| in this way: |
| |
| \begin{Verbatim} |
| String getThirdElement(String @MinLen(3) [] arr) { |
| return arr[2]; |
| } |
| \end{Verbatim} |
| |
| When the index is not a compile-time constant, as in \<arr[i]>, then the |
| Index Checker depends not on a \<@MinLen> annotation but on \<i> being |
| annotated as |
| \refqualclasswithparams{checker/index/qual}{LTLengthOf}{"arr"}. |
| |
| The MinLen type qualifier is implemented in practice by the Constant Value Checker, |
| using \<@ArrayLenRange> annotations (see \chapterpageref{constant-value-checker}). |
| This means that errors related to the minimum lengths of arrays must be suppressed using |
| the "value" argument to \<@SuppressWarnings>. |
| \refqualclass{common/value/qual}{ArrayLenRange} and \refqualclass{common/value/qual}{ArrayLen} |
| annotations can also be used to establish the minimum length of a sequence, if a |
| more precise estimate of length is known. For example, |
| if \<arr> is known to have exactly three elements: |
| |
| \begin{Verbatim} |
| String getThirdElement(String @ArrayLen(3) [] arr) { |
| return arr[2]; |
| } |
| \end{Verbatim} |
| |
| The following type qualifiers (from \chapterpageref{constant-value-checker}) |
| can establish the minimum length of a sequence: |
| |
| \begin{description} |
| \item[\refqualclasswithparams{common/value/qual}{MinLen}{int value}] |
| The value of an expression of this type is a sequence with at least |
| \code{value} elements. The default annotation is |
| \code{@MinLen(0)}, and it may be applied to non-sequences. |
| \code{@MinLen($x$)} is a subtype of \code{@MinLen($x-1$)}. |
| An \code{@MinLen} annotation is treated internally as an |
| \refqualclass{common/value/qual}{ArrayLenRange} with only its |
| \code{from} field filled. |
| \item[\refqualclasswithparams{common/value/qual}{ArrayLen}{int[] value}] |
| The value of an expression of this type is a sequence whose |
| length is exactly one of the integers listed in its argument. |
| The argument can contain at most ten integers; larger collections of |
| integers are converted to \refqualclass{common/value/qual}{ArrayLenRange} |
| annotations. The minimum length of a sequence with this annotation |
| is the smallest element of the argument. |
| \item[\refqualclasswithparams{common/value/qual}{ArrayLenRange}{int from, int to}] |
| The value of an expression of this type is a sequence whose |
| length is bounded by its arguments, inclusive. |
| The minimum length of a sequence with this annotation is its \<from> argument. |
| \end{description} |
| |
| \begin{figure} |
| \begin{center} |
| \hfill |
| \includeimage{samelen}{5cm} |
| \hfill |
| \end{center} |
| \caption{The type hierarchy for arrays of equal length ("a" and "b" are |
| assumed to be in-scope sequences). Qualifiers |
| written in gray should never be written in source code; they are used |
| internally by the type system.} |
| \label{fig-index-array-types} |
| \end{figure} |
| |
| The following method annotation can be used to establish a method postcondition |
| that ensures that a certain sequence has a minimum length: |
| |
| \begin{description} |
| \item[\refqualclasswithparams{common/value/qual}{EnsuresMinLenIf}{String[] expression, boolean result, int targetValue}] |
| If the method with this annotation returns the given boolean value, |
| then the given expression (or all the given expressions) is a sequence |
| with at least \code{targetValue} elements. |
| \end{description} |
| |
| \sectionAndLabel{Sequences of the same length}{index-samelen} |
| |
| The Index Checker determines whether two or more sequences have the same length. |
| This enables it to verify that all the indexing operations are safe in code |
| like the following: |
| |
| \begin{Verbatim} |
| boolean lessThan(double[] arr1, double @SameLen("#1") [] arr2) { |
| for (int i = 0; i < arr1.length; i++) { |
| if (arr1[i] < arr2[i]) { |
| return true; |
| } else if (arr1[i] > arr2[i]) { |
| return false; |
| } |
| } |
| return false; |
| } |
| \end{Verbatim} |
| |
| When needed, you can specify which sequences have the same length using the following type qualifiers (Figure~\ref{fig-index-array-types}): |
| |
| \begin{description} |
| \item[\refqualclasswithparams{checker/index/qual}{SameLen}{String[] names}] |
| An expression with this type represents a sequence that has the |
| same length as the other sequences named in \<names>. In general, |
| \code{@SameLen} types that have non-intersecting sets of names |
| are \textit{not} subtypes of each other. However, if at least one |
| sequence is named by both types, the types are actually the same, |
| because all the named sequences must have the same length. |
| \item[\refqualclass{checker/index/qual}{PolySameLen}] |
| indicates qualifier polymorphism. |
| For a description of qualifier polymorphism, see |
| Section~\ref{method-qualifier-polymorphism}. |
| \item[\refqualclass{checker/index/qual}{SameLenUnknown}] |
| No information is known about which other sequences have the same length |
| as this one. |
| This is the top type, and programmers should never need to write it. |
| \item[\refqualclass{checker/index/qual}{SameLenBottom}] |
| This is the bottom type, and programmers should rarely need to write it. |
| \code{null} has this type. |
| \end{description} |
| |
| |
| \sectionAndLabel{Binary search indices}{index-searchindex} |
| |
| The JDK's |
| \sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch} |
| method returns either where the value was found, or a negative value |
| indicating where the value could be inserted. The Search Index Checker |
| represents this concept. |
| |
| \begin{figure} |
| \begin{center} |
| \hfill |
| \includeimage{searchindex}{7cm} |
| \hfill |
| \end{center} |
| \caption{The type hierarchy for the Index Checker's internal type system |
| that captures information about the results of calls to |
| \sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch}.} |
| \label{fig-index-searchindex} |
| \end{figure} |
| |
| The Search Index Checker's type hierarchy (Figure~\ref{fig-index-searchindex}) has four type qualifiers: |
| \begin{description} |
| \item[\refqualclasswithparams{checker/index/qual}{SearchIndexFor}{String[] names}] |
| An expression with this type represents an integer that could have been |
| produced by calling |
| \sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch}: |
| for each array \<a> specified in the annotation, the annotated integer is |
| between \<-a.length-1> and \<a.length-1>, inclusive |
| \item[\refqualclasswithparams{checker/index/qual}{NegativeIndexFor}{String[] names}] |
| An expression with this type represents a ``negative index'' that is |
| between \<a.length-1> and \<-1>, inclusive; that is, a value that is both |
| a \<@SearchIndex> and is negative. Applying the bitwise complement |
| operator (\verb|~|) to an expression of this type produces an expression |
| of type \refqualclass{checker/index/qual}{IndexOrHigh}. |
| \item[\refqualclass{checker/index/qual}{SearchIndexBottom}] |
| This is the bottom type, and programmers should rarely need to write it. |
| \item[\refqualclass{checker/index/qual}{SearchIndexUnknown}] |
| No information is known about whether this integer is a search index. |
| This is the top type, and programmers should rarely need to write it. |
| \end{description} |
| |
| |
| \sectionAndLabel{Substring indices}{index-substringindex} |
| |
| The methods |
| \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf} |
| and |
| \sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf} |
| return an index of a given substring within a given string, or -1 if no |
| such substring exists. The index \<i> returned from |
| \<receiver.indexOf(substring)> satisfies the following property, which is |
| stated here in three equivalent ways: |
| \begin{Verbatim} |
| i == -1 || ( i >= 0 && i <= receiver.length() - substring.length() ) |
| i == -1 || ( @NonNegative && @LTLengthOf(value="receiver", offset="substring.length()-1") ) |
| @SubstringIndexFor(value="receiver", offset="substring.length()-1") |
| \end{Verbatim} |
| |
| % This new annotation is similar to \<@LTLengthOf\allowbreak(value = |
| % "receiver", offset = "substring.length()-1")>, but explicitly allows the |
| % index to be -1 even if the upper bound would not allow it because of the |
| % offset. The Upper Bound Checker can infer the corresponding |
| % \<@LTLengthOf> annotation for expressions that have a |
| % \<@SubstringIndexFor> annotation and at the same time are known to be |
| % non-negative (according to the Lower Bound Checker). |
| |
| The return type of methods \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf} |
| and \sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf} has the annotation |
| \refqualclasswithparams{checker/index/qual}{SubstringIndexFor}{value="this", offset="\#1.length()-1")}. |
| This allows writing code such as the following with no warnings from the |
| Index Checker: |
| |
| \begin{Verbatim} |
| public static String removeSubstring(String original, String removed) { |
| int i = original.indexOf(removed); |
| if (i != -1) { |
| return original.substring(0, i) + original.substring(i + removed.length()); |
| } |
| return original; |
| } |
| \end{Verbatim} |
| |
| % The code removes the first occurrence of \<removed> from |
| % \<original>. After checking that \code{i != -1}, the value of \<i> must |
| % be a valid index for \<original>. Because this index is the start of an |
| % occurrence of \<removed>, \code{i + removed.length()} is the index of the |
| % end of the occurrence. Without the \<@SubstringIndexFor> annotation, the |
| % Upper Bound Checker would not be able to verify that \code{i + |
| % removed.length()} is a valid argument to \<substring>, which requires |
| % both arguments to be \<@IndexOrHigh("original")>. |
| |
| \begin{figure} |
| \begin{center} |
| \hfill |
| \includeimage{substringindex}{3.5cm} |
| \hfill |
| \end{center} |
| \caption{The type hierarchy for the Substring Index Checker, which |
| captures information about the results of calls to |
| \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf} |
| and |
| \sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf}.} |
| \label{fig-index-substringindex} |
| \end{figure} |
| |
| The \<@SubstringIndexFor> annotation is implemented in a Substring Index |
| Checker that runs together with the Index Checker and has its own type |
| hierarchy (Figure~\ref{fig-index-substringindex}) with three type |
| qualifiers: |
| \begin{description} |
| \item[\refqualclasswithparams{checker/index/qual}{SubstringIndexFor}{String[] value, String[] offset}] |
| An expression with this type represents an integer that could have been |
| produced by calling |
| \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}: |
| the annotated integer is either -1, or it is non-negative and is less |
| than or equal to \<receiver.length - offset> (where the sequence |
| \<receiver> and the offset \<offset> are corresponding elements of the |
| annotation's arguments). |
| \item[\refqualclass{checker/index/qual}{SubstringIndexBottom}] |
| This is the bottom type, and programmers should rarely need to write it. |
| \item[\refqualclass{checker/index/qual}{SubstringIndexUnknown}] |
| No information is known about whether this integer is a substring index. |
| This is the top type, and programmers should rarely need to write it. |
| \end{description} |
| |
| |
| \subsectionAndLabel{The need for the \<@SubstringIndexFor> annotation}{index-substringindex-justification} |
| |
| No other annotation supported by the Index Checker precisely represents the |
| possible return values of methods |
| \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf} |
| and |
| \sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf}. |
| The reason is the methods' special cases for empty strings and for failed matches. |
| |
| Consider the result \<i> of \<receiver.indexOf(substring)>: |
| |
| \begin{itemize} |
| \item |
| \<i> is \<@GTENegativeOne>, because \code{i >= -1}. |
| \item |
| \<i> is \<@LTEqLengthOf("receiver")>, because \code{i <= receiver.length()}. |
| \item |
| \<i> is not \<@IndexOrLow("receiver")>, because for |
| \code{receiver = "", substring = "", i = 0}, the property |
| \code{i >= -1 \&\& i < receiver.length()} does not hold. |
| \item |
| \<i> is not \<@IndexOrHigh("receiver")>, because for |
| \code{receiver = "", substring = "b", i = -1}, the property |
| \code{i >= 0 \&\& i <= receiver.length()} does not hold. |
| \item |
| \<i> is not |
| \<@LTLengthOf(value = "receiver", offset = "substring.length()-1")>, |
| because for \code{receiver = "", substring = "abc", i = -1}, the property |
| \code{i + substring.length() - 1 < receiver.length()} does not hold. |
| \end{itemize} |
| |
| \noindent |
| The last annotation in the list above, |
| \<@LTLengthOf(value = "receiver", offset = "substring.length()-1")>, |
| is the correct and precise upper bound for all values of \<i> except -1. |
| The offset expresses the fact that we can add \<substring.length()> to this |
| index and still get a valid index for \<receiver>. That is useful for |
| type-checking code that adds the length of the substring to the found |
| index, in order to obtain the rest of the string. However, the upper bound |
| applies only after the index is explicitly checked not to be -1: |
| |
| \begin{Verbatim} |
| int i = receiver.indexOf(substring); |
| // i is @GTENegativeOne and @LTEqLengthOf("receiver") |
| // i is not @LTLengthOf(value = "receiver", offset = "substring.length()-1") |
| if (i != -1) { |
| // i is @NonNegative and @LTLengthOf(value = "receiver", offset = "substring.length()-1") |
| int j = i + substring.length(); |
| // j is @IndexOrHigh("receiver") |
| return receiver.substring(j); // this call is safe |
| } |
| \end{Verbatim} |
| |
| The property of the result of \<indexOf> cannot be expressed by any |
| combination of lower-bound (Section~\ref{index-lowerbound}) and upper-bound |
| (Section~\ref{index-upperbound}) annotations, because the upper-bound |
| annotations apply independently of the lower-bound annotations, but in this |
| case, the upper bound \code{i <= receiver.length() - substring.length()} |
| holds only if \code{i >= 0}. Therefore, to express this property and make |
| the example type-check without false positives, a new annotation such as |
| \<@SubstringIndexFor\allowbreak(value = "receiver", offset = "substring.length()-1")> |
| is necessary. |
| |
| \sectionAndLabel{Inequalities}{index-inequalities} |
| |
| The Index Checker estimates which expression's values are less than other expressions' values. |
| |
| \begin{description} |
| |
| \item[\refqualclasswithparams{checker/index/qual}{LessThan}{String[] values}] |
| An expression with this type has a value that is less than the value of each |
| expression listed in \<values>. The expressions in values must be composed of |
| final or effectively final variables and constants. |
| |
| \item[\refqualclass{checker/index/qual}{LessThanUnknown}] |
| There is no information about the value of an expression this type relative to other expressions. |
| This is the top type, and should not be written by the programmer. |
| |
| \item[\refqualclass{checker/index/qual}{LessThanBottom}] |
| This is the bottom type for the less than type system. It should |
| never need to be written by the programmer. |
| |
| \end{description} |
| |
| \sectionAndLabel{Annotating fixed-size data structures}{index-annotating-fixed-size} |
| |
| The Index Checker has built-in support for Strings and arrays. |
| You can add support for additional fixed-size data structures by writing |
| annotations. |
| This allows the Index Checker to typecheck the data structure's |
| implementation and to typecheck uses of the class. |
| |
| This section gives an example: a fixed-length collection. |
| |
| %% The code that follows is copied from checker/tests/index/ArrayWrapper.java. |
| %% If this code is updated, please update that file, too. |
| |
| \begin{Verbatim} |
| /** ArrayWrapper is a fixed-size generic collection. */ |
| public class ArrayWrapper<T> { |
| private final Object @SameLen("this") [] delegate; |
| |
| @SuppressWarnings("index") // constructor creates object of size @SameLen(this) by definition |
| ArrayWrapper(@NonNegative int size) { |
| delegate = new Object[size]; |
| } |
| |
| public @LengthOf("this") int size() { |
| return delegate.length; |
| } |
| |
| public void set(@IndexFor("this") int index, T obj) { |
| delegate[index] = obj; |
| } |
| |
| @SuppressWarnings("unchecked") // required for normal Java compilation due to unchecked cast |
| public T get(@IndexFor("this") int index) { |
| return (T) delegate[index]; |
| } |
| } |
| \end{Verbatim} |
| |
| The Index Checker treats methods annotated with \code{@LengthOf("this")} as |
| the length of a sequence like \code{arr.length} for arrays and |
| \code{str.length()} for strings. |
| |
| With these annotations, client code like the following typechecks with no |
| warnings: |
| \begin{Verbatim} |
| public static void clearIndex1(ArrayWrapper<? extends Object> a, @IndexFor("#1") int i) { |
| a.set(i, null); |
| } |
| |
| public static void clearIndex2(ArrayWrapper<? extends Object> a, int i) { |
| if (0 <= i && i < a.size()) { |
| a.set(i, null); |
| } |
| } |
| \end{Verbatim} |
| |
| %% LocalWords: NegativeArraySizeException pre myArray IndexFor someArray |
| %% LocalWords: MyJavaFile LTLengthOf LTEqLengthOf GTENegativeOne GTE str |
| %% LocalWords: LowerBoundUnknown LTOMLengthOf LTEq LTOM UpperBoundBottom |
| %% LocalWords: UpperBoundUnknown MinLen MinLenBottom SameLen indexOf abc |
| %% LocalWords: SameLenUnknown SameLenBottom lastIndexOf html lang charAt |
| %% LocalWords: LengthOf IndexOrLow PolyIndex PolyLowerBound PolyLength |
| %% LocalWords: PolyUpperBound PolySameLen PolyValue LowerBoundBottom |
| %% LocalWords: lowerbound upperbound EnsuresLTLengthOf targetValue |
| %% LocalWords: EnsuresLTLengthOfIf boolean HasSubsequence LessThan |
| %% LocalWords: minlen ArrayLenRange ArrayLen EnsuresMinLenIf samelen |
| %% LocalWords: searchindex binarySearch SearchIndexFor NegativeIndexFor |
| %% LocalWords: SearchIndex bitwise SearchIndexBottom SearchIndexUnknown |
| %% LocalWords: Substring substringindex substring SubstringIndexFor |
| %% LocalWords: SubstringIndexBottom SubstringIndexUnknown LessThanBottom |
| %% LocalWords: LessThanUnknown typecheck typechecks |