| \htmlhr |
| \chapterAndLabel{Signedness Checker}{signedness-checker} |
| |
| The Signedness Checker guarantees that signed and unsigned integral values are not mixed |
| together in a computation. In addition, it prohibits meaningless operations, such |
| as division on an unsigned value. |
| |
| Recall that a computer represents a number as a sequence of bits. |
| Signedness indicates how to interpret the most significant bit. For |
| example, the bits \<10000010> ordinarily represent the value -126, but when |
| interpreted as unsigned, those bits represent the value 130. The bits |
| \<01111110> represent the value 126 in signed and in unsigned interpretation. |
| The range of signed byte values is -128 to 127. The range of unsigned byte |
| values is 0 to 255. |
| |
| Signedness is only applicable to the integral types \<byte>, |
| \<short>, \<int>, and \<long> and their boxed variants \<Byte>, |
| \<Short>, \<Integer>, and \<Long>. |
| \<char> and \<Character> are always unsigned. |
| Floating-point types \<float>, \<double>, \<Float>, and \<Double> are always signed. |
| % , because they do not have operations that interpret the bits as unsigned. |
| |
| Signedness is primarily about how the bits of the representation are |
| \emph{interpreted}, not about the values that it can represent. An unsigned value |
| is always positive, but just because a variable's value is positive does |
| not mean that it should be marked as \<@Unsigned>. If variable $v$ will be |
| compared to a signed value, or used in arithmetic operations with a signed |
| value, then $v$ should have signed type. |
| To indicate the range of possible values for a variable, use the |
| \refqualclass{checker/index/qual}{NonNegative} annotation of the Index |
| Checker (see \chapterpageref{index-checker}) or the |
| \refqualclass{common/value/qual}{IntRange} annotation of the Constant Value |
| Checker (see \chapterpageref{constant-value-checker}). |
| |
| |
| \sectionAndLabel{Annotations}{signedness-checker-annotations} |
| |
| The Signedness Checker uses type annotations to indicate the signedness that the programmer intends an expression to have. |
| |
| \begin{figure} |
| \includeimage{signedness}{3.5cm} |
| \caption{The type qualifier hierarchy of the signedness annotations. |
| Qualifiers in gray are used internally by the type system but should never be written by a programmer.} |
| \label{fig-signedness-hierarchy} |
| \end{figure} |
| |
| These are the qualifiers in the signedness type system: |
| |
| \begin{description} |
| |
| \item[\refqualclass{checker/signedness/qual}{Unsigned}] |
| indicates that the programmer intends the value to be |
| interpreted as unsigned. |
| That is, if the most significant bit in the bitwise representation is |
| set, then the bits should be interpreted as a large positive value. |
| |
| \item[\refqualclass{checker/signedness/qual}{Signed}] |
| indicates that the programmer intends the value to be |
| interpreted as signed. |
| That is, if the most significant bit in the bitwise representation is |
| set, then the bits should be interpreted as a negative value. |
| This is the default annotation. |
| |
| \item[\refqualclass{checker/signedness/qual}{SignedPositive}] |
| indicates that a value is known at compile time to be in the positive |
| signed range, so it has the same interpretation as signed or unsigned |
| and may be used with either interpretation. (Equivalently, the most |
| significant bit is guaranteed to be 0.) Programmers should usually |
| write \refqualclass{checker/signedness/qual}{Signed} or |
| \refqualclass{checker/signedness/qual}{Unsigned} instead. |
| |
| \item[\refqualclass{checker/signedness/qual}{SignedPositiveFromUnsigned}] |
| indicates that a value is in the positive signed range, as with |
| \refqualclass{checker/signedness/qual}{SignedPositive}. Furthermore, |
| the value was derived from values that can be interpreted as |
| \refqualclass{checker/signedness/qual}{Unsigned}. Programmers should |
| rarely write this annotation. |
| |
| \item[\refqualclass{checker/signedness/qual}{SignednessGlb}] |
| indicates that a value may be interpreted as unsigned or signed. It |
| covers the same cases as |
| \refqualclass{checker/signedness/qual}{SignedPositive}, plus manifest literals, to |
| prevent the programmer from having to annotate them all explicitly. |
| This annotation should almost never be written by the |
| programmer. |
| |
| \item[\refqualclass{checker/signedness/qual}{PolySigned}] |
| indicates qualifier polymorphism. |
| For a description of qualifier polymorphism, see |
| Section~\ref{method-qualifier-polymorphism}. |
| |
| \item[\refqualclass{checker/signedness/qual}{UnknownSignedness}] |
| indicates that a value's type is not relevant or known to this checker. |
| This annotation is used internally, and should not be |
| written by the programmer. |
| |
| \item[\refqualclass{checker/signedness/qual}{SignednessBottom}] |
| indicates that the value is \<null>. |
| This annotation is used internally, and should not |
| be written by the programmer. |
| |
| \end{description} |
| |
| |
| \subsectionAndLabel{Default qualifiers}{signedness-checker-annotations-default-qualifiers} |
| |
| The only type qualifier that the programmer should need to write is |
| \refqualclass{checker/signedness/qual}{Unsigned}. |
| When a programmer leaves an expression unannotated, the |
| Signedness Checker treats it in one of the following ways: |
| |
| \begin{itemize} |
| |
| \item |
| All \code{byte}, \code{short}, \code{int}, and \code{long} literals default |
| to \refqualclass{checker/signedness/qual}{SignednessGlb}. |
| \item |
| All \code{byte}, \code{short}, \code{int}, and \code{long} variables default |
| to \refqualclass{checker/signedness/qual}{Signed}. |
| \item |
| All other expressions default to \refqualclass{checker/signedness/qual}{UnknownSignedness}. |
| |
| \end{itemize} |
| |
| |
| \subsectionAndLabel{Widening conversions}{signedness-checker-widening-conversions} |
| |
| Figure~\ref{fig-signedness-hierarchy} shows the type qualifier hierarchy. |
| When upcasting among integral types, the expression has type |
| \<@SignedPositive> because the value is guaranteed to be within the signed |
| positive range. For example: |
| |
| \begin{Verbatim} |
| @Signed int sint; |
| @Unsigned int uint; |
| @SignedPositive long splong1 = sint; // legal |
| @SignedPositive long splong2 = uint; // legal |
| ... (long) sint ... // this expression has type @SignedPositive long |
| ... (long) uint ... // this expression has type @SignedPositive long |
| \end{Verbatim} |
| |
| |
| \sectionAndLabel{Prohibited operations}{signedness-checker-prohibited-operations} |
| |
| The Signedness Checker prohibits the following uses of operators: |
| |
| \begin{itemize} |
| |
| \item |
| Division (\code{/}) or modulus (\code{\%}) with an \code{@Unsigned} |
| operand. |
| \item |
| Signed right shift (\verb|>>|) with an \code{@Unsigned} left operand. |
| \item |
| Unsigned right shift (\verb|>>>|) with a \code{@Signed} left operand. |
| \item |
| Greater/less than (or equal) comparators |
| (\code{<}, \code{<=}, \code{>}, \code{>=}) with an \code{@Unsigned} |
| operand. |
| \item |
| Any other binary operator with one \code{@Unsigned} operand and one |
| \code{@Signed} operand, with the exception of left shift (\verb|<<|). |
| |
| \end{itemize} |
| |
| There are some special cases where these operations are permitted; see |
| Section~\ref{signedness-checker-permitted-shifts}. |
| |
| Like every type-checker built with the Checker Framework, the Signedness |
| Checker ensures that assignments and pseudo-assignments have consistent types. |
| For example, it is not permitted to assign a \code{@Signed} expression to an |
| \code{@Unsigned} variable or vice versa. |
| |
| |
| \subsectionAndLabel{Rationale}{signedness-checker-rationale} |
| |
| The Signedness Checker prevents misuse of unsigned values in Java code. |
| Most Java operations interpret operands as signed. If applied to unsigned |
| values, those operations would produce unexpected, incorrect results. |
| |
| Consider the following Java code: |
| |
| \begin{Verbatim} |
| public class SignednessManualExample { |
| |
| int s1 = -2; |
| int s2 = -1; |
| |
| @Unsigned int u1 = 2147483646; // unsigned: 2^32 - 2, signed: -2 |
| @Unsigned int u2 = 2147483647; // unsigned: 2^32 - 1, signed: -1 |
| |
| void m() { |
| int w = s1 / s2; // OK: result is 2, which is correct for -2 / -1 |
| int x = u1 / u2; // ERROR: result is 2, which is incorrect for (2^32 - 2) / (2^32 - 1) |
| } |
| |
| int s3 = -1; |
| int s4 = 5; |
| |
| @Unsigned int u3 = 2147483647; // unsigned: 2^32 - 1, signed: -1 |
| @Unsigned int u4 = 5; |
| |
| void m2() { |
| int y = s3 % s4; // OK: result is -1, which is correct for -1 % 5 |
| int z = u3 % u4; // ERROR: result is -1, which is incorrect for (2^32 - 1) % 5 = 2 |
| } |
| } |
| \end{Verbatim} |
| |
| These examples illustrate why division and modulus with an unsigned operand |
| are illegal. Other uses of operators are prohibited for similar reasons. |
| |
| |
| \subsectionAndLabel{Permitted shifts}{signedness-checker-permitted-shifts} |
| |
| As exceptions to the rules given above, the Signedness Checker permits |
| certain right shifts which are immediately followed by a cast or |
| masking operation. |
| |
| For example, right shift by 8 then mask by 0xFF evaluates to the same value |
| whether the argument is interpreted as signed or unsigned. Thus, the |
| Signedness Checker permits both \verb|((myInt >> 8) & 0xFF)| and |
| \verb|((myInt >>> 8) & 0xFF)|, regardless of the qualifier on the type of |
| \<myInt>. |
| |
| Likewise, right shift by 8 then cast to byte evaluates to the |
| same value whether the argument is interpreted as signed or unsigned, so |
| the Signedness Checker permits both \verb|(byte) (myInt >> 8)| and |
| \verb|(byte) (myInt >>> 8)|, regardless of the type of \<myInt>. |
| |
| %% TODO: This is not yet implemented. Should it be? |
| % These masked/casted shift expressions have type |
| % \refqualclass{checker/signedness/qual}{SignednessGlb}. They are bit |
| % patterns that can be interpreted as either signed or unsigned values. |
| |
| |
| % There are two distinct reasons to use \verb|>>>|, the unsigned right shift operator: |
| % \begin{itemize} |
| % \item |
| % To perform arithmetic: dividing or multiplying by 2. |
| % The left-hand operand must be an unsigned value, and |
| % the result is an unsigned value. |
| % \item |
| % To extract bits. |
| % The left-hand operand may be arbitrary, and \verb|>>>| behaves the same |
| % as \verb|>>|. |
| % % TODO: The result should be @SignednessUnknown. |
| % % The result is a bit string: no arithmetic should be performed on it (@SignednessUnknown). |
| % \end{itemize} |
| |
| |
| \sectionAndLabel{Utility routines for manipulating unsigned values}{signedness-utilities} |
| |
| Class \refclass{checker/signedness/util}{SignednessUtil} provides static |
| utility methods for working with unsigned values. They are |
| properly annotated with \refqualclass{checker/signedness/qual}{Unsigned} |
| where appropriate, so using them may reduce the number of annotations that |
| you need to write. |
| To use the \refclass{checker/signedness/util}{SignednessUtil} class, the |
| \<checker-util.jar> file must be on the classpath at run time. |
| |
| Class \refclass{checker/signedness/util}{SignednessUtilExtra} contains more utility |
| methods that reference packages not included in Android. This class is not |
| included in \code{checker-util.jar}, so you may want to copy the methods to your code. |
| |
| |
| \sectionAndLabel{Local type refinement}{signedness-refinement} |
| |
| Local type refinement/inference (Section~\ref{type-refinement}) may be |
| surprising for the Signedness type system. Ordinarily, an expression with |
| unsigned type may not participate in a division, as shown in |
| Sections~\ref{signedness-checker-prohibited-operations} |
| and~\ref{signedness-checker-rationale}. However, if a constant is assigned |
| to a variable that was declared with \<@Unsigned> type, then --- just like |
| the constant --- the variable may be treated as either signed or unsigned. |
| For example, it can participate in division. Since the result of the |
| division is signed, you cannot accidentally assign the division result to |
| an \<@Unsigned> variable. |
| |
| \begin{Verbatim} |
| void useLocalVariables() { |
| |
| int s1 = -2; |
| int s2 = -1; |
| |
| @Unsigned int u1 = 2147483646; // unsigned: 2^32 - 2, signed: -2 |
| @Unsigned int u2 = 2147483647; // unsigned: 2^32 - 1, signed: -1 |
| |
| int w = s1 / s2; // OK: result is 2, which is correct for -2 / -1 |
| int x = u1 / u2; // OK; computation over constants, interpreted as signed; result is signed |
| } |
| \end{Verbatim} |
| |
| |
| \sectionAndLabel{Other signedness annotations}{signedness-other-annotations} |
| |
| The Checker Framework's signedness annotations are similar to annotations used |
| elsewhere. |
| |
| If your code is already annotated with a different |
| annotation, the Checker Framework can type-check your code. |
| It treats annotations from other tools |
| as if you had written the corresponding annotation from the |
| Signedness Checker, as described in Figure~\ref{fig-signedness-refactoring}. |
| % If the other annotation is a declaration annotation, it may be moved; see |
| % Section~\ref{declaration-annotations-moved}. |
| |
| |
| % These lists should be kept in sync with SignednessAnnotatedTypeFactory.java . |
| \begin{figure} |
| \begin{center} |
| % The ~ around the text makes things look better in Hevea (and not terrible |
| % in LaTeX). |
| \begin{tabular}{ll} |
| \begin{tabular}{|l|} |
| \hline |
| ~jdk.jfr.Unsigned~ \\ \hline |
| \end{tabular} |
| & |
| $\Rightarrow$ |
| ~org.checkerframework.checker.signedness.qual.Unsigned~ |
| \end{tabular} |
| \end{center} |
| %BEGIN LATEX |
| \vspace{-1.5\baselineskip} |
| %END LATEX |
| \caption{Correspondence between other signedness annotations |
| and the Checker Framework's annotations.} |
| \label{fig-signedness-refactoring} |
| \end{figure} |
| |
| % LocalWords: Signedness signedness IntRange bitwise SignedPositive myInt |
| % LocalWords: SignedPositiveFromUnsigned SignednessGlb PolySigned qual |
| % LocalWords: UnknownSignedness SignednessBottom upcasting comparators |
| % LocalWords: SignednessUtil SignednessUtilExtra |