blob: 73a9e8401592b06024582ef43c69573db2dd6d0a [file] [log] [blame] [edit]
\htmlhr
\chapterAndLabel{Signedness Checker}{signedness-checker}
The Signedness Checker guarantees that signed and unsigned integral values are not mixed
together in a computation. In addition, it prohibits meaningless operations, such
as division on an unsigned value.
Recall that a computer represents a number as a sequence of bits.
Signedness indicates how to interpret the most significant bit. For
example, the bits \<10000010> ordinarily represent the value -126, but when
interpreted as unsigned, those bits represent the value 130. The bits
\<01111110> represent the value 126 in signed and in unsigned interpretation.
The range of signed byte values is -128 to 127. The range of unsigned byte
values is 0 to 255.
Signedness is only applicable to the integral types \<byte>,
\<short>, \<int>, and \<long> and their boxed variants \<Byte>,
\<Short>, \<Integer>, and \<Long>.
\<char> and \<Character> are always unsigned.
Floating-point types \<float>, \<double>, \<Float>, and \<Double> are always signed.
% , because they do not have operations that interpret the bits as unsigned.
Signedness is primarily about how the bits of the representation are
\emph{interpreted}, not about the values that it can represent. An unsigned value
is always positive, but just because a variable's value is positive does
not mean that it should be marked as \<@Unsigned>. If variable $v$ will be
compared to a signed value, or used in arithmetic operations with a signed
value, then $v$ should have signed type.
To indicate the range of possible values for a variable, use the
\refqualclass{checker/index/qual}{NonNegative} annotation of the Index
Checker (see \chapterpageref{index-checker}) or the
\refqualclass{common/value/qual}{IntRange} annotation of the Constant Value
Checker (see \chapterpageref{constant-value-checker}).
\sectionAndLabel{Annotations}{signedness-checker-annotations}
The Signedness Checker uses type annotations to indicate the signedness that the programmer intends an expression to have.
\begin{figure}
\includeimage{signedness}{3.5cm}
\caption{The type qualifier hierarchy of the signedness annotations.
Qualifiers in gray are used internally by the type system but should never be written by a programmer.}
\label{fig-signedness-hierarchy}
\end{figure}
These are the qualifiers in the signedness type system:
\begin{description}
\item[\refqualclass{checker/signedness/qual}{Unsigned}]
indicates that the programmer intends the value to be
interpreted as unsigned.
That is, if the most significant bit in the bitwise representation is
set, then the bits should be interpreted as a large positive value.
\item[\refqualclass{checker/signedness/qual}{Signed}]
indicates that the programmer intends the value to be
interpreted as signed.
That is, if the most significant bit in the bitwise representation is
set, then the bits should be interpreted as a negative value.
This is the default annotation.
\item[\refqualclass{checker/signedness/qual}{SignedPositive}]
indicates that a value is known at compile time to be in the positive
signed range, so it has the same interpretation as signed or unsigned
and may be used with either interpretation. (Equivalently, the most
significant bit is guaranteed to be 0.) Programmers should usually
write \refqualclass{checker/signedness/qual}{Signed} or
\refqualclass{checker/signedness/qual}{Unsigned} instead.
\item[\refqualclass{checker/signedness/qual}{SignedPositiveFromUnsigned}]
indicates that a value is in the positive signed range, as with
\refqualclass{checker/signedness/qual}{SignedPositive}. Furthermore,
the value was derived from values that can be interpreted as
\refqualclass{checker/signedness/qual}{Unsigned}. Programmers should
rarely write this annotation.
\item[\refqualclass{checker/signedness/qual}{SignednessGlb}]
indicates that a value may be interpreted as unsigned or signed. It
covers the same cases as
\refqualclass{checker/signedness/qual}{SignedPositive}, plus manifest literals, to
prevent the programmer from having to annotate them all explicitly.
This annotation should almost never be written by the
programmer.
\item[\refqualclass{checker/signedness/qual}{PolySigned}]
indicates qualifier polymorphism.
For a description of qualifier polymorphism, see
Section~\ref{method-qualifier-polymorphism}.
\item[\refqualclass{checker/signedness/qual}{UnknownSignedness}]
indicates that a value's type is not relevant or known to this checker.
This annotation is used internally, and should not be
written by the programmer.
\item[\refqualclass{checker/signedness/qual}{SignednessBottom}]
indicates that the value is \<null>.
This annotation is used internally, and should not
be written by the programmer.
\end{description}
\subsectionAndLabel{Default qualifiers}{signedness-checker-annotations-default-qualifiers}
The only type qualifier that the programmer should need to write is
\refqualclass{checker/signedness/qual}{Unsigned}.
When a programmer leaves an expression unannotated, the
Signedness Checker treats it in one of the following ways:
\begin{itemize}
\item
All \code{byte}, \code{short}, \code{int}, and \code{long} literals default
to \refqualclass{checker/signedness/qual}{SignednessGlb}.
\item
All \code{byte}, \code{short}, \code{int}, and \code{long} variables default
to \refqualclass{checker/signedness/qual}{Signed}.
\item
All other expressions default to \refqualclass{checker/signedness/qual}{UnknownSignedness}.
\end{itemize}
\subsectionAndLabel{Widening conversions}{signedness-checker-widening-conversions}
Figure~\ref{fig-signedness-hierarchy} shows the type qualifier hierarchy.
When upcasting among integral types, the expression has type
\<@SignedPositive> because the value is guaranteed to be within the signed
positive range. For example:
\begin{Verbatim}
@Signed int sint;
@Unsigned int uint;
@SignedPositive long splong1 = sint; // legal
@SignedPositive long splong2 = uint; // legal
... (long) sint ... // this expression has type @SignedPositive long
... (long) uint ... // this expression has type @SignedPositive long
\end{Verbatim}
\sectionAndLabel{Prohibited operations}{signedness-checker-prohibited-operations}
The Signedness Checker prohibits the following uses of operators:
\begin{itemize}
\item
Division (\code{/}) or modulus (\code{\%}) with an \code{@Unsigned}
operand.
\item
Signed right shift (\verb|>>|) with an \code{@Unsigned} left operand.
\item
Unsigned right shift (\verb|>>>|) with a \code{@Signed} left operand.
\item
Greater/less than (or equal) comparators
(\code{<}, \code{<=}, \code{>}, \code{>=}) with an \code{@Unsigned}
operand.
\item
Any other binary operator with one \code{@Unsigned} operand and one
\code{@Signed} operand, with the exception of left shift (\verb|<<|).
\end{itemize}
There are some special cases where these operations are permitted; see
Section~\ref{signedness-checker-permitted-shifts}.
Like every type-checker built with the Checker Framework, the Signedness
Checker ensures that assignments and pseudo-assignments have consistent types.
For example, it is not permitted to assign a \code{@Signed} expression to an
\code{@Unsigned} variable or vice versa.
\subsectionAndLabel{Rationale}{signedness-checker-rationale}
The Signedness Checker prevents misuse of unsigned values in Java code.
Most Java operations interpret operands as signed. If applied to unsigned
values, those operations would produce unexpected, incorrect results.
Consider the following Java code:
\begin{Verbatim}
public class SignednessManualExample {
int s1 = -2;
int s2 = -1;
@Unsigned int u1 = 2147483646; // unsigned: 2^32 - 2, signed: -2
@Unsigned int u2 = 2147483647; // unsigned: 2^32 - 1, signed: -1
void m() {
int w = s1 / s2; // OK: result is 2, which is correct for -2 / -1
int x = u1 / u2; // ERROR: result is 2, which is incorrect for (2^32 - 2) / (2^32 - 1)
}
int s3 = -1;
int s4 = 5;
@Unsigned int u3 = 2147483647; // unsigned: 2^32 - 1, signed: -1
@Unsigned int u4 = 5;
void m2() {
int y = s3 % s4; // OK: result is -1, which is correct for -1 % 5
int z = u3 % u4; // ERROR: result is -1, which is incorrect for (2^32 - 1) % 5 = 2
}
}
\end{Verbatim}
These examples illustrate why division and modulus with an unsigned operand
are illegal. Other uses of operators are prohibited for similar reasons.
\subsectionAndLabel{Permitted shifts}{signedness-checker-permitted-shifts}
As exceptions to the rules given above, the Signedness Checker permits
certain right shifts which are immediately followed by a cast or
masking operation.
For example, right shift by 8 then mask by 0xFF evaluates to the same value
whether the argument is interpreted as signed or unsigned. Thus, the
Signedness Checker permits both \verb|((myInt >> 8) & 0xFF)| and
\verb|((myInt >>> 8) & 0xFF)|, regardless of the qualifier on the type of
\<myInt>.
Likewise, right shift by 8 then cast to byte evaluates to the
same value whether the argument is interpreted as signed or unsigned, so
the Signedness Checker permits both \verb|(byte) (myInt >> 8)| and
\verb|(byte) (myInt >>> 8)|, regardless of the type of \<myInt>.
%% TODO: This is not yet implemented. Should it be?
% These masked/casted shift expressions have type
% \refqualclass{checker/signedness/qual}{SignednessGlb}. They are bit
% patterns that can be interpreted as either signed or unsigned values.
% There are two distinct reasons to use \verb|>>>|, the unsigned right shift operator:
% \begin{itemize}
% \item
% To perform arithmetic: dividing or multiplying by 2.
% The left-hand operand must be an unsigned value, and
% the result is an unsigned value.
% \item
% To extract bits.
% The left-hand operand may be arbitrary, and \verb|>>>| behaves the same
% as \verb|>>|.
% % TODO: The result should be @SignednessUnknown.
% % The result is a bit string: no arithmetic should be performed on it (@SignednessUnknown).
% \end{itemize}
\sectionAndLabel{Utility routines for manipulating unsigned values}{signedness-utilities}
Class \refclass{checker/signedness/util}{SignednessUtil} provides static
utility methods for working with unsigned values. They are
properly annotated with \refqualclass{checker/signedness/qual}{Unsigned}
where appropriate, so using them may reduce the number of annotations that
you need to write.
To use the \refclass{checker/signedness/util}{SignednessUtil} class, the
\<checker-util.jar> file must be on the classpath at run time.
Class \refclass{checker/signedness/util}{SignednessUtilExtra} contains more utility
methods that reference packages not included in Android. This class is not
included in \code{checker-util.jar}, so you may want to copy the methods to your code.
\sectionAndLabel{Local type refinement}{signedness-refinement}
Local type refinement/inference (Section~\ref{type-refinement}) may be
surprising for the Signedness type system. Ordinarily, an expression with
unsigned type may not participate in a division, as shown in
Sections~\ref{signedness-checker-prohibited-operations}
and~\ref{signedness-checker-rationale}. However, if a constant is assigned
to a variable that was declared with \<@Unsigned> type, then --- just like
the constant --- the variable may be treated as either signed or unsigned.
For example, it can participate in division. Since the result of the
division is signed, you cannot accidentally assign the division result to
an \<@Unsigned> variable.
\begin{Verbatim}
void useLocalVariables() {
int s1 = -2;
int s2 = -1;
@Unsigned int u1 = 2147483646; // unsigned: 2^32 - 2, signed: -2
@Unsigned int u2 = 2147483647; // unsigned: 2^32 - 1, signed: -1
int w = s1 / s2; // OK: result is 2, which is correct for -2 / -1
int x = u1 / u2; // OK; computation over constants, interpreted as signed; result is signed
}
\end{Verbatim}
\sectionAndLabel{Other signedness annotations}{signedness-other-annotations}
The Checker Framework's signedness annotations are similar to annotations used
elsewhere.
If your code is already annotated with a different
annotation, the Checker Framework can type-check your code.
It treats annotations from other tools
as if you had written the corresponding annotation from the
Signedness Checker, as described in Figure~\ref{fig-signedness-refactoring}.
% If the other annotation is a declaration annotation, it may be moved; see
% Section~\ref{declaration-annotations-moved}.
% These lists should be kept in sync with SignednessAnnotatedTypeFactory.java .
\begin{figure}
\begin{center}
% The ~ around the text makes things look better in Hevea (and not terrible
% in LaTeX).
\begin{tabular}{ll}
\begin{tabular}{|l|}
\hline
~jdk.jfr.Unsigned~ \\ \hline
\end{tabular}
&
$\Rightarrow$
~org.checkerframework.checker.signedness.qual.Unsigned~
\end{tabular}
\end{center}
%BEGIN LATEX
\vspace{-1.5\baselineskip}
%END LATEX
\caption{Correspondence between other signedness annotations
and the Checker Framework's annotations.}
\label{fig-signedness-refactoring}
\end{figure}
% LocalWords: Signedness signedness IntRange bitwise SignedPositive myInt
% LocalWords: SignedPositiveFromUnsigned SignednessGlb PolySigned qual
% LocalWords: UnknownSignedness SignednessBottom upcasting comparators
% LocalWords: SignednessUtil SignednessUtilExtra