Data Types
Java is a strongly typed language. This means that every variable must have a declared type. There are eight primitive types in Java. Four of them are integer types; two are floating-point number types; one is the character type char, used for characters in the Unicode encoding (see the section on the char type), and one is a boolean type for truth values.
Java has an arbitrary precision arithmetic package. However, “Big numbers,” as they are called, are Java objects and not a new Java type. You will see how to use them later in this chapter.
Java has an arbitrary precision arithmetic package. However, “Big numbers,” as they are called, are Java objects and not a new Java type. You will see how to use them later in this chapter.
Integers
The integer types are for numbers without fractional parts. Negative values are allowed. Java provides the four integer types shown in Table 3-1.
Table 3-1. Java integer types
Type |
Storage Requirement |
Range (inclusive) |
---|---|---|
int |
4 bytes |
–2,147,483,648 to 2,147,483, 647 (just over 2 billion) |
short |
2 bytes |
–32,768 to 32,767 |
long |
8 bytes |
–9,223,372,036,854,775,808L to 9,223,372,036,854,775,807L |
byte |
1 byte |
–128 to 127 |
In most situations, the int type is the most practical. If you want to represent the number of inhabitants of our planet, you'll need to resort to a long. The byte and short types are mainly intended for specialized applications, such as low-level file handling, or for large arrays when storage space is at a premium.
Under Java, the ranges of the integer types do not depend on the machine on which you will be running the Java code. This alleviates a major pain for the programmer who wants to move software from one platform to another, or even between operating systems on the same platform. In contrast, C and C++ programs use the most efficient integer type for each processor. As a result, a C program that runs well on a 32-bit processor may exhibit integer overflow on a 16-bit system. Since Java programs must run with the same results on all machines, the ranges for the various types are fixed.
Long integer numbers have a suffix L (for example, 4000000000L). Hexadecimal numbers have a prefix 0x (for example, 0xCAFE). Octal numbers have a prefix 0. For example, 010 is 8. Naturally, this can be confusing, and we recommend against the use of octal constants.
In C and C++, int denotes the integer type that depends on the target machine. On a 16-bit processor, like the 8086, integers are 2 bytes. On a 32-bit processor like the Sun SPARC, they are 4-byte quantities. On an Intel Pentium, the integer type of C and C++ depends on the operating system: for DOS and Windows 3.1, integers are 2 bytes. When using 32-bit mode for Windows programs, integers are 4 bytes. In Java, the sizes of all numeric types are platform independent.
Note that Java does not have any unsigned types.
In C and C++, int denotes the integer type that depends on the target machine. On a 16-bit processor, like the 8086, integers are 2 bytes. On a 32-bit processor like the Sun SPARC, they are 4-byte quantities. On an Intel Pentium, the integer type of C and C++ depends on the operating system: for DOS and Windows 3.1, integers are 2 bytes. When using 32-bit mode for Windows programs, integers are 4 bytes. In Java, the sizes of all numeric types are platform independent.
Note that Java does not have any unsigned types.
Floating-Point Types
The floating-point types denote numbers with fractional parts. There are two floating-point types, as shown in Table 3-2.
Table 3-2. Floating-point types
Type |
Storage Requirement |
Range |
---|---|---|
float |
4 bytes |
approximately ±3.40282347E+38F (6–7 significant decimal digits) |
double |
8 bytes |
approximately ±1.79769313486231570E+308 (15 significant decimal digits) |
The name double refers to the fact that these numbers have twice the precision of the float type. (Some people call these double-precision numbers.) Here, the type to choose in most applications is double. The limited precision of float is simply not sufficient for many situations. Seven significant (decimal) digits may be enough to precisely express your annual salary in dollars and cents, but it won't be enough for your company president's salary. The only reason to use float is in the rare situations in which the slightly faster processing of single-precision numbers is important, or when you need to store a large number of them.
Numbers of type float have a suffix F, for example, 3.402F. Floating-point numbers without an F suffix (such as 3.402) are always considered to be of type double. You can optionally supply the D suffix such as 3.402D.
All floating-point computations follow the IEEE 754 specification. In particular, there are three special floating-point values:
-
positive infinity
-
negative infinity
-
NaN (not a number)
to denote overflows and errors. For example, the result of dividing a positive number by 0 is positive infinity. Computing 0/0 or the square root of a negative number yields NaN.
There are constants Double.POSITIVE_INFINITY, Double.NEGATIVE_ INFINITY and Double.NaN (as well as corresponding Float constants) to represent these special values. But they are rarely used in practice. In particular, you cannot test
if (x == Double.NaN) // is never true
to check whether a particular result equals Double.NaN. All “not a number” values are considered distinct. However, you can use the Double.isNaN method:
if (Double.isNaN(x)) // check whether x is "not a number"
There are constants Double.POSITIVE_INFINITY, Double.NEGATIVE_ INFINITY and Double.NaN (as well as corresponding Float constants) to represent these special values. But they are rarely used in practice. In particular, you cannot test
if (x == Double.NaN) // is never true
to check whether a particular result equals Double.NaN. All “not a number” values are considered distinct. However, you can use the Double.isNaN method:
if (Double.isNaN(x)) // check whether x is "not a number"
Floating-point numbers are not suitable for financial calculation where roundoff errors cannot be tolerated. For example, the command System.out.println(2.0 - 1.1) prints 0.8999999999999999, not 0.9 as you would expect. Such roundoff errors are caused by the fact that floating-point numbers are represented in the binary number system. There is no precise binary representation of the fraction 1/10, just as there is no accurate representation of the fraction 1/3 in the decimal system. If you need precise numerical computations without roundoff errors, you need to use the BigDecimal class which is introduced later in this chapter.
Floating-point numbers are not suitable for financial calculation where roundoff errors cannot be tolerated. For example, the command System.out.println(2.0 - 1.1) prints 0.8999999999999999, not 0.9 as you would expect. Such roundoff errors are caused by the fact that floating-point numbers are represented in the binary number system. There is no precise binary representation of the fraction 1/10, just as there is no accurate representation of the fraction 1/3 in the decimal system. If you need precise numerical computations without roundoff errors, you need to use the BigDecimal class which is introduced later in this chapter.
The Character Type
First, single quotes are used to denote char constants. For example, 'H' is a character. It is different from "H", a string containing a single character. Second, the char type denotes characters in the Unicode encoding scheme. You may not be familiar with Unicode, and, fortunately, you don't need to worry much about it if you don't program international applications. (Even if you do, you still won't have to worry about it too much because Unicode was designed to make the use of non-Roman characters easy to handle.) Because Unicode was designed to handle essentially all characters in all written languages in the world, it is a 2-byte code. This allows 65,536 characters, of which about 35,000 are currently in use. This is far richer than the ASCII codeset, which is a 1-byte code with 128 characters, or the commonly used ISO 8859-1 extension with 256 characters. That character set (which some programmers call the “Latin-1” character set) is a subset of Unicode. More precisely, it is the first 256 characters in the Unicode coding scheme. Thus, character codes like 'a', '1', '[' and 'ä' are valid Unicode characters with character codes < 256. Unicode characters have codes between 0 and 65535, but they are usually expressed as hexadecimal values that run from '\u0000' to '\uFFFF' (with '\u0000' to '\u00FF' being the ordinary ISO 8859-1 characters). The \u prefix indicates a Unicode value, and the four hexadecimal digits tell you what Unicode character. For example, \u2122 is the trademark symbol (™). For more information on Unicode, you might want to check out the Web site at http://www.unicode.org.
Besides the \u escape character that indicates the encoding of a Unicode character, there are several escape sequences for special characters shown in Table 3-3.
Table 3-3. Special characters
Escape Sequence |
Name |
Unicode Value |
---|---|---|
\b |
backspace |
\u0008 |
\t |
tab |
\u0009 |
\n |
linefeed |
\u000a |
\r |
carriage return |
\u000d |
\" |
double quote |
\u0022 |
\' |
single quote |
\u0027 |
\\ |
backslash |
\u005c |
Although you can theoretically use any Unicode character in a Java application or applet, whether you can actually see it displayed depends on your browser (for applets) and (ultimately) on your operating system for both. For example, you cannot use Java to output Kanji on a machine running the U.S. version of Windows. For more on internationalization issues, please see Chapter 12 of Volume 2.
Although you can theoretically use any Unicode character in a Java application or applet, whether you can actually see it displayed depends on your browser (for applets) and (ultimately) on your operating system for both. For example, you cannot use Java to output Kanji on a machine running the U.S. version of Windows. For more on internationalization issues, please see Chapter 12 of Volume 2.
The boolean Type
The boolean type has two values, false and true. It is used for evaluating logical conditions. You cannot convert between integers and boolean values.
In C++, numbers and even pointers can be used in place of boolean values. The value 0 is equivalent to the bool value false, and a non-zero value is equivalent to true. This is not the case in Java. Thus, Java programmers are shielded from accidents such as
if (x = 0) // oops...meant x == 0
In C++, this test compiles and runs, always evaluating to false. In Java, the test does not compile because the integer expression x = 0 cannot be converted to a boolean value.
In C++, numbers and even pointers can be used in place of boolean values. The value 0 is equivalent to the bool value false, and a non-zero value is equivalent to true. This is not the case in Java. Thus, Java programmers are shielded from accidents such as
if (x = 0) // oops...meant x == 0
In C++, this test compiles and runs, always evaluating to false. In Java, the test does not compile because the integer expression x = 0 cannot be converted to a boolean value.