Home > Articles

  • Print
  • + Share This

Character Literals

Character literals are enclosed in single quotation marks.

Any printable character, other than a backslash (\), can be specified as the single character itself enclosed in single quotes. Some examples of these literals are 'a', 'A', '9', '+', '_', and '~'.

Some characters, such as the backspace, cannot be written out like this, so these characters are represented by escape sequences. Escape sequences, like all character literals, are enclosed within single quotes. They consist of a backslash followed by one of the following:

  • A single character (b, t, n, f, r, ", ', or \)

  • An octal number between 000 and 377

  • A u followed by four hexadecimal digits specifying a Unicode character

The escape sequences built from single characters are shown in Table 3.7.

Table 3.7 Escape Sequences

Escape Sequence

Unicode

Meaning

'\b'

\u0008

Backspace

'\t'

\u0009

Horizontal tab

'\n'

\u000a

Linefeed

'\f'

\u000c

Form feed

'\r'

\u000d

Carriage return

'\"'

\u0022

Double quotation mark

'\''

\u0027

Single quotation mark

'\\'

\u005c

Backslash



Caution - Don't use the Unicode format to express an end-of-line character. Use the '\n' or '\r' characters instead.


The octal values allowed in character literals support the Unicode values from '\u0000' to '\u00ff' (the traditional ASCII range). Table 3.8 shows some examples of octal character literals.

Table 3.8 Octal Character Literals

Octal Literal

Unicode

Meaning

'\007'

\u0007

Bell

'\101'

\u0041

'A'

'\141'

\u0061

'a'

'\071'

\u0039

'9'

'\042'

\u0022

Double quotation mark


You can use Unicode sequences anywhere in your Java code, not just as character literals. As indicated earlier, identifiers can be composed of any Unicode character. In fact, comments, identifiers, and the contents of character and string literals can all be expressed using Unicode. You must use caution, however, because they are interpreted early by the compiler. For example, if you were to use the Unicode representation for a linefeed ('\u000a') as part of a print statement, it would cause a compiler error. This is because the compiler would see this as an actual linefeed in your source code that occurs before the closing single quote of a character literal. This is the reason for the earlier caution to always use '\n' and '\r' for line termination literals.

For an example of using Unicode, look at the following statements that declare and reference a variable using an identifier specified with a Unicode sequence:

int \u0074\u0065\u0073\u0074 = 3;
System.out.println( test );
System.out.println( \u0074\u0065\u0073\u0074 );

This code probably looks strange to you, but the first statement in this example declares and initializes an integer variable named test ('\u0074' equates to 't', '\u0065' equates to 'e', and so on). Although quite different in appearance, both println statements are equivalent; they each display the value assigned to test when executed.

Now look at two attempts to output a linefeed using different representations:

System.out.print( "\n" );   // OK
System.out.print( '\u000a' ); // a compiler error

The first statement is valid and is the equivalent of calling System.out.println(). The second statement, however, causes a compiler error. As mentioned previously, the Unicode sequence is interpreted early, and it appears to the compiler that the argument to print is a character literal that is prematurely terminated by a linefeed.

  • + Share This
  • 🔖 Save To Your Account

Related Resources

There are currently no related titles. Please check back later.