Home > Articles

  • Print
  • + Share This
This chapter is from the book

3.4. Analyzing Character Attributes

You want to evaluate the individual characters in a string to determine a character's attributes.

Technique

The System.Char structure contains several static functions that let you test individual characters. You can test whether a character is a digit, letter, or punctuation symbol or whether the character is lowercase or uppercase.

Comments

One of the hardest issues to handle when writing software is making sure users input valid data. You can use many different methods, such as restricting input to only digits, but ultimately, you always need an underlying validating test of the input data.

You can use the System.Char structure to perform a variety of text-validation procedures. Listing 3.5 demonstrates validating user input as well as inspecting the characteristics of a character. It begins by displaying a menu and then waiting for user input using the Console.ReadLine method. Once a user enters a command, you make a check using the method ValidateMainMenuInput. This method checks to make sure the first character in the input string is not a digit or punctuation symbol. If the validation passes, the string is passed to a method that inspects each character in the input string. This method simply enumerates through all the characters in the input string and prints descriptive messages based on the characteristics. Some of the System.Char methods for inspection have been inadvertently left out of Listing 3.5. Table 3.3 shows the remaining methods and their functionality. The results of running the application in Listing 3.5 apper in Figure 3.1.

Listing 3.5 Using the Static Methods in System.Char to Inspect the Details of a Single Character

using System;

namespace _4_CharAttributes
{
  class Class1
  {
    [STAThread]
    static void Main(string[] args)
    {
      char cmd = 'x';

      string input;
      do
      {
        DisplayMainMenu();
        input = Console.ReadLine();

        if( (input == "" ) || 
           ValidateMainMenuInput( Char.ToUpper(input[0]) ) == 0 )
        {
          Console.WriteLine( "Invalid command!" );
        }
        else
        {
          cmd = Char.ToUpper(input[0]);

          switch( cmd )
          {
            case 'Q':
            {
              break;
            }

            case 'N':
            {
              Console.Write( "Enter a phrase to inspect: " );
              input = Console.ReadLine();
              InspectPhrase( input );
              break;
            }
          }
        }
      } while ( cmd != 'Q' );
    }

    private static void InspectPhrase( string input )
    {
      foreach( char ch in input )
      {
        Console.Write( ch + " - ");

        if( Char.IsDigit(ch) )
          Console.Write( "IsDigit " );
        if( Char.IsLetter(ch) )
        {
          Console.Write( "IsLetter " );
          Console.Write( "(lowercase={0}, uppercase={1})", 
            Char.ToLower(ch), Char.ToUpper(ch));
        }
        if( Char.IsPunctuation(ch) )
          Console.Write( "IsPunctuation " );
         if( Char.IsWhiteSpace(ch) )
          Console.Write( "IsWhitespace" );

        Console.Write("\n");
        
      }
    }
    private static int ValidateMainMenuInput( char input )
    {
      // a simple check to see if input == 'N' or 'Q' is good enough
      // the following is for illustrative purposes
      if( Char.IsDigit( input ) == true )
        return 0;
      else if ( Char.IsPunctuation( input ) )
        return 0;
      else if( Char.IsSymbol( input ))
        return 0;
      else if( input != 'N' && input != 'Q' )
        return 0;

      return (int) input;
    }

    private static void DisplayMainMenu()
    {
      Console.WriteLine( "\nPhrase Inspector\n-------------------" );
      Console.WriteLine( "N)ew Phrase" );
      Console.WriteLine( "Q)uit\n" );
      Console.Write( ">> " );
    }
	}
}

Table 3.3 System.Char Inspection Methods

Name

Description

IsControl

Denotes a control character such as a tab or carriage return.

IsDigit

Indicates a single decimal digit.

IsLetter

Used for alphabetic characters.

IsLetterOrDigit

Returns true if the character is a letter or a digit.

IsLower

Used to determine whether a character is lowercase.

IsNumber

Tests whether a character is a valid number.

IsPunctuation

Denotes whether a character is a punctuation symbol.

IsSeparator

Denotes a character used to separate strings. An example is the space character.

IsSurrogate

Checks for a Unicode surrogate pair, which consists of two 16-bit values primarily used in localization contexts.

IsSymbol

Used for symbolic characters such as $ or #.

IsUpper

Used to determine whether a character is uppercase.

IsWhiteSpace

Indicates a character classified as whitespace such as a space character, tab, or carriage return.


Figure 3.1Figure 3.1 Use the static method in the System.Char class to inspect character attributes.

The System.Char structure is designed to work with a single Unicode character. Because a Unicode character is 2 bytes, the range of a character is from 0 to 0xFFFF. For portability reasons in future systems, you can always check the size of a char by using the MaxValue constant declared in the System.Char structure. One thing to keep in mind when working with characters is to avoid the confusion of mixing char types with integer types. Characters have an ordinal value, which is an integer value used as a lookup into a table of symbols. One example of a table is the ASCII table, which contains 255 characters and includes the digits 0 through 9, letters, punctuation symbols, and formatting characters. The confusion lies in the fact that the number 6, for instance, has an ordinal char value of 0x36. Therefore, the line of code meant to initialize a character to the number 6

char ch = (char) 6;

is wrong because the actual character in this instance is ^F, the ACK control character used in modem handshaking protocols. Displaying this value in the console would not provide the 6 that you were looking for. You could have chosen two different methods to initialize the variable. The first way is

char ch = (char) 0x36;

which produces the desired result and prints the number 6 to the console if passed to the Console.Write method. However, unless you have the ASCII table memorized, this procedure can be cumbersome. To initialize a char variable, simply place the value between single quotes:

char ch = '6';
  • + Share This
  • 🔖 Save To Your Account