Home > Articles > Programming > PHP

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Sorting with Foreign Languages

<?php
  function compare($a, $b) {
    if ($a == $b) {
      return 0;
    } else {
      for ($i = 0; $i < min(strlen($a), strlen($b)); $i++) {
        $cmp = compareChar(substr($a, $i, 1), substr($b, $i, 1));
        if ($cmp != 0) {
          return $cmp;
        }
      }
      return (strlen($a) > strlen($b)) ? 1 : 0;
    }
  }

  function compareChar($a, $b) {
    // ...
  }

  $a = array('Frédéric', 'Froni', 'Frans');
  usort($a, 'compare');
  echo implode(' < ', $a);
?>

Sorting an Array with Language-Specific Characters (languagesort.php; excerpt)

Sorting works well, as long as only the standard ASCII characters are involved. However, as soon as special language characters come into play, the sorting yields an undesirable effect. For instance, calling sort() on an array with the values 'Frans', 'Frédéric', and 'Froni' puts 'Frédéric' last because the é character has a much larger charcode than o.

For this special case, PHP offers no special sorting method; however, you can use strnatcmp()to emulate this behavior. The idea is to define a new order for some special characters; in the comparison function, you then use this to find out which character is “larger” and which is “smaller.”

You first need a function that can sort single characters:

function compareChar($a, $b) {
    $characters =
'AÀÁÄBCÇDEÈÉFGHIÌÍJKLMNOÒÓÖPQRSTUÙÚÜVWXYZ';
    $characters .=  'aàáäbcçdeèéfghiìíjklmnoòóöpqrstuùúüvwxyz';
    $pos_a = strpos($characters, $a);
    $pos_b = strpos($characters, $b);
    if ($pos_a === false) {
      if ($pos_b === false) {
        return 0;
      } else {
        return 1;
      }
    } elseif ($pos_b === false) {
      return -1;
    } else {
      return $pos_a - $pos_b;
    }
  }

Then, the main sorting function calls compareChar(), character for character, until a difference is found. If no difference is found, the longer string is considered to be the “greater” one. If both strings are identical, 0 is returned. The code at the beginning of this phrase shows the compare function. The result of this code is, as desired, Frans < Frédéric < Froni.

Starting with PHP 5.3, the ext/intl extension provides a mechanism for natural language sorting, as well. If the extension is installed (which may additionally require the ICU library set from http://site.icu-project.org/), you first need to create a so-called collator, which expects a locale (for instance, en_US, en_CA, en_GB, fr_FR, or de_AT). Then, you can call the sort() and asort() methods, which work analogously to their PHP counterparts but take the locale information into account:

$a = array('Frédéric', 'Froni', 'Frans');
$coll = new Collator('fr_FR');
$coll->sort($a);
echo implode(' < ', $a);

Sorting an Array with Locale Information (collator.php)

  • + Share This
  • 🔖 Save To Your Account