Home > Articles > Operating Systems, Server > Linux/UNIX/Open Source

UNIX Disk Usage

  • Print
  • + Share This
This chapter is from the book

Simplifying Analysis with sort

The output of du has been very informative, but it's difficult to scan a listing to ascertain the four or five largest directories, particularly as more and more directories and files are included in the output. The good news is that the Unix sort utility is just the tool we need to sidestep this problem.

Task 3.3: Piping Output to sort

Why should we have to go through all the work of eyeballing page after page of listings when there are Unix tools to easily let us ascertain the biggest and smallest? One of the great analysis tools in Unix is sort, even though you rarely see it mentioned in other Unix system administration books.

  1. At its most obvious, sort alphabetizes output:

    # cat names
    Linda
    Ashley
    Gareth
    Jasmine
    Karma
    # sort names
    Ashley
    Gareth
    Jasmine
    Karma
    Linda

    No rocket science about that! However, what happens if the output of du is fed to sort?

    # du -s * | sort
    0      gif.gif
    10464  IBM
    13984  Lynx
    16     Exchange
    196    DEMO
    3092   Gator
    36     CraigsList
    412    bin
    48     elance
    4      badjoke
    4      badjoke.rot13
    4      browse.sh
    4      buckaroo
    4      getmodemdriver.sh
    4      getstocks.sh
    4      gettermsheet.sh
    76     CBO_MAIL
    84     etcpasswd

    Sure enough, it's sorted. But probably not as you expected—it's sorted by the ASCII digit characters! Not good.

  2. That's where the -n flag is a vital addition: With -n specified, sort will assume that the lines contain numeric information and sort them numerically:

    # du -s * | sort -n
    0      gif.gif
    4      badjoke
    4      badjoke.rot13
    4      browse.sh
    4      buckaroo
    4      getmodemdriver.sh
    4      getstocks.sh
    4      gettermsheet.sh
    16     Exchange
    36     CraigsList
    48     elance
    76     CBO_MAIL
    84     etcpasswd
    196    DEMO
    412    bin
    3092   Gator
    10464  IBM
    13984  Lynx

    A much more useful result, if I say so myself!

  3. The only thing I'd like to change in the sorting here is that I'd like to have the largest directory listed first, and the smallest listed last.

    The order of a sort can be reversed with the -r flag, and that's the magic needed:

    # du -s * | sort -nr
    13984  Lynx
    10464  IBM
    3092   Gator
    412    bin
    196    DEMO
    84     etcpasswd
    76     CBO_MAIL
    48     elance
    36     CraigsList
    16     Exchange
    4      gettermsheet.sh
    4      getstocks.sh
    4      getmodemdriver.sh
    4      buckaroo
    4      browse.sh
    4      badjoke.rot13
    4      badjoke
    0      gif.gif

    One final concept and we're ready to move along. If you want to only see the five largest files or directories in a specific directory, all that you'd need to do is pipe the command sequence to head:

    # du -s * | sort -nr | head -5
    13984  Lynx
    10464  IBM
    3092   Gator
    412    bin
    196    DEMO

    This sequence of sort|head will prove very useful later in this hour.

A key concept with Unix is understanding how the commands are all essentially Lego pieces, and that you can combine them in any number of ways to get exactly the results you seek. In this vein, sort -rn is a terrific piece, and you'll find yourself using it again and again as you learn more about system administration.

  • + Share This
  • 🔖 Save To Your Account