Simplifying Analysis with sort
The output of du has been very informative, but it's difficult to scan a listing to ascertain the four or five largest directories, particularly as more and more directories and files are included in the output. The good news is that the Unix sort utility is just the tool we need to sidestep this problem.
Task 3.3: Piping Output to sort
Why should we have to go through all the work of eyeballing page after page of listings when there are Unix tools to easily let us ascertain the biggest and smallest? One of the great analysis tools in Unix is sort, even though you rarely see it mentioned in other Unix system administration books.
At its most obvious, sort alphabetizes output:
# cat names Linda Ashley Gareth Jasmine Karma # sort names Ashley Gareth Jasmine Karma Linda
No rocket science about that! However, what happens if the output of du is fed to sort?
# du -s * | sort 0 gif.gif 10464 IBM 13984 Lynx 16 Exchange 196 DEMO 3092 Gator 36 CraigsList 412 bin 48 elance 4 badjoke 4 badjoke.rot13 4 browse.sh 4 buckaroo 4 getmodemdriver.sh 4 getstocks.sh 4 gettermsheet.sh 76 CBO_MAIL 84 etcpasswd
Sure enough, it's sorted. But probably not as you expectedit's sorted by the ASCII digit characters! Not good.
That's where the -n flag is a vital addition: With -n specified, sort will assume that the lines contain numeric information and sort them numerically:
# du -s * | sort -n 0 gif.gif 4 badjoke 4 badjoke.rot13 4 browse.sh 4 buckaroo 4 getmodemdriver.sh 4 getstocks.sh 4 gettermsheet.sh 16 Exchange 36 CraigsList 48 elance 76 CBO_MAIL 84 etcpasswd 196 DEMO 412 bin 3092 Gator 10464 IBM 13984 Lynx
A much more useful result, if I say so myself!
The only thing I'd like to change in the sorting here is that I'd like to have the largest directory listed first, and the smallest listed last.
The order of a sort can be reversed with the -r flag, and that's the magic needed:
# du -s * | sort -nr 13984 Lynx 10464 IBM 3092 Gator 412 bin 196 DEMO 84 etcpasswd 76 CBO_MAIL 48 elance 36 CraigsList 16 Exchange 4 gettermsheet.sh 4 getstocks.sh 4 getmodemdriver.sh 4 buckaroo 4 browse.sh 4 badjoke.rot13 4 badjoke 0 gif.gif
One final concept and we're ready to move along. If you want to only see the five largest files or directories in a specific directory, all that you'd need to do is pipe the command sequence to head:
# du -s * | sort -nr | head -5 13984 Lynx 10464 IBM 3092 Gator 412 bin 196 DEMO
This sequence of sort|head will prove very useful later in this hour.
A key concept with Unix is understanding how the commands are all essentially Lego pieces, and that you can combine them in any number of ways to get exactly the results you seek. In this vein, sort -rn is a terrific piece, and you'll find yourself using it again and again as you learn more about system administration.