Mac OS X Unleashed

Mac OS X Unleashed

By John Ray and William C. Ray

Searching for Files, Directories, and More

Unix traditionally has provided very useful tools for searching for files by name and by content, and Apple has expanded on these by making available an interface into the Sherlock databases from the command line. Unix's traditional tools don't work from a database as Sherlock does, so they run a little slower. But they aren't hampered by needing a database to run, or by being only as current in their results as the last database update.

Finding Files: locate, find

Sometimes you want to find some files, but you are not sure where they are. There are two tools available to search for files: locate and find. If you know some of the name of a file, you can use the locate utility to try to find it.

For example, our user nermal looked earlier at a file called system.log. Does our machine have other files that have log in their name? You bet! The syntax for locate is

locate <pattern>

We encourage you to try the locate command for files with log in them (locate log) to see the output, but it is much too long to include here. locate searches a database of pathnames on the machine.

Further information on locate is in the command documentation table, Table 13.13.

Table 13.13. The Command Documentation Table for locate

locate Finds files.
locate <pattern>

            
 
Searches a database for all pathnames that match <pattern> . The database is rebuilt periodically and contains the names of all publicly accessible files.
Shell and globbing characters (*, ?, \, [, and ]) may be used in <pattern> , although they must be escaped. Preceding a character by \ eliminates any special meaning for it. No characters must be explicitly matched, including /.
As a special case, a pattern with no globbing characters (foo) is matched as (*foo*).
Useful files:
/var/db/locate.database Database
/usr/libexec/locate.updatedb Script to update database

A more powerful and more ubiquitous tool for finding files is find. It is much slower than the locate command because it actually searches the file system every time it's used, rather than consulting a database, but that also means it doesn't depend on a database for its information and the information is always completely up to date.

After running her search for files containing log in the name, our sample user nermal was overwhelmed by the results. However, she thinks that she might have heard that general system log files might be located in /usr or /var. To check whether what she recalls is correct, she decides to run find:

[localhost:~] nermal% find /var /usr -name \*log\*

     /usr/bin/logger
     /usr/bin/login
     /usr/bin/logname
     /usr/bin/rcs2log
     /usr/bin/rlog
     /usr/bin/rlogin
     /usr/bin/slogin
     /usr/include/hfs/hfscommon/headers/CatalogPrivate.h
     /usr/include/httpd/http_log.h
     /usr/include/mach/mig_log.h
     /usr/include/netinet6/natpt_log.h
     /usr/include/php/ext/standard/php_ext_syslog.h
     /usr/include/php/main/logos.h
     /usr/include/php/main/php_logos.h
     /usr/include/php/main/php_syslog.h
     /usr/include/sys/syslog.h
     /usr/include/syslog.h
     /usr/libexec/emacs/20.7/powerpc-apple-darwin1.0/rcs2log
     /usr/libexec/httpd/mod_log_config.so
     /usr/libexec/rlogind
     /usr/sbin/logresolve
     /usr/sbin/rotatelogs
     /usr/sbin/sliplogin
     /usr/sbin/syslogd

     /usr/share/emacs/20.7/lisp/add-log.el
     /usr/share/emacs/20.7/lisp/add-log.elc
     /usr/share/emacs/20.7/lisp/gnus/gnus-logic.el
     /usr/share/emacs/20.7/lisp/gnus/gnus-logic.elc
     /usr/share/emacs/20.7/lisp/progmodes/prolog.el
     /usr/share/emacs/20.7/lisp/progmodes/prolog.elc
     /usr/share/emacs/20.7/lisp/rlogin.el
     /usr/share/emacs/20.7/lisp/rlogin.elc
     /usr/share/groff/font/devps/prologue
     /usr/share/init/tcsh/login
     /usr/share/init/tcsh/logout
     /usr/share/man/cat1/logger.0
     /usr/share/man/cat1/login.0

     /usr/share/man/cat1/logname.0
     /usr/share/man/cat1/rlog.0
     /usr/share/man/cat1/rlogin.0
     /usr/share/man/cat2/getlogin.0
     /usr/share/man/cat3/login.0
     /usr/share/man/cat3/Sys::Syslog.0
     /usr/share/man/cat3/syslog.0
     /usr/share/man/cat5/.k5login.0

     /usr/share/man/cat5/syslog.conf.0
     /usr/share/man/cat8/logresolve.0
     /usr/share/man/cat8/nologin.0
     /usr/share/man/cat8/rlogind.0
     /usr/share/man/cat8/rotatelogs.0
     /usr/share/man/cat8/sliplogin.0
     /usr/share/man/cat8/syslogd.0
     /usr/share/man/man1/logger.1
     /usr/share/man/man1/login.1
     /usr/share/man/man1/logname.1
     /usr/share/man/man1/rlog.1
     /usr/share/man/man1/rlogin.1
     /usr/share/man/man1/slogin.1
     /usr/share/man/man2/getlogin.2
     /usr/share/man/man3/login.3
     /usr/share/man/man3/Sys::Syslog.3
     /usr/share/man/man3/syslog.3
     /usr/share/man/man5/.k5login.5
     /usr/share/man/man5/syslog.conf.5
     /usr/share/man/man8/logresolve.8
     /usr/share/man/man8/nologin.8
     /usr/share/man/man8/rlogind.8
     /usr/share/man/man8/rotatelogs.8
     /usr/share/man/man8/sliplogin.8
     /usr/share/man/man8/syslogd.8
     /usr/share/vi/catalog

In the preceding statement, nermal searches /usr and /var. The results, though, do not include the system.log file that nermal knows user joray was looking at earlier. According to these results, there are many files in /usr that contain log, but nothing in /var. But nermal is sure that /var is the other possibility she has heard. So, she decides that maybe the problem has something to do with /var actually being a symbolic link in OS X. She adds another option, -H, for find to return information on the referenced file, rather than the link:

[localhost:~] nermal% find -H /var -name \*log\*

     find: /var/cron: Permission denied
     find: /var/db/dhcpclient: Permission denied
     find: /var/db/netinfo/local.nidb: Permission denied
     /var/log
     /var/log/ftp.log
     /var/log/ftp.log.0.gz
     /var/log/ftp.log.1.gz
     /var/log/lookupd.log
     /var/log/lookupd.log.0.gz
     /var/log/lookupd.log.1.gz
     /var/log/lpr.log
     /var/log/lpr.log.0.gz
     /var/log/lpr.log.1.gz
     /var/log/mail.log
     /var/log/mail.log.0.gz
     /var/log/mail.log.1.gz
     /var/log/netinfo.log
     /var/log/netinfo.log.0.gz
     /var/log/netinfo.log.1.gz
     /var/log/secure.log
     /var/log/system.log
     /var/log/system.log.0.gz
     /var/log/system.log.1.gz
     /var/log/system.log.2.gz
     /var/log/system.log.3.gz
     /var/log/system.log.4.gz
     /var/log/system.log.5.gz
     /var/log/system.log.6.gz
     /var/log/system.log.7.gz
     find: /var/root: Permission denied
     /var/run/syslog
     /var/run/syslog.pid
     find: /var/spool/lpd: Permission denied
     find: /var/spool/mqueue: Permission denied
     find: /var/spool/output: Permission denied
     find: /var/spool/printing/74FE197C-2928-11D5-AFCC.Q:  Permission denied
     find: /var/spool/printing/95E0CC6A-2F6F-11D5-B43C.Q: Permission denied
     find: /var/spool/printing/B6E8E296-2EB1-11D5-98A5.Q: Permission denied
     /var/tmp/console.log

There, in the middle of that output, is the system.log file, as well as some additional files with system.log in their name. As we see from the output, nermal does not have permission to search everywhere, but find responds with information for areas where permissions permit it. nermal was lucky that her machine's logs appear to include log in the name. That is not the case on all systems.

There are numerous options available in find. In addition to being able to search on a pattern, find can also run searches based on ownership, file modification times, file access times, and much more. The complete syntax and options for find are in the command documentation table, Table 13.14.

Table 13.14. The Command Documentation Table for find

find Finds files.
find [-H | -L | -P] [-Xdx] [-f <file>] <file> .... 
               <expression>

            
find recursively descends the directory tree of each file listing, evaluating an <expression> composed of primaries and operands.
Options
-H Causes the file information and file type returned for each symbolic link on the command line to be those of the file referenced, rather than those of the link itself. If the file does not exist, the information is for the link itself. File information of symbolic links not on the command line is that of the link itself.
-L Causes the file information and file type returned for each symbolic link to be those of the referenced file, rather than those of the link itself. If the referenced file does not exist, the information is for the link itself.
-P Causes the file information and file type returned for each symbolic link to be those of the link itself.
-X Permits find to be safely used with xargs. If a filename contains any delimiting characters used by xargs, an error message is displayed and the file is skipped. The delimiting characters include single quote, double quote, backslash, space, tab, and newline.
-d Causes a depth-first traversal of the hierarchy. In other words, directory contents are visited before the directory itself. The default is for a directory to be visited before its contents.
-x Excludes find from traversing directories that have a device number different from that of the file from which the descent began.
-h Causes the file information and file type returned for each symbolic link to be those of the referenced file, rather than those of the link itself. If the referenced file does not exist, the information returned is for the link itself.
-f Specifies a file hierarchy for find to traverse. File hierarchies may also be specified as operands immediately following the options listing.
Primaries (expressions)  
All primaries that can take a numeric argument allow the number to be preceded by +, -, or nothing. n takes on the following meanings:


  +n More than n
  -n Less than n
   n Exactly n

-atime n True if the file was last accessed n days ago. Note that find itself will change the access time.
-ctime n True if the file's status was changed n days ago.
-mtime n True if the file was last modified n days ago.
-newer <file> than <file> . True if the current file has a more recent modification time than <file> .
-exec <command> ; True if <command> returns a zero-value exit status. Optional arguments may be passed to <command> . The expression must be terminated by a semicolon. If {} appear anywhere in the command name or arguments, they are replaced by the current pathname.
-follow Follows symbolic links.
-fstype True if the file is contained in a file system specified by -fstype. Issue the command sysctl vfs to determine the available types of file systems on the system. There are also two pseudo types: local and rdonly. local matches any file system physically mounted on the system where find is being executed; rdonly matches any mounted read-only file system.
-group <gname> True if the file belongs to the specified group name. If <gname> is numeric and there is no such group name, <gname> is treated as the group ID.
-user <uname> True if file belongs to the user <uname> . If <uname> is numeric and there is no such user <uname> , it is treated as the user ID.
-nouser True if the file belongs to an unknown user.
-nogroup True if the file belongs to an unknown group.
-inum n True if the file has inode number n .
-links n True if the file has n links.
-ls Always true. Prints the following file statistics: inode number, size in 512-byte blocks, file permissions, number of hard links, owner, group, size in bytes, last modification time, and filename. If the file is a symbolic link, the display of the file it is linked to is preceded by ->. The display from this ls is identical to that displayed by ls -dgils.
-ok <command> Same as -exec, except that confirmation from the user is requested before executing <command> .
-name <pattern> True if the filename contains <pattern> . Special shell pattern matching characters ([, ], *, ?) may be used as part of <pattern> . A backslash (\) is used to escape those characters to explicitly search for them as part of <pattern> .
-path <pattern> True if the pathname contains <pattern> . Special shell pattern matching characters ([, ], *, ?) may be used as part of <pattern> . A backslash (\) is used to escape those characters to explicitly search for them as part of <pattern> . Slashes (/) are treated as normal characters and do not need to be escaped.
-perm [-] <mode> <mode> may be either symbolic or octal (see chmod). If <mode> is symbolic, a starting value of 0 is assumed, and <mode> sets or clears permissions without regard to the process's file mode creation mask. If <mode> is octal, only bits 0777 of the file's mode bits are used in the comparison. If <mode> is preceded by a dash (-), this evaluates to true if at least all the bits in <mode> are set in the file's mode bits. If <mode> is not preceded by a dash, this evaluates to true if the bits in <mode> match exactly the file's mode bits. If <mode> is symbolic, the first character may not be a dash.
-print0 Always true. Prints the current pathname followed by a null character.
-print Always true. Prints the current pathname followed by a newline character. If none of -exec, -ls, -ok, or -print0 is specified, -print is assumed.
-prune Always true. Does not descend into current file after the pattern has been matched. If -d is specified, -prune has no effect.
-size n [c] True if the file size, rounded up, is n 512-byte blocks. If c follows n , it is true if the file size is n bytes.
-type t True if the file is of the specified type. Possible file types are
 
  • W Whiteout
  • b Block special
  • c Character special
  • d Directory
  • f Regular file
  • l Symbolic link
  • p FIFO
  • s Socket
Operators  
Primaries may be combined using the following operators (in order of decreasing precedence).
( expression ) True if the parenthesized expression evaluates to true.
! expression True if the expression is false. (! is the unary, not the operator)
expression [-and] expression
expression expression True if both expressions are true. The second expression is not evaluated if the first is false. (-and is the logical AND operator.)
expression -or e x pression True if either expression is true. The second expression is not evaluated if the first is true. (-or is the logical OR operator. )

Finding Files with Specific Contents: grep

Trying to remember what you've named a file that you need can sometimes be a real chore, especially if you haven't used the file for a long time, or its name is similar to many other files on your system. For situations like these, it is useful to be able to search for files based on patterns contained within the contents of the files themselves, rather than just the filenames. The basic syntax for grep is

grep <pattern> 
   <files>

Here is a sample of using grep:

[localhost:~] joray% grep me file*

     grep: file1: Permission denied
     file2:It's me.  Doing some
     file3:Yep, me again..
     file4:me again
     file5:Another test by me...

In the preceding statement, we see that grep provides output as permissions permit. We also see that the default output lists only the fil,e the filename, and lines containing the searched pattern. A number of options are available in grep. For example, we could ask grep to list the line numbers on which our pattern, me, appears in the files:

[localhost:~] joray% grep -n me file*

     grep: file1: Permission denied
     file2:2:It's me.  Doing some
     file3:2:Yep, me again..
     file4:6:me again
     file5:1:Another test by me...

Another available option is the recursive option, for descending a directory tree searching all the contents.

The grep command is even more powerful than might be immediately apparent because it is also very useful for searching for patterns in the output of other commands. It could, for example, have been used to filter the rather verbose output from the preceding finds, to print out only the specific lines containing exact matches to the filename of interest. Although we haven't gotten to the syntax of the more complex matter of chaining Unix commands together to make sophisticated commands, keep grep in mind as a building block, and consider its possible uses when you reach the end of Chapter 14, "Advanced Shell Concepts and Commands."

The complete syntax and options for grep are shown in the command documentation table, Table 13.15.

Table 13.15. The Command Documentation Table for grep

grep Prints lines matching a pattern
egrep

fgrep.

grep [options] <pattern> <file1> <file2> ...

grep [options] [-e <pattern> | -f <file>] <file1>

                     ccc.gif
                   <file2>

            
grep searches the list of files enumerated by < file1 > < file2 > …, or standard input if no file is specified or if - is specified. By default, the matching lines are printed.
Two additional variants of the program are available as egrep (same as grep -E) or fgrep (same as grep -F).
-A < num > Prints < num > lines of trailing context after matching lines.
--after-context=< num > Same as -A < num >.
-a Processes a binary file as if it were a text file. Equivalent to -binary-files=text option.
--text Same as -a.
-B <num> Prints <num> lines of leading context before matching lines.
--before-context= <num> Same as -B <num> .
-C <num> Prints <num> lines of output context. Default is 2.
- <num> Same as -C <num> .
--context[= <num> ] Same as -C <num> .
-b Prints the byte offset within the input file before each line of output.
--byte-offset Same as -b.
--binary-files= <type> Assumes a file is type <type> if the first few bytes of a file contain binary data.
Default <type> is binary, and grep normally outputs a one-line message indicating the file is binary, or nothing if there is no match.
If <type> is without-match, it is assumed that a binary file does not match. Equivalent to -I option.
If <type> is text, it processes the file as though it were a text file. Equivalent to -a option. Warning: Using this option could result in binary garbage being output to a terminal, some of which could be interpreted by the terminal as commands, resulting in unwanted side effects.
-I Assumes that a binary file does not match. Equivalent to -binary-files=without-match option.
-c Prints a count of matching lines for each file. Combined with -v, counts nonmatching lines.
--count Same as -c.
-v Inverts matching to select nonmatching lines.
--invert-match Same as -v.
-d <action> If input file is a directory, uses <action> to process it.
If <action> is read, grep reads directories as if they were normal files. This is the default.
If <action> is skip, grep silently skips directories.
If <action> is recurse, grep recursively reads files under the directory. Equivalent to -r.
--directories=<action> Same as -d < action >.
-r Recursively reads files under directories. Equivalent to -d recurse option.
--recursive Same as -r.
-f <file> Reads a list of patterns from < file >, which contains one pattern per line. An empty file has no patterns and matches nothing.
--file=<file> Same as -f < file >.
-e <pattern> Uses < pattern > as the pattern. Useful for protecting patterns beginning with -.
-regexp=<pattern> Same as -e < pattern >.
-G Interprets < pattern > as a basic regular expression. This is the default behavior.
--basic-regexp Same as -G.
-E Interprets <pattern> as an extended regular expression. Equivalent to egrep.
-extended-regexp Same as -E.
-F Interprets <pattern> as a list of fixed strings, separated by newlines, any of which is to be matched. Equivalent to fgrep.
--fixed-strings Same as -F.
-H Prints the filename for each match.
--with-filename Same as -H.
-h Suppresses filenames on output when multiple files are searched.
--no-filename Same as -h.
--help Displays a brief help message.
-i Ignores case in <pattern> and input files.
--ignore-case Same as -i.
-L Prints a list of files that do not have matches. Stops scanning after the first match.
-l Prints a list of files that contain matches.
--mmap If possible, uses mmap(2) system call rather than the default read(2) system call. Sometimes -mmap results in better performance. However, it can cause unexpected behavior, such as core dumps, if the file shrinks while grep is reading it or if an I/O error occurs.
-n Output includes the line number where the match occurs.
--line-number Same as -n.
-q Quiet. Suppresses normal output. Scanning stops on the first match. Also see the -s and -no-messages options.
--quiet Same as -q.
--silent Same as -q.
-s Suppresses error messages about nonexistent or unreadable files.
--no-messages Same as -s.
-V Prints the version number of grep to standard error. Includes the version number in all bug reports.
--version Same as -V.
-w Selects only lines that have matches that form whole words.
--word-regexp Same as -w.
-x Selects only those matches that exactly match the whole line.
--line-regexp Same as -x.
-Z Outputs a zero byte (the ASCII NUL character) instead of the character that normally follows a filename. This option makes the output unambiguous, even for filenames containing unusual characters such as newlines.
--null Same as -Z.
-y Obsolete equivalent for -i.
-U Has no effect on platforms other than MS-DOS and MS Windows. On those platforms, -U treats files as binary files to affect how CR characters are handled.
--binary Same as -U.
-u Has no effect on platforms other than MS-DOS and MS Windows. On those platforms, reports Unix-style byte offsets; that is, with CR characters stripped off.
--unix-byte-offsets Same as -u.

Share ThisShare This

Informit Network