Web Search: the Principle of Onions

Searching the internet can be a daunting task, particularly with the internet at four billion sites and growing, but Tara Calishain offers the "Principle of Onions" to help you find the information you need more efficiently.
This chapter is from the book

The Principle of Onions

The Principle of Onions: When searching, it's better to start with very specific search queries and then get more and more general. If you start with more general queries you will tend to get overwhelmed with results.

Google indexes over four billion Web pages but admits that they don't index the entire Internet. The Internet Archive, a collection of older copies of Web site pages, is by itself well over ten billion pages. And the Internet is getting larger, not smaller.

It's crucial that you start your Internet search right by structuring a query such that you get a limited number of results—otherwise you're going to get overwhelmed with information. If the very narrow query doesn't work, you should slowly get more and more general, until you achieve a good balance of useful—but not overwhelming—information. That's what the Principle of Onions is all about.

Of course, different types of search engines have different levels of information. What is just narrow enough on a full-text engine is impossibly narrow on a searchable subject index. That's because a full-text engine indexes an entire Web page, making searches on obscure keywords possible. Searchable subject indexes include only the title, URL, and description of a site, leaving little room for narrow search results.

How does this apply to the Principle of Onions? Let's take a look at four search scenarios and see how "narrow" means different things to the two different search engines. For these scenarios we'll use Google as our full-text engine and Yahoo as our searchable subject index.

Searching for Lyrics

Are there really lyrics to that song you're listening to or is the singer just saying "Ugga oomp blat argh"? Now you can find out by searching for the lyrics online. Though the lyrics transcriptions provided by any given Web site are not guaranteed to be accurate, they'll get you a lot further than "ugga oomp."

Full-Text Engines

For full-text engines the onion is simple. Search for the name of the artist, if you know it, and one line of lyrics that you're absolutely sure of, the more unique the better. If you're not sure of a single word within a line, use the full word wildcard. A nice narrow search therefore might look like this:

"they might be giants" "dot or is he a speck""

This search will get you results. But if it doesn't, start by removing the line of lyrics (you might be absolutely certain that you have the right lyrics, but you might be wrong) and substitute the name of the song:

"they might be giants" "particle man"

If that still doesn't work, use the band name and the words "lyric" and "search"—remove the song title. If you're lucky you'll land on a lyrics search engine.

Do you see how you're moving from more specific to more general with just a few searches?

Searchable Subject Indexes

Searchable subject indexes are lousy for searching for the lyrics for one song. Instead, you'll want to use them to search for lyrics collections. Search for the name of the band and the word "lyrics":

Not only will that get you results, but you'll also discover that there's a category for that band, which will lead you to other resources if you want them.

The two techniques above will work for any band that's halfway well-known. But you may be interested in a band that has only a regional following. In that case you'll have to realize that "as specific as possible" is not going to be very specific. You may have to go to Google and just search for the band name and the word "lyrics."

