Organizing Content around Taxonomies
The organizational techniques just described provide the skeletal structure of well-designed portals. They provide a logical organization that spans individual pages up to the entire portal. It is now time to turn our attention to the fine-grained structures required to organize the content that surrounds the coarser-grained organizational structure. In this section we discuss taxonomies—a commonly used organizing framework—that organize information reflected in facets.
Classifying Content with Taxonomies
Taxonomies are quickly gaining prominence as navigational tools in portals and with good reason. Taxonomies, or classification schemes, provide a high-level view of the content and other resources available in a portal. Search tools are useful when we are looking for a targeted piece of information, but taxonomies provide an easy-to-use browsing method. Users do not have to know what terms to search for or even whether specific information exists. Taxonomies allow us to move quickly from high-level groupings (e.g., Business, Weather, Politics, and Sports) to narrow subjects (e.g., Marketing, Finance, Investment, and so on).
While it is often easy to start constructing taxonomies, the process becomes more difficult as you move to more specific categories and realize there may be multiple ways to classify the same topic. This leads to the first rule of taxonomy development: There is not a single correct taxonomy. There are many. For example, suppose you want to find a speech by the president of the Federal Reserve Bank of Chicago using the Yahoo! directory (http://www.yahoo.com). You could find the bank's Web site in at least two different ways using the directory. One way is to start at Home and then follow the taxonomy based on organizational structure :
Home > Government > U.S. Government > Agencies > Independent > Federal Reserve System > Federal Reserve Banks
Alternatively, you could follow the taxonomy based on geographical organization:
Home > Regional > U.S. States > Illinois > Cities > Chicago > Community > Government > Federal Reserve Bank of Chicago
What seems like a logical organization is a function of how we think about a topic, not how the topic is organized according to a predefined scheme. We are not under the same constraints as librarians who have to manage physical assets. A book can be in only one location at a time, so librarians need to adopt a single arbitrary scheme (such as the Library of Congress Subject Headings or the Dewey Decimal System) to effectively manage these assets. Digital assets are easily categorized with multiple schemes. However, with this flexibility comes a new problem: integrating these multiple schemes.
For those developers convinced that a taxonomy is needed for their portals, the next question is where to begin.
Portal developers have a number of options for building taxonomies.
Start with an existing third-party taxonomy.
Use enterprise structures (e.g., directory structures).
Use automated clustering.
Each has its benefits and drawbacks, but using a combination of these techniques can often meet most needs.
The quickest way to develop a taxonomy is to simply use an existing one. Publicly available classification schemes, such as the Library of Congress Subject Headings, cover a wide range of topics but may not be suited to commercial organizations because of their focus on comprehensive coverage of top-level topics. Industry- and discipline-specific taxonomies are widely available and often provide a good starting point. Remember to match the coverage of the taxonomy to your specific needs. For example, a taxonomy from an electrical engineering organization will work well for electrical engineers but may not work as well for teams that combine electrical engineering, computer science, and chemistry experts. Multidisciplinary teams tend to focus on particular problems (e.g., low power consumption circuit design), and taxonomies organized around those problems are better suited than discipline-centric ones. General business taxonomies are available from news aggregation services that have often developed the classification schemes for their own use. Even if a third-party taxonomy is not a perfect fit “out of the box,” it can be combined with schemes developed in-house.
For as long as we have had subdirectories in file systems, we have been categorically organizing content. Many organizations have large shared directories organized around business processes, organizational structures, and ad hoc practices. These directory structures are useful starting points for building taxonomies because they tend to reflect the way users, at least some users, organize their work. When using network directory structures as a guide we need to remember that some subdirectories are created for ad hoc tasks, some are used simply to share files much like an ftp site, and some are no longer used but continue to exist because of poor directory management practices. Nonetheless, within the sometimes sprawling directory structures we can find elements of organizational structures that reflect existing business processes.
Automatic clustering of documents can also provide insight into the logical grouping of content. Basically, the process involves analyzing patterns within documents and grouping documents with similar patterns. This technique is useful when the logical grouping of documents is not clear, for example, when doing research in an unfamiliar domain. Clustering can definitely help discern the groups of documents, but it cannot be the sole technique used to define taxonomies. Not all groups identified by the clustering tool will make sense. Clustering tools can name the groups using terms that frequently occur in the member documents, but these are generally insufficient labels for end users.
When developing taxonomies, our primary focus must be on the way users think about their domain, not what third-party experts have decreed and not on the output of automatic categorization algorithms. There is no “right” answer. The best taxonomies are the ones that match the users' model of the organization and its processes.
Taxonomies are typically, although not exclusively, hierarchical. Taxonomies allow us to think about topics in relation to broader and narrower topics. When we make these distinctions between broader and narrower topics we are doing it based on some overriding concept. For example, when we navigate from a point labeled “United States” to “Illinois” to “Chicago,” the overriding concept is geography. When we navigate from “United States” to “Federal Government” to “Supreme Court,” the overriding concept is government structure. Clearly we can categorize topics in different ways depending on our particular interest at the time.1
Similar structures exist in OLAP applications. We can think about sales figures by product, by sales region, and by time. Traditionally, these organizing principles are called dimensions. In the world of taxonomies and content management, we refer to these as facets.