Home > Articles

Software Documentation: Identifying Authoritative Knowledge

When it comes to software documentation, we need to become our own curators, working on all the knowledge that is already there to turn it into something meaningful and useful. This chapter demonstrates how to start turning knowledge into something useful through curation, while embracing that this knowledge is continuously changing.

Save 35% off the list price* of the related book or multi-format eBook (EPUB + MOBI + PDF) with discount code ARTICLE.
* See informit.com/terms

This chapter is from the book

Remember that most of the knowledge related to a system is already there in that system—and there is a lot of it. One crucial way to exploit all that knowledge is through curation. The idea of curation is to select the few relevant bits of knowledge out of the ocean of data in the system, in order to help people working on it in their future work assignments. Because this system is always changing, it is safest to ensure that this curation evolves naturally, without any manual maintenance.

Dynamic Curation

In art exhibitions, the curator is as important as the director in a movie. In contemporary art, the curator selects and often interprets works of art. For example, the curator searches for prior work and places that inspired the artist, and he or she proposes a narrative or a structured analysis that links the selected works together in a way that transcends each individual piece. When a work that is essential for the exhibition is not in the collection, the curator will borrow it from another museum or from a private collection or may even commission the artist to create it. In addition to selecting works, the curator is responsible for writing labels and catalog essays and overseeing the scenography of the exhibition to help convey the chosen messages.

When it comes to documentation, we need to become our own curators, working on all the knowledge that is already there to turn it into something meaningful and useful.

Curators select works of art based on many objective criteria, such as the artist name, the date and place of creation of the works, or the private collectors who first bought the works. They also rely on more subjective criteria, such as the relationships to art movements or to major events in history, such as wars or popular scandals. The curator needs metadata about each painting, sculpture, or video performance. When the metadata is missing, the curator has to create it, sometimes by doing research.

Curation is something that you already do, perhaps without being aware of it. For example, when you are asked to demo an application to a customer or to a top manager, you have to choose just a few use cases and screens to show in order to convey a message, such as “everything is under control” or “buy our product because it will help you do your job.” If you have no underlying message, it’s likely that your demo with be an unconvincing mess.

Unlike in art exhibitions, in software development what we need is more like a living exhibition with content that adjusts according to the latest changes. As the knowledge evolves over time, we need to automate the curation on the most important topics.

Therefore: Adopt the mindset of a curator to tell a meaningful story out of all the available knowledge in the source code and artifacts. Don’t select a fixed list of elements. Instead, rely on tags and other metadata in each artifact to dynamically select the cohesive subset of knowledge that is of interest for the long term. Augment the code when the necessary metadata is missing and add any missing pieces of knowledge when they are needed for the story.

Curation is the act of selecting relevant pieces out of a large collection to create a consistent narrative that tells a story. It’s like a remix or a mashup. Curation is key for knowledge work like software development. Source code is full of knowledge about many facets of the development, and different parts of it are of varying degrees of importance. On anything bigger than a toy application, extracting knowledge from the source artifacts immediately overflows our regular cognitive capabilities with too many details, and the knowledge becomes meaningless and therefore useless (see Figure 5.1).


Figure 5.1 Too much information is as useless as no information

The solution is to aggressively filter the signal from the noise for a particular communication intent; as the little buffoon monster in Figure 5.1 says, “Too much information is as useless as no information.” What would be the noise from a particular perspective might be the signal from another perspective. For example, the method names are an unnecessary detail in an architecture diagram, but they might be important in a close-up diagram about how two classes interact, with one being an adapter to the other.

Curation as its core is the selection of pieces of knowledge to include or to ignore, according to a chosen editorial perspective. It’s a matter of scope. Dynamic curation goes one step further, with the ability to do the selection continuously on an ever-changing set of artifacts.

Examples of Dynamic Curation

A Twitter search is an example of automated dynamic curation, and it is a resource in itself that you can follow just as you would follow any Twitter handle. People on Twitter also do a manual form of curation when they retweet content they have (more or less) carefully selected according to their own editorial perspective (perhaps). A Google search is another example of simple automated curation.

As another example, selecting an up-to-date subset of artifacts based on a criterion is something we do every day when using an IDE:

  • Show every type with a name that ends with “DAO.”

  • Show every method that calls this method.

  • Show every class that references this class.

  • Show every class that references this annotation.

  • Show every type that is a subtype of this interface.

When a tag is missing to help select the pieces, you should introduce it with annotations, naming conventions, or any other means. When a piece of knowledge is missing, in order to show a complete picture, you need to add it in a just-in-time fashion.

Editorial Curation

Curation is an editorial act. Deciding on an editorial perspective is the essential step. There should be one and only one message at a time. A good message is a statement with a verb, like “No dependency is allowed from the domain model layer to the other layers” rather than just “Dependencies between layer,” where there is no message, and it’s up to the reader to guess what is meant. At a minimum, dynamic curation should be given an expressive name that reflects the intended message.

Low-Maintenance Dynamic Curation

Selecting subsets of knowledge can be hazardous if done in a rigid way. For example, a direct reference to a list of classes, tests, or scenarios will rapidly become obsolete and will require maintenance. It is a form of copy and paste, and it makes change more expensive; it also is subject to the risk of someone forgetting to update it. This is not a good practice, and it should be avoided at all cost.

You can describe the artifacts of interest in a stable way by using one of the stable selection criteria described here:

  • Folder organization: For example, “everything in the folder named ‘Return Policy’”

  • Naming conventions: For example, “every test with ‘Nominal’ in its name”

  • Tags or annotations: For example, “every scenario tagged as ‘WorkInProgress’”

  • Links registry over which you have control (which may need some maintenance from time to time, but at least it is in a central place): For example, “the URL registered under this shortlink”

  • Tool output: For example, “every file that has been processed by the compiler, as visible in its log”

When you use stable criteria, the work is done by tools that automatically extract the latest content that meets the criteria to insert it into the published output. Because it is fully automated, it can be run as often as possible—perhaps continuously on each build.

One Corpus of Knowledge for Multiple Uses

Everything can be curated—code, configuration, tests, business behavior scenarios, datasets, tools, data, and so on. All the knowledge available can be considered as a huge corpus, accessible via automated means for analysis and curated extractions.

Provided that the content of the knowledge corpus is adequately tagged, it is possible to extract by curation out of it a business view of a glossary (that is, a living glossary), a technical view of the architecture (that is, a living diagram), and any other perspective you can imagine, including the following:

  • Audience-specific content, such as business-readable content only versus technical details

  • Tasks-specific content, such as how to add one more currency

  • Purpose-specific content, such as an overview of content versus a references section

Curation is possible only to the extent that metadata about the source knowledge is available to enable relevant selection of material of interest.

Scenario Digests

Curation is not just about code; it’s also about tests and scenarios. A good example of dynamic curation is a scenario digest, in which the corpus of business scenarios is curated under various dimensions in order to publish reports tailored for particular audiences and purposes.

When a team makes use of BDD together with an automated tool such as Cucumber, a large number of scenarios are written in feature files. Not every scenario is equally interesting for everyone and for every purpose, so you need a way to do a dynamic curation of the scenarios, and for that you need to have the scenarios marked with a nicely designed system of tags. Remember from Chapter 2, “Behavior-Driven Development as an Example of Living Specifications,” that tags are documentation.

Each scenario can have tags like the following:

1  @acceptancecriteria @specs @returnpolicy @nominalcase
2  Scenario: Full reimbursement for return within 30 days
3  ...
5  @acceptancecriteria @specs @returnpolicy @nominalcase
6  Scenario: No reimbursement for return beyond 30 days
7  ...
9  @specs @returnpolicy @controversial
10 Scenario: No reimbursement for return with no proof of purchase
11  ...
13 @specs @returnpolicy @wip @negativecase
14 Scenario: Error for unknown return
15  ...

Note that almost all these tags are totally stable and intrinsic to the scenario they relate to. I say almost because @controversial and @wip (work in progress) are actually not meant to last too long, but they are convenient for a few days or weeks for easy reporting.

Thanks to all these tags, it is easy to extract only a subset of scenarios, by title only or complete with step-by-step descriptions. The following are some examples:

  • When meeting business experts who have very limited time, perhaps you could focus only on the information tagged @keyexample and @controversial:

    1 @keyexample or @controversial Scenarios:
    2 - Full reimbursement for return within 30 days
    3 - No reimbursement for return with no proof of purchase
  • When reporting to the sponsor about the progress, the @wip and @pending scenarios are probably more interesting for this audience, along with the proportion of @acceptancecriteria passing green:

    1 @wip, @pending or @controversial Scenarios:
    2 - Error for unknown return
  • When onboarding a new team member, going through the @nominalcase scenarios of each @specs section may be enough:

    1 @nominalcase Scenarios:
    2 - Full reimbursement for return within 30 days
    3 - No reimbursement for return beyond 30 days
  • Compliance officers want everything that is not @wip. However, even in that case, they might want to have the big document show a summary of the @acceptancecriteria first and the rest of the scenarios in addendum.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.


Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.


If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information

Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.


This site is not directed to children under the age of 13.


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure

Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact

Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice

We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020