Home > Articles > Web Services > XML

This chapter is from the book

1.3 Information systems

An information system is a collection of information that is, in essence, a model of some aspects of the world. It is of interest to its users because it can answer questions about these aspects of the world. Before the advent of computerized information systems the only way to find out if, for example, a book was available in a library or lent out was to go and look at the shelf where the book ought to be. If it was not there, it would be assumed that someone was currently reading it. Today, however, librarians will consult the library information system to see whether the book is available or not, and only check the information from the system against external reality (the shelf) if the reader insists.

An information system need not be digital: A paper encyclopedia, for example, is an information system that can answer a large number of questions when consulted by a human. This book, however, is written strictly with digital information systems in mind. These are usually used to store information about the world external to the computer, but not always. One exception might be the registries that many computer systems maintain of installed software and configuration information for that software.

1.3.1 Anatomy of classical information systems

Any information system exists as part of a larger context in which the system plays a specific role. In the case of the library, the information system will be consulted and updated by the librarians. This will be its context, and the role it plays is something that can help answer questions such as "what books does the library have," "where can I find this book," "is this book available or not," and so on.

Figure 1–3 shows a diagrammatic outline of what the library information system might look like. In the center, there is a data store of some kind, most likely a relational database. Around it are several applications which all access the central data repository, without being aware of each other. These applications are used by three different groups of people: the librarians, the readers, and the system administrators. This is how classical information systems have generally been structured. There are some variations in the exact structure of the system, but in a broad outline, these are the features that most such systems have had.

Figure 1-3Figure 1–3 The anatomy of the library information system

In such systems, the basis of the entire system is the schema used to define the internal data representation in the data store. The schema

defines how data is stored in the data store, and this determines how applications can access and work with the data. The schema defines the structure of the data and lays down constraints on it. For example, the schema might say that the book ID code must be unique, each row in the loan table must have valid book and reader ID codes, and so on. These rules are (usually) enforced by the data store, which means that even though there are many different applications, perhaps written by different people over a long period of time, one can be certain that none of the applications will violate these rules.

Another role played by the schema is that of documenting the structure of the data that the system manages as well as many of the assumptions made in the system design. Together with prose, the schema is very valuable as documentation, since it is concise, clear, and unambiguous.

It should also be noted that the schema plays a very important role in that it effectively defines the limits for what kinds of functionality can be supported by the applications in the system. For example, if the library information system does not record the Dewey classification code of each book, searching for books by their Dewey codes cannot be supported at all.

The information stored in a database is in a half-way state between liveness and suspendedness, not really being entirely in memory or entirely serialized. It should probably be considered to be live, since the application does not need to expend much effort to access the data and the data certainly are not serialized in the database. The data export application in Figure 1–3 would serialize information from the system into some kind of transport notation, whether for sending to other installations elsewhere or for backup purposes. Other than that, the library system does not really do any serialization or deserialization. It holds all its needed data internally and has little need for communication with the outside world, except through user interaction.

1.3.2 Formality in information systems

Digital information systems can usefully be divided into two categories: formal and informal systems. In formal systems the information follows strict rules, while informal systems are free-form. This division is not absolute, since systems can have varying degrees of formality, but a typical example of an informal information system might be a collection of word processor documents containing a list of the CDs available in a library in the form of prose.

Even though this collection of documents could be consulted by a human to find, for example, the number of songs in the CD collection, a computer would not be able to do the same, since it cannot read text and understand what it says. To enable a computer to answer this question, one would have to develop a formal information system to store the information in such a way that the computer, still without knowing what a song or a CD really is, could perform some simple operations that would result in the number of songs being counted.

Doing this, however, means formalizing the system and making it more rigid, which may be hard if the information in it has a very complex structure, or if that structure is poorly understood. Furthermore, a formal system will be harder to extend, since formal systems give much less flexibility in terms of how information is expressed. The benefit is much greater convenience in use through automation. For example, although a human might in theory count the songs on all the CDs in a library, that would require a large amount of manual work, while a properly designed digital information system could answer the question within seconds.

Quite often, an organization will start out with a highly formal system, such as one for books in a library. After the system has been in use for a while, the library starts stocking CDs in addition to books, but since CDs do not fit in the information system (the structure being too specifically directed towards books) the list of CDs is kept in simple text documents instead.

Eventually, this solution is bound to become insufficient to support the number of CDs that the library accumulates. To solve this, the original information system is extended with support for CDs, and the information in the text documents is migrated from the text documents to the larger system. From this point on, both CDs and books will be supported. Most large real-life information systems will at any point in their lifetime consist of a highly formal core with several smaller informal systems clustered around them. These informal systems will typically contain less data and often also be only temporary in nature. Some of them, however, will grow and eventually demand to be made more formal and need applications of their own.

One of the strengths of XML is that it supports this very well, since it can support both relatively informal and quite formal data. XML information systems also tend to be easier to set up initially and also to change later than their more formal competitors. XML is generally less formal and controlled than data in ordinary databases. With XML, checking validity is a separate operation, performed when necessary, and not something enforced by the data storage mechanism itself (except when an XML database with such functionality is used, which is relatively rare).

1.3.3 Ontologies

To be able to formalize the system, one really should design a schema that defines the structure, but before a schema can be made there are two steps that need to be taken. Often, these are taken without being explicitly thought through, and this may even work well, but it is still worth knowing about the steps.

The parts of the world that are considered within the scope of the information system are often called the Universe of Discourse (UoD) for that particular information system. The next step towards a schema is to analyze the UoD to find out what it consists of and which parts of it are considered interesting. In the example above, this would mean the CD collection of some library, and implicitly, only the music CDs (since we mentioned songs) and not the CD-ROMs with software and data.

This analysis would result in what is called an ontology, which means a theory of reality. Such a theory of reality might state that our particular UoD consists of CDs, artists, and songs. This is a pretty naive theory, though, as it omits many interesting aspects of the UoD. For example, artists can be individual people, such as Mariss Jansons and Peter Gabriel, but also groups of people, such as the Oslo Philharmonic Orchestra and Genesis. Some artists have released music both individually and as part of a group of people (for example, Peter Gabriel was a member of Genesis until 1975, but released solo albums after that).

Another, and even subtler problem arises when we try to count the songs in the CD collection, because we haven't decided what a song really is. For example, Peter Gabriel has released three different CDs that all contain a song titled Biko. Does this count as one song, or as three? The version on the album usually10 known as 3 is the original studio version, the version on Plays Live is a live recording, and the version on Shaking the Tree is indistinguishable from the original studio version on 3.

The complexity does not stop there, for these CDs are issued in slightly different versions in different countries, and records that were originally released as LPs are often re-released once on CD with poor quality and later remastered to much better quality. This produces CDs with identical titles and song listings, identical (or near-identical) covers, but with subtly different sounds.

Clearly, to be able to make a structured information system for something as messy as this, we need a theory of reality, an ontology that can tell us what is what. One such ontology already exists, and is known as IFLA FRBR, or Functional Requirements for Bibliographic Records, defined by the International Federation of Library Associations and Institutions. The specification can be found at http://www.ifla.org/VII/s13/frbr/frbr.pdf. This ontology deals with what it calls creations (not just music) and defines three main categories of creations:

manifestations

These are tangible creations that are either physical objects composed of atoms and molecules or digital objects consisting of bits and bytes. A CD and a track on a CD would both be manifestations, as would notes printed on paper.

performances

These are spatio-temporal creations, that is, creations that have taken place as events in space and time. A concert would be a typical example of a performance. If a performance is recorded somehow, that recording becomes a manifestation of the performance.11

works

Works are the least tangible category of creations, being abstract creations. For example, if you think of a new melody, that becomes an abstract creation, and its existence will not be revealed until you either make a manifestation of it (by writing down the notes) or a performance of it (by humming it or singing it out loud).

With this ontology in hand, we can suddenly make sense of the confusion we suffered earlier. The question "How many songs are there?" was ill-posed, in the sense that we had not properly defined the term "song." Instead, we have three new terms, and occurrences of these we can count with confidence. So, Biko is a work, which has been performed in the studio and also live in concert. The three occurrences of the work are three different manifestations of two different performances of one work.12

1.3.4 Information models

With the ontology in place, we can start to make an information model for our UoD. The information model is a detailed conceptual model of all the information in the system, including all types of items13 with their fields (or properties) and the relationships between them. For our example we could start by defining the item types CD, track, person, artist, and work (choosing to disregard performances) and then continue by defining the attributes of each and their relationships.

An information model differs from a schema in that the schema is defined in terms of a data model, while the information model is independent of any particular data model. In fact, part of the reason for making an information model is that the model is not plagued by the weaknesses of some data models, and this means that we can model the data more-or-less directly.14 The information model is generally created either informally, using some undefined data model, or it is created using some formal modelling language. Among the possibilities are the Entity-Relationship (ER) language, Object Role Modeling (ORM), and Unified Modeling Language (UML). Some people also use the EXPRESS schema language, since it is so powerful that even though one doesn't plan to use EXPRESS in the system to be developed, EXPRESS can serve to define the information model.

Once all the item types, their attributes, and relationships were worked out and clearly defined, we would have an information model for the information system. This would not be something that could be used directly to generate programs or to configure software to manage the system for us, but would be a conceptual specification that could serve as documentation for the system. Typically, the developers of the software components in the system would use the information model as guidance when developing the components, and it would also be used to set up any central data repositories such as a database.

To make a schema for the system, the developers would need to select a schema language and express the information model in terms of the data model used by that schema language. This step often involves more than a simple reformulation of the information model, since changes may prove necessary for various kinds of performance reasons. Generally, the information model is designed to be easy to understand, while the schema must be designed to be efficient.

1.3.5 Summary

To briefly reiterate the terms introduced in this section, an information system is a model of a subset of the external world known as the Universe of Discourse. The basis for the model is an ontology, a theory of reality, based on which a conceptual information model describing the detailed structure of the system is created. The information model is then turned into a schema for the data model used by the system (or possibly more than one schema, if the system uses more than one data model).

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020