Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

English Query - Part 5

Last updated Mar 28, 2003.

In this tutorial on the series on Natural Language Processing, we'll finalize the design for our simple application for SQL Server's English Query. I've shown you how to follow the process for developing an English Query application that I explained in the first tutorial in this series. Through the tutorials that followed we gathered the requirements, and examined the physical structure of the database to ensure that we could get the information back to the user. Then we created the semantic and physical models within English Query using a Wizard that pulled in most of the Entities and Relationships that we needed to answer the user's questions.

Now we're to the point where we need to complete the design for our project and ensure that we can answer all of the questions our users presented to us early in the design. To refresh your memory, here are those questions:

  • Who are the authors?
  • How many books has author (Name) written?
  • What books have they written?
  • Who sold the most books?

Remember that we checked the physical database using Transact-SQL (T-SQL) code to ensure that we have that information, so the system should be able to provide those answers to the users. Since we have all of the data in the database and we've "told" the English Query system where that information is, the application should "know" the answers. I've put quotes around two words in the last sentence to show how easy it is to be deceived by the English Query engine. As technical developers we're not immune to assigning the computer intelligence, even when it doesn't have any. The system, amazing as it is, only knows what we code it to know. It's vital that we keep this in mind, so that we can arm the applications with as many hints as possible to answer the user's questions.

Let's examine what answers we can get back from the system as it stands. We'll do this in two phases. First, we'll ask the system the questions from the list, and make notes about the ones that the system can't answer. We'll then refine the design until it can. In the second phase we'll ask the same questions in multiple ways, and see if the application returns the correct data. In effect we'll "train" the system so that it can recognize the various question formats as the same.

Let's begin with the questions in the list. Just as in the previous tutorial, I've opened the English Query application and accessed the pubs project.

I then ran the application and asked it the questions from the list. Here are the results from my test system:

Question

Result

Who are the authors?

Correct

How many books has author (Name) written?

Not Correct

What books have they written?

Not Correct

Who sold the most books?

Not Correct

Ah, well – one out of four isn't bad for a first pass. The wizard did a fair job of discovering most of the Entities and Relationships. For the corrections, let's focus on the last question for this exercise. Once you understand the process you can use to answer it, you can examine the other questions and correct the model the same way.

Here's a shot of the question and answer session for the last question on my test system:

This shows the resulting message the users get when they ask this question. They'll get this response not only when the Entities and Relationships aren't defined, but also when you haven't included enough synonyms and other attributes for the Entities. That's part two of our review.

So where do we start? We know that the system doesn't understand the Entities as they relate to the question or it doesn't have a Relationship that can get to the answer. We can define those by hand, using the same tools I showed you in the last tutorial. Instead, I like to start off with another Wizard, built right into the application panel we're in. The icon to launch the Wizard looks like a magic wand, just to the right of the table output icon in the icon bar. Once you click that, you're shown a panel similar to this one:

When the panel first comes up, the top two items are blank. Let's take a look at each of the fields. The first question the system asks is what the word "Who" refers to. When you pull down that menu list, you're shown all of the Entities that you have in your model. In our case that's quite a few, since we started by using the Wizard and it created dozens of them for us. Here's the concept you need to keep in mind: English Query objects build on each other. We are going to create more Entities and Relationships based on what we already have.

If you find that your model doesn't have an Entity you need to answer this question, you can click the Entity icon that looks similar to a green diamond to the right of the selection. From there you can build your semantic to physical mappings from scratch. If you find at that time that the database doesn't have the data you need to build the object, you'll have to create it using one of the techniques I mentioned in the last tutorial. In this case I set "Who" to be equal to authors.

You need to be careful in this selection, however, since your application may have more than one "Who" answer. Is the "Who" referring to the publisher, the store clerk or the author? In this case, I'm setting the answer to be the author, but it's important that the users know that, or they'll operate on bad data. This is the main reason you need to think your model design through very carefully, and always go back to the users to verify the requirements.

The second item refers to the Relationships that are present in the question. In this example the system correctly guesses that "books" are really the numbers that identify them (title id's) and that authors sell them. The last checkbox tells the system to build or modify the Relationships and Entities. Once you click the OK button you'll get a screen similar to this one:

The system has built a new Relationship from the question, and dropped us in the design area. After I review the Relationship the Wizard created I run the application again and ask the same question:

This time the answer is a success, even determining that multiple authors have sold the same amount of books.

Now we move on to phase two of the review. What are some of the other ways users can ask the same questions? You need to think of as many as you can, write them down, and then try the application against them. As the system finds the correct answer, switch over to the Analysis tab and examine the objects it used to find the answer.

Locating the Entities and Relationships this way will help you when the system doesn't answer a question. When the system doesn't find the answer, you'll have to work with either the Entities by adding more or editing their attributes or you'll have to change or add Relationships. Learning what it uses to answer the question one way will help you design the synonyms or other edits you need to correct the problem.

We've worked through a very simple exercise of creating an English Query application using only Wizards and editing the objects they create. In an actual application, you'll find yourself creating Entities and Relationships by hand. I still advise that you start with a fully Wizard-created application and work up to the more complex models.