Home > Articles > Web Services > XML

This chapter is from the book

3.3 Internet Applications

In the latter part of the 1990s, if the press wasn't talking about the Microsoft/Java wars, it was talking about the Internet. The Internet was a people's revolution and no vendor has been able to dominate the technology. Within IT, the Internet has changed many things, for instance:

  • It hastened (or perhaps caused) the dominance of TCP/IP as a universal network standard.

  • It led to the development of a large amount of free Internet software at the workstation.

  • It inspired the concept of thin clients, where most of the application is centralized. Indeed, the Internet has led to a return to centralized computer applications.

  • It led to a new fashion for data to be formatted as text (e.g., HTML and XML). The good thing about text is that it can be read easily and edited by a simple editor (such as Notepad). The bad thing is that it is wasteful of space and requires parsing by the recipient.

  • It changed the way we think about security (discussed in Chapter 10).

  • It liberated us from the notion that specific terminals are of a specific size.

  • It led to a better realization of the power of directories, in particular Domain Name Servers (DNS) for translating Web names (i.e., URLs) into network (i.e., IP) addresses.

  • It led to the rise of intranets—Internet technology used in-house—and extranets—private networks between organizations using Internet technology.

  • It has to some extent made people realize that an effective solution to a problem does not have to be complicated.

Internet applications differ from traditional applications in at least five significant ways.

First, the user is in command. In the early days, computer input was by command strings and the user was in command. The user typed and the computer answered. Then organizations implemented menus and forms interfaces, where the computer application was in command. The menus guide the user by giving them restricted options. Menus and forms together ensure work is done only in one prescribed order. With the Internet, the user is back in command in the sense that he or she can use links, Back commands, Favorites, and explicit URL addresses to skip around from screen to screen and application to application. This makes a big difference in the way applications are structured and is largely the reason why putting a Web interface on an existing menu and forms application may not work well in practice.

Second, when writing a Web application you should be sensitive to the fact that not all users are equal. They don't all have high-resolution, 17-inch monitors attached to 100Mbit or faster Ethernet LANs. Screens are improving in quality but new portable devices will be smaller again. And in spite of the spread of broadband access to the Internet, there are, and will continue to be, slow telephone-quality lines still in use.

Third, you cannot rely on the network address to identify the user, except over a short period of time. On the Internet, the IP address is assigned by the Internet provider when someone logs on. Even on in-house LANs, many organizations use dynamic address allocation (the DHCP protocol), and every time a person connects to the network he or she is liable to get a different IP address.

Fourth, the Internet is a public medium and security is a major concern. Many organizations have built a security policy on the basis that (a) every user can be allocated a user code and password centrally (typically the user is given the opportunity to change the password) and (b) every network address is in a known location. Someone logging on with a particular user code at a particular location is given a set of access rights. The same user at a different location may not have the same access rights. We have already noted that point (b) does not hold on the Internet, at least not to the same precision. Point (a) is also suspect; it is much more likely that user code security will come under sustained attack. (We discuss these points when we discuss security in Chapter 10.)

Fifth and finally, it makes much more sense on the Internet to load a chunk of data, do some local processing on it, and send the results back. This would be ideal for filling in big forms (e.g., a tax form). At the moment these kinds of applications are handled by many short interactions with the server, often with frustratingly slow responses. We discuss this more in Chapters 6 and 13.

Most nontrivial Web applications are implemented in a hardware configuration that looks something like Figure 3-4.

03fig04.gifFigure 3-4 Web hardware configuration

You can, of course, amalgamate the transaction and database server with the Web servers and cut out the network between them. However, most organizations don't do this, partly because of organizational issues (e.g., the Web server belongs to a different department). But there are good technical reasons for making the split, for instance:

  • You can put a firewall between the Web server and the transaction and database server, thus giving an added level of protection to your enterprise data.

  • It gives you more flexibility to choose different platforms and technology from the back-end servers.

  • A Web server often needs to access many back-end servers, so there is no obvious combination of servers to bring together.

Web servers are easily scalable by load balancing across multiple servers (as long as they don't hold session data). Others, for example, database servers, may be harder to load balance. By splitting them, we have the opportunity to use load balancing for one and not the other. (We discuss load balancing in Chapter 8.)

The Transactional Component Middleware was designed to be the middleware between front- and back-end servers.

Many applications require some kind of session concept to be workable. A session makes the user's life easier by

  • Providing a logon at the start, so authentication need be done only once.

  • Providing for traversal from screen to screen.

  • Making it possible for the server to collect data over several screens before processing.

  • Making it easier for the server to tailor the interface for a given user, that is, giving different users different functionality.

In old-style applications these were implemented by menu and forms code back in the server. Workstation GUI applications are also typically session-based; the session starts when the program starts and stops when it stops. But the Web is stateless, by which we mean that it has no built-in session concept. It does not remember any state (i.e., data) from one request to another. (Technically, each Web page is retrieved by a separate TCP/IP connection.) Sessions are so useful that there needs to be a way to simulate them. One way is to use applets. This essentially uses the Web as a way of downloading a GUI application. But there are problems.

If the client code is complex, the applet is large and it is time consuming to load it over a slow line. The applet opens a separate session over the network back to the server. If the application is at all complex, it will need additional middleware over this link.

A simple sockets connection has the specific problem that it can run foul of a firewall since firewalls may restrict traffic to specific TCP port numbers (such as for HTTP, SMTP, and FTP communication). The applet also has very restricted functionality on the browser (to prevent malicious applets mucking up the workstation).

Java applets have been successful in terminal emulation and other relatively straightforward work, but in general this approach is not greatly favored. It's easier to stick to standard HTML or dynamic HTML features where possible.

An alternative strategy is for the server to remember the client's IP address. This limits the session to the length of time that the browser is connected to the network since on any reconnect it might be assigned a different IP address. There is also a danger that a user could disconnect and another user could be assigned the first user's IP address, and therefore pick up their session!

A third strategy is for the server to hide a session identifier on the HTML page in such a way that it is returned when the user asks for the next screen (e.g., put the session identifier as part of the text that is returned when the user hits a link). This works well, except that if the user terminates the browser for any reason, the session is broken.

Finally, session management can be done with cookies. Cookies are small amounts of data the server can send to the browser and request that it be loaded on the browser's disk. (You can look at any text in the cookies with a simple text editor such as Notepad.) When the browser sends a message to the same server, the cookie goes with it. The server can store enough information to resume the session (usually just a simple session number). The cookie may also contain a security token and a timeout date. Cookies are probably the most common mechanism for implementing Web sessions. Cookies can hang around for a long time; therefore, it is possible for the Web application to notice a single user returning again and again to the site. (If the Web page says "Welcome back <your name>", it's done with cookies.) Implemented badly, cookies can be a security risk, for instance, by holding important information in clear text, so some people disable them from the browser.

All implementations of Web sessions differ from traditional sessions in one crucial way. The Web application server cannot detect that the browser has stopped running on the user's workstation.

How session state is handled becomes an important issue. Let us take a specific example—Web shopping cart applications. The user browses around an online catalogue and selects items he wishes to purchase by pressing an icon in the shape of a shopping cart. The basic configuration is illustrated in Figure 3-4. We have:

  • A browser on a Web site

  • A Web server, possibly a Web server farm implemented using Microsoft ASP (Active Server Pages), Java JSP (JavaServer Pages), or other Web server products

  • A back-end transaction server using .NET or EJB

Let us assume the session is implemented by using cookies. That means that when the shopping cart icon is pressed, the server reads the cookie to identify the user and displays the contents of the shopping cart. When an item is added to the shopping cart, the cookie is read again to identify the user so that the item is added to the right shopping cart. The basic problem becomes converting cookie data to the primary key of the user's shopping cart record in the database. Where do you do this? There are several options of which the most common are:

  • Do it in the Web server.

  • Hold the shopping cart information in a session bean.

  • Put the user's primary key data in the cookie and pass it to the transaction server.

The Web server solution requires holding a lookup table in the Web server to convert cookie data value to a shopping cart primary key. The main problem is that if you want to use a Web server farm for scalability or resiliency, the lookup table must be shared across all the Web servers. This is possible, but it is not simple. (The details are discussed Chapter 7.)

Holding the shopping cart information in a session bean also runs into difficulties when there is a Web server farm, but in this case the session bean cannot be shared. This is not an insurmountable problem because in EJB you can read a handle from the object and store it on disk, and then the other server can read the handle and get access to the object. But you would have to ensure the two Web servers don't access the same object at the same time. Probably the simplest way to do this is to convert the handle into an object reference every time the shopping cart icon is pressed. Note that a consequence of this approach is that with 1,000 concurrent users you would need 1,000 concurrent session beans. A problem with the Web is that you don't know when the real end user has gone away, so deleting a session requires detecting a period of time with no activity. A further problem is that if the server goes down, the session bean is lost.

The simplest solution is to store the shopping cart information in the database and put the primary key of the user's shopping cart directly in the cookie. The cookie data is then passed through to the transaction server. This way, both the Web server and the transaction server are stateless, all these complex recovery problems disappear, and the application is more scalable and efficient.

In our view, stateful session beans are most useful in a nontransactional application, such as querying a database. We can also envisage situations where it would be useful to keep state that had nothing to do with transaction recovery, for instance, for performance monitoring. But as a general principle, if you want to keep transactional state, put it in the database.

On the other hand, keeping state during a transaction is no problem as long as it is reinitialized if the transaction aborts, so the COM model is a good one. To do the same in EJB requires using a stateful session bean but explicitly reinitializing the bean at the start of every transaction.

But you needed session state for mainframe transaction monitors, why not now? Transaction monitors needed state because they were dealing with dumb terminals, which didn't have cookies—state was related to the terminal identity. Also, the applications were typically much more ruthless about removing session state if there was a recovery and forcing users to log on again. For instance, if the network died, the mainframe applications would be able to log off all the terminals and remove session state. This simplified recovery. In contrast, if the network dies somewhere between the Web server and the browser, there is a good chance the Web server won't even notice. Even if it does, the Web server can't remove the cookie. In the olden days, the session was between workstation and application; now it is between cookie and transaction server. Stateful session beans support a session between the Web server and the transaction server, which is only part of the path between cookie and transaction server. In this case, having part of an implementation just gets in the way.

Entity beans, on the other hand, have no such problems. They have been criticized for forcing the programmer to do too many primary key lookup operations on the database, but we doubt whether this performance hit is significant.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020