System Modeling
Models help to simplify details of a system, therefore making it easier to understand. Choosing exactly what to model can have an enormous effect on your understanding of the problems at hand. Web applications, much like other software-intensive systems, are normally represented with a set of models. Model types include use case models, implementation models, deployment models, and security models. Site maps, which are abstractions of the Web pages and navigation routes throughout the system, are models used exclusively by Web systems.
Determining the correct level of abstraction and detail is crucial in order to benefit the users of the model. It is often best to model the artifacts of the system, meaning the entities that will be constructed and manipulated to produce the final product. Modeling pages, hyperlinks, and dynamic content on the client and server is very important. Modeling internals of the Web server or details of the browser are not very helpful to the designers and architects of a Web application.
The artifacts should be mapped to modeling elements. Hyperlinks map to association elements in the model and represent navigational paths between pages. Pages might also map to classes in the logical view of the model. If a Web page were a class in the model, the page's scripts would map to operations of the class. Any variables in the scripts that are page-scoped would map to class attributes.
A problem arises when you consider that a Web page may contain a set of scripts that execute on the server in order to prepare the dynamic content of the page along with a completely different set of scripts that execute on the client. A perfect example of this is JavaScript. When using JavaScript, there can be confusion as to which operations, attributes, and relationships are active on the server and which are active on the client. In addition, a Web page as delivered in a Web application is best modeled as a component of the system. Simply mapping a Web page to a UML class does not help us understand the system any better.
The creators of the UML realized that it is not always sufficient to capture the relevant semantics of a particular domain or architecture. To address this problem, a formal extension mechanism was defined in order for practitioners to extend the semantics of the UML. This mechanism allows the defining of stereotypes, tagged values, and constraints that can be applied to model elements.
A stereotype allows you to define a new semantic meaning for a modeling element. Tagged values are key value pairs that can be associated with a modeling element. These allow you to attach any value to a modeling element. Constraints are rules defining the best way to express a model: as free-form text or with the more formal Object Constraint Language (OCL).
In modeling, a very clear distinction needs to be made between business logic and presentation logic. In typical business applications, presentation details such as animated buttons, fly-over help, and other UI enhancements do not normally belong in the model unless a separate UI model is constructed for the application.
Web Application Architecture
Basic Web application architecture includes browsers, a network, and a Web server. Browsers request the Web pages from the server. Each page contains a mix of content and formatting instructions, expressed in HTML. Some pages include client-side scripts defining additional dynamic behaviors for the display page. These scripts (which are interpreted by the browser) interact with the browser, page content, and additional controls such as applets, ActiveX controls, and plug-ins that are contained in the page.
Users view and interact with the content in the page. Often there are field elements in the page that are filled in and submitted to the server by the user for processing. Users also interact with the system by navigating to different pages in the system via hyperlinks. In both cases, the user supplies input to the system that may alter the business state of the system.
The client sees a Web page as an HTML formatted document. On the server, however, a Web page may manifest itself in several different ways. Early Web applications utilized the common gateway interface (CGI) for dynamic Web pages. CGI defines an interface for scripts and compiled modules to utilize in order to gain access to the information passed along with a page request. In a CGI-based system, a special directory is usually configured on the Web server to be able to execute scripts in response to page requests. When a CGI script is requested, instead of just returning the contents of the file (as it would for any HTML formatted file), the server processes or executes the file with the correct interpreter (usually a Perl shell). The output is streamed back to the requesting client. The end result of this processing is an HTML formatted stream, which is sent back to the requesting client. Business logic is executed in the system while the file is being processed. During that time, it has the potential to interact with server-side resources such as databases and middle-tier components.
Web servers have improved on this basic design over time, becoming more security aware, and including features such as client state management, transaction processing integration, remote administration, and resource pooling. The latest generation of Web servers addresses issues that are important to architects of mission-critical, scalable, and robust applications.
Today's Web servers can be divided into three major categories: scripted pages, compiled pages, and hybrids of the two. In scripted pages, each Web page that can be requested by a client browser is represented on the Web server's file system as a scripted file. This file is often a mix of HTML and another scripting language. Upon page request, the Web server delegates the task of processing the page to an engine that recognizes it and results in an HTML formatted stream that is sent back to the requesting client. Prime examples are Microsoft's Active Server Pages, JavaServer Pages, and Allaire Cold Fusion.
In compiled pages, the Web server loads and executes a binary component that has access to all of the information accompanying the page request, such as the values of form fields and parameters. The compiled code uses the request details and accesses server-side resources to produce the HTML stream that is returned to the client. Compiled pages often offer more functionality than do scripted pages. Different functionality can be obtained by passing parameters to the compiled page request. Any single compiled component has the ability to include all of the functionality of an entire directory's scripted pages. Examples of this architecture include Microsoft's ISAPI and Netscape's NSAPI.
The third category is a hybrid of the previous categories, representing scripted pages that are compiled on request, with the compiled version then being used by all subsequent requests. When the original page's contents change, the page will undergo another compile. This category is a compromise between the flexibility of scripted pages and the efficiency of compiled pages.