Home > Articles > Web Development

This chapter is from the book

HTTP and REST

Now that you have your instances proxied and ready to accept requests, it's time to think about how to accept requests. In general, it's a bad practice to reinvent the wheel when you can just use another protocol that's already been established and well tested. There are entire books on using HTTP and REST to build your own SaaS, but this section provides the basic details.

Although you can use HTTP in many ways, including SOAP, the simplest of all these is Representational State Transfer (REST), which was officially defined in 2000 by Roy Fielding in a doctoral dissertation "Architectural Styles and the Design of Network-based Software Architectures" (http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm). It uses HTTP as a communication medium and is designed around the fundamental idea that HTTP already defines how to handle method names, authentication, and many other things needed when working with these types of communications. HTTP is split into two different sections: the header and the body (not to be confused with the HTML <head> and <body> tags), each of which is fully used by REST.

This book uses REST and XML for most of the examples, but this is not the only option and may not even suite your specific needs. For example, SOAP is still quite popular for many people because of how well it integrates with Java. It also makes it easy for other developers to integrate with your APIs if you provide them with a Web Service Definition Language (WSDL) that describes exactly how a system should use your API. The important point here is that the HTTP protocol is highly supported across systems and is one of the easiest to use in many applications because much of the lower-level details, such as authentication, are already taken care of.

The Header

The HTTP header describes exactly who the message is designed for, and what method the user is instantiating on the recipient end. REST uses this header for multiple purposes. HTTP method names can be used to define the method called and the arguments (path) which is sent to that method. The HTTP header also includes a name, which can be used to differentiate between applications running on the same port. This shouldn't be used for anything other than differentiating between applications because it's actually the DNS name and shouldn't be used for anything other than a representation of the server's address.

The method name and path are both passed into the application. Typically you want to use the path to define the module, package, or object to use to call your function. The method name is typically used to determine what function to call on that module, package, or object. Lastly, the path also contains additional arguments after the question mark (?) that usually are passed in as arguments to your function. Now take a look at a typical HTTP request:

GET /module_name/id_argument?param1=value1&param2=value2

In this example, most applications would call module_name.get(id_argument,param1=value1,param2=value2) or module_name.get(id_argument,{param1=value1,param2=value2}). By using this simple mapping mechanism, you're decoupling your interface (the web API) from your actual code, and you won't actually need to call your methods from the website. This helps greatly when writing unit tests.

Many libraries out there can handle mapping a URI path to this code, so you should try to find something that matches your needs instead of creating your own. Although REST and RESTful interfaces are defined as using only four methods, most proxies and other systems, including every modern web browser, support adding your own custom methods. Although many REST developers may frown on it, in general it does work, and when a simple CRUD interface isn't enough, it's much better than overloading an existing function to suit multiple needs. The following sections reference some of the most common HTTP headers and how you can use them in a REST API.

If-Match

The most interesting header that you can provide for is the If-Match header. This header can be used on any method to indicate that the request should be performed only if the conditions in the header represent the current object. This header can be exceptionally useful when you operate with databases that are eventually consistent, but in general, because your requests can be made in rapid succession, it's a good idea to allow for this so that they don't overwrite each other. One possible solution to this is to provide for a version number or memento on each object or resource that can then be used to ensure that the user knows what the value was before it replaces it.

In some situations, it may be good to require this field and not accept the special * case for anyone other than an administrative user. If you require this field to be sent and you receive a request that doesn't have it, you should respond with an error code of 412 (Precondition Failed) and give the users all they need to know to fill in this header properly. If the conditions in this header do not match, you must send back a 412 (Precondition Failed) response. This header is typically most used when performing PUT operations because those operations override what's in the database with new values, and you don't know if someone else may have already overridden what you thought was there.

If-Modified-Since

The If-Modified-Since header is exceptionally useful when you want the client to contain copies of the data so that they can query locally. In general, this is part of a caching system used by most browsers or other clients to ensure that you don't have to send back all the data if it hasn't been changed. The If-Modified-Since header takes an HTTP-date, which must be in GMT, and should return a 304 (Not Modified) response with no content.

If-Unmodified-Since

If you don't have an easy way to generate a memento or version ID for your objects, you can also allow for an If-Unmodified-Since header. This header takes a simple HTTP date, formatted in GMT, which is the date the resource was last retrieved by the client. This puts a lot of trust in the client, however, to indicate the proper date. It's generally best to use the If-Modified header instead, unless you have no other choice.

Accept

The Accept header is perhaps the most underestimated header in the entire arsenal. It can be used not only to handle what type of response to give (JSON, XML, and so on), but also to handle what API version you're dealing with. If you need to support multiple versions of your API, you can support this by attaching it to the content type. This can be done by extending the standard content types to include the API version number:

Accept: text/xml+app-1.0

This enables you to specify not only a revision number (in this case, 1.0) and content type, but also the name of the application so that you can ensure the request came from a client that knew who it was talking to. Traditionally, this header will be used to send either HTML, XML, JSON, or some other format representing the resource or collection being returned.

Authorization

The Authorization header can be used just like a standard HTTP authentication request, encoding both the password and the username in a base64 encoded string, or it can optionally be used to pass in an authentication token that eventually expires. Authentication types vary greatly, so it's up to you to pick the right version for your application. The easiest method is by using the basic HTTP authentication, but then you are sending the username and password in every single request, so you must ensure that you're using SSL if you're concerned about security.

In contrast, if you choose to use a token or signing-based authentication method, the user has to sign the request based on some predetermined key shared between the client and server. In this event, you can hash the entire request in a short string that validates that the request did indeed come from the client. You also need to make sure to send the username or some other unique identifier in this header, but because it's not sending a reversible hash of the password, it's relatively safe to send over standard HTTP. We won't go into too much depth here about methods of hashing because there are a wide variety of hashing methods all well-documented online.

The Body

The body of any REST call is typically either XML or JSON. In some situations, it's also possible to send both, depending on the Accept header. This process is fairly well documented and can be used to not only define what type of response to return, but also what version of the protocol the client is using. The body of any request to the system should be in the same format as the response body.

In my applications, I typically use XML only because there are some powerful tools, such as XSLT, that can be used as middleware for authorization purposes. Many clients, however, like the idea of using JSON because most languages serialize and deserialize this quite well. REST doesn't specifically require one form of representation over the other and even enables for the clients to choose which type they want, so this is up to you as the application developer to decide what to support.

Methods

REST has two distinct, important definitions that you need to understand before continuing. A collection is a group of objects; in your case this usually is synonymous with either a class in object terms, or a table in database terms. A resource is a specific instantiation of a collection, which can be thought of as an instance in object terms or a row in a table in database terms. This book also uses the term property to define a single property or attribute on an instance in object terms, or a cell in database terms. Although you can indeed create your own methods, in general you can probably fit most of your needs into one of the method calls listed next.

GET

The GET method is the center of all requests for information. Just like a standard webpage, applications can use the URL in two parts; everything before the first question mark (?) is used as the resource to access, and everything after that is used as query parameters on that resource. The URL pattern can be seen here:

/collection_name/resource_id/property_name?query

The resource_id property_name and query in the preceding example are all optional, and the query can be applied to any level of the tree. Additionally, this tree could expand exponentially downward if the property is considered a reference. Now take a simple example of a request on a web blog to get all the comments of a post specified by POST-ID submitted in 2010. This query could look like this:

/posts/POST-ID/comments?submitted=2010%

The preceding example queries for the posts collection for a specific post identified as POST-ID. It then asks for just the property named comments and filters specifically for items with the property submitted that matches 2010%.

Responses to this method call can result in a redirection if the resource is actually located at a different path. This can be achieved by sending a proper redirect response code and a Location header that points to the actual resource.

A GET on the root of the collection should return a list of all resources under that path. It can also have the optional ?query on the end to limit these results. It's also a good idea to implement some sort of paging system so that you can return all the results instead of having to limit because of HTTP timeouts. In general, it's never a good idea to have a request that takes longer then a few seconds to return on average because most clients will assume this is an error. In general, most proxy systems will time out any connection after a few minutes, so if your result takes longer than a minute, it's time to implement paging.

If you use XML as your communication medium, think about implementing some sort of ATOM style next links. These simple tags give you a cursor to the next page of results, so you can store a memento or token of your query and allow your application to pick up where it left off. This token can then be passed in via a query parameter to the same URL that was used in the original request. In general, your next link should be the full URL to the next page of results. By doing this, you leave yourself open to the largest range of possibilities for implementing your paging system, including having the ability to perform caching on the next page of results, so you can actually start building it before the client even asks for it.

If you use Amazon Web Services and SimpleDB, it's generally a good idea to use the next_token provided by a SimpleDB query as your memento. You also need to provide enough information in the next link to build the entire original query, so just using the URL that was originally passed and adding the next_token to the end of the query is generally a good idea. Of course if this is a continuation, you have to replace the original next_token with the new one.

Performing a GET operation on the root path (/) should return a simple index that states all the available resource types and the URLs to those resource types. This machine-readable code should be simple enough that documentation is not required for new developers to build a client to your system. This methodology enables you to change around the URLs of the base resources without modifying your client code, and it enables you build a highly adaptable client that may adapt to new resources easily without making any modifications.

PUT

The HTTP definition of a PUT is replace, so if you use this on the base URL of a resource collection, you're actually requesting that everything you don't pass in is deleted, anything new is created, and any existing resources passed in are modified. Because PUT is logically equivalent to a SET operation, this is typically not allowed on a collection.

A PUT on a resource should update or create the record with a specific ID. The section after the collection name is considered the ID of the resource, and if a GET is performed after the PUT, it should return the resource that was just created or updated. PUT is intended as an entire replacement, but in general, if you don't pass in an attribute, that is assumed to be "don't change," whereas if you pass in an empty value for the attribute, you are actually requesting it to be removed or set to blank.

A PUT on a property should change just that specific property for the specified resource. This can be incredibly useful when you're putting files to a resource because those typically won't serialize very well into XML without lots of base64 conversions.

Most PUT requests should return either a 201 Created with the object that was just created, which may have been modified due to server application logic, or a 204 No Content if there are no changes to the original object request. If you operate with a database that has eventual consistency, it may also be a good idea to instead return a 202 Accepted request to indicate that the client should try to fetch the resource at a later time. You should also return an estimate of how long it will be before this object is created.

POST

A POST operation on a collection is defined as a creation request. This is the user requesting to add a new resource to the collection without specifying a specific ID. In general, you want to either return a redirection code and a Location header, or at the least the ID of the object you just created (if not the whole object serialized in your representation).

A POST operation on a resource should actually create a subobject of that resource; although, this is often not used. Traditionally, browsers don't handle changing form actions to PUT, so a POST is typically treated as a special form of a PUT that takes form-encoded values.

A POST operation on a property is typically used only for uploading files but could also be used as appending a value to a list. In general, this could be considered an append operation, but if your property is a single value, it's probably safe to assume that the client wanted this to be a PUT, not a POST.

DELETE

A DELETE operation on a collection is used to drop the entire collection, so you probably don't want to allow this. If you do allow it, this request should be treated as a request to remove every single resource in the collection.

A DELETE operation on a specific resource is simply a request to remove that specific resource. If there are errors in this request, the client should be presented with a message explaining why the request failed. The resulting error code should explain if the request can be issued again, or if the user is required to perform another operation before reissuing the request. The most typical error message back from this request is a 409 Conflict, which indicates that another resource is referencing this resource, and the server is refusing to cascade the delete request. A DELETE may also return a 202 Accepted response if the database has eventual consistency.

A DELETE operation on a specific property is identical to a PUT with an empty value for that property. It can be used to delete just a single property from a resource instead of having to send a PUT request. This can also be used as a differentiation between setting a value to blank and removing the value entirely. In programming terms, this the difference between an empty string and None or Null.

HEAD

A HEAD request on any URL should return the exact same headers as a standard GET request but shouldn't send back the body. This is typically used to get a count of the number of results in a response without retrieving the actual results. In my applications, I use this to send an additional header X-Results, which contains the number of results that would have been retrieved.

OPTIONS

An OPTIONS request on any URL returns the methods allowed to be performed on this URL. This can return just the headers with the additional Accept header, but it can also return a serialized version of them in the body of the results that describes what each method actually does. This response should be customized for the specific user that made the request, so if the user is not allowed to perform DELETE operations on the given resource for any reason, that option should not be returned. This allows the client to specifically hide options that the users aren't allowed to perform so that they don't get error responses.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020