Avoiding Cold Startups
In the early days of the Google App Engine, any request could lead to a new instance being launched. For applications with low traffic, there was a high risk of long response times on the first request by a visitor, especially if the application was not optimized for fast cold startups.
Only high-traffic applications with a relative constant load could serve a large percentage of users without confronting them with longer response times. But even those would lose a few visitors with instance starts and stops.
Later, Google added new features for paying customers that help avoid longer response times. It should be noted that these strategies may fail when the application experiences very sudden spikes in traffic.
Reserving Instances with Always On
Paying customers can hire instances that are never turned off. This solves the problem of low-traffic applications, where almost every visit leads to an instance being launched.
The Always On instances are supplemented with dynamic instances when the demand exceeds the capabilities of the available Always On instances. This means that just switching to Always On does not completely fix the problem with long responses on cold startups.
Always On can be configured in the admin console, as described in Google’s documentation on http://code.google.com/appengine/docs/adminconsole/instances.html.
Preloading Classes Using Warm-Up Requests
When at least one instance is running, either Always On or dynamic instances, the App Engine can sometimes predict when a new instance will be required.
As long as you haven’t explicitly turned off warm-up requests in the appengine-web.xml configuration file, the App Engine can send a request to /_ah/warmup sometime before a new instance is required. You can configure your own servlet to listen on that address and make sure that classes and other data are preloaded before a visitor starts accessing that instance.
Warm-up requests do not work when no instances are running. They do not add much value for low-traffic applications unless Always On is used.
Even with instances running, warm-up requests do not always work. The App Engine is not always capable of predicting traffic in advance.
More information on warm-up requests is found on http://code.google.com/appengine/docs/adminconsole/instances.html.
Handling Concurrent Requests with Thread-Safe Mode
By default, an instance handles only a single request at a time. If an instance takes long to respond and there are other requests at the same time, the App Engine launches additional instances to handle the rest of the traffic.
In some cases, loading new instances can be avoided by allowing concurrent requests. This requires you to develop thread-safe servlets. More information on thread-safe mode is found on http://code.google.com/appengine/docs/java/config/appconfig.html.
Handling Memory Intensive Requests with Backends
In addition to Always On instances, you can purchase, for a higher fee, specialized instances that are optimized for handling requests of a backend nature—that is, requests that require longer than 30 seconds to finish. Another characteristic of backend applications is higher memory consumption.
More information on backend instances can be found on Google’s website at http://code.google.com/appengine/docs/java/backends/.