Predictability and Determinism
Other important qualities of a real-time system are that of predictability and determinism. A real-time system must behave in a way that can be predicted mathematically. This refers to the system’s deadline in terms of relative and absolute time. For instance, it must be mathematically predictable to determine if the amount of work to be done can be completed before a given deadline. Factors that go into this calculation are system workload, the number of CPUs (or CPU cores) available for processing, running threads in the real-time system, process and thread priorities, and the operating system scheduling algorithm.
Determinism represents the ability to ensure the execution of an application without concern that outside factors will upset the execution in unpredictable ways. In other words, the application will behave as intended in terms of functionality, performance, and response time, all of the time without question. In many respects, determinism and predictability are related, in that one results in the other. However, the important distinction is that a deterministic system puts the control of execution behavior in the hands of the application developer. Predictability is then the result of proper programming practice on a system that enables such behavior. This book will explore the statement “proper programming practice” in relation to real-time applications written in Java because using a real-time language or operating system is never enough—discipline is also required.
Another aspect of deterministic application behavior is that it’s fixed, more or less. This means that unforeseen events, such as garbage collection in Java, must never upset a real-time application’s ability to meet its deadlines, and hence become less predictable. A real-time system such as an anti-lock brake system, or an airplane’s flight-control system, must always be 100% deterministic and predictable or human lives may be at stake.
Many practical discussions of real-time systems and their requirements involve the terms latency and jitter. Let’s examine these now, and form precise definitions that we’ll use in our discussion going forward.
Much of the discussion so far has been about responding to an event before a deadline. This is certainly a requirement of a real-time system. Latency is a measure of the time between a particular event and a system’s response to that event, and it’s quite often a focus for real-time developers. Because of this, latency is often a key measurement in any real-time system. In particular, the usual focus is to minimize system latency. However, in a real-time system the true goal is simply to normalize latency, not minimize it. In other words, the goal is to make latency a known, reasonably small, and consistent quantity that can then be predicted. Whether the latency in question is measured in seconds, or microseconds, the fact that it can be predicted is what truly matters to real-time developers. Nonetheless, more often than not, real-time systems also include the requirement that latency be minimized and bounded, often in the sub-millisecond range.
To meet a system’s real-time requirements, all sources of latency must be identified and measured. To do this, you need the support of your host system’s operating system, its environment, network relationship, and programming language.
The definition of jitter includes the detection of irregular variations, or unsteadiness, in some measured quantity. For example, in an electronic device, jitter often refers to a fluctuation in an electrical signal. In a real-time system, jitter is the fluctuation in latency for similar event processing. Simply measuring the latency of message processing for one event, or averaging it over many events, is not enough. For instance, if the average latency from request to response for a certain web server is 250 milliseconds, we have no insight into jitter. If we look at all of the numbers that go into the average (all of the individual request/response round-trip times) we can begin to examine it. Instead, as a real-time developer, you must look at the distribution and standard deviation of the responses over time.
The chart in Figure 1-2 shows a sampling of latency data for a web server’s request/response round-trip time. You can see that although the average of 250 milliseconds seems pleasing, a look at the individual numbers shows that some of the responses were delivered with up to one-second latency. These “large” latency responses stand out above most of the others, and are hence labeled outliers, since they fall outside of the normal, or even acceptable, response time range.
Figure 1-2 The average response time measurement of a transaction can cover up latency outliers.
However, if the system being measured is simply a web application without real-time requirements, this chart should not be alarming; the outliers simply aren’t important, as the average is acceptable. However, if the system were truly a real-time system, these outliers could represent disaster. In a real-time system, every response must be sent within a bounded amount of latency.
Hard and Soft Real-Time
In the real-time problem domain, discussions often involve the terms hard real-time and soft real-time. Contrary to what many people assume, these terms have nothing to do with the size of the deadline, or the consequence of missing that deadline. It’s a common misconception that a hard real-time system has a smaller, or tighter, deadline in terms of overall time than a soft real-time system. Instead, a hard real-time system is one that cannot miss a single deadline or the system will go into an abnormal state. In other words, the correctness of the system depends not only on the responses it generates, but the time frame in which each and every response is delivered. A soft real-time system is one that may have a similar deadline in terms of time, but it instead has the tolerance to miss a deadline occasionally without generating an error condition.
For example, let’s compare a hypothetical video player software application to an automated foreign exchange trading application. Both systems have real-time qualities:
- The video player must retrieve and display video frames continuously, with each frame being updated by a deadline of, say, one millisecond.
- A foreign exchange trade must be settled (moneys transferred between accounts) within exactly two days of the trade execution.
The video player has a far more constraining deadline at one millisecond compared to the two-day deadline of the trading system. However, according to our definition, the trading system qualifies as a hard real-time system, and the video player as a soft real-time system, since a missed settlement trade puts the entire trade, and trading system, into a bad state—the trade needs to be rolled back, money is lost, and a trading relationship strained. For the video player, an occasional missed deadline results in some dropped frames and a slight loss of video quality, but the overall system is still valid. However, this is still real-time since the system must not miss too many deadlines (and drop too many frames) or it, too, will be considered an error.
Additionally, the severity of the consequence of missing a deadline has nothing to do with the definition of hard versus soft. Looking closer at the video player software in the previous example, the requirement to match audio to the corresponding video stream is also a real-time requirement. In this case, many people don’t consider the video player to be as critical as an anti-lock brake system, or a missile-tracking system. However, the requirement to align audio with its corresponding video is a hard real-time constraint because not doing so is considered an error condition. This shows that whereas the consequence of a misaligned video/audio stream is minimal, it’s still a hard real-time constraint since the result is an error condition.
Therefore, to summarize hard and soft real-time constraints, a hard real-time system goes into a bad state when a single deadline is missed, whereas a soft real-time system has a more flexible deadline, and can tolerate occasional misses. In reality, it’s best to avoid these terms and their distinction and instead focus on whether a system has a real-time requirement at all. If there truly is a deadline the system must respond within, then the system qualifies as real-time, and every effort should be made to ensure the deadline is met each time.
In some cases, the requirement to respond to an event before a deadline is not enough; it must not be sent too early either. In many control systems, responses must be sent within a window of time after the request, and before the absolute deadline (see Figure 1-3). Such a system has an isochronal real-time requirement.
Figure 1-3 Isochronal real-time: the deadline must be met, but a response must not be sent too early, either.
Although clearly distinct from a hard real-time task that needs to complete any time before its deadline, in most cases isochronal real-time tasks are simply classified as hard real-time with an additional timing constraint. This certainly makes it easier to describe the tasks in a system design. However, this added constraint does make a difference to the real-time task scheduler, which is something we’ll explore later in this chapter.
Real-Time Versus Real Fast
Application or system performance is a relative measurement. When a system is said to be fast or slow, it’s usually in comparison to something else. Perhaps it’s an older system, a user expectation, or a comparison to an analogous real-world system. In general, performance is more of a relative measurement than a precise mathematical statement. As discussed earlier in this chapter, real-time does not necessarily equal real fast.
Instead, whereas the objective of fast computing is to minimize the average response time of a given set of tasks, the objective of real-time computing is to meet the individual time-critical requirement of each task. Consider this anecdote: there once was a man who drowned in a river with an average depth of 6 inches [Buttazzo05]. Of course, the key to that sentence is the use of the average depth, which implies the river is deeper at some points. A real-time system is characterized by its deadline, which is the maximum time within which it must complete its execution, not the average.
However, the goals of most real-time systems are to meet critical deadlines and to perform optimally and efficiently. For example, a system with sub-millisecond deadlines will most likely require high-performance computer hardware and software. For this reason, real-time systems programming is often associated with high-performance computing (HPC). However, it’s important to remember that high-performance does not imply real-time, and vice versa.
Real-Time Versus Throughput
Another area of system performance that is often confused with real-time is that of system throughput. Throughput is often used to describe the number of requests, events, or other operations, that a software system can process in any given time frame. You often hear of software positively characterized with terms like “messages-per-second,” or “requests-per-second.” A system with high throughput can give its operators a false sense of security when used in a real-time context.
This is because a system with high throughput is not necessarily a real-time system, although it is often misunderstood to be. For instance, a system that supports thousands of requests per second may have some responses with up to a second of latency. Even though a majority of the requests may be handled with low latency, the existence of some messages with large latency represents outliers (those outside the normal response time). In other words, with this example, most requestors received their responses well within the one-second window. However, there were some that waited the full second for their response. Because the degree of, and the amount of, these outliers are unpredictable, high-throughput systems are not necessarily real-time systems.
Typically, real-time systems exhibit lower average throughput than non-real-time systems. This engineering trade-off is well known and accepted in the real-time community. This is due to many factors that include trade-offs as to how tasks are scheduled and how resources are allocated. We’ll explore these factors in relation to Java RTS throughout this book.
Task Completion Value
In a modern computing system, the basic element of execution is called a thread, or a task. A process is defined as an application launched by the user (either explicitly through a command, or implicitly by logging in) that contains one or more threads of execution. Regardless of how each task begins executing, the basic unit of execution is a thread. To simplify things going forward, the thread will be the focus of the discussion.
The value, or usefulness, that a task has to any running system is usually dependent upon when it gets its work done, not just that the work is done properly. Even non-real-time systems have this quality. For example, the chart in Figure 1-4 shows that the value of tasks in a non-real-time system usually increases as more of them are completed over time. This is evidence of the throughput quality of non-real-time systems, as explained in the previous section.
Figure 1-4 In a non-real-time system, the perceived value of task completion is directly proportional to the total number completed over time.
For a soft real-time system, the value of task completion rapidly decreases once the task’s deadline passes. Although the correct answer may have been generated, it gets more and more useless as time passes (see Figure 1-5).
Figure 1-5 In a soft real-time system, the value of task completion, after the deadline, decays over time.
Contrast this to a hard real-time system, where the task has zero value after the deadline (see Figure 1-6).
Figure 1-6 The value of task completion in a hard real-time system is zero the moment the deadline passes.
The discussion so far assumes that task completion anytime before the deadline is acceptable. In some cases, as with firm, or isochronal, real-time systems, the task must complete before the deadline, but no earlier than a predefined value. In this case, the value of task completion before and after the deadline is, or quickly goes to, zero (see Figure 1-7).
Figure 1-7 The value of task completion in a firm, isochronal, real-time system is zero if it completes early, or late.
Of course, these graphs are only general visual representations of task completion value in non-real-time and real-time systems; actual value is derived on a case-by-case basis. Later in this chapter, we’ll examine this in more detail as task cost functions are used to calculate efficient real-time scheduling algorithms.