## 1.3. Random Variables

So far, we have restricted ourselves to studying events, which are collections of outcomes of experiments or observations. However, we are often interested in abstract quantities or outcomes of experiments that are derived from events and observations but are not themselves events or observations. For example, if we throw a fair die, we may want to compute the probability that the square of the face value is smaller than 10. This is random and can be associated with a probability and, moreover, depends on some underlying random events. Yet, it is neither an event nor an observation: It is a **random variable**. Intuitively, a random variable is a quantity that can assume any one of a set of values, called its **domain** * D*, and whose value can be stated only probabilistically. In this section, we will study random variables and their distributions.

More formally, a **real random variable**—the one most commonly encountered in applications having to do with computer networking—is a mapping from events in a sample space *S* to the domain of real numbers. The probability associated with each value assumed by a real random variable^{2 } is the probability of the underlying event in the sample space, as illustrated in Figure 1.1.

Figure 1.1. The random variable *X* takes on values from the domain *D*. Each value taken on by the random variable is associated with a probability corresponding to an event *E*, which is a subset of outcomes in the sample space *S*.

A random variable is **discrete** if the set of values it can assume is finite and countable. The elements of *D* should be *mutually exclusive*—that is, the random variable cannot simultaneously take on more than one value—and *exhaustive*—the random variable cannot assume a value that is not an element of *D*.

A random variable is **continuous** if the values it can take on are a subset of the real line.

### 1.3.1. Distribution

In many cases, we are not interested in the actual value taken by a random variable but in the probabilities associated with each such value that it can assume. To make this more precise, consider a discrete random variable *X _{d}* that assumes distinct values

*D*= {

*x*

_{1},

*x*

_{2},...,

*x*}. We define the value

_{n}*p*(

*x*) to be the probability of the event that results in

_{i}*X*

_{d}assuming the value

*x*. The function

_{i}*p*(

*X*

_{d}), which characterizes the probability that

*X*will take on each value in its domain, is called the

_{d}**probability mass function**of

*X*.

_{d}^{3 }It is also sometimes called the

**distribution**of

*X*.

_{d}Unlike a discrete random variable, which has nonzero probability of taking on any particular value in its domain, the probability that a continuous real random variable *X*_{c} will take on any specific value in its domain is 0. Nevertheless, in nearly all cases of interest in the field of computer networking, we will be able to assume that we can define the **density** function *f*(*x*) of *X*_{c} as follows: The probability that *X*_{c} takes on a value between two reals, *x*_{1} and *x*_{2}, *p*(*x*_{1} ≤ *x* ≤ *x*_{2}), is given by the integral . Of course, we need to ensure that . Alternatively, we can think of *f*(*x*) being implicitly defined by the statement that a variable *x* chosen randomly in the domain of *X*_{c} has probability *f*(α)Δ of lying in the range when Δ is very small.

### 1.3.2. Cumulative Density Function

The domain of a discrete real random variable *X _{d}* is totally ordered; that is, for any two values

*x*

_{1}and

*x*

_{2}in the domain, either x

_{1}> x

_{2}or x

_{2}> x

_{1}. We define the

**cumulative density function**

*F*(

*X*) by

_{d}Note the difference between *F*(*X _{d}*), which denotes the cumulative distribution of random variable

*X*, and

_{d}*F*(

*x*), which is the value of the cumulative distribution for the value

*X*=

_{d}*x*.

Similarly, the cumulative density function of a continuous random variable *X _{c}*, denoted

*F*(

*X*), is given by

_{c}By definition of probability, in both cases,0 ≤ F(*X _{d}*)≤ 1, 0 ≤ F(

*X*)≤1.

_{c}### 1.3.3. Generating Values from an Arbitrary Distribution

The cumulative density function *F*(*X*), where *X* is either discrete or continuous, can be used to generate values drawn from the underlying discrete or continuous distribution *p*(*X _{d}*) or

*f*(

*X*), as illustrated in Figure 1.2. Consider a discrete random variable

_{c}*X*that takes on values

_{d}*x*

_{1},

*x*

_{2},...

*x*

_{n}with probabilities

*p*(

*x*

_{i}). By definition,

*F*(

*x*

_{k}) =

*F*(

*x*

_{k – 1}) +

*p*(

*x*

_{k}). Moreover,

*F*(

*X*

_{d}) always lies in the range [0,1]. Therefore, if we were to generate a random number

*u*with uniform probability in the range [0,1], the probability that

*u*lies in the range [

*F*(

*x*

_{k – 1}),

*F*(

*x*

_{k})] is

*p*(

*x*

_{k}). Moreover,

*x*

_{k}=

*F*

^{–1}(

*u*). Therefore, the procedure to generate values from the discrete distribution

*p*(

*X*

_{d}) is as follows: First, generate a random variable

*u*uniformly in the range [0,1]; second, compute

*x*

_{k}=

*F*

^{–1}(

*u*).

Figure 1.2. Generating values from an arbitrary (a) discrete or (b) continuous distribution

We can use a similar approach to generate values from a continuous random variable *X _{c}* with associated density function

*f*(

*X*). By definition,

_{c}*F*(

*x*+ δ) =

*F*(

*x*) +

*f*(

*x*)δ for very small values of δ. Moreover,

*F*(

*X*) always lies in the range [0,1]. Therefore, if we were to generate a random number

_{c}*u*with uniform probability in the range [0,1], the probability that

*u*lies in the range [

*F*(

*x*),

*F*(

*x*+ δ)] is

*f*(

*x*)δ, which means that

*x*=

*F*

^{–1}(

*u*) is distributed according to the desired density function

*f*(

*X*). Therefore, the procedure to generate values from the continuous distribution

_{c}*f*(

*X*) is as follows: First, generate a random variable

_{c}*u*uniformly in the range [0,1]; second, compute

*x*=

*F*

^{–1}(

*u*).

### 1.3.4. Expectation of a Random Variable

The **expectation**, **mean**, or **expected value** *E*[*X _{d}*] of a discrete random variable

*X*that can take on

_{d}*n*values

*x*with probability

_{i}*p*(

*x*) is given by

_{i}Similarly, the expectation *E*[*X _{c}*] of a continuous random variable

*X*with density function

_{c}*f*(

*x*) is given by

Intuitively, the expected value of a random variable is the value we expect it to take, knowing nothing else about it. For instance, if you knew the distribution of the random variable corresponding to the time it takes for you to travel from your home to work, you expect your commute time on a typical day to be the expected value of this random variable.

The expectation of a random variable gives us a reasonable idea of how it behaves in the long run. It is important to remember, however, that two random variables with the same expectation can have rather different behaviors.

We now state, without proof, four useful properties of expectations.

- For constants
*a*and*b*: *E*[*X*+*Y*] =*E*[*X*] +*E*[*Y*], or, more generally, for any set of random variables*X*:_{i}- For a discrete random variable
*X*with probability mass function_{d}*p*(*x*) and any function_{i}*g*(.): - For a continuous random variable
*X*with density function_{c}*f*(*x*) and any function*g*(.):

Note that, in general, *E*[*g*(*X*)] is not the same as *g*(*E*[*X*]); that is, a function cannot be taken out of the expectation.

### 1.3.5. Variance of a Random Variable

The **variance** of a random variable is defined by *V*(*X*) = *E*[(*X* – *E*[*X*])^{2}]. Intuitively, it shows how far away the values taken on by a random variable would be from its expected value. We can express the variance of a random variable in terms of two expectations as *V*(*X*) = *E*[*X*^{2}] – *E*[*X*]^{2}. For

In practical terms, the distribution of a random variable over its domain *D*—this domain is also called the **population**—is not usually known. Instead, the best we can do is to sample the values it takes on by observing its behavior over some period of time. We can estimate the variance of the random variable by keeping running counters for and . Then,

where this approximation improves with *n*, the size of the sample, as a consequence of the law of large numbers, discussed in Section 1.7.4.

The following properties of the variance of a random variable can be easily shown for both discrete and continuous random variables.

- For constant
*a*: - For constant
*a*: - If
*X*and*Y*are independent random variables: