# Multicore Application Programming: Identifying Opportunities for Parallelism

• Print
This chapter is from the book

## How Dependencies Influence the Ability Run Code in Parallel

Dependencies within an application (or the calculation it performs) define whether the application can possibly run in parallel. There are two types of dependency: loop- or data-carried dependencies and memory-carried dependencies.

With a loop-carried dependency, the next calculation in a loop cannot be performed until the results of the previous iteration are known. A good example of this is the loop to calculate whether a point is in the Mandelbrot set. Listing 3.4 shows this loop.

#### Listing 3.4. Code to Determine Whether a Point Is in the Mandelbrot Set

int inSet(double ix, double iy)
{
int iterations=0;
double x = ix, y = iy, x2 = x*x, y2 = y*y;
while ( (x2+y2 < 4) && (iterations < 1000) )
{
y  = 2 * x * y + iy;
x  = x2 - y2 + ix;
x2 = x * x;
y2 = y * y;
iterations++;
}
return iterations;
}

Each iteration of the loop depends on the results of the previous iteration. The loop terminates either when 1,000 iterations have been completed or when the point escapes a circle centered on the origin of radius two. It is not possible to predict how many iterations this loop will complete. There is also insufficient work for each iteration of the loop to be split over multiple threads. Hence, this loop must be performed serially.

Memory-carried dependencies are more subtle. These represent the situation where a memory access must be ordered with respect to another memory access to the same location. Consider the snippet of code shown in Listing 3.5.

#### Listing 3.5. Code Demonstrating Ordering Constraints

int val=0;

void g()
{
val = 1;
}

void h()
{
val = val + 2;
}

If the routines g() and h() are executed by different threads, then the result depends on the order in which the two routines are executed. If g() is executed followed by h(), then the val will hold the result 3. If they are executed in the opposite order, then val will contain the result 1. This is an example of a memory-carried dependence because to produce the correct answer, the operations need to be performed in the correct order.

### Antidependencies and Output Dependencies

Suppose one task, A, needs the data produced by another task, B; A depends on B and cannot start until B completes and releases the data needed by A. This is often referred to as true dependency. Typically, B writes some data, and A needs to read that data. There are other combinations of two threads reading and writing data. Table 3-1 illustrates the four ways that tasks might have a dependency.

#### Table 3-1. Possible Ordering Constraints

When both threads perform read operations, there is no dependency between them, and the same result is produced regardless of the order the threads run in.

With an antidependency, or write after read, one task has to read the data before the second task can overwrite it. With an output dependency, or write after write, one of the two tasks has to provide the final result, and the order in which the two tasks write their results is critical. These two types of dependency can be most clearly illustrated using serial code.

In the code shown in Listing 3.6, there is an antidependency on the variable data1. The first statement needs to complete before the second statement because the second statement reuses the variable data1.

#### Listing 3.6. An Example of an Antidependency

void anti-dependency()
{
result1 = calculation( data1 );  // Needs to complete first
data1   = result2 + 1;           // Will overwrite data1
}

If one of the statements was modified to use an alternative or temporary variable, for example, data1_prime, then both statements could proceed in any order. Listing 3.7 shows this modified code.

#### Listing 3.7. Fixing an Antidependency

void anti-dependency()
{
data1_prime = data1;      // Local copy of data1
result1 = calculation( data1_prime );
data1   = result2 + 1;   // No longer has antidependence
}

The code shown in Listing 3.8 demonstrates an output dependency on the variable data1. The second statement needs to complete after the first statement only because they both write to the same variable.

#### Listing 3.8. An Output Dependency

void output-dependency()
{
data1 = result1 + 2;
data1 = result2 + 2; // Overwrites same variable
}

If the first target variable was renamed data1_prime, then both statements could proceed in any order. Listing 3.9 shows this fix.

#### Listing 3.9. Fixing an Output Dependency

void output-dependency()
{
data1_prime = result1 + 2;
data1       = result2 + 2; // No longer has output-dependence
}

What is important about these two situations is that both output and antidependencies can be avoided by renaming the data being written, so the final write operation goes to a different place. This might involve taking a copy of the object and having each task work on their own copy, or it might be a matter of duplicating a subset of the active variables. In the worst case, it could be resolved by both tasks working independently and then having a short bit of code that sets the variables to the correct state.

### Using Speculation to Break Dependencies

In some instances, there is a clear potential dependency between different tasks. This dependency means it is impossible to use a traditional parallelization approach where the work is split between the two threads. Even in these situations, it can be possible to extract some parallelism at the expense of performing some unnecessary work. Consider the code shown in Listing 3.10.

#### Listing 3.10. Code with Potential for Speculative Execution

void doWork( int x, int y )
{
int value = longCalculation( x, y );
if (value > threshold)
{
return value + secondLongCalculation( x, y );
}
else
{
return value;
}
}

In this example, it is not known whether the second long calculation will be performed until the first one has completed. However, it would be possible to speculatively compute the value of the second long calculation at the same time as the first calculation is performed. Then depending on the return value, either discard the second value or use it. Listing 3.11 shows the resulting code parallelized using pseudoparallelization directives.

#### Listing 3.11. Speculatively Parallelized Code

void doWork(int x, int y)
{
int value1, value2;
#pragma start parallel region
{
{
value1 = longCalculation( x, y );
}
{
value2 = secondLongCalculation( x, y );
}
}
#pragma wait for parallel tasks to complete
if (value1 > threshold)
{
return value1 + value2;
}
else
{
return value1;
}
}

The #pragma directives in the previous code are very similar to those that are actually used in OpenMP, which we will discuss in Chapter 7, "OpenMP and Automatic Parallelization." The first directive tells the compiler that the following block of code contains statements that will be executed in parallel. The two #pragma directives in the parallel region indicate the two tasks to be performed in parallel. A final directive indicates that the code cannot exit the parallel region until both tasks have completed.

Of course, it is important to consider whether the parallelization will slow performance down more than it will improve performance. There are two key reasons why the parallel implementation could be slower than the serial code.

• The overhead from performing the work and synchronizing after the work is close in magnitude to the time taken by the parallel code.
• The second long calculation takes longer than the first long calculation, and the results of it are rarely used.

It is possible to put together an approximate model of this situation. Suppose the first calculation takes T1 seconds and the second calculation takes T2 seconds; also suppose that the probability that the second calculation is actually needed is P. Then the total runtime for the serial code would be T1 + P * T2.

For the parallel code, assume that the calculations take the same time as they do in the serial case and the probability remains unchanged, but there is also an overhead from synchronization, S. Then the time taken by the parallel code is S + max (T1,T2).

Figure 3.22 shows the two situations.

We can further deconstruct this to identify the constraints on the two situations where the parallel version is faster than the serial version:

• If T1 > T2, then for the speculation to be profitable, S+T1 < T1+P*T2, or S < P*T2. In other words, the synchronization cost needs to be less than the average amount of time contributed by the second calculation. This makes sense if the second calculation is rarely performed, because then the additional overhead of synchronization needed to speculatively calculate it must be very small.
• If T2 > T1 (as shown in Figure 3.21), then for speculation to be profitable, S+T2 < T1+P*T2 or P > (T2 +S -T1)/T2. This is a more complex result because the second task takes longer than the first task, so the speculation starts off with a longer runtime than the original serial code. Because T2 > T1, T2 + S -T1 is always >0. T2 + S -T1 represents the overhead introduced by parallelization. For the parallel code to be profitable, this has to be lower than the cost contributed by executing T2. Hence, the probability of executing T2 has to be greater than the ratio of the additional cost to the original cost. As the additional cost introduced by the parallel code gets closer to the cost of executing T2, then T2 needs to be executed increasingly frequently in order to make the parallelization profitable.

The previous approach is speculative execution, and the results are thrown away if they are not needed. There is also value speculation where execution is performed, speculating on the value of the input. Consider the code shown in Listing 3.12.

#### Listing 3.12. Code with Opportunity for Value Speculation

void doWork(int x, int y)
{
int value = longCalculation( x, y );
return secondLongCalculation( value );
}

In this instance, the second calculation depends on the value of the first calculation. If the value of the first calculation was predictable, then it might be profitable to speculate on the value of the first calculation and perform the two calculations in parallel. Listing 3.13 shows the code parallelized using value speculation and pseudoparallelization directives.

#### Listing 3.13. Parallelization Using Value Speculations

void doWork(int x, int y)
{
int value1, value2;
static int last_value;
#pragma start parallel region
{
{
value1 = longCalculation( x, y );
}
{
value2 = secondLongCalculation( lastValue );
}
}
#pragma wait for parallel tasks to complete
if (value1 == lastvalue)
{
return value2;
}
else
{
lastValue = value1;
return secondLongCalculation( value1 );
}
}

The value calculation for this speculation is very similar to the calculation performed for the speculative execution example. Once again, assume that T1 and T2 represent the costs of the two routines. In this instance, P represents the probability that the speculation is incorrect. S represents the synchronization overheads. Figure 3.23 shows the costs of value speculation.

The original code takes T1+T2 seconds to complete. The parallel code takes max(T1,T2)+S+P*T2. For the parallelization to be profitable, one of the following conditions needs to be true:

• If T1 > T2, then for the speculation to be profitable, T1 + S + P*T2 < T1 +T2. So, S < (1-P) * T2. If the speculation is mostly correct, the synchronization costs just need to be less than the costs of performing T2. If the synchronization is often wrong, then the synchronization costs need to be much smaller than T2 since T2 will be frequently executed to correct the misspeculation.
• If T2 > T1, then for the speculation to be profitable, T2 + S + P*T2 < T1 +T2. So, S <T1 – P*T2. The synchronization costs need to be less than the cost of T1 after the overhead of recomputing T2 is included.

As can be seen from the preceding discussion, speculative computation can lead to a performance gain but can also lead to a slowdown; hence, care needs to be taken in using it only where it is appropriate and likely to provide a performance gain.

### Critical Paths

One way of looking at parallelization is by examining the critical paths in the application. A critical path is the set of steps that determine the minimum time that the task can complete in. A serial program might complete tasks A, B, C, and D. Not all of the tasks need to have dependencies. B might depend on the results of A, and D might depend on the results of B and C, but C might not depend on any previous results. This kind of data can be displayed on a graph such as the one in Figure 3.24.

It is relatively straightforward to identify the critical path in a process once the dependencies and durations have been identified. From this graph, it is apparent that task C could be performed in parallel with tasks A and B. Given timing data, it would be possible to estimate the expected performance of this parallelization strategy.

• 🔖 Save To Your Account

### InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

## Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

## Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

### Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

### Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

### Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

### Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

### Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

### Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

## Other Collection and Use of Information

### Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

### Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

### Do Not Track

This site currently does not respond to Do Not Track signals.

## Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

## Children

This site is not directed to children under the age of 13.

## Marketing

Pearson may send or direct marketing communications to users, provided that

• Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
• Such marketing is consistent with applicable law and Pearson's legal obligations.
• Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
• Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

## Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

## Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

## Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

## Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

## Sharing and Disclosure

Pearson may disclose personal information, as follows:

• As required by law.
• With the consent of the individual (or their parent, if the individual is a minor)
• In response to a subpoena, court order or legal process, to the extent permitted or required by law
• To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
• In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
• To investigate or address actual or suspected fraud or other illegal activities
• To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
• To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
• To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.