Home > Store

Efficient C++: Performance Programming Techniques

By Dov Bulka, David Mayhew
Published Nov 3, 1999 by Addison-Wesley Professional.

Book

Sorry, this book is no longer in print.

Not for Sale

Description

Sample Content

Updates

More Information

Description

Copyright 2000
Dimensions: 7-1/4" x 9-1/4"
Pages: 336
Edition: 1st

Book
ISBN-10: 0-201-37950-3
ISBN-13: 978-0-201-37950-1

Far too many programmers and software designers consider efficient C++ to be an oxymoron. They regard C++ as inherently slow and inappropriate for performance-critical applications. Consequently, C++ has had little success penetrating domains such as networking, operating system kernels, device drivers, and others.

Efficient C++ explodes that myth. Written by two authors with first-hand experience wringing the last ounce of performance from commercial C++ applications, this book demonstrates the potential of C++ to produce highly efficient programs. The book reveals practical, everyday object-oriented design principles and C++ coding techniques that can yield large performance improvements. It points out common pitfalls in both design and code that generate hidden operating costs.

This book focuses on combining C++'s power and flexibility with high performance and scalability, resulting in the best of both worlds. Specific topics include temporary objects, memory management, templates, inheritance, virtual functions, inlining, reference-counting, STL, and much more.

With this book, you will have a valuable compendium of the best performance techniques at your fingertips.

0201379503B04062001



Sample Content

(Each chapter concludes with Key Points.)

Preface.

Introduction.

Roots of Software Inefficiency.

Our Goal.

Software Efficiency: Does It Matter?

Terminology.

Organization of This Book.

1. The Tracing War Story.

Our Initial Trace Implementation.

What Went Wrong.

The Recovery Plan.

2. Constructors and Destructors.

Inheritance.

Composition.

Lazy Construction.

Redundant Construction.

Key Points.

3. Virtual Functions.

Virtual Function Mechanics.

Templates and Inheritance.

Hard Coding.

Inheritance.

Templates.

4. The Return Value Optimization.

The Mechanics of Return-by-Value.

The Return Value Optimization.

Computational Constructors.

5. Temporaries.

Object Definition.

Type Mismatch.

Pass by Value.

Return by Value.

Eliminate Temporaries with op=().

6. Single-Threaded Memory Pooling.

Version 0: The Global new() and delete().

Version 1: Specialized Rational Memory Manager.

Version 2: Fixed-Size Object Memory Pool.

Version 3: Single-Threaded Variable-Size Memory Manager.

7. Multithreaded Memory Pooling.

Version 4: Implementation.

Version 5: Faster Locking.

8. Inlining Basics.

What Is Inlining?

Method Invocation Costs.

Why Inline?

Inlining Details.

Inlining Virtual Methods.

Performance Gains from Inlining.

9. Inlining—Performance Considerations.

Cross-Call Optimization.

Why Not Inline?

Development and Compile-Time Inlining

Considerations. Profile-Based

Inlining.

Inlining Rules.

Singletons.

Trivials.

10. Inlining Tricks.

Conditional Inlining.

Selective Inlining.

Recursive Inlining.

Inlining with Static Local Variables.

Architectural Caveat: Multiple Register Sets.

11. Standard Template Library.

Asymptotic Complexity.

Insertion.

Deletion.

Traversal.

Find.

Function Objects.

Better than STL?

12. Reference Counting

Implementation Details.

Preexisting Classes.

Concurrent Reference Counting.

13. Coding Optimizations.

Caching.

Precompute.

Reduce Flexibility.

80-20 Rule: Speed Up the Common Path.

Lazy Evaluation.

Useless Computations.

System Architecture.

Memory Management.

Library and System Calls.

Compiler Optimization.

14. Design Optimizations.

Design Flexibility.

Caching.

Web Server Timestamps.

Data Expansion.

The Common Code Trap.

Efficient Data Structures.

Lazy Evaluation.

getpeername().

Useless Computations.

Obsolete Code.

15. Scalability.

The SMP Architecture.

Amdahl’s Law.

Multithreaded and Synchronization Terminology.

Break Up a Task into Multiple Subtasks.

Cache Shared Data.

Share Nothing.

Partial Sharing.

Lock Granularity.

False Sharing.

Thundering Herd.

Reader/Writer Locks.

16. System Architecture Dependencies

Memory Hierarchies.

Registers: Kings of Memory.

Disk and Memory Structures.

Cache Effects.

Cache Thrash.

Avoid Branching.

Prefer Simple Calculations to Small Branches.

Threading Effects.

Context Switching.

Kernel Crossing.

Threading Choices.

Bibliography.

Index. 0201379503T04062001

Preface

If you conducted an informal survey of software developers on the issue of C++ performance, you would undoubtedly find that the vast majority of them view performance issues as the Achilles' heel of an otherwise fine language. We have heard it repeatedly ever since C++ burst on the corporate scene: C++ is a poor choice for implementing performance-critical applications. In the mind of developers, this particular application domain was ruled by plain C and, occasionally, even assembly language.

As part of that software community we had the opportunity to watch that myth develop and gather steam. Years ago, we participated in the wave that embraced C++ with enthusiasm. All around us, many development projects plunged in headfirst. Some time later, software solutions implemented in C++ began rolling out. Their performance was typically less than optimal, to put it gently. Enthusiasm over C++ in performance-critical domains has cooled. We were in the business of supplying networking software whose execution speed was not up for negotiation--speed was top priority. Since networking software is pretty low on the software food-chain, its performance is crucial. Large numbers of applications were going to sit on top of it and depend on it. Poor performance in the low levels ripples all the way up to higher level applications.

Our experience was not unique. All around, early adopters of C++ had difficulties with the resulting performance of their C++ code. Instead of attributing the difficulties to the steep learning curve of the new object-oriented software development paradigm, we blamed it on C++, the dominant language for the expression of the paradigm. Even though C++ compilers were still essentially in their infancy, the language was branded as inherently slow. This belief spread quickly and is now widely accepted as fact. Software organizations that passed on C++ frequently pointed to performance as their key concern. That concern was rooted in the perception that C++ cannot match the performance delivered by its C counterpart. Consequently, C++ has had little success penetrating software domains that view performance as top priority: operating system kernels, device drivers, networking systems (routers, gateways, protocol stacks), and more.

We have spent years dissecting large systems of C and C++ code trying to squeeze every ounce of performance out of them. It is through our experience of slugging it out in the trenches that we have come to appreciate the potential of C++ to produce highly efficient programs. We've seen it done in practice. This book is our attempt to share that experience and document the many lessons we have learned in our own pursuit of C++ efficiency. Writing efficient C++ is not trivial, nor is it rocket science. It takes the understanding of some performance principles, as well as information on C++ performance traps and pitfalls.

The 80-20 rule is an important principle in the world of software construction. We adopt it in the writing of this book as well: 20% of all performance bugs will show up 80% of the time. We therefore chose to concentrate our efforts where it counts the most. We are interested in those performance issues that arise frequently in industrial code and have significant impact. This book is not an exhaustive discussion of the set of all possible performance bugs and their solutions; hence, we will not cover what we consider esoteric and rare performance pitfalls.

Our point of view is undoubtedly biased by our practical experience as programmers of server-side, performance-critical communications software. This bias impacts the book in several ways:

The profile of performance issues that we encounter in practice may be slightly different in nature than those found in scientific computing, database applications, and other domains. That's not a problem. Generic performance principles transcend distinct domains, and apply equally well in domains other than networking software.
At times, we invented contrived examples to drive a point home, although we tried to minimize this. We have made enough coding mistakes in the past to have a sizable collection of samples taken from real production-level code that we have worked on. Our expertise was earned the hard way--by learning from our own mistakes as well as those of our colleagues. As much as possible, we illustrated our points with real code samples.
We do not delve into the asymptotic complexity of algorithms, data structures, and the latest and greatest techniques for accessing, sorting, searching, and compressing data. These are important topics, but they have been extensively covered elsewhere Knu73, BR95, KP74. Instead, we focus on simple, practical, everyday coding and design principles that yield large performance improvements. We point out common design and coding practices that lead to poor performance, whether it be through the unwitting use of language features that carry high hidden costs or through violating any number of subtle (and not so subtle) performance principles.

So how do we separate myth from reality? Is C++ performance truly inferior to that of C? It is our contention that the common perception of inferior C++ performance is invalid. We concede that in general, when comparing a C program to a C++ version of what appears to be the same thing, the C program is generally faster. However, we also claim that the apparent similarity of the two programs typically is based on their data handling functionality, not their correctness, robustness, or ease of maintenance. Our contention is that when C programs are brought up to the level of C++ programs in these regards, the speed differences disappear, or the C++ versions are faster.

Thus C++ is inherently neither slower nor faster. It could be either, depending on how it is used and what is required from it. It's the way it is used that matters: If used properly, C++ can yield software systems exhibiting not just acceptable performance, but yield superior software performance.

We would like to thank the many people who contributed to this work. The toughest part was getting started and it was our editor, Marina Lang, who was instrumental in getting this project off the ground. Julia Sime made a significant contribution to the early draft and Yomtov Meged contributed many valuable suggestions as well. He also was the one who pointed out to us the subtle difference between our opinions and the absolute truth. Although those two notions may coincide at times, they are still distinct.

Many thanks to the reviewers hired by Addison-Wesley; their feedback was extremely valuable.

Thanks also to our friends and colleagues who reviewed portions of the manuscript. They are, in no particular order, Cyndy Ross, Art Francis, Scott Snyder, Tricia York, Michael Fraenkel, Carol Jones, Heather Kreger, Kathryn Britton, Ruth Willenborg, David Wisler, Bala Rajaraman, Don "Spike" Washburn, and Nils Brubaker.

Last but not least, we would like to thank our wives, Cynthia Powers Bulka and Ruth Washington Mayhew.

0201379503P04062001

