Reflections on a Near Disaster
The situation could have been much, much worse. Finding two low-level hardware faults so close to production is painful, but it is nothing compared to a line-down crisis. In response to these two close calls, Acme asked Bob to lead an all-day process review; he helped the team retrace the trail that led to the failures:
- The signal integrity team did not have a representative at the table when the architecture was being defined. A cross-disciplinary design team is the cornerstone of a healthy, solid design process.
- There was no cost-performance analysis of the PC board stack-up. This is admittedly one of most difficult things any engineer has to do.
- Schedule should never take precedence over product quality unless the risk of non-functional hardware is deemed acceptable.
- High-quality models are hard to come by, but it is possible to have the models in place when they are needed by planning ahead and involving procurement.
- If a design is sensitive to edge rate and output impedance, then the component datasheet should specify these parameters.
- A signal integrity engineer needs to evaluate every net in a design and make a decision about the appropriate level of analysis required for that net or group of nets. The decisions may range from no analysis at all to a thorough electrical characterization of each component in the net and end-to-end coupled simulation of the IO circuits and everything in between.
During the review, Bob pointed out that both failures were, at their root, timing failures. Crosstalk from the PCI Express net occurred at the same time the I2C clock was switching through its threshold. The reflection on the memory address net occurred at the same time the DRAM was sampling it.
Circumstances do not usually present themselves in such an obvious logical progression as they did in this fictional scenario. It is only careful retrospection that reveals the sequence of events that led to a particular conclusion. To some degree, the job of the engineer is to play the role of the seer, who can predict these circumstances and avoid them without becoming a whiner to whom nobody pays much attention.
It must be tempting for those who are making weighty architectural decisions during the earliest stages of a new product to avoid addressing implementation details. As any experienced engineer will attest, there is usually a price to be paid for defining architecture without input from those whose job it is to implement the architecture. The worst possible scenario is a product that is marginally functional—except the marginal part does not become apparent until production is in full swing. This is even more treacherous than the product that overruns its budget, misses its milestones, and never makes it to market at all. Assembly lines come down. Companies—more than one—lose large quantities of money each day. There may be recalls. There will certainly be redesigns under intense pressure. At the end of the whole experience lies a painful loss of reputation. This is a scenario that no Vice President of Technology or Chief Financial Officer would choose to put in motion if the choice were made clear.
One thing is crystal clear: A company greatly enhances its chance for success by building a cross-disciplinary team in the early architectural phases of the project. In addition to a system architect, a minimum team would include a board layout designer, firmware programmer, and engineers from the following disciplines: logic, software, mechanical, thermal, power, manufacturing, electromagnetic compliance, and signal integrity. A large company may have separate engineers to represent each of these disciplines, while in most other companies, one engineer plays several roles. In any case, it is vital that each discipline have representation at the table and the resources to do their job.