Distributed computing is different from traditional computing in many ways. The scale is larger; there are many machines, each doing specialized tasks. Services are replicated to increase capacity. Hardware failure is not treated as an emergency or exception but as an expected part of the system. Thus the system works around failure.
Large systems are built through composition of smaller parts. We discussed three ways this composition is typically done: load balancer for many backend replicas, frontend with many different backends, and a server tree.
The load balancer divides traffic among many duplicate systems. The front-end with many different backends uses different backends in parallel, with each performing different processes. The server tree uses a tree configuration, with each tree level serving a different purpose.
Maintaining state in a distributed system is complex, whether it is a large database of constantly updated information or a few key bits to which many systems need constant access. The CAP Principle states that it is not possible to build a distributed system that guarantees consistency, availability, and resistance to partitioning simultaneously. At most two of the three can be achieved.
Systems are expected to evolve over time. To make this easier, the components are loosely coupled. Each embodies an abstraction of the service it provides, such that the internals can be replaced or improved without changing the abstraction. Thus, dependencies on the service do not need to change other than to benefit from new features.
Designing distributed systems requires an understanding of the time it takes various operations to run so that time-sensitive processes can be designed to meet their latency budget.