Designing Abstractions
- What Are You Hiding?
- Impedance Mismatch
- Unifying Abstractions
- Is Your Abstraction Recursive?
- Think Twice, Code Once
Computer science is all about abstractions. The mark of a good programmer is the ability to think at multiple levels of abstraction at once. To write really good code, you need to think about what the computer will be doing while it's executing, the complexity class of the algorithm, and the way the user will interact with it. Neglect either of the first two and you'll end up with poorly performing code. Neglect the third and no one will care that your code is fastthey'll hate using it despite its speed.
With so much depending on the choice of good abstractions, it's surprising how often people abstract the wrong thing. Picking the wrong abstraction can complicate life massively later on, whereas picking a clean model makes the rest of a system much easier to build.
What Are You Hiding?
The first question to ask when designing an abstraction is what it's supposed to hide. An abstraction provides a simple model of something complicated. For example, most programming languages provide some simple collection types that hide the details of their implementation from the programmer.
You also need to ask the converse questionwhat are you exposing? A good abstraction should expose the features that interest users, but hide things that are irrelevant.
Sometimes you'll want to provide a mechanism for breaking the abstraction. The NSFileHandle class in Cocoa is a good example: It provides a high-level, object-oriented way of interacting with files, but it also allows you to access the raw file descriptor if you need to do that. This capability is useful because it means that you can use the abstraction for the subset of your problem where the abstraction is applicable, and not use it for the subset where it isn't.
Hiding too little or too much can cause problems:
- Hiding too little. Files are a good example of an abstraction that hides too little. As a user-interface concept, files are meaningless. In the UNIX sense, a file is a stream of bytes in persistent storage. This fact doesn't mean anything to a typical user, although concepts like documents and folders have meaning. A lot of documents contain multiple, moderately independent objects, and serializing these objects in separate files makes more sense. Systems such as Mac OS and RISC OS allow these objects to be presented to the user as a single document, unlike systems that expose files to the user directly. The file abstraction forces developers to use features such as zip archives for documents, which increases the I/O required for saving.
- Hiding too much. This problem is much more obvious. If you hide low-level features that users need, they simply can't use those features. Anyone who has used a high-level API has probably encountered this common problem at least once. Complaints about programming languages tend to fall into two categories: complaints that it's too high-level, and complaints that it's too low-levelin other words, complaints that the language exposes too little or too much of the machine's workings.