Some Other Types of Testing
The vocabulary of testing is indeed rich and plentiful. Next follow some terms that get thrown around frequently enough and that are related to developer testing in one way or another.
The term smoke testing originated from engineers testing pipes by blowing smoke into them. If there was a crack, the smoke would seep out through it. In software development, smoke testing refers to one or a few simple tests executed immediately after the system has been deployed. The “Hello World” of smoke testing is logging into the application.6 Trivial as it may seem, such a test provides a great deal of information. For example, it will show that
The application has been deployed successfully
The network connection works (in case of network applications)
The database could be reached (because user credentials are usually stored in the database)
The application starts, which means that it isn’t critically flawed
Smoke tests are perfect candidates for automation and should be part of an automated build/deploy cycle. Earlier we touched on the subject of regression tests. Smoke tests are the tests that are run first in a regression test suite or as early as possible in a continuous delivery pipeline.
Sometimes we encounter the term end-to-end testing. Most commonly, the term refers to system testing on steroids. The purpose of an end-to-end test is to include the entire execution path or process through a system, which may involve actions outside the system. The difference from system testing is that a process or use case may span not only one system, but several. This is certainly true in cases where the in-house systems are integrated with external systems that cannot be controlled. In such cases, the end-to-end test is supposed to make sure that all systems and subsystems perform correctly and produce the desired result.
What’s problematic about this term is that its existence is inseparably linked to one’s definition of a system and system boundary. In short, if we don’t want to make a fuss about the fact that our e-commerce site uses a payment gateway operated by a third party, then we’re perfectly fine without end-to-end tests.
Characterization testing is the kind of testing you’re forced to engage in when changing old code that supposedly works but it’s unclear what requirements it’s based on, and there are no tests around to explain what it’s supposed to be doing. Trying to figure out the intended functionality based on old documentation is usually a futile attempt, because the code has diverged from the scribblings on a wrinkled piece of paper covered with coffee stains long ago.7 In such conditions, one has to assume that the code’s behavior is correct and pin it down with tests (preferably unit tests), so that changing it becomes less scary. Thus, the existing behavior is “characterized.” Characterization tests differ from regression tests in that they aim at stabilizing existing behavior, and not necessarily the correct behavior.
Positive and Negative Testing
The purpose of positive testing is to verify that whatever is tested works as expected and behaves like it’s supposed to. In order to do so, the test itself is friendly to the tested artifact. It supplies inputs that are within allowed ranges, in a timely fashion, and in the correct order. Tests that are run in such a manner and exercise a typical use case are also called happy path tests.
The purpose of negative testing is to verify that the system behaves correctly if supplied with invalid values and that it doesn’t generate any unexpected results. What outcome to expect depends on the test level. At the system level, we generally want the system to “do the right thing”: either reject the faulty input in a user-friendly manner, or recover somehow. At the unit level, throwing an exception may be the right thing to do. For example, if a function exercised with a unit test expects a positive number and throws an IllegalArgumentException or ArgumentOutOfRangeException in a negative test that may be fine. What’s important is that the developer has anticipated the scenario.
Small, Medium, and Large Tests
When it comes to pruning terminology, Google may serve as a source of inspiration. To avoid the confusion between terms like end-to-end test, system test, functional test, Selenium8 test, or UI test, the engineers at Google divided tests into only three categories—small, medium, and large (Stewart 2010).
Small tests—Correspond closely to unit tests; they’re small and fast. They’re not allowed to access networks, databases, file systems, and external systems. Neither are they allowed to contain sleep statements or test multithreaded code. They must complete within 60 seconds.
Medium tests—May check the interactions between different tiers of the application, which means that they can use databases, access the file system, and test multithreaded code. They should stay away from external systems and remote hosts, though, and should execute for no longer than 300 seconds.
Large tests—Not restricted by any limitations.