InformIT

Fighting Fire with Fire: Designing a "Good" Computer Virus

Date: Oct 15, 2004

Return to the article

Cyrus Peikari demonstrates methods to design and test a live, attenuated computer virus vaccine using real-world simulation.

Introduction

Computer viruses have outgrown traditional antivirus methods. Despite the widespread use of antivirus software, viruses still cause billions of dollars in damage each year. To counter the threat, this article offers suggestions for designing and testing unique, computer-based solutions inspired by the field of biology. Specifically, we describe ideas for designing and testing a live, attenuated computer virus vaccine, including methods for successfully testing the vaccine using real-world simulation.

Defining the Virus Problem

Currently, no standard definition exists for a computer virus. In fact, the term has been debated for the 20 years since these viruses were first described. [1] For purposes of this article, let's just define a virus as "a self-replicating pathogen." A pathogen in this context is any agent designed to generate harm to a computing system or network. Self-replication means that the virus can copy itself (either locally or across a network) without user input. This broad definition of computer virus includes what some call computer worms.

Current antivirus solutions are inadequate. [2, 3, 4] Although software to protect against computer viruses is in widespread use, [5, 6] each year viruses cause $10–20 billion in damage worldwide. [7] The average business currently spends $81,000 to clean up after each virus outbreak. [8] The Code Red virus itself is estimated to have cost $2.5 billion worldwide. [9] In fact, Code Red still exists in the wild nearly two years after it was first released, and some researchers have shown that the Internet might remain infected with it indefinitely. [10]

The growing threat from wireless devices could amplify the danger. For example, viruses that infect wireless devices already exist. [11] Moreover, hundreds of millions of smartphones will soon be potential victims. For example, Microsoft Corporation reported that it's "only a matter of time" before their Windows Mobile Smartphone platform is attacked by viruses. [12]

The explosion of hundreds of millions of such "smart" handheld devices as personal data assistants (PDAs) and smartphones poses a double risk. On the one hand, these mobile devices generally lack antivirus software and have little or no security architecture. On the other hand, they often incorporate multiple communication protocols and methods for data transfer, which can increase the number of virus vectors. For example, a typical handheld device might allow data transfer via WiFi, GSM/GPRS, memory cards, infrared beaming, desktop synchronization, Bluetooth, and firmware upgrades. Each one of these data-transfer mechanisms can increase the opportunity for viruses to spread. [13]

Because current antivirus solutions are inadequate, there is a pressing demand for new techniques. [14] One controversial area of research involves using "good" viruses to counter pathogens. The concept of a "good" virus is not new. [15] Although some researchers have argued against using beneficial viruses, [16] others have extensively countered these arguments. [17, 18, 19]

In fact, such "antivirus viruses" already exist. For example, the Cheese worm [20] automatically seeks out and patches hosts (vulnerable computers) that have been exploited by the Lion worm. [21] Another example is the Nachi worm, [22] which searches for machines infected with the Blaster worm, and then repairs and patches the infected hosts. Similarly, CRClean [23] is a passively spreading worm designed to counter the Code Red worm. Instead of actively spreading, CRClean listens for incoming Code Red probes from infected hosts. CRClean then attaches itself to the infected host generating the incoming Code Red request and patches the infected host.

Antivirus viruses such as these already exist. Examples such as the Nachi worm [24] have been limited by their poor quality. [25] However, this problem can be addressed by introducing quality-control mechanisms. [26] One area of research involves self-replicating "vaccines" that use attenuated (weakened) strains of live pathogens in order to boost global immunity on the Internet. This type of computer vaccine is modeled after its biological vaccine counterpart. Such a vaccine would benefit from being open source for transparency. In addition, it should be developed under the umbrella of an international monitoring body, analogous to the World Health Organization for biological viruses. [27]

Designing a Virus Vaccine

This article demonstrates methods to design and test a live, attenuated computer virus vaccine using real-world simulation. The following list summarizes the design goals of our vaccine:

The major objection to releasing a live, replicating vaccine on the Internet—even under the aegis of a world governance body—is that the vaccine alters target host machines without permission of the owner. Thus, the vaccine can (and will) cause damage to a certain number of critical systems. However, this objection is overcome by comparison to biological vaccines. For example, with the measles vaccine, a small percentage of children who receive it are injured or die due to side effects. Despite these risks, however, parents line up each year to have their children receive this attenuated form of the deadly virus. In fact, the United States government mandates that all children must receive the measles vaccine. Thus, although mandatory biological vaccination results in a small percentage of death and disease, it has resulted in an overall net benefit of saving millions of lives.

Given the billions of dollars of damage caused by computer viruses each year, a computer virus vaccine could be designed with the same benefit-to-risk ratio as that of biological vaccines. As stated above, the Code Red virus caused approximately $2.5 billion in damage. However, Code Red actually came in three discrete variants, which were released over time. All three versions exploited the same vulnerability discovered in Microsoft's IIS web server software on June 18, 2001. [28] However, the first version of Code Red was not seen in the wild until more than three weeks later, on July 12, 2001. This version infected at least 359,104 machines within 14 hours. Following the first outbreak, the second, more virulent version of Code Red didn't appear for another week. Finally, it took another two weeks for the third and final form to appear in the wild. [29]

Thus, there was a considerable amount of time for a vaccine to be developed and deployed between these individual versions of Code Red. For example, suppose a vaccine was developed and released in the interim between the first and second versions of Code Red; at that stage, a vaccine might have improved global immunity and reduced the remainder of the $2.5 billion in damage.

This article examines the theory that a live, attenuated vaccine released in the interim between the variants of Code Red would improve outcomes. It also tests the case of releasing the vaccine in the three-week interim after the first vulnerability was first discovered, but before the first virus was released in the wild.

Several models have been used to simulate the spread of viruses. [30, 31] However, optimal testing of a live vaccine requires a simulator that most closely models the real-world behavior of viruses. NWS, a "network worm simulation system" [32] is a framework of objects and methods written in the PERL programming language that allows programmers to design and test live, self-replicating pathogens. NWS provides the advantage of modeling viral behavior by executing actual virus code. Thus, the simulation allows viruses to perform arbitrary actions, which may model real-world behavior more accurately than rigid mathematical simulation. NWS is open source and is available under the GNU General Public License (GPL).

In the following simulations, NWS is initialized with a sample address space arbitrarily chosen at 65535, which represents the entire network. This simulated network is then populated with 10,000 vulnerable hosts. These hosts represent the number of vulnerable systems that are initially present in the address space. This simulation is analogous to an Internet populated with Microsoft IIS web servers that are vulnerable to the Code Red virus.

NWS measures the passage of time in the simulated network based on discrete time steps. At each time step, an object may perform an action such as a virus probing or infecting a host. In the current simulations, the address space is populated with an initial number of seven (7) Code Red and seven (7) vaccine instances, respectively. In each run, the results are recorded for a total of 150 discrete time steps. Each simulation is run through 20 iterations on a Pentium IV machine running Linux, and the results of the 20 iterations are averaged and plotted.

One of the most devastating impacts of modern viruses is their negative effect on overall network performance. For example, the rapid spread of the Slammer infection in 2003 knocked most of South Korea's Internet offline for several hours. [33] Thus, an important goal is to design the vaccine to have minimal impact on network bandwidth. In NWS, network messages are counted at each time step. For the purpose of this simulation, each instance of Code Red sends out two message probes to a random address during each time step. Vaccines are attenuated to send out fewer probes per time step than Code Red. The number of messages passed per time in the network is recorded as a measure of bandwidth consumption.

In biology, many vaccines use live, attenuated strains of the virus to fight the virus itself. Thus, the current vaccine is based on Code Red itself. However, it has been modified in two ways. After attaching to a vulnerable host, the vaccine "patches" the vulnerability, thus marking the host as resistant to further infection. In addition, in this study the vaccine is attenuated. Thus, the vaccine is not as virulent as Code Red—it doesn't send as many probes per time step. For example, if a variant of Code Red sends out two random probes per time step, a vaccine can be attenuated to send only one probe per time step (a 50% attenuation in virulence). As in biology, the advantage of weakening the virus vaccine is that it should then spread with less damage (for example, with less consumption of network resources) than the more virulent strains of the real virus.

Results of Our Virus Simulation

Figure 1 shows our simulated network with 10,000 vulnerable hosts. At initial time T1, Code Red is released and continues to infect vulnerable hosts until saturation. At some arbitrary point after the virus has reached steady state (with all 10,000 hosts infected), the vaccine is released at time T2. After release at time T2, the vaccine begins to compete successfully with Code Red for vulnerable hosts.

In the simulation depicted in Figure 1, the vaccine has not been attenuated. Both Code Red and the vaccine have been set to send two random probes per time step. Thus, both Code Red and the vaccine are at equal strength. However, because the vaccine patches systems that it infects, the vaccinated hosts are no longer vulnerable to Code Red. Thus, the number of Code Red infections eventually drops to zero, leaving only immunized (repaired) hosts that are no longer vulnerable to future infection.

Figure 1 Figure 1

Code Red is released at time T1, and continues to infect vulnerable hosts until saturation. At time T2, the vaccine is released and competes with Code Red for vulnerable hosts. Because the vaccine patches systems that it infects, the vaccinated hosts are no longer vulnerable to Code Red. The number of Code Red infections eventually drops to zero.

Figure 2 shows an alternate method for vaccination. In this case, the vaccine is automatically released when a new strain of Code Red appears in the wild and begins spreading. In practice, this automatically "triggered" vaccine can be achieved through honeynets set to monitor the Internet for new global infections. A honeynet is a laboratory network of computers set up to attract and study Internet viruses and hackers in the wild. [34] For the simulation depicted in Figure 2, the vaccine is not attenuated; it's full strength and spreads as rapidly as Code Red itself. In fact, the vaccine is identical to Code Red in every respect, except that the vaccine patches vulnerable hosts that it infects, thus repairing the hosts and leaving them immune to future infection.

As shown in Figure 2, the vaccine begins to spread immediately when triggered by the appearance of a new Code Red infection at time T1. Because the vaccine is full strength and because it confers immunity, it rapidly outpaces Code Red. Thus, the epidemic is contained and is quickly eradicated. Once a steady state is reached, only protected, immunized hosts remain.

Figure 2 Figure 2

A non-attenuated vaccine is automatically triggered by an outbreak of Code Red. The full-strength vaccine quickly controls and eradicates Code Red.

Now suppose we attenuate the vaccine. An attenuated vaccine should be less damaging to network resources. In other words, it should consume less overall bandwidth than a full-strength vaccine. For example, a vaccine that spreads more slowly than Code Red might use less total bandwidth, but it would still confer immunity. Figure 3 shows the results of a vaccine that has been attenuated by 25%. As in Figure 2, this vaccine is automatically triggered at initial time T1 when a new infection of Code Red is detected in the wild.

Figure 3 Figure 3

The spread of Code Red triggers a vaccine. However, this vaccine is attenuated to spread 25% more slowly than Code Red. Although the vaccine ultimately controls and eradicates the outbreak, it does so more slowly.

In the simulation depicted in Figure 3, the immune response is not as rapid because the replication speed of the vaccine is attenuated by 25% and thus spreads more slowly. However, the triggered, weakened vaccine still ameliorates the Code Red epidemic, albeit more slowly than the full-strength vaccine depicted in Figure 2. As seen in Figure 3, the Code Red outbreak is partially blunted by the vaccine. Meanwhile, the vaccine spreads more slowly, consuming less network resources yet still ultimately defeating and eradicating Code Red.

The simulation depicted in Figure 4 shows a triggered vaccine that has been further weakened. The replication speed of this vaccine has been attenuated by 50% relative to the speed of the Code Red virus. In this case, the greatly attenuated vaccine causes even less blunting of the Code Red outbreak.

Figure 4 Figure 4

A highly weakened vaccine (50% attenuation) responds even more slowly to an outbreak of Code Red. However, attenuating the vaccine (making it spread more slowly) reduces network bandwidth consumption.

The previous simulations demonstrated the effect of vaccines on ameliorating a Code Red epidemic. As shown in these simulations, attenuating a vaccine results in delayed efficacy against a Code Red outbreak. However, does attenuating a vaccine confer the advantage of reduced overall network bandwidth consumption? Figure 5 demonstrates the effect of attenuation on total bandwidth. As the vaccine becomes more attenuated, both the rate of new message generation and the magnitude of total messages per time step are reduced. For example, a vaccine with 25% attenuation generates fewer messages per time step than a vaccine with 0% attenuation. Similarly, a vaccine with 50% attenuation generates fewer messages per time step than a vaccine with 25% attenuation. Thus, greater vaccine attenuation leads to reduced total bandwidth consumption in the network.

Figure 5 Figure 5

Bandwidth consumption in the network is represented by total messages generated by time step. As the vaccine becomes more attenuated, both the rate of new message generation and the magnitude of total messages per time step are reduced. Thus, the more attenuated vaccines have a more gentle impact on the network.

A more controversial use of the vaccine is to release it in the interim after the vulnerability is found, but before the first virus appears in the wild. This strategy is more controversial because it involves releasing a self-replicating vaccine on the Internet (and thus damaging some small percentage of systems) before any immediate threat from a virus has appeared. Based on historical trends, we know that after a new vulnerability becomes known, it's only a matter of time before a virus writer creates and releases a virus to exploit that vulnerability.

In the case of Code Red, such a "prophylactic" vaccine could have been released in the three-week interim between the discovery of the vulnerability and the release of the first Code Red virus to exploit it. As in biology, the goal is to vaccinate as much of the population as possible before an outbreak occurs. In the first simulation (Figure 1), reversing the order to release the vaccine before the virus ever appears would make all hosts immune to the virus. Thus, the outbreak would never occur, which is the optimal goal.

However, what effect does releasing the vaccine before an outbreak have on total network bandwidth consumption? In the simulation depicted in Figure 6, the bandwidth used by the prophylactic vaccine is compared to previous simulations. In this example, the prophylactic vaccine is attenuated by 50%. The prophylactic bandwidth curve is seen in the lower right corner of the graph. In other words, the prophylactic vaccine has a more gentle impact on the network than any vaccine released after a virus outbreak.

Figure 6 Figure 6

An attenuated, "prophylactic" vaccine is released before a virus outbreak occurs. The prophylactic vaccine uses the least total bandwidth, and thus has a more gentle impact on the network.

Conclusions

This article demonstrated methods to design and test a live, attenuated computer virus vaccine using computer simulation. For the purpose of the simulations, the vaccine meets our design goals. For example, the vaccine confers immunity by patching systems that it infects. In addition, attenuating the vaccine conserves resources and reduces morbidity by incurring a slower, more controlled use of network bandwidth. Because it's self-replicating, the vaccine is easily distributed and accessible to the entire population. Automatically controlling and eradicating even a small part of a $2.5 billion outbreak of Code Red would also make such a vaccine cost-effective.

The simulations in this article examine the effect of a self-replicating vaccine that repairs the flaw exploited by its virus counterpart. In each simulation, the vaccine is effective in eradicating the virus. Attenuating (weakening) the vaccine results in a more blunted response to the virus outbreak. However, attenuated vaccines eventually contain the virus outbreak, while resulting in less overall network bandwidth consumption.

The optimal case is to release a "prophylactic" vaccine to immunize the network before a virus ever appears. In the case of Code Red, this vaccine could have been released in the three-week interim after the web server vulnerability was discovered, but before the first virus to exploit the vulnerability was released. In the simulation, such a virus prevents an outbreak by automatically seeking out and repairing all vulnerable hosts in the network, so the Code Red infection never has a chance to start in the first place. Moreover, the simulation shows that an attenuated, prophylactic vaccine is the least damaging in terms of network bandwidth consumption.

There are several drawbacks to these simulations. For example, the simulations occur in discrete time steps, rather than in continuous time like a real-world virus. In addition, the simulation distributes hosts randomly in the address space, in contrast to the Internet, where hosts are often grouped in blocks of adjacent address space, with other blocks of empty address space filling the interstices. Finally, these simulations don't take into account the effect of real-world devices such as firewalls and routers on bandwidth and message flow.

However, this method of simulation has certain advantages. For example, NWS allows for the execution of actual virus code, which can potentially provide a more realistic simulation than other methods such as numerical simulation. Also, NWS has an object-oriented design, in which program objects correspond directly to real-world objects such as software, messages, and hosts. In addition, this object-oriented design allows vaccines to be easily customized for each simulation.

For researchers who are interested in controversial areas of research such as antivirus viruses, a caution is in order. Despite the fact that current antivirus technology is no longer effective, some industry leaders are vehemently opposed to novel research that includes writing self-replicating code. An example of this opposition has occurred at the University of Calgary Department of Computer Science. Despite having the full approval and support of his department and university, one professor came under intense public criticism—and was even threatened with physical violence—for attempting to teach virus writing as part of a university course on viruses. [35]

This trend of initial opposition follows historical examples from biology. For example, the original smallpox vaccine killed a staggering 1% of patients who received it, and injured many more. Even scientists such as Benjamin Franklin were frightened of it. Nevertheless, this early, poor-quality smallpox vaccine was still highly successful in preventing pandemics. Over time, the vaccine was refined and improved, and is now relatively safe.

Similarly, the computer virus vaccine is now at a very early stage, and is likely to be met with initial opposition. Designing a live, attenuated vaccine safe enough to be released on the Internet will require international cooperation among computer scientists, biologists, engineers, epidemiologists, ethicists, and government agencies. The approach modeled here is not yet meant for large-scale implementation. Rather, this article provides a starting point for laboratory experimentation, with the hope that this research will stimulate further research in computer virus vaccine simulation.

Acknowledgements

Seth Fogie, Dr. Anton Chuvakin, and Dr. Daniel Pak each provided a complete technical review of this article. Ivan Acre provided valuable references and encouragement. The author is extremely grateful to Bruce Ediger for designing NWS, a world-class network worm simulator, and for making it available to the public for free.

References

[1] C. Peikari and Seth Fogie. InformIT Security Reference Guide. "Malware" (specifically, see the section "Viruses vs. Worms"). July 19, 2004.

[2] Aberdeen Group. "Is Antivirus Software Becoming Irrelevant?" January 6, 2004. (Free registration required.)

[3] E. Joyce. "Is Anti-Virus Software Obsolete?" July 19, 2002.

[4] P. Heikkila. "Anti-virus vendors failing users, claims CA [Computer Associates] chief." January 14, 2002.

[5] R. Greenspan. "Computers Still Insecure." September 27, 2002.

[6] ICSA Labs 2002 Computer Virus Prevalence Survey (PDF). (Free registration required.)

[7] R. Lemos. "Security: What's going on?" ZDNet, January 22, 2002.

[8] BBC News. "Viruses bite businesses hard." BBC.com. April 29, 2003.

[9] J. Lyman. "In Search of the World's Costliest Computer Virus." February 21, 2002.

[10] A. Chuvakin. "Where Worms Go To Die?" December 09, 2003.

[11] C. Peikari and S. Fogie. Maximum Wireless Security. Sams. 2002.

[12] D. Muriel. "Threat of mobile virus attack real." CNN.com. October 17, 2003.

[13] C. Peikari "Airborne Viruses." Presented at the Computer Security Institute 30th Annual International Security Conference. November 5, 2003.

[14] C. Wang, J. Davidson, J. Hill, J. Knight. "Protection of Software-based Survivability Mechanisms." International Conference of Dependable Systems and Networks, Goteborg, Sweden. July, 2001.

[15] "A Computer Virus." Fred Cohen & Associates, 1984.

[16] V. Bontchev. "Are 'Good' Computer Viruses Still a Bad Idea?" 1994.

[17] F. Cohen. It's Alive! The New Breed of Living Computer Programs. John Wiley & Sons. 1994.

[18] G. Moorer. "The case for beneficial computer viruses and worms" (PDF). 23rd NISSC Proceedings. October 2000.

[19] C. Peikari. "Good Viruses Crucial to Self-Healing Internet." Secure Computing Magazine. March 2002.

[20] "The 'cheese' Worm." CERT Incident Note IN-2001-05. May 17, 2001.

[21] M. Vision. "Lion Internet Worm Analysis." (Undated.)

[22] B. Krebs. "'Good' Worm Fixes Infected Computers." washingtonpost.com August 18, 2003.

[23] M. Kern. CRClean. September 1, 2001.

[24] K. Poulsen. "Nachi worm infected Diebold ATMs." November 24, 2003.

[25] C. Webb. "Worm vs. Worm." washingtonpost.com. August 19, 2003.

[26] S. Costello. "A Virus to fight viruses?" IDG News Service/Boston Bureau. July 16, 2001.

[27] C. Peikari. "An Open Source, International, Attenuated, Computer Virus Vaccine." Presented at the Defcon 9th Annual Security Conference. July 14, 2001.

[28] CAIDA. "CAIDA Analysis of Code-Red." April 8, 2003.

[29] D. Moore, C. Shannon, and J. Brown. "Code-Red: a case study on the spread and victims of an Internet worm" (PDF). Presented at the Internet Measurement Workshop (IMW) in 2002.

[30] C. C. Zou, W. Gong, D. Towsley. "Code Red Worm Propagation Modeling and Analysis" (PDF). Ninth ACM Conference on Computer and Communication Security (CCS'02). November 18–22, 2002.

[31] M. Liljenstam. "SSF.App.Worm: A Network Worm Modeling Package for SSFNet." April 28, 2003.

[32] B. Ediger "Simulating Network Worms." (Undated.)

[33] S. Cowley and M. Williams. "Slammer Worm Slaps Net Down, But Not Out." IDG News Service. January 26, 2003.

[34] C. Peikari and A. Chuvakin. Security Warrior. O'Reilly, 2004.

[35] C. Peikari. "Best Medicine to Treat Viruses?" Secure Computing Magazine. July 2003.

800 East 96th Street, Indianapolis, Indiana 46240