Home > Articles > Security > General Security and Privacy

  • Print
  • + Share This
This chapter is from the book

Custom Fuzzer Case Study: Shockwave Flash

A good representation of the currently available fuzzing frameworks was discussed in the previous section. For a more complete listing, refer to this book's companion Web site at http://www.fuzzing.org. We encourage you to download the individual framework packages and explore the complete documentation and various examples. No single framework stands out as the best of breed, able to handle any job thrown at it better than the others. Instead, one framework might be more applicable depending on the specific target, goals, timeline, budget, and other factors. Familiarity with a few frameworks will provide a testing team with the flexibility to apply the best tool for any given job.

Although most protocols and file formats can be easily described in at least one of the publicly available fuzzing frameworks, there will be instances where none of them are applicable. When dealing with specialized software or protocol auditing, the decision to build a new, target-specific fuzzer can yield greater results and save time over the life of the audit. In this section we discuss Adobe's Macromedia Shockwave Flash18 (SWF) file format as a real-world fuzzing target and provide portions of a custom-built testing solution. Security vulnerabilities in Shockwave Flash are typically high impact as some version of Flash Player is installed on almost every desktop on the planet. In fact, modern Microsoft Windows operating systems ship with a functional version of Flash Player by default. Dissecting every detail of a full-blown SWF audit is well beyond the scope of this chapter, so only the most relevant and interesting portions are discussed. For further information, up-to-date results, and code listings,19 visit the official book Web site at http://www.fuzzing.org.

A couple of factors make SWF a great candidate for a custom fuzzing solution. First, the structure of the SWF file format is completely documented in an Adobe developer reference titled "Macromedia Flash (SWF) and Flash Video (FLV) File Format Specification."20 The version of the document referenced for this writing was version 8. Unless you are auditing an open source or in-house developed file format, such thorough documentation is not usually available and might limit your testing to more generic approaches. With this specific case, we will most definitely leverage every detail provided by Adobe and Macromedia. Examining the specification reveals that the SWF file format is largely parsed as a bit stream versus the typical byte-level granularity by which most files and protocols are parsed. As SWFs are typically delivered over the Web, streamlined size is likely the motivation behind the choice to parse at the bit level. This adds a level of complexity for our fuzzer and provides further reason to create a custom solution, as none of the fuzzing frameworks we explored in this chapter support bit types.

Modeling SWF Files

The first milestone that must be met to successfully fuzz the SWF file format is for our fuzzer to be able to both mutate and generate test cases. SWF consists of a header followed by a series of tags that resemble the following high-level structure:

 [Header]
   <magic>
   <version>
   <length>
   <rect>
     [nbits]
     [xmin]
     [xmax]
     [ymin]
     [ymax]
   <framerate>
   <framecount>

 [FileAttributes Tag]
 [Tag]
   <header>
   <data>
     <datatypes>
     <structs>
     ...
 [Tag]
   <header>
   <data>
     <datatypes>
     <structs>
     ...
...
[ShowFrame Tag]
[End Tag]

This is how the various fields break down:

  • magic. Consists of three bytes and for any valid SWF file must be "FWS."
  • version. Consists of a single byte representing the numerical version of Flash that generated the file.
  • length. A four-byte value indicating the size of the entire SWF file.
  • rect. This field is divided into five components:
    • nbits. This is five bits wide and indicates the length, in bits, of each of the next four fields.
    • xmin,xmax,ymin, and ymax. These fields represent the screen coordinates to which to render the Flash content. Note that these fields do not represent pixels but rather twips, a Flash unit of measurement that represents 1/20th of a "logical pixel."

The rect field is where we can start to see the complexity of this file format. If the value of the nbits field is 3, then the total length of the rect portion of the header is 5 + 3 + 3 + 3 + 3 = 17 bits. If the value of the nbits field is 4 then the total length of the rect portion of the header is 5 + 4 + 4 + 4 + 4 = 21 bits. Representing this structure with one of the available fuzzing frameworks is not a simple task. The final two fields in the header represent the speed at which to play back frames and the number of total frames in the SWF file.

Following the header is a series of tags that define the content and behavior of the Flash file. The first tag, FileAttributes, was introduced in Flash version 8 and is a mandatory tag that serves two purposes. First, the tag specifies whether or not the SWF file contains various properties such as a title and description. This information is not used internally but is instead made available for reference by external processes such as search engines. Second, the tag specifies whether the Flash Player should grant the SWF file either local or network file access. Each SWF file has a variable number of remaining tags that fall into one of two categories: definition and control tags. Definition tags define content such as shapes, buttons, and sound. Control tags define concepts such as placement and motion. The Flash Player processes the tags until it hits a ShowFrame tag, at which point content is rendered to screen. The SWF file is terminated with the mandatory End tag.

There are a number of available tags to choose from, each consisting of a two-byte tag header followed by data. Depending on the length of the data, the tag can be defined as long or short. If the length of the data is less than 63 bits, then the tag is considered short and is defined as: [ten-bit tag id] [six-bit data length] [data]. If the data block is larger, then the tag is considered long and is defined with an additional four-byte field in the header that specifies the length of the data: [ten-bit tag id] [111111] [four-byte data length] [data]. Again, we can see that the complexity of the SWF format does not lend itself well to being modeled within any of the aforementioned fuzzing frameworks.

Things only get worse from here. Each tag has its own set of fields. Some contain raw data, whereas others consist of SWF basic types named structures (struct or structs). Each struct consists of a set of defined fields that can include both raw data and even other structs! It still only gets worse. Groups of fields within both tags and structs may or may not be defined depending on the value of another field within that same tag or struct. Even the explanation can get confusing, so let's start building from the ground up and then show some examples to better explain everything.

To begin, we define a bit field class in Python for representing arbitrary numeric fields of variable bit lengths. We extend this bit field class to define bytes (chars), words (shorts), dwords (longs), and qwords (doubles) as they can be easily defined as 8-bit, 16-bit, 32-bit, and 64-bit instantiations of the bit field class. These basic data structures are accessible under the package Sulley (not to be confused with the Sulley fuzzing framework discussed in the next section):

BIG_ENDIAN    = ">"
LITTLE_ENDIAN = "<"
class bit_field (object):
    def __init__ (self, width, value=0, max_num=None):
        assert(type(value) is int or long)

        self.width   = width
        self.max_num = max_num
        self.value   = value
        self.endian  = LITTLE_ENDIAN
        self.static  = False
        self.s_index = 0

        if self.max_num == None:
            self.max_num = self.to_decimal("1" * width)

    def flatten (self):
        '''
        @rtype:  Raw Bytes
        @return: Raw byte representation
        '''

        # pad the bit stream to the next byte boundary.
        bit_stream = ""

        if self.width % 8 == 0:
            bit_stream += self.to_binary()
        else:
            bit_stream  = "0" * (8 - (self.width % 8))
            bit_stream += self.to_binary()


        flattened = ""

        # convert the bit stream from a string of bits into raw bytes.
        for i in xrange(len(bit_stream) / 8):
            chunk = bit_stream[8*i:8*i+8]
            flattened += struct.pack("B", self.to_decimal(chunk))

        # if necessary, convert the endianess of the raw bytes.
        if self.endian == LITTLE_ENDIAN:
            flattened = list(flattened)
            flattened.reverse()
            flattened = "".join(flattened)

        return flattened

    def to_binary (self, number=None, bit_count=None):
        '''
    @type number:     Integer
    @param number:    (Opt., def=self.value) Number to convert
    @type bit_count:  Integer
    @param bit_count: (Opt., def=self.width) Width of bit string

    @rtype:  String
    @return: Bit string
    '''

    if number == None:
        number = self.value

    if bit_count == None:
        bit_count = self.width

    return "".join(map(lambda x:str((number >> x) & 1),               range(bit_count -1, -1, -1)))

def to_decimal (self, binary):
    '''
    Convert a binary string into a decimal number and return.
    '''

    return int(binary, 2)

def randomize (self):
    '''
    Randomize the value of this bitfield.
    '''

    self.value = random.randint(0, self.max_num)

def smart (self):
    '''
    Step the value of this bitfield through a list of smart values.
    '''

    smart_cases =     [
        0,
        self.max_num,
        self.max_num / 2,
        self.max_num / 4,
        # etc...
    ]
        self.value    = smart_cases[self.s_index]
        self.s_index += 1


class byte (bit_field):
    def __init__ (self, value=0, max_num=None):
        if type(value) not in [int, long]:
            value = struct.unpack(endian + "B", value)[0]

        bit_field.__init__(self, 8, value, max_num)


class word (bit_field):
    def __init__ (self, value=0, max_num=None:
        if type(value) not in [int, long]:
            value = struct.unpack(endian + "H", value)[0]

        bit_field.__init__(self, 16, value, max_num)


class dword (bit_field):
    def __init__ (self, value=0, max_num=None):
        if type(value) not in [int, long]:
            value = struct.unpack(endian + "L", value)[0]

        bit_field.__init__(self, 32, value, max_num)


class qword (bit_field):
    def __init__ (self, value=0, max_num=None):
        if type(value) not in [int, long]:
            value = struct.unpack(endian + "Q", value)[0]

        bit_field.__init__(self, 64, value, max_num)


# class aliases
bits   = bit_field
char   = byte
short  = word
long   = dword
double = qword

The bit_field base class defines the width of the field (width), the maximum number the field is capable of representing (max_num), the value of the field (value), the bit order of the field (endian), a flag specifying whether or not the field's value can be modified (static), and finally an internally used index (s_index). The bit_field class further defines a number of useful functions:

  • The flatten() routine converts and returns a byte-bounded sequence of raw bytes from the field.
  • The to_binary() routine can convert a numeric value into a string of bits.
  • The to_decimal() routine does the opposite by converting a string of bits into a numeric value.
  • The randomize() routine updates the field value to a random value within the fields valid range.
  • The smart() routine updates the field value through a sequence of "smart" boundaries, and only an excerpt of these numbers is shown.

When constructing a complex type from the bit_field building block, these last two routines can be called to mutate the generated data structure. The data structure can then be traversed and written to a file by recursively calling the flatten() routines of each individual component.

Using these basic types, we can now begin to define the more simple SWF structs, such as RECT and RGB. The following code excerpts show the definitions of these classes, which inherit from a base class not shown (refer to http://www.fuzzing.org for the definition of the base class):

class RECT (base):
    def __init__ (self, *args, **kwargs):
        base.__init__(self, *args, **kwargs)

        self.fields =         [
          ("Nbits", sulley.numbers.bits(5, value=31, static=True)),
          ("Xmin" , sulley.numbers.bits(31)),
          ("Xmax" , sulley.numbers.bits(31)),
          ("Ymin" , sulley.numbers.bits(31)),
          ("Ymax" , sulley.numbers.bits(31)),
        ]

class RGB (base):
    def __init__ (self, *args, **kwargs):
        base.__init__(self, *args, **kwargs)

        self.fields =         [

          ("Red"  , sulley.numbers.byte()),
          ("Green", sulley.numbers.byte()),
          ("Blue" , sulley.numbers.byte()),
        ]

This excerpt also shows a simple shortcut that was discovered and agreed on during development. To simplify fuzzing, variable width fields are all defined at their maximum length. To represent structs with dependent fields a new primitive, dependent_bit_field, was derived from the bit_field class as shown here:

class dependent_bit_field (sulley.numbers.bit_field):
    def __init__ (self, width, value=0, max_num=None, static=False,                   parent=None, dep=None, vals=[]):
        self.parent = parent
        self.dep    = dep
        self.vals   = vals

        sulley.numbers.bit_field.__init__(self, width, value,                                           max_num, static)

    def flatten (self):
        # if there is a dependency for flattening (including) this
        # structure, then check it.
        if self.parent:
            #                              VVV - object value
            if self.parent.fields[self.dep][1].value not in self.vals:
                # dependency not met, don't include this object.
                return ""

        return sulley.numbers.bit_field.flatten(self)

This extended class specifies a field index and value to examine prior to generating and returning data. If the appropriate field referenced by dep does not contain a value from the list vals then no data is returned. The MATRIX struct demonstrates the usage of this newly defined primitive:

class MATRIX (base):
    def __init__ (self, *args, **kwargs):
        base.__init__(self, *args, **kwargs)

        self.fields =         [
          ("HasScale"     ,  sulley.numbers.bits(1)),
          ("NScaleBits"   ,  dependent_bits(5, 31, parent=self,                              dep=0, vals=[1])),
          ("ScaleX"        , dependent_bits(31, parent=self,                              dep=0, vals=[1])),
          ("ScaleY"        , dependent_bits(31, parent=self,                              dep=0, vals=[1])),

          ("HasRotate"     , sulley.numbers.bits(1)),

          ("NRotateBits"   , dependent_bits(5, 31, parent=self,                              dep=4, vals=[1])),
          ("skew1"         , dependent_bits(31, parent=self,                              dep=0, vals=[1])),
          ("skew2"         , dependent_bits(31, parent=self,                              dep=0, vals=[1])),

          ("NTranslateBits"  , sulley.numbers.bits(5, value=31)),
          ("TranslateX"      , sulley.numbers.bits(31)),
          ("TranslateY"      , sulley.numbers.bits(31)),
        ]

From this excerpt, the NScaleBits field within the MATRIX struct is defined as a five-bit-wide field with a default value of 31 that is included in the struct only if the value of the field at index 0 (HasScale) is equal to 1. The ScaleX, ScaleY, skew1, and skew2 fields also depend on the HasScale field. In other words, if HasScale is 1, then those fields are valid. Otherwise, those fields should not be defined. Similarly, the NRotateBits field is dependent on the value of the field at index 4 (HasRotate). At the time of writing, more than 200 SWF structs have been accurately defined in this notation.21

With all the necessary primitives and structs, we can now begin to define whole tags. First, a base class is defined from which all tags are derived:

class base (structs.base):
    def __init__ (self, parent=None, dep=None, vals=[]):
        self.tag_id = None
        (structs.base).__init__(self, parent, dep, vals)

    def flatten (self):
        bit_stream = structs.base.flatten(self)

        # pad the bit stream to the next byte boundary.

    if len(bit_stream) % 8 != 0:
        bit_stream  = "0" * (8-(len(bit_stream)%8)) + bit_stream

    raw = ""

    # convert the bit stream from a string of bits into raw bytes.
    for i in xrange(len(bit_stream) / 8):
        chunk = bit_stream[8*i:8*i+8]
        raw += pack("B", self.to_decimal(chunk))

    raw_length = len(raw)

    if raw_length >= 63:
        # long (record header is a word + dword)
        record_header = self.tag_id
        record_header <<= 6
        record_header |= 0x3f
        flattened = pack('H', record_header)

        record_header <<= 32
        record_header |= raw_length
        flattened += pack('Q', record_header)
        flattened += raw 

    else:
        # short (record_header is a word)
        record_header = self.tag_id
        record_header <<= 6
        record_header |= raw_length
        flattened = pack('H', record_header)
        flattened += raw

    return flattened

The flatten() routine for the base tag class automatically calculates the length of the data portion and generates the correct short or long header. At the time of writing more than 50 SWF tags have been accurately defined on this framework. Here are a few examples of simple and medium complexity:

class PlaceObject (base):
    def __init__ (self, *args, **kwargs):
        base.__init__(self, *args, **kwargs)

        self.tag_id = 4
        self.fields =         [
          ("CharacterId"    , sulley.numbers.word(value=0x01)),
          ("Depth"          , sulley.numbers.word()),
          ("Matrix"         , structs.MATRIX()),
          ("ColorTransform" , structs.CXFORM()),
        ]


class DefineBitsLossless (base):
    def __init__ (self, *args, **kwargs):
        base.__init__(self, *args, **kwargs)

        self.tag_id = 20
        self.fields =         [
          ("CharacterId"          , sulley.numbers.word()),
          ("BitmapFormat"         , sulley.numbers.byte()),
          ("BitmapWidth"          , sulley.numbers.word()),
          ("BitmapHeight"         , sulley.numbers.word()),
          ("BitmapColorTableSize" , structs.dependent_byte(                                     parent=self, dep=1, vals=[3])),
          ("ZlibBitmapData"       , structs.COLORMAPDATA(                                     parent=self, dep=1, vals=[3])),
          ("ZlibBitMapData_a"     , structs.BITMAPDATA(                                     parent=self, dep=1, vals=[4, 5])),
        ]


class DefineMorphShape (base):
    def __init__ (self, *args, **kwargs):
        base.__init__(self, *args, **kwargs)

        self.tag_id = 46
        self.fields =         [
          ("CharacterId"      , sulley.numbers.word()),
          ("StartBounds"      , structs.RECT()),
          ("EndBounds"        , structs.RECT()),
          ("Offset"           , sulley.numbers.word()),
          ("MorphFillStyles"  , structs.MORPHFILLSTYLE()),
          ("MorphLineStyles"  , structs.MORPHLINESTYLES()),
          ("StartEdges"       , structs.SHAPE()),
          ("EndEdges"         , structs.SHAPE()),
        ]

A lot of information has been presented, so let's review. To properly model an arbitrary SWF, we began with the most basic primitive bit_field. From there, we derived primitives for bytes, words, dwords, and qwords. Also from bit_field, we derived dependent_bit_field, from which we further derived dependent bytes, words, dwords, and qwords. These types, in conjunction with a new base class, form the basis for SWF structs. A base class for tags is derived from the structs base class that in conjunction with structs and primitives form SWF tags. The relationship depicted in Figure 21.1 can further clarify.

Figure 21.1

Figure 21.1 SWF fuzzer component relationship

Some other building blocks, such as string primitives, are also necessary for complete SWF modeling. Refer to the source for their definitions. With all these classes in place, we construct a global SWF container and begin instantiating and appending tags to it. The resulting data structure can then be easily traversed and modified either manually or automatically with each individual component's randomize() or smart() routines. Finally, the structure can be written to a file by once again traversing the complex data structure and concatenating the results of the flatten() routine. This design allows for special cases to be transparently handled within individual tags or structs.

Generating Valid Data

Not long after conquering our first challenge of successfully representing basic SWF structures, we come across another. The SWF specification states that tags can only depend on previously defined tags. When fuzzing a tag with multiple dependencies, all of the dependencies must first be successfully parsed; otherwise, the target tag is never reached. Guessing valid field values is a painstaking process, but there is a clever alternative. Utilizing the Google SOAP API,22 we can search the Web for SWF files with the keyword filetype:swf. A simple script is written to incrementally search for a filetype:swf, b filetype:swf, and through to z filetype:swf. The script parses the returned results looking for and downloading discovered SWF files. Files are named by the MD5 hash to prevent storage of duplicates.

Approximately 10,000 unique SWF files consuming more than 3GB of space have been spidered. Examination of the version field in the SWF header can reveal a statistical sample of the distribution, which is shown in Table 21.1.

Table 21.1. Spidered Flash SWF Version Distribution

SWF Version

%of Total

Flash 8

< 1%

Flash 7

~ 2%

Flash 6

~ 11%

Flash 5

~ 55%

Flash 4

~ 28%

Flash 1 - Flash 3

~ 3%

These results are interesting. Unfortunately, no conclusions can be drawn as to the distribution of active Flash Players from these statistics. Another script is then written that parses each file, extracting and storing individual tags. This repository of valid tags can now be used to assist in the creation of SWF test cases.

Fuzzing Environment

The PaiMei 23 reverse engineering framework is used to facilitate the instantiation and monitoring of individual test cases. PaiMei is a set of pure Python classes designed to assist in the development of reverse engineering automation tools. Use of the framework is described in more detail in Chapter 23. The actual testing process is similar to that used by the generic PaiMei FileFuzz as follows:

  1. Load an SWF test case within a Flash Player under the PaiMei PyDbg scripted debugger.
  2. Start the player and monitor execution for five seconds. The delay of five seconds is arbitrarily chosen and assumes that if an error is going to occur during the parsing of the SWF file it will occur within this time frame.
  3. If, within that time, an error is detected, store the test case and relevant debug information.
  4. If, within that time, no error is detected, close the Flash Player instance.
  5. Continue to the next test case.

For more efficient testing, we can modify the frame rate in the SWF header from the default value of 0x000C to the maximum value of 0xFFFF. By doing so, we allow for larger portions of the SWF file to be processed during the five seconds allotted to each test case. The default method of viewing SWF files is to load them in a Web browser such as Microsoft Internet Explorer or Mozilla Firefox. Although this approach is acceptable for this specific case, an alternative method that comes in handy, especially for fuzzing ActionScript,24 is to use SAFlashPlayer.exe. This small stand-alone player is distributed with Macromedia Studio.25

Testing Methodologies

The most generic approach to fuzzing SWF files is to use bit flipping. This rudimentary testing technique has been seen in previous chapters at the byte level. Applying this technique at the more granular bit level for SWF is prudent as a single byte might span more than one field. Moving a step beyond bit flipping, a custom-written SWF tag parsing script can be used to change the order of tags within an individual SWF file and swap tags between two different SWFs. Finally, the most thorough approach for SWF fuzzing is to individually stress test every field within every struct and tag. The previously generated tag and struct repository can be referenced to piecemeal a base SWF for mutation.

With all three approaches, the generated files are fed through the fuzzing environment in the same way. In the next section, we explore a new fuzzing framework introduced and released by the authors in conjunction with this book.

  • + Share This
  • 🔖 Save To Your Account