Home > Articles > Programming > Windows Programming

  • Print
  • + Share This
From the author of

Dynamically Emitting Compiled Regular Expressions

You can emit a compiled regular expression to an assembly. The regular expression will load slower, but runs faster once loaded as a compiled regular expression assembly. With this emitter you can allow users to select which expressions they run frequently, and then compile and emit those expressions. Let's quickly review Reflection and emitting assemblies.

In the simplest terms possible, Reflection is a .NET technology that supports dynamic discovery and use of code. It's analogous to Run-time Type Information (RTTI), but Reflection is much more; .NET Reflection allows programmers to write code—that writes code. This is referred to as emitting. In short, your programs can write programs. This is precisely what CompileToAssembly does; it writes code that after you've compiled your application.


For more on reflection, see my upcoming book The Visual Basic .NET Developer's Book (Addison-Wesley, scheduled for publication Fall 2002, ISBN 0-672-3240705).

Listing 2 demonstrates the brief code necessary to emit a regular expression to an assembly at runtime.

Listing 2—Emitting a Regular Expression to an Assembly

const string expression = "mailto:\w+@\w+.senate.gov"
RegexCompilationInfo[] info = new RegexCompilationInfo[]
 { new RegexCompilationInfo(Expression, RegexOptions.Compiled,
  "SenateMail", "CompiledExpressions", true)};

AssemblyName assemblyName = new AssemblyName();
assemblyName.Name = "Regex";

Regex.CompileToAssembly(info, assemblyName);

The code defines a regular expression that can easily be represented by some dynamic user input. The Regex.CompileToAssembly method requires an array of RegexCompilationInfo objects. RegexCompilationInfo is basically everything needed to define a custom Regex class:

  • The first argument (Expression) is the expression string.

  • The second argument (RegexOptions.Compiled) is the RegexOptions.

  • The third argument ("SenateMail") is the name of the Regex derivative class.

  • The fourth argument ("CompiledExpressions") is the namespace to emit.

  • The fifth argument (true) represents the access modifier for the new class.

Finally, we need an AssemblyName object, and we pass the RegexCompilationInfo array and the AssemblyName arguments to the Regex.CompileToAssembly method. When the last statement runs, there will be an assembly named Regex.dll on your disk containing a module with the namespace CompiledExpressions and one class, SenateMail. SenateMail will be subclassed from System.Text.RegularExpressions.Regex.

  • + Share This
  • 🔖 Save To Your Account