The Common Language Runtime: Overview of the Runtime Environment
In This Chapter
Introduction to the Runtime
Starting a Method
At a high level, the CLR is simply an engine that takes in IL instructions, translates them into machine instructions, and executes them. This does not mean that the CLR is interpreting the instructions. This is just to say that the CLR forms an environment in which IL code can be executed. For this to work efficiently and portably, the execution engine must form a runtime environment that is both efficient and portable. Efficiency is key; if the code does not run quickly enough, all of the other features of the system become moot.
Portability is important because of the number of processors and devices on which the CLR is slated to run. For a long time, Microsoft and Intel seemed to be close partners. Microsoft more or less picked the Intel line of processors to run the software that the company produced. This allowed Microsoft to build and develop software without worrying about supporting multiple CPU architectures and instructions. The company didn't have to worry about shipping a Motorola 68XXX version of the software because it was not supported. Limiting the scope of processor support became a problem as Win16 gave way to Win32. (No APIs were called Win16, but this is the name I will give the APIs that existed before Win32.) Building software that took advantage of the features of a 32-bit CPU remained somewhat backward compatible with older Win16 APIs and proved to be a major undertaking. With Win64 on the horizon, Microsoft must realize that it cannot continue to "port" all of its software with each new CPU that is released if it wants to stay alive as a company. Microsoft is trying to penetrate the mobile phone, hand-held, and tablet markets that are powered by a myriad of different processors and architectures. Too much software is produced at Microsoft for it to continue to produce a CPU-bound version.
The answer to the problem of base address and data size (Win32 versus Win64) and to the problem of providing general portability to other processors came in the form of the runtime environment, or the Common Language Runtime. Without going into the details of the specific instructions that the CLR supports (this is done in Chapter 5, "Intermediate Language Basics"), this chapter details the architecture of the runtime that goes into making a managed application run.
Introduction to the Runtime
Before .NET, an executable (usually a file with an .exe suffix), was the application. In other words, the application was contained within one file. To make the overall system run more efficiently, the application would elect to use code that was shared (usually a file with a .dll suffix). If the program elected to use shared code, you could either use an import library (a file that points function references to the DLL that is associated with the import library), or you could load the DLL explicitly at runtime (using LoadLibrary, LoadLibraryEx, and GetProcAddress). With .NET, the unit of execution and deployment is the assembly. Execution usually begins with an assembly that has an .exe suffix. The application can use shared code by importing the assembly that contains the shared code with an explicit reference. (You can add the reference via the "Add References" node in Visual Studio .NET or include it via a command-line switch /r). The application can also explicitly load an assembly with Assembly.Load or Assembly.LoadFrom.
Before going further, you need to learn definitions of some of the terms:
AssemblyThe assembly is the primary unit of deployment within the .NET Framework. Within the base class libraries is a class that encapsulates a physical assembly appropriately named Assembly. When this book refers to the class or an instance of the class, it will be denoted as Assembly. This class exists in the System namespace. An assembly can contain references to other assemblies and modules. Chapter 4, "The Assembly," contains more detailed information about assemblies.
ModuleA module is a single file that contains executable content. An assembly can encapsulate one or more modules; a module does not stand alone without an assembly referring to it. Similar to assembly, a class exists in the base class library that encapsulates most of the features of a module called Module. When this book refers to Module, it is referring to the class in the base class library. This class exists in the System namespace.
AppDomainAn application domain has been referred to as a lightweight process. Before .NET, isolation was achieved through separate processes through assistance from the OS and the supporting hardware. If one process ran amok, then it would not bring down the whole system, just that process. Because types are so tightly controlled with the .NET Framework, it is possible to have a mechanism whereby this same level of isolation can occur within a process. This mechanism is called the application domain, or AppDomain. As with modules and assemblies, a class in the base class library encapsulates many of the features and functionality of an application domain called AppDomain. This class exists in the System namespace. When this book refers to the class, it will be called AppDomain.
IL or MSILIL stands for Intermediate Language, and MSIL stands for Microsoft Intermediate Language. IL is the language in which assemblies are written. It is a set of instructions that represent the code of the application. It is intermediate because it is not turned in to native code until needed. When the code that describes a method is required to run, it is compiled into native code with the JIT compiler. Chapter 5 contains information about individual IL instructions.
JITJIT stands for Just-In-Time. This term refers to the compiler that is run against IL code on an as-needed basis.
After the code is "loaded," execution of the code can begin. This is where the old (pre-.NET) and the new (.NET) start to diverge significantly. In the case of unmanaged code, the compiler and linker have already turned the source into native instructions, so those instructions can begin to execute immediately. Of course, this means that you will have to compile a separate version of the code for every different native environment. In some cases, because it is undesirable to ship and maintain a separate version for every possible native environment, only a compatible version is compiled and shipped. This leads to a lowest common denominator approach as companies want to ship software that can be run on as wide a range of environments as possible. Currently, few companies ship programs that target environments that have an accelerated graphics engine. Not only would the manufacturer need to ship a different program for each graphics accelerator card, but a different program also would need to be developed for those cases where a graphics accelerator was lacking. Other examples of hardware environments in which specific optimizations could be taken advantage of would be disk cache, memory cache, high-speed networks, multiple CPUs, specialized hardware for processing images, accelerated math functions, and so forth. In numerous other examples, compiling a program ahead of time either results in a highly optimized yet very specific program, or an unoptimized and general program.
One of the first steps that the CLR takes in running a program is checking the method that is about to be run to see whether it has been turned into native code. If the method has not been turned into native code, then the code in the method is Just-In-Time compiled (JITd). Delaying the compilation of a method yields two immediate benefits. First, it is possible for a company to ship one version of the software and have the CLR on the CPU where the program is installed take care of the specific optimizations that are appropriate for the hardware environment. Second, it is possible for the JIT compiler to take advantage of specific optimizations that allow the program to run more quickly than a general-purpose, unmanaged version of the program. Systems built with a 64-bit processor will have a "compatibility" mode that allows 32-bit programs to run unmodified on the 64-bit CPU. This compatibility mode will not result in the most efficient or fastest possible throughput, however. If an application is compiled into IL, it can take advantage of the 64-bit processing as long as a JIT engine can target the new 64-bit processor.
The process of loading a method and compiling it if necessary is repeated until either all of the methods in the application have been compiled or the application terminates. The rest of this chapter explores the environment in which the CLR encloses each class method.