Programming in .NET: The Type System
Chapter 1 provided a high-level overview of the issues involved in building distributed systems. It introduced a solution to these issues, the .NET Framework, and used a simple "Hello World" example to highlight the language interoperability offered by the .NET Framework. But, as is so often the case, the devil lies in the details. Chapters 2 through 4 describe in more depth the three CLR subsystems: the type system (described in this chapter) and the metadata and execution systems (described in Chapters 3 and 4, respectively).
This is an example of a gray box. This is an example of a gray box. This is an example of a gray box. This is an example of a gray box. This is an example of a gray box.
As noted in Chapter 1, the facilities provided by the type, metadata, and execution systems are not new. However, the CLR does provide functionality in addition to the services provided by other architectures, such as COM/DCOM, CORBA, and Java. For example:
The type system supports many programming styles and languages, allowing types defined in one language to be first-class citizens in other languages.
The metadata system supports an extensibility mechanism, called custom attributes, that allows developers to extend the metadata annotations.
The execution system ensures security and supports versioning on types in the CLR.
Using the .NET Framework, developers can both define and share types. Defining and sharing new types in a single language is not particularly challenging; allowing a newly defined type to be used in other languages is much more problematic. This chapter offers a sufficiently detailed understanding of the CLR type system so that developers can appreciate how it achieves type interoperability.
The Relationship Between Programming Languages and Type Systems
The Evolution of Type Systems
Why is a type system necessary at all? Some early programming languages did not provide a type system; they simply saw memory as a sequence of bytes. This perspective required developers to manually craft their own "types" to represent user-defined abstractions. For example, if a developer needed four bytes to represent integer values, then he or she had to write code to allocate four bytes for these integers and then manually check for overflow when adding two integers, byte by byte.
Later programming languages provided type systems, which included a number of built-in abstractions for common programming types. The first type systems were very low level, providing abstractions for fundamental types, such as characters, integers, and floating-point numbers, but little more. These types were commonly supported by specific machine instructions that could manipulate them. As type systems become more expressive and powerful, programming languages emerged that allowed users to define their own types.
Of course, type systems provide more benefits than just abstraction. Types are a specification, which the compiler uses to validate programs through a mechanism such as static type checking. (In recent years, dynamic type checking has become more popular.) Types also serve as documentation, allowing developers to more easily decipher code and understand its intended semantics. Unfortunately, the type systems provided by many programming languages are incompatible, so language integration requires the integration of different types to succeed.Programming Language-Specific Type Systems
Before attempting to design a type system for use by multiple languages, let's briefly review the type systems used by some of the more popular programming languages.
The C programming language provides a number of primitive built-in types, such as int and float. These types are said to closely resemble a machine's architecture, as they can often be held in a single register and may have specific machine instructions to process them. The C programmer can also create user-defined types, such as enumerations or structures. Structures are essentially aggregate types that contain members of one or more other types.
The C++ programming language takes the type system of C and extends it with object-oriented and generic programming facilities. C++'s classes (essentially C structures) can inherit from multiple other classes and extend these classes' functionality. C++ does not provide any new built-in types but does offer libraries, such as the Standard Template Library (STL), that greatly enhance the language's functionality.
SmallTalk is an object-oriented language in which all types are classes. SmallTalk's type system provides single-implementation inheritance, and every type usually directly or indirectly inherits from a common base class called Object,1 providing a common root class in the SmallTalk type system. SmallTalk is an example of a dynamically type-checked language.
Like SmallTalk, Java provides an object-oriented type system; unlike SmallTalk, it also supports a limited number of primitive built-in types. Java provides a single-implementation inheritance model with multiple inheritance of interfaces.
The Design Challenge: Development of a Single Type System for Multiple Languages
Given the variety of type systems associated with these programming languages, it should be readily apparent that developing a single type system for multiple languages poses a difficult design challenge. (Most of the languages mentioned previously are object-oriented.) Also, it is clear from the list of requirements that not all type systems are compatible. For example, the single-implementation inheritance model of SmallTalk and Java differs from the multiple-implementation inheritance capabilities of C++.
The approach taken when designing the CLR generally accommodated most of the common types and operations supported in modern object-oriented programming languages. In general terms, the CLR's type system can be regarded as the union of the type systems of many object-oriented languages. For example, many languages support primitive built-in types; the CLR's type system follows suit. The CLR's type system also supports more advanced features such as properties and events, two concepts that are found in more modern programming languages. An example of a feature not currently supported in the CLR is multiple-implementation inheritancean omission that naturally affects languages that do support multiple inheritance, such as C++, Eiffel, and Python.
Although the CLR's type system most closely matches the typical object-oriented type system, nothing in the CLR precludes nonobject-oriented languages from using or extending the type system. Note, however, that the mapping from the CLR type system provided by a nonobject-oriented language may involve contortions. Interested readers should see the appendices at the end of this book for more details on language mapping in nonobject-oriented languages.
CLRProgramming Language Interaction: An Overview
Figure 2.1 depicts the relationship between elements of the CLR and programming languages. At the top of the diagram, the source file may hold a definition of a new type written in any of the .NET languages, such as Python. When the Python.NET compiler compiles this file, the resulting executable code is saved in a file with a .DLL or .EXE extension, along with the new type's metadata. The metadata format used is independent of the programming language in which the type was defined.
Once the executable file for this new type exists, other source filesperhaps written in languages such as C#, Managed C++, Eiffel, or Visual Basic (VB)can then import the file. The type that was originally defined in Python can then be used, for example, within a VB source code file just as if it were a VB type. The process of importing types may be repeated numerous times between different languages, as represented by the arrow from the executable file returning to another source file in Figure 2.1.
At runtime, the execution system will load and start executing an executable file. References to a type defined in a different executable file will cause that file to be loaded, its metadata will be read, and then values of the new type can be exposed to the runtime environment. This scenario is represented by the line running from the execution system back to the executable files in Figure 2.1
Figure 2.1 Interaction between languages, compilers, and the CLR