.NET Web Services and SOAP
As you've learned in Chapter 1, "Web Service Fundamentals," SOAP is a critical technological component in the .NET Web Service scheme. SOAP in general defines a mechanism for encoding information into an XML wrapper. For Web Services, SOAP gets a bit more specific when it facilitates the mappings between method signatures and the XML document.
The basic idea behind the use of the SOAP protocol is to interpret a remote method's parameter values at runtime and stuff those values into an XML document, at least as SOAP was originally envisioned. The XML data is then transported to the remote server using the HTTP protocol (other transport protocols are also used, albeit currently outside.NET). If you use SOAP in this manner, you are using the SOAP protocol as an implementation of a more general concept, the invocation of some remote method (implemented as a Web Service). The Remote Procedure Call (RPC) protocol has the same objective: to carry local computer information to a remote computer, even information that might not make sense to a remote system without some conversion (for example, addresses of data to their textual equivalents). This allows the remote computer to execute the remote method on your behalf and return a result.
In this chapter, you'll explore SOAP in some detail to see how it acts as a messaging protocol. To SOAP, the concept of an RPC protocol has a specific meaning. .NET, however, actually uses the SOAP protocol in a dual fashion. It uses SOAP to carry the information back and forth as SOAP messages or as true SOAP RPC calls, depending on how you configure your Web method.
After learning why SOAP is quickly becoming so successful in the industry, you'll dive into the protocol itself to see specifically how SOAP carries remote method information to and from a Web Service. You'll also learn how .NET employs the protocol.
Why Is SOAP Needed?
This innocent-looking question is actually a very good one to ask. RPC protocols grew from research in the mid-1980s that had roots all the way back to Tim Berners-Lee and his description of TCP/IP —and even the invention of Ethernet itself, circa 1970 by Bob Metcalfe, later of 3Com fame. The concept of distributed computing dates back farther than that. The creators of the ENIAC envisioned that one day large numbers of computers would be linked to solve very complex problems.
In the case of RPC and SOAP, the distributed computing issue is simply one of consuming resources on a remote computer as if the remote computer and the calling (local) machine were the same machine. The goal is to seamlessly tie the distributed systems together so that when you call a given method, you don't know (and presumably don't care) whether the call is actually handled by a remote system. In real-world situations, you very often do care, if only because the call latency is greatly increased. It simply takes longer for the method to complete its task. For the purposes of discussing SOAP as a protocol, however, let's ignore those issues and imagine that SOAP, as an RPC protocol actually does seamlessly integrate distributed systems. In critical cases when this model breaks down, you'll find the issue noted in the text.
Why Do You Need to Understand SOAP?
This is also a very good question. After all, given the power of .NET, should you be concerned about underlying protocols .NET uses? As an analogy, when you bring up your e-mail client, do you care how your email is sent or how attachments to your email messages are encoded?
You could make an excellent argument that you don't need to understand SOAP to program Web Services today. .NET handles the details for you. You write the code that .NET requires to handle the Web Service, or the invocation of the Web Service, if you're writing client-side code. Then .NET takes care of the serialization aspects as well as the transmission of the information back and forth. As you'll see in Chapter 5, "Web Services and Description and Discovery," you don't require the more complex aspects of the SOAP protocol because you tell the world what your packets look like using the Web Service Description Language (WSDL).
Those complex SOAP encoding practices were required when client and server had to agree on a protocol using static code. WSDL allows for dynamic packet layout and description—at least, from an early binding perspective—rendering the deeper encoding structures less necessary and even obsolete. This is so because the RPC style of encoding is going out of fashion in favor of the wider range of encoding possibilities that WSDL document/literal encoding offers. You are free to encode information as you see fit rather than blindly follow the SOAP specification itself, at least as far as Section 5 is concerned.
In this case, the term early binding refers to tying the client code to the Web Service when the code is compiled. With .NET, the WSDL is read to create proxy source code that is compiled into the client. Dynamic proxy generation, to be used for truly late bound Web Services at runtime, is certainly a possibility but is not currently supported by .NET. It is an alternative offered by the SOAP Toolkit, however, if you require this capability.
But if you really examine what you intend to do, the "why do I care" fac[cd]ade breaks down. Returning to our analogy, an email consumer doesn't need to understand the details associated with the Internet email protocols—or email attachment encoding, for that matter. But developers need to understand these protocols to write code that uses them directly. Blindly trusting infrastructure might get you 80% of the features you require. After all, the infrastructure was designed to satisfy the needs of the general populace. That other 20% or so requires the true ingenuity that comes from understanding the lower levels of the technology.
This chapter won't make you an expert on SOAP, but it will give you the understanding that you'll require to write professional grade Web Services and clients. So back to the initial question—why SOAP? Let's explain it in this way ....
The SOAP Advantage
Probably the best-known RPC protocol is DCE-RPC, which is the Distributed Computing Environment's implementation. Many Unix environments use DCE-RPC, as does Microsoft Windows (which modified it slightly to handle object references across machines to support DCOM). DCE-RPC requires the use of a port mapper, which you'll find listening (monitoring network traffic) on TCP/UDP Port 135. Whenever you want to access a remote computer using DCE-RPC, you access the remote system's port mapper and request a socket address. The actual distributed communication then takes place over the assigned port.
The issue here is actually one of security, when your business-critical servers are safely tucked behind a firewall. For DCE-RPC to work, not only do you have to open Port 135 to the world for port-mapping purposes, but you also need to have a range of other socket addresses available for the general public to use for RPC communications. This very often leads to an opening through which some 13-year-old will ruin your crucial data as well as your day. So it isn't surprising to find nearly every business IT guru locks Port 135 and almost every other port. The one universal exception is Port 80, or Port 443 for secure sockets.
Port 80 is the network socket port used by HTTP, at least as it is nominally configured (Port 8080 is often used to manage the Web server, and HTTP is also spoken there). As you probably already know, the Hypertext Transfer Protocol (HTTP), is the lower-level network protocol used to shuttle Hypertext Markup Language (HTML) documents around the Internet. HTTP is the transport protocol for Web pages, and because you can bet that practically every corporate vice president likes to surf the Net, you'll probably find Port 80 open through any firewall you'll likely encounter.
This is SOAP's secret weapon and one of the sources of its power. It's almost unheard of to find someone blocking Port 80 with a corporate firewall, so SOAP (as bound to HTTP) should pass through corporate firewalls untouched.
The other source of SOAP's power is the fact that the information transported by the HTTP protocol is actually XML (which is why Chapter 3, "Web Services and XML," dealt so heavily with XML within the .NET Framework). To be more specific, the content-type of the HTTP packet is text/xml. The remainder of this chapter is dedicated to uncovering the XML format that SOAP uses to serialize method parameter information, starting with the SOAP XML object model.