Table of Contents
- Microsoft SQL Server Defined
- Microsoft SQL Server Features
- Microsoft SQL Server Administration
- Microsoft SQL Server Programming
- Performance Tuning
- Practical Applications
- Professional Development
- Application Architecture Assessments
- Business Intelligence
- Tips and Troubleshooting
- Additional Resources
Cloud Computing (Distributed Computing) Paradigms
Last updated Mar 28, 2003.
The word "Cloud Computing" is thrown around so much these days that, like most buzzwords, it's lost all meaning. In this article I'll explain "the cloud" for data professionals and other technical folks using some industry-standard terms, and by explaining the places where each form of cloud computing fits. Before I begin, it's important to remember that no single technology or methodology is a panacea for all computing needs. The cloud — just like any other computing technology — should be used to solve a problem, not to make a single approach your only strategy.
Instead of using the (hopefully soon to be outdated) term "cloud", I prefer the term "Distributed Computing" because it more accurately reflects how these technologies are actually arranged. The general concept of "the cloud" is that some part of the computing or storage is out of your direct control. The other part of distributed computing that makes it a cloud is that the organization providing it has some sort of automated provisioning of the resources. In other words, it's treated more as a service than something you have to build in a case-by-case fashion.
But even within this definition there are multiple ways to create a good distributed architecture. Each approach has its strengths and weaknesses; each has a good use-case set. There are lots of companies that sell systems in these designs, called a "public cloud" and you can implement this yourself, called a "private cloud". Those terms merely refer to who owns the artifacts, not how they are implemented.
In this article, I'll describe these paradigms. Like all categories, they aren't absolute — you might have a mixture of one in another and so on. I'll also mention only Microsoft's implementation of these architectures — I'll leave other companies to describe their own implementations. While the terms I'll use are industry standard, each vendor implements them in their own way.
Infrastructure as a Service (Iaas)
In this paradigm, the hardware is abstracted away.
This is one of the simplest paradigms to understand an implement. In an IaaS arrangement, you simply virtualize a server, and run the Virtual Machine (VM) somewhere else.
There are other parts of infrastructure that can be hosted by someone else as well, such as remote storage, remote security devices as so on. Anything that removes the hardware portion of your infrastructure to another location and provisions it is considered an IaaS "cloud".
In an IaaS situation, the IT team for the most part treats each VM just like it would any other server in a Data Center building or room. You have complete control of the operating system, virus scanners, drivers, patch-levels and even drive letters.
This makes it pretty easy to move just about any application you have on-premise to an IaaS provider — even if you are that provider (private cloud). It's quick, simple and easy to understand.
Using IaaS is easiest to think of in terms of installation. If you need to run SETUP.EXE, or an INSTALL.CMD, then IaaS should be an option. For instance, if you need to run SQL Server in another location, IaaS is a good candidate. Of course, you can also use IaaS when you write software as a target for distribution as well.
Virtualizing into an IaaS service allows agility as well. If you need a server, you can simply use the provisioning process provided by the IaaS vendor (or your own processes in the case of a private cloud) to instantly obtain a new "server". Normally these are pre-staged images with the desired patches, configurations and sometimes even other software (such as SQL Server) installed.
Another use-case for IaaS is when the software to be installed is not designed to scale outward — meaning that the original intent of the package was to be installed on one system at a time. In the case of SQL Server, although you can install the individual parts of the server such as the database engine and the Reporting Services components, each of those is "atomic" — each must be installed on a single server. In other words, you can't install the SQL Server database engine component on multiple servers under one name, in a load-balancing configuration. It's designed to be installed on one system at a time. This holds true for other software packages such as Exchange and so on.
Since you're only moving the servers, memory and other physical components away, you still have to either pay for, configure and certainly maintain and upgrade operating systems, drivers, run-times (such as Java or .NET) and so on in an IaaS configuration. And the more of these you have, the more configurations and systems you need to keep in synch at one time.
There's also an issue of latency to consider. Although everything is self-contained on a single Virtual Machine and the performance may be quite good, the applications that use these systems may be located far away from the systems. If the application is designed from the outset to understand this latency and deal with it, then you may not have an issue. But if the application normally expects the server to be on a private network, you either need to make changes to it (if possible) for retry logic and so on, or consider using IaaS in a private cloud configuration.
Virtual Machines only scale "up" to a certain extent. That means you can only add so many processors, so much memory and so on to make the server "bigger" in an IaaS configuration. If you are writing your own software, you can compensate by using a scale-out development paradigm, reducing this issue.
Another consideration is how you implement High Availability and Disaster Recovery (HA/DR) in an IaaS configuration. You will need to work closely with your public or private cloud provider to ensure that you have whatever geographical fault-domains specified that the server needs. Not all virtualization software supports the use of Windows or Linux clustering, since some of those technologies require direct hardware attachments that may not be possible for some VM Hypervisors.
There are many companies that offer IaaS. Microsoft uses a mix of the Windows Server operating systems, their Hyper-V technology for VM's, and a new add-in to Microsoft's System Center to allow you to provision and manage a private cloud. They do not host Virtual Machines in an IaaS configuration, but they do allow you to work with several hosting facilities that provide this function. You can read more about how they do that here.
Software as a Service (Saas)
In this paradigm, everything is abstracted away.
It's actually pretty simple to grasp this concept — you log on to a system, use it, and then log off. There's nothing to install, configure, control, patch, or even understand — the only thing you focus on is the user interface provided to you by the SaaS provider.
If you think about it, this is the way our users view us as the IT staff. They log on to their e-mail server, use a CRM client that hits SQL Server or any number of other software services we provide for them each day. The only difference is that in our case, the business is footing the bill — when you or I decide to use a SaaS provider we pay for it.
If a software offering is a perfect fit for the uses you need, then you should consider a SaaS. In many cases you can simply bypass buying hardware and software, learning it, configuring and maintain it when there is something out there that will already provide the software service we need. In fact, most organizations already use a SaaS provider for things like payroll and even human resources software.
Another use-case is when an organization has limited IT staff available. In very small shops, say less than 15 or 20 users, there's no way to afford a full-time IT person to run the organization's technology needs. The organization can purchase everything from word processing to e-mail, finance, and even business operations online.
Of course, there are drawbacks. One is that you may be locked-in to the vendor. Often the vendor doesn't allow you to bring the data your applications to another vendor, or even back on site.
Which brings up a matter of trust. Will the vendor still be in business in a year? Five years? Twenty? How do they protect that data you're storing there? Who has access to it?
An then there's the cost. It's not just a simple matter of paying for the initial cost of each seat or person that uses the software — you have to ensure that you understand the way the vendor can raise prices. After all, once your entire business is running on a vendor's SaaS offering, and especially if you can't easily transfer it to another vendor or simply bring the data to your location, the vendor might want to raise the price of your access to that data. It's the same concern most people (including myself) have with Home Owner's Associations here in the United States. Many of the contracts that govern those have no written "cap" — meaning that theoretically they could raise fees immensely and there's little you could do about it.
The Microsoft solutions for SaaS are Office 365 and Hotmail, along with Office Online. These are offerings you can pay a subscription to access, and then you can use typical Microsoft Office applications without installing or configuring servers. Since the formats are standard Microsoft documents, you can bring much of this data back, but the other caveats still apply. You can read more about these here.
Platform as a Service (Paas)
In this paradigm, the hardware, operating system, virtualization and code runtime is abstracted away, as is the High Availability to some degree. Scale is also handled by the PaaS provider.
It's easiest to think of a PaaS system as one single computing resource. You can write code that will scale outward as you pay for more capacity. Of course, there is a specific way to write this type of code, called "stateless programming", but it's a well-understood and not specific to any one vendor. In fact, at its core, good web development is usually written in a stateless fashion.
A PaaS system starts with writing code. PaaS is not meant to host 3rd party software — that's the job of IaaS. In PaaS you write code, deploy that to the PaaS provider, and your users access that. In this case your users might be internal to your company or they may be customers accessing a website. An example is the http://LoveCleanLondon.org site. This website is running on the Windows Azure and SQL Azure PaaS, and the users are actually the citizens of the city of London, England in the UK.
Another use-case for PaaS is when you have no desire or need to control the underlying components such as the operating system or the scaling technology. Your desire is to have a place that will run your code and be available for your users or customers.
The first considerations for a PaaS provider are the platforms available. That normally equates to a programming language, but may include underpinnings such as .NET, Java runtimes, environments and so on.
Once again, you need to be able to trust the vendor. This time you not only have to trust that they will be around in a few years, but you need to trust their security, facilities, global reach, bandwidth, and even their upgrade procedures. This means you need to learn what those are, and operated within them.
The cost model for a PaaS provider often has multiple factors. For instance, there may be a charge for the computing power, the storage, and the network bandwidth or connections — or all of the above. This is a different way of thinking about computing, since we're often used to paying a large fee up-front when we purchase a technology and then using it "for free" from then on. Of course, we're always renewing software contracts, buying new servers, hiring new people to learn and run those systems and so on, but we tend to forget those costs. You'll need to factor those in when you're comparing the costs of a PaaS offering, and you may even need to restructure accordingly.
Microsoft implements PaaS in two systems: Windows Azure and SQL Azure.
Windows Azure is composed of three parts. The first is the "Computing" function, which provides the ability to write web sites using .ASPX or PHP or Tomcat servers. You can also write .NET code, Java, Ruby on Rails, C++ and other code using the compute function.
The second component of Windows Azure is storage. Windows Azure has multiple types of storage, from Binary Objects (Blobs) to Key-Value Pair C++-like "HashTables" which can be quite large in size — often used in a fashion similar to "NoSQL" systems. There are queues that allow the compute functions to maintain a stateless paradigm as well.
The third component of Windows Azure is the "Application Fabric". This component provides a security layer so that you can use Windows Authenticated logins to your applications, OpenID systems such as Windows Live, Facebook and Yahoo logins. It also has a "Service Bus" layer that you can use for message-queue type communications both in and out of your own organization, which you could use to make a "Hybrid Cloud", using applications in your organization on the web. It also has a cache component, so that you applications can perform better.
SQL Azure is essentially a version of SQL Server running on the web — and I've described it in more detail here.
You can use any or all of these components (compute, storage, application fabric, SQL Azure) together or separately. You pay for what you use in different ways for each of them.