Why cloud computing?

| Comments (3) | Networking
Ed Felten writes about the economic forces that drive cloud computing, arguing that a prime driver is the desire to reduce administrative costs:
Why, then, are we moving into the cloud? The key issue is the cost of management. Thus far we focused only on computing resources such as storage, computation, and data transfer; but the cost of managing all of this -- making sure the right software version is installed, that data is backed up, that spam filters are updated, and so on -- is a significant part of the picture. Indeed, as the cost of computing resources, on both client and server sides, continues to fall rapidly, management becomes a bigger and bigger fraction of the total cost. And so we move toward an approach that minimizes management cost, even if that approach is relatively wasteful of computing resources. The key is not that we're moving computation from client to server, but that we're moving management to the server, where a team of experts can manage matters for many users.

This certainly is true to an extent and it's one of the driving factors behind all sorts of outsourced hosting. Educated Guesswork, for instance, is hosted on Dreamhost, in large part because I didn't want the hassle of maintaining yet another public Internet-accessible server. I'm not sure I would call this "cloud computing", though, except retroactively.

That said, the term "cloud computing" covers a lot of ground (see the Wikipedia article), and I don't think Felten's argument holds up as well when we look at examples that look less like outsourced applications. Consider, for example Amazon's Elastic Compute Cluster (EC2). EC2 lets you rapidly spin up a large number of identical servers on Amazon's hardware and bring them up and down as required to service your load. Now, there is a substantial amount of management overhead reduction at the hardware level in that you don't need to contract for Internet, power, HVAC, etc., but since you're running a virtualized machine, you still have all the software management issues Ed mentions, and they're somewhat worse since you have to work within Amazon's infrastructure (see here for some complaining about this). Much of the benefit of an EC2-type solution is extreme resource flexibility: if you have a sudden load spike, you don't need to quickly roll out a bunch of new hardware, you just bring up some EC2 instances. When the spike goes away, you shut them down.

A related benefit is that this reduces resource consumption via a crude form of stochastic multiplexing: if EC2 is running a large number of Web sites, they're probably not all experiencing spikes at the same time, so the total amount of spare capacity required in the system is a lot smaller.

Both of these benefits apply as well to applications in the cloud (for instance, Ed's Gmail example). If you run your own mail server, it's idle almost all the time. On the other hand, if you use Gmail (or even a hosted service), then you are sharing that resource with a whole bunch of different people and so Amazon just needs enough capacity to service the projected aggregate usage of all those people, most of whom aren't using the system very hard (what, you thought that Amazon really had 8G of disk for each user?). At the end of the day, I suspect that the management cost Ed sites is the dominant issue here, though, which, I suppose argues that lumping outsourced applications ("software as a service") together with outsourced/virtualized hardware as "cloud computing" isn't really that helpful.


> …Amazon just needs enough capacity…
> …Amazon really had 8G of disk for each user?

I think you mean Google.

I wouldn't call EC2 a "crude form of stochastic multiplexing" -- nothing crude about it. The larger they get, the more the law of large numbers governs.

On the shared resource vs. administrative load front, administrative load explains why the customer wants to use gmail instead of a low performance VM in a cloud, but not why it is profitable for google to run the thing -- that part is multiplexing of resources.

I think you and Ed are talking about two different things. Ed's talking about the "cloud-ization" of *client* applications--the move from pure desktop applications to desktop-plus-cloud hybrids or even browser-based applications with little-to-no client-side processing. You're talking about the "cloud-ization" of *server* applications--the shift from internally operated enterprise servers and datacenters to external hosting services.

In both cases, though, I'd say the same general effect is at work: various operational costs scale better than linearly with the number of application users. As Ed points out, certain aspects of management, such as the cost of sophisticated management tools and the expertise to use them, fit that pattern. So, as you note, do peak resource requirements, including CPU and bandwidth. No doubt there are others as well, including perhaps some of the infrastructure costs associated with running processing, storage and communications gear.

Leave a comment