This is the first time I have heard of power constraints on the grid as justification for quarterly earnings results. In no way am I refuting it, its actually becoming a critical problem – California is a great example. In 2006 we had this rather warm day – about 114 degrees as I was driving along Interstate 280. It was a Sunday fortunately so there was not a lot of commercial enterprises in full-tilt operation. The next day it was commented that the northern California power grid was around 380 megawatts short of the point where rolling blackouts start.
To put this in perspective I was at a data center the other day in at Switch in Las Vegas that is a 100 Megawatt critical load facility. The new Microsoft Data Center in Chicago is also in the same range and while I have seen no hard data on the size of the Google or Apple facilities given the square footage and power densities available today they are also most likely in the same power range. Short version – Northern California was three to four data centers short of rolling black outs.
See, here’s what happened. For a long time we didn’t care. When we were designing products like the Catalyst switches in the 2001-2006 timeframe power just simply was not a concern – density was the main concern. We needed to get more and more switch ports into a smaller form-factor because our customers were running out of space. We saw this in every tech segment- denser servers, denser storage, denser networks. I often joke we had the densest product management teams out there – I can say this, was one of them.
Coupled with this density increase was the trend towards data center consolidation – the gist of it is that as bandwidth costs went down post-2001 and the operating costs of these ever increasingly complex data centers went up, the cost slopes inflected, and it became operationally advantageous to run an IT infrastructure with as few physical facilities as possible while maintaining a viable disaster recovery/avoidance plan. When people consolidated their data center facilities together all of a sudden what was distributed became centralized, what was centralized was measurable behind the power meter, and the facilities team decided to pass the power bill to the IT department for this now measurable and consolidated data center facility. Oops – that wasn’t in the budget plan!
We ran out of space – the industry delivered denser equipment.
It got expensive to maintain them, and more cost effective to connect them – data centers consolidated.
Then we ran out of cooling capacity – and this made many cooling vendors very happy. Enter the Computer Room Air Conditioner, colloquially knows as the CRAC unit – and yes, to a facilities manager being pushed to cool these massively dense systems these are quite the addictive element when you have cooling problems.
Then, lastly, as Stacey indicates we started running out of power. You go to the local power provider and say, “Hey! I need another 10 Megawatts, please sir.” Then they pretty much laugh at you almost hysterically. In some cases you are asked to build your own substation and finance their build out to support you, in some cases you opt for co-generation, in others you hope a second provider exists. But generally your option is to find a new location that has the power density you require.
Now lets look at this from an economic perspective in California. California, as a stand-alone entity, is the world’s 6th largest economy. And just FOUR more data centers would break the power infrastructure in this state. Thus we have witnessed the ‘Flight of the Data Centers’(cue Wagner now for a laugh). Businesses have moved their data center facilities in one safe direction – away. Las Vegas, Oregon, Washington – places with sustainably generated, renewable, low/no carbon emission power sources that are cheap.
What does this mean to capital equipment sales taxes, import tariff, infrastructure investment, job creation, and property taxes in California? I think we can see that it is not positive at a local level although most would not argue that it is the right choice for the business.
Q: So what can, or should we do about this?
A: There is no silver bullet.
Every company you talk to will have their own angle on what can be done. There is no one all-encompassing right answer.
Cloud! some will say – but that moves the problem from you to someone else, granted it is someone who hopefully has the technical expertise and economy of scale on their side to run the IT infrastructure more efficiently than you can in your own domain.
DC Power! Of course – run everything DC, that will save power. The studies vary, just like the mileage on my car; however, since most of the IT infrastructure is AC today that is an awful lot of equipment that would have to be replaced which generally seems to benefit the vendors who are pushing DC.
Higher Voltage AC – possible, not the be-all/end-all but could improve efficiencies, especially in the US where 110 may be the norm, even going from 208 to ~240 would be an improvement.
Solid State Disk – can certainly handle more IOPS than spinning-disks at a more efficient power draw. A good example is Cisco was touting 400,000 IOPS per blade in a UCS system. If you supported this with rotating media it would take around 2000 hard drives per UCS module to absorb the I/O performance – suffice to say that is a lot of capacity. At 11-16w of power (idle versus active) per hard drive and assuming usage averages at 50% that’s 27 kilowatts of power per blade in storage requirements, thus SSD is the more logical choice with far faster IO rates.
Energy Efficient Ethernet – Why do you run a port at 10Gb when it may only need 1Gb or 100Mb? Energy Efficient Ethernet is being developed through and with the IEEE to lower the power draw, rate adapt, and reduce the power when load factors are low or the ports are in a shutdown state.
Smarter Software – If the ports are administratively shut down, why do the chips initialize and draw power? By that, I mean that when a port is put in down/down state administratively, why should the PHy and MAC power up?
Why not keep as many system components in a low power draw state and have a sub linear power curve based on true-demand rather than having the device run ‘hot’?
Now, I know why this is- its because it is easier on the software developer in the engineering team. But I think we should all start demanding more efficient and effective power management – a variable rotation fan is great, but there is a lot more that can be done.
The list goes on. What other ways can power be lowered and workloads be processed more efficiently?