Running databases in virtualized environments

I have long believed that databases can be successfully deployed in virtual machines. Among other things, that is one of the central ideas behind ParElastic, a start-up I helped launch earlier this year. Many companies (Amazon, Rackspace, Microsoft, for example) offer you hosted databases in the cloud!

But yesterday I read this post in RWW. This article talks about a report published by Principled Technologies in July 2011, a report commissioned by Intel, that

tested 12 database applications simultaneously – and all delivered strong and consistent performance. How strong? Read the case study, examine the results and testing methodology, and see for yourself.

Unfortunately, I believe that discerning readers of this report are more likely to question the conclusion(s) based on the methodology. What do you think?


A Summary of the Principled Technologies Report

In a nutshell, this report seeks to make the case that industry standard servers with virtualization can in fact deliver the performance required to run business critical database applications.

It attempts to do so by running Vware vSphere 5.0 on the newest four socket Intel Xeon E7-4870 based server and hosting 12 database applications each of which has an 80GB database in its own virtual machine. The Intel Xeon E7-4870 server is a 10 core processor with two hardware threads per core. It was clocked at 2.4GHz and 1TB of RAM (64 modules each of which had 16GB). The storage in this server was 2 disks, each of which was 146GB in size (10k SAS). In addition, an EMC Clarriion Fibre Channel SAN with some disks configured in RAID0. In total they configured 6 LUN’s each of which was 1066GB (over a TB each). They VM’s ran Windows Server 2008 R2, and SQL Server 2008 R2.

The report claims that the test that was performed was “Benchmark Factory’s TPC-H like workload”. Appendix B somewhat (IMHO) misleadingly calls this “Benchmark Factory TPC-H score”.

The result is that these twelve VM’s running against an 80GB database were able to consistently process in excess of 10,000 queries per hour each.

A comparison is made to the Netezza whitepaper that claims that the TwinFin data warehouse appliance running the “Nationwide Financial Services” workload was able to process around 2,500 queries per hour and a maximum of 10,000 queries per hour.

The report leaves the reader to believe that since the 12 VM’s in the test ran consistently more than 10,000 queries per hour, business critical applications can in fact be deployed in virtualized environments and deliver good performance.

The report concludes therefore that business critical applications can be run on virtualized platforms, deliver good performance, and reduce cost.


My opinion

While I entirely believe that virtualized database servers can produce very good performance, and while I entirely agree with the conclusion that was reached, I don’t believe that this whitepaper makes even a modestly credible case.

I ask you to consider this question, “Is the comparison with Netezza running 2,500 queries per hour legitimate?”

Without digging too far, I found that the Netezza whitepaper talks of a data warehouse with “more than 4.5TB of data”, 10 million database changes per day, 50 concurrent users at peak time and 10-15 on an average. 2,500 qph with a peak of 10k qph at month end, 99.5% completing in under one minute.

Based on the information disclosed, this comparison does not appear to be valid. Note well that I am not saying that this comparison is invalid, rather that the case has not been made sufficiently to justify it.

An important reason for my skepticism is that when processing database operations like joins between two tables, doubling the data volume quadruples the amount of computation that may be required. If you are performing three table joins, doubling the data increases the computation involved may be as much as eight times. This is the very essence of the scalability challenge with databases!

I get an inkling that this may not be a valid comparison when we look at Appendix B that states that the total test time was under 750 seconds in all cases.

This feeling is compounded when I don’t see how many concurrent queries are run against each database. Single user database performance is a whole lot better and more predictable than multi-user performance. The Netezza paper specifically talks about the multi-user concurrency performance not the single-user performance.

Reading very carefully, I did find a mention that a single server running 12 VM’s hosted the client(s) for the benchmark. Since ~15k queries were completed in under 750s, we can say that each query lasted about 0.05s. Now, those are really really short queries. Impressive but not what I would generally consider to be in the kinds of workloads that one would expect Netezza to be deployed. The Netezza report does clearly state that 99.5% completed in under one minute, which leads me to conclude that the queries being run in the subject benchmark are at least two orders of magnitude away!

Conclusion

Virtualized environments like Amazon EC2, Rackspace, Microsoft Azure, and VMWare are perfectly capable of running databases and database applications.One need only look at Amazon RDS (now with MySQL and Oracle), database.com, SQL Azure, and offerings like that to realize that this is in fact the case!

However, this report fails to make a compelling case for this. By making a comparison to a different whitepaper and simply relating the results to the “queries per hour” in the other paper causes me to question the  methodology. Once readers question the method(s) used to reach a conclusion, they are likely to question the conclusion itself.

Therefore, I don’t believe that this report achieves what it set out to do.


References

You can get a copy of the white paper here, a link to scribd, or here, a link to the PDF on RWW.

This case study references a Netezza whitepaper on concurrency, which you can get here. The Netezza whitepaper is “CONCURRENCY & WORKLOAD MANAGEMENT IN NETEZZA”, and prepared by Winter Corp and sponsored by Netezza.

I have also archived copies of the two documents here and here.

A link to the TPC-H benchmark can be found on the TPC web site here.

Disclosure

In the interest of full disclosure, in the past I was an employee of Netezza, a company that is referenced in this report.

Boston Cloud Services meetup yesterday

summary of boston cloud services meetup yesterday.

Tsahy Shapsa of aprigo organized the second Boston Cloud Services meetup yesterday. There were two very informative presentations, the first by Riki Fine of EMC on the EMC Atmos project and the second by Craig Halliwell from TwinStrata.

What I learnt was that Atmos was EMC’s entry into the cloud arena. The initial product was a cloud storage offering with some additional functionality over other offerings like Amazon’s. Key product attributes appear to be scalability into the petabytes, policy and object metadata based management, multiple access methods (CIFS/NFS/REST/SOAP), and a common “unified namespace” for the entire managed storage. While the initial offering was for a cloud storage offering, there was a mention of a compute offering in the not too distant future.

In terms of delivery, EMC has setup its own data centers to host some of the Atmos clients. But, they have also partnered with other vendors (AT&T was mentioned) who would provide an cloud storage offerings that exposed the Atmos API. AT&T’s web page reads

AT&T Synaptic Cloud Storage uses the EMC Atmos™ backend to deliver an enterprise-grade global distribution system. The EMC Atmos™ Web Services API is a Web service that allows developers to enable a customized commodity storage system over the Internet, VPNs, or private MPLS connectivity.

I read this as a departure from the approach being taken by the other vendors. I don’t believe that other offerings (Amazon, Azure, …) provide a standardized API and allow others to offer cloud services compliant to that interface. In effect, I see this as an opportunity to create a marketplace for “plug compatible” cloud storage. Assume that a half dozen more vendors begin to offer Atmos based cloud storage, each offering a different location, SLA’s and price point, an end user has the option to pick and choose from that set. To the best of my knowledge, today the best one can do is pick a vendor and then decide where in that vendor’s infrastructure the data would reside.

Atmos also seems to offer some cool resiliency and replication functionality. An application can leverage a collection of Atmos storage providers. Based on policy, an object could be replicated (synchronously or asynchronously) to multiple locations on an Atmos cloud with the options of having some objects only within the firewall and others being replicated outside the firewall.

Enter TwinStrata who are an Atmos partner. They have a cool iSCSI interface to the Atmos cloud storage. With a couple of clicks of a mouse, they demonstrated the creation of a small Atmos based iSCSI block device. Going over to a windows server machine and rescanning disks they found the newly created volume. A couple of clicks later there was a newly minted “T:” that the application could use, just as it would a piece of local storage. TwinStrata provides some additional caching and ease of use features. We saw the “ease of use” part yesterday. The demo lasted a couple of minutes and no more than about a dozen mouse clicks. The version that was demo’ed was the iSCSI interface, there was talk of a file system based interface in the near future.

Right now, all of these offerings are expected to be for Tier-3 storage. Over time, there is a belief that T2 and T1 will also use this kind of infrastructure.

Very cool stuff! If you are in the Boston area and are interested in the Cloud paradigm, definitely check out the next event on Sept 23rd.

Pizza and refreshments were provided by Intuit. If you haven’t noticed, the folks from Intuit are doing a magnificent job fostering these kinds of events all over the Boston Area. I have attended several excellent events that they have sponsored. A great big “Thank You” to them!

Finally, a big “Thank You” to Tsahy and Aprigo for arranging this meetup and offering their premises for the meetings.

Boston Cloud Services- June Meetup.

Boston Cloud Services- June Meetup.

Tsahy setup a meetup group for Cloud Services at http://www.meetup.com/Boston-cloud-services/. The first meeting is today, check out the meeting link at

Boston Cloud Services- June Meetup.

Location

460 Totten Pond rd
suite 660
Waltham, MA 02451

All,
We have a great agenda for this 1st Boston cloud services meetup!& broadcasting live on http://www.stickam.co…

1. Tsahy Shapsa – 15 minutes- a case study of an early stage start-up and talk about what it’s like to build a new business now days, with all this cloud stuff going around. covering where we’re using cloud/SaaS to run our business,operations,IT etc, where we’re not and why, challenges that we faced / are facing etc. We can have an open discussions on the good,bad & ugly and I wouldn’t mind taking a few tips from the audience…

2. John Bennett – 30 minutes will give a talk on separating fact from fiction in the cloud computing market. John is the former marketing director of a cloud integration vendor (SnapLogic), and have been watching this market closely for a couple of years now.
Blog: http://bestrategic.blogspot.com.
bio here: http://www.bennettstr…

3. Mark E. Hodapp – 30 minutes – ‘Competing against Amazon’s EC2’
Mark was Director R&D / CTO at Sun microsystems where led a team of 20 engineers working on an advanced research effort,Project Caroline, a horizontally scalable platform for the development
and deployment of Internet services.

No cloud in sight!

The conventional wisdom at the beginning of ’09 was that the economic downturn would catapult cloud adoption but that hasn’t quite happened. This post explores trends and possible reasons for the slow adoption as well as what the future may hold.

A lot has been written in the past few days about Cloud Computing adoption based on a survey by ITIC (http://www.itic-corp.com/). At the time of this writing, I haven’t been able to locate a copy of this report or a link with more details online but most articles referencing this survey quote Laura DiDio as saying,

“An overwhelming 85% majority of corporate customers will not implement a private or public cloud computing infrastructure in 2009 because of fears that cloud providers may not be able to adequately secure sensitive corporate data”.

In another part of the country, structure09 had a lot of discussion about Cloud Computing. Moderating a panel of VC’s, Paul Kedrosky asked for a show of hands of VC’s who run their business on the cloud. To quote Liz Gannes,

“Let’s just say the hands did not go flying up”.

Elsewhere, a GigaOM report by George Gilbert and Juergen Urbanski conclude that leading storage vendors are planning their innovation around a three year time frame, expecting adoption of new storage technologies to coincide with emergence from the current recession.

My point of view

In the short term, services that are already “networked” will begin to migrate into the cloud. The migration may begin at the individual and SMB end of the market rather than at the Fortune 100. Email and CRM applications will be the poster-children for this wave.

PMCrunch also lists some SMB ERP solutions that will be in this early wave of migration.

But, this wave will primarily target the provision of application services through a different delivery model (application hosted on a remote server instead of a corporate server).

It will be a while before cloud based office applications (word-processing, spreadsheets, presentations) become mainstream. The issue is not so much security as it is network connectivity. The cloud is useless to a person who is not on the network and until ubiquitous high bandwidth network connectivity is available everywhere, and at an accessible and reasonable cost, the cloud platform will not be able to move forward.

We are beginning to see increased adoption in Broadband WiFi or Cellular Data in the US but the costs are still too high and service is still insufficient. Just ask anyone who has tried to get online at one of the many airports and hotels in the US.

Gartner highlights five key attributes of Cloud Computing.

  1. Uses Internet Technologies
  2. Service Based
  3. Metered by Use
  4. Shared
  5. Scalable and Elastic

Note that I have re-ordered them into what I believe is the order in which cloud adoption will progress. The early adoption will be in applications that “Uses Internet Technologies” and “Service Based” and the last will be “Scalable and Elastic”.

As stated above, the early adopters will deploy applications with a clearly defined and “static” set of deliverables in areas that currently require the user to have network connectivity (i.e. do no worse than current, change the application delivery model from in-house to hosted). In parallel, corporations will begin to deploy private clouds for use within their firewalls.

As high bandwidth connectivity is more easily available adoption will increase, currently I think that is the real limitation.

Data Security will be built along the way, as will best practices on things like Escrow and mechanisms to migrate from one service provider to another.

Two other things that could kick cloud adoption into high gear are

  1. the delivery of a cloud platform from a company like Akamai (why hasn’t this happened yet?)
  2. a mechanism that would allow applications to scale based on load and use the right amount of cloud resource. Applications like web servers can scale based on client demand but this isn’t (yet) the case with other downstream services like databases or mail servers.

That’s my point of view, and I’d love to hear yours especially in the area of companies that are addressing the problem of providing a cloud user the ability to migrate from one provider to another, or mechanisms to dynamically scale services like databases and mail servers.