ParAccel TPC-H 30TB results challenged!

Watch the feeding frenzy now that ParAccel’s TPC-H 30TB results have been challenged.

Before I had my morning cup of coffee, I found an email message with the subject “ParAccel ADVISORY” sitting in my mail box. Now, I’m not exactly sure why I got this message from Renee Deger of GlobalFluency so my first suspicion was that this was a scam and that someone in Nigeria would be giving me a million dollars if I did something.

But, I was disappointed. Renee Deger is not a Nigerian bank tycoon who will make me rich. In fact, ParAccel’s own blog indicates that their 30TB results have been challenged.

We wanted you to hear it from us first.  Our TPC-H Benchmark for performance and price-performance at the 30-terabyte scale was taken down following a challenge by one of our competitors and a review by the TPC.  We executed this benchmark in collaboration with Sun and under the watch of a highly qualified and experienced auditor.   Although it has been around for many years, the TPC-H specification is still subject to interpretation, and our interpretation of some of the points raised was not accepted by the TPC review board.

None of these items impacts our actual performance, which is still the best in the industry.  We will rerun the benchmark at our earliest opportunity according to the interpretations set forth by the TPC review board. We remain committed to the organization, as outlined in a blog post by Barry Zane, our CTO, here: http://paraccel.com/data_warehouse_blog/?p=74#more-74.

Please see the company blog for a post by David Steinhoff in the office of the CTO for further info: http://paraccel.com/data_warehouse_blog/?p=104#more-104

I read David Steinhoff’s blog as well. He writes

This last week, our June 2009 30TB results were challenged and found to be in violation of certain TPC rules. Accordingly, our results will soon be taken down from the TPC website.

We published our results in good faith. We used our standard, customer available database to run the benchmark (we wanted the benchmark to reflect the incredible performance our customers receive). However, the body of TPC rules is complex and undergoes constant interpretation; we are still relatively new to the benchmark game and are still learning, and we made some mistakes.

While we cannot get into the details of the challenges to our results (TPC proceedings are confidential and we would be in violation if we did), we can state with confidence that our query performance was in no way enhanced by the items that were challenged.

We can also say with confidence that we will publish again, soon.

Now, some competitor or competitor loyalist may try to make more of this than there is … we all know there is the risk of tabloid frenzy around any adversity in a society with free speech … and we wouldn’t have it any other way.

It is unfortunate that the proceedings are confidential and cannot be shared. I hope you republish your results at 30TB.

Contrary to some a long list of pundits, I believe that these benchmarks have an important place in the marketing of a product and its capabilities.

I reiterate what I said in a previous blog post

ParAccel’s solution is based on high-performance trilithium crystals. (Note: I don’t know why this wasn’t disclosed in the full disclosure report).

I hope the challenge was not about the trilithium crystals and the fact that you didn’t disclose it in the full disclosure report.

Wow, the TPC-H speculation continues!

The real technology behind the ParAccel TPC-H results are revealed here!

It is definitely interesting to see that the ParAccel TPC-H result saga is not yet done. Posts by Daniel Abadi (interesting analysis but it seems simplistic at first blush) and reflections on Curt Monash’s blog are proving to be amusing, to say the least.

At the end of the day, whether ParAccel is a “true column store” (or a “vertically partitioned row store”), and whether it has too much disk capacity and disk bandwidth strike me as somewhat academic arguments that don’t recognize some basic facts.

  • ParAccel’s system (bloated, oversized, undersized, truly columnar, OR NOT) is only the second system to post 30TB results.
  • That system costs less than half of the price of the other set of published results. Since I said “basic facts”, I will point out that the other entry may provide higher resiliency and that may be a part of the higher price. I have not researched whether the additional price is justified, related to the higher resiliency, etc., I want to make the point that there is a difference in the stated resiliency of the two solutions that is not apparent from the performance claims.

It might just be my sheltered upbringing but:

  • I know of few customers who purchase cars solely for the manufacturers advertised MPG and similarly, I know of no one who purchases a data-warehouse from a specific vendor because of the published TPC-H results.
  • I know of few customers who will choose to make a purchase decision based on whether a car has a inline engine, a V-engine or a rotary engine and similarly, I know of no one who makes a buying decision on a data warehouse based on whether the technology is “truly columnar”, “vertically partitioned row store”, “row store with all indexes” or some other esoteric collection of interesting sounding words?

So, I ask you, why are so many people so stewed about ParAccel’s TPC-H numbers? In a few more days, will we have more posts about ParAccel’s TPC-H numbers than we will about whether Michael Jackson really died or whether this is all part of a “comeback tour”?

I can’t answer either of the questions above for sure but what I  do know is this, ParAccel is getting some great publicity out of this whole thing.

And, I have it on good authority (it came to me in a dream last night) that ParAccel’s solution is based on high-performance trilithium crystals. (Note: I don’t know why this wasn’t disclosed in the full disclosure report). I hear that they chose 43 nodes because someone misremembered the “universal answer” from The Hitchhikers Guide to the Galaxy. By the time someone realized this, it was too late because the data load had begun. Remember you read it here first 🙂

Give it a rest folks!

P.S. Within minutes of posting this a well known heckler sent me email with the following explanation that confirms my hypothesis.

When a beam of matter and antimatter collide in dilithium we get a plasma field that powers warp drives within the “Sun” workstations. The warp drives that ParAccel uses are the Q-Warp variant which allow queries to run faster than the speed of light. A patent has been filed for this technique, don’t mention it in your blog please.

More on TPC-H comparisons

Three charts showing comparisons of TPC-H benchmark data.

Just a quick post to upload three charts that help visualize the numbers that Curt and I have been referring to in our posts. Curt’s original post was, my post was.

The first chart shows the disk to data ratio that was mentioned. Note that the X-Axis showing TPC-H scale factor is a logarithmic scale.The benchmark information shows that the ParAccel solution has in excess of 900TB of storage for the 30TB benchmark, the ratio is therefore in excess of 30:1.

The second chart shows the memory to data ratio. Note that both the X and Y Axis are logarithmic scales. The benchmark information shows that the ParAccel solution has 43 nodes and approximately 2.7TB of RAM, the ratio is therefore approximately 1:11 (or 9%).

The third chart shows the load time (in hours) for various recorded results. The ParAccel results indicate a load time of 3.48 hours. Note again that the X-Axis is a logarithmic scale.

For easy reading, I have labeled the ParAccel 30TB value on the chart. I have to admit, I don’t understand Curt’s point. And maybe others share this bewilderment? I think I’ve captured the numbers correctly, could someone help verify these please.

If the images above are shown as thumbnails, you may not be able to see the point I’m trying to make. You need to see the bigger images to see the pattern.

Revision

In response to an email, I looked at the data again and got the following RANK() information. Of the 151 results available today, the ParAccel 30TB numbers are 58th in Memory to Data and 115th in Disk to Data. It is meaningless to compare load time ranks without factoring in the scale and I’m not going to bother with that as the sample size at SF=30,000 is too small.

If you are willing to volunteer some of your time to review the spreadsheet with all of this information, I am happy to send you a copy. Just ask!

Is TPC-H really a blight upon the industry?

A recap of some posts (Curt Monash, Merv Adrian) about the ParAccel TPC-H 30 TB benchmark numbers.

On June 22, Curt Monash posted an interesting entry on his blog about TPC-H in the wake of an announcement by ParAccel. On the same day, Merv Adrian posted another take on the same subject on his blog.

Let me begin with a couple of disclaimers.

First, I am currently employed by Dataupia, I used to be employed at Netezza in the past. I am not affiliated with ParAccel in any way, nor Sun, nor the TPC committee, nor the Toyota Motor Corporation, the EPA, nor any other entity related in any way with this discussion. And if you are curious about my affiliations with any other body, just ask.

Second, this blog is my own and does not represent or intend to represent the point of view of my employer, former employer(s), or any other person or entity than myself. Any resemblance to the opinions or points of view of anyone other than myself are entirely coincidental.

As with any other benchmark, TPC-H only serves to illustrate how well or poorly a system was able to process a specified workload. If you happen to run a data warehouse that tracks parts, orders, suppliers, and lineitems in orders in 25 countries and 5 nations that resemble the TPC-H specification, your data warehouse may look something like the one specified in the benchmark specification. And if your business problems are similar to the twenty something queries that are presented in the specification, you can leverage hundreds of person-hours of free tuning advice given to you by the makers of most major databases and hardware.

In that regard, I feel that excellent performance on a published TPC-H benchmark does not guarantee that the same configuration would work well in my data warehouse environment.

But, if I understand correctly, the crux of the argument that Curt makes is that the benchmark configurations are bloated (and he cites the following examples)

  • 43 nodes to run the benchmark at SF 30,000
  • each node has 64 GB of RAM (total of over 2.5TB of RAM)
  • each node has 24 TB of disk (total of over 900TB of disk)

which leads to a “RAM:DATA ratio” of approximately 1:11 and a “DISK:DATA ratio” of approximately 32:1.

Let’s look at the DISK:DATA ratio first

What no one seems to have pointed out (and I apologize if I didn’t catch it in the ocean of responses) is that this 32:1 DISK:DATA ratio is the ratio between total disk capacity and data and therefore includes overheads.

First, whether it is in a benchmark context or a real life situation, one expects data protection in one form or another. The benchmark report indicates that the systems used RAID 0 and RAID 1 for various components. So, at the very least, the number should be approximately 16:1. In addition, the same disk space is also used for the Operating System, Operating System Swap as well as temporary table space. Therefore, I don’t know whether it is reasonable to assume that even with good compression, a system would acheive a 1:1 ratio between data and disk space but I would like to know more about this.

“By way of contrast, real-life analytic DBMS with good compression often have disk/data ratios of well under 1:1.”

Leaving the issue of DISK:DATA ratio aside, one thing that most performance tuning looks at is the number of “spindles”. And, having a large number of spindles is a good thing for performance whether it is in a benchmark or in real life. Given current disk drive prices, it is reasonable to assume that a pre-configured server comes with 500GB drives, as is the case with the Sun system that was used in the ParAccel benchmark. If I were to purchase a server today, I would expect either 500GB drives or 1TB drives. If it were necessary to have a lower DISK:DATA ratio and reducing that ratio had some value in real life, maybe the benchmark could have been conducted with smaller disk drives.

Reading section 5.2 of the Full Disclosure Report it is clear that the benchmark did not use all 900 or so Terabytes of data. If I understand the math in that section correctly, the benchmark is using the equivalent of 24 drives and 50GB per drive on each node for data. That is a total of approximately 52TB of storage set aside for the database data. That’s pretty respectable! Richard Gostanian in his post to Curt’s blog (June 24th, 2009 7:34 am) indicates that they only needed about 20TB of data. I can’t reconcile the math but we’re at least in the right ball-park.

And as for the RAM:DATA ratio, the ratio is 1:11. I find it hard to understand how the benchmark could have run entirely from RAM as conjectured by Curt.

“And so I conjecture that ParAccel’s latest TPC-H benchmark ran (almost) entirely in RAM as well.”

From my experience in sizing systems, one looks at more things than just the physical disk capacity. One should also consider things like concurrency, query complexity, and expected response times. I’ve been analyzing TPC-H numbers (for an unrelated exercise) and I will post some more information from that analysis over the next couple of weeks.

On the whole, I think TPC-H performance numbers (QPH, $/QPH) are as predictive of system performance in a specific data warehouse implementation as the EPA ratings on cars are of actual mileage that one may see in practice. If available, they may serve as one factor that a buyer could consider in a buying decision. In addition to reviewing the mileage information for a car, I’ll also take a test drive, speak to someone who drives the same car, and if possible rent the same make and model for a weekend to make sure I like it. I wouldn’t rely on just the EPA ratings so why should one assume that a person purchasing a data warehouse would rely solely on TPC-H performance numbers?

As an aside, does anyone want to buy a 2000 Toyota Sienna Mini Van? It is white in color and gave 22.4 mpg over the last 2000 or so miles.