It is definitely interesting to see that the ParAccel TPC-H result saga is not yet done. Posts by Daniel Abadi (interesting analysis but it seems simplistic at first blush) and reflections on Curt Monash’s blog are proving to be amusing, to say the least.
At the end of the day, whether ParAccel is a “true column store” (or a “vertically partitioned row store”), and whether it has too much disk capacity and disk bandwidth strike me as somewhat academic arguments that don’t recognize some basic facts.
- ParAccel’s system (bloated, oversized, undersized, truly columnar, OR NOT) is only the second system to post 30TB results.
- That system costs less than half of the price of the other set of published results. Since I said “basic facts”, I will point out that the other entry may provide higher resiliency and that may be a part of the higher price. I have not researched whether the additional price is justified, related to the higher resiliency, etc., I want to make the point that there is a difference in the stated resiliency of the two solutions that is not apparent from the performance claims.
It might just be my sheltered upbringing but:
- I know of few customers who purchase cars solely for the manufacturers advertised MPG and similarly, I know of no one who purchases a data-warehouse from a specific vendor because of the published TPC-H results.
- I know of few customers who will choose to make a purchase decision based on whether a car has a inline engine, a V-engine or a rotary engine and similarly, I know of no one who makes a buying decision on a data warehouse based on whether the technology is “truly columnar”, “vertically partitioned row store”, “row store with all indexes” or some other esoteric collection of interesting sounding words?
So, I ask you, why are so many people so stewed about ParAccel’s TPC-H numbers? In a few more days, will we have more posts about ParAccel’s TPC-H numbers than we will about whether Michael Jackson really died or whether this is all part of a “comeback tour”?
I can’t answer either of the questions above for sure but what I do know is this, ParAccel is getting some great publicity out of this whole thing.
And, I have it on good authority (it came to me in a dream last night) that ParAccel’s solution is based on high-performance trilithium crystals. (Note: I don’t know why this wasn’t disclosed in the full disclosure report). I hear that they chose 43 nodes because someone misremembered the “universal answer” from The Hitchhikers Guide to the Galaxy. By the time someone realized this, it was too late because the data load had begun. Remember you read it here first 🙂
Give it a rest folks!
P.S. Within minutes of posting this a well known heckler sent me email with the following explanation that confirms my hypothesis.
When a beam of matter and antimatter collide in dilithium we get a plasma field that powers warp drives within the “Sun” workstations. The warp drives that ParAccel uses are the Q-Warp variant which allow queries to run faster than the speed of light. A patent has been filed for this technique, don’t mention it in your blog please.
2 thoughts on “Wow, the TPC-H speculation continues!”
Well you have to admit at the very least it’s entertaining and quite informative from an industry perspective. When’s the last time it was hip or cool to joust on EDW topics? 🙂
Personally as a customer I wouldn’t give much of a hoot how they got there (unless there was a blatant inconsistency in the report, which would cast doubt on the vendor’s competency). My main concerns would be cost, ease of use and scalability. If you can show me these prevail using my own data, I don’t care much if you get there via Trilithium crystals or columns or sharding or, God forbid, another approach, much like XSPRADA takes . Results matter. Contrary to leisure travelling, in this business, it’s definitely all about final destinations:)