Highland Light, Provicetown, MA

If you have ever been to the Highland Light Lighthouse at Provincetown, you will likely recognize the picture below. The image is panoramic and the high resolution image is 15MB and about 13k pixels wide. The image appears cropped here, you can see the complete image at flickr (link on the left).

Highland Light, Provincetown, MA
Highland Light, Provincetown, MA

The image is a composite of 10 discrete images that were stitched using AutoStitch. The software is available for free evaluation at this site.

For more information about Highland Light you can visit their web page at http://www.lighthouse.cc/highland/

IE+Microsoft ROCKS. Ubuntu+Firefox BLOWS

I’m running Ubuntu 9.0.4 and it appears that to upgrade to Firefox 3.5 requires a PhD. Could I borrow yours?

Kevin Purdy has the following “One line install”

That’s not a true install; he just unpacks the tar-ball into the current directory.

Web Upd8 has the following steps

But, now I get daily updates?

What’s with this garbage? Complain as much as you want but Microsoft Software Updates will give you the latest Internet Explorer easily.

Could someone please make this easy?

It’s just numbers people!

I hadn’t read Justin Swanhart’s post (Why is everybody so steamed about a benchmark anyway?) from June 24th till just a short while ago. You can see it at here. Justin is right, it’s just numbers.

He did remind me of a statement from a short while ago by another well known philosopher.

It's a budget

I could not embed the script provided by quotesdaddy.com into this post. I have used their material in the past, they have a great selection of statements that will live on in history. Check them out at http://www.quotesdaddy.com/ 🙂

Head for the hills, here comes more jargon: Anti-Databases and NoSQL

We all know what SQL is, get ready to learn about NoSQL. There was a meet-up last month in San Francisco to discuss it and some details are available at this link. The call for the “inaugural get-together of the burgeoning NoSQL community” drew 150 people and if you are so inclined, you can subscribe to the email list here.

I’m no expert at NoSQL but the major gripe seems to be schema restriction and what Jon Travis calls “twisting your object” data to fit an RDBMS.

NoSQL appears to be a newly coined collective pronoun for a lot of technologies you have heard about before. They include Hadoop, BigTable, HBase, Hypertable, and CouchDB, and maybe some that you have not heard of before like Voldemort, Cassandra, and MongoDb.

But, then there is this thing called NoSQL. The first line on that web page reads

NoSQL is a fast, portable, relational database management system without arbitrary limits, (other than memory and processor speed) that runs under, and interacts with, the UNIX 1 Operating System.

Now, I’m confused. Is this the NoSQL that was referenced in the non-conference at non-san-francisco last non-month?

My Point of View

From what I’ve seen so far, NoSQL seems to be a case of disruption at the “low end”. By “low end”, I refer to circumstances where the data model is simple and in those cases I can see the point that SQL imposes a very severe overhead. The same can be said in the case of a model that is relatively simple but refers to “objects” that don’t relate to a relational model (like a document). But, while this appears to be “low end” disruption right now, there is a real reason to believe that over time, NoSQL will grow into more complex models.

But, the big benefit of SQL is that it is declarative programming language that gave implementations the flexibility of implementing the program flow in a manner that matched the physical representation of the data. Witness the fact that the same basic SQL language has produced a myriad of representations and architectures ranging from shared nothing to shared disk, SMP to MPP, row oriented to column oriented, with cost based and rules based optimizers etc., etc.,

There is (thus far) a concern about the relative complexity of the NoSQL offerings when compared with the currently available SQL offerings. I’m sure that folks are giving this considerable attention and what comes next may be a very interesting piece of technology.



More on TPC-H comparisons

Three charts showing comparisons of TPC-H benchmark data.

Just a quick post to upload three charts that help visualize the numbers that Curt and I have been referring to in our posts. Curt’s original post was, my post was.

The first chart shows the disk to data ratio that was mentioned. Note that the X-Axis showing TPC-H scale factor is a logarithmic scale.The benchmark information shows that the ParAccel solution has in excess of 900TB of storage for the 30TB benchmark, the ratio is therefore in excess of 30:1.

The second chart shows the memory to data ratio. Note that both the X and Y Axis are logarithmic scales. The benchmark information shows that the ParAccel solution has 43 nodes and approximately 2.7TB of RAM, the ratio is therefore approximately 1:11 (or 9%).

The third chart shows the load time (in hours) for various recorded results. The ParAccel results indicate a load time of 3.48 hours. Note again that the X-Axis is a logarithmic scale.

For easy reading, I have labeled the ParAccel 30TB value on the chart. I have to admit, I don’t understand Curt’s point. And maybe others share this bewilderment? I think I’ve captured the numbers correctly, could someone help verify these please.

If the images above are shown as thumbnails, you may not be able to see the point I’m trying to make. You need to see the bigger images to see the pattern.

Revision

In response to an email, I looked at the data again and got the following RANK() information. Of the 151 results available today, the ParAccel 30TB numbers are 58th in Memory to Data and 115th in Disk to Data. It is meaningless to compare load time ranks without factoring in the scale and I’m not going to bother with that as the sample size at SF=30,000 is too small.

If you are willing to volunteer some of your time to review the spreadsheet with all of this information, I am happy to send you a copy. Just ask!

Learning from Joost

A lot of interesting articles have emerged in the wake of the recent happenings at Joost.

One post that caught my attention and deserves reading, and re-reading, and then reading one more time just to be sure is Ed Sim’s post, where he writes

raising too much money can be a curse and not a blessing

A lot to learn from in all of these articles.