NoSQL and “single-table” design pattern

The NoSQL “single-table” design pattern appears to be a polarizing topic with strong opinions for and against it. As best as I can tell, there’s no good reason for that!

I did a talk at AWS re:Invent last week along with Alex DeBrie. The talk was about deploying modern and efficient data models with Amazon DynamoDB. One small part of the talk was about the “single-table” design pattern. Over the next couple of days I have been flooded with questions about this pattern. I’m not really sure what all the hoopla is about this pattern, and why there is so much passion and almost religious fervor around this topic.

With RDBMS there are clearly defined benefits and drawbacks with normalization (and denormalization). Normalization and denormalization are an exercise in trading off between a well understood set of redundancies and anomalies, and runtime complexity and cost. When you normalize your data, the only mechanism to get it “back together” is using a “join”.

If you happen to use a database that doesn’t support joins, or if joins turn out to be expensive, you may prefer to accept the redundancies and anomalies that come with denormalization. This has been a long established pattern, for example in the analytics realm.

The “single-table” design pattern extends traditional RDBMS denormalization in three interesting ways. First, it quite often uses non-atomic datatypes that are not allowed in the normalized terminology of Codd, Date, and others. Second, makes use of the flexible schema support in NoSQL databases to commingle data from different entities in a single table. Finally, it uses data colocation guarantees in NoSQL databases to minimize the number of blocks read, and the number of API calls required in fetching related data.

Here’s what I think these options look like in practice.

First, this is a normalized schema with three tables. When you want to reconstruct the data, you join the tables. There are primary and foreign key constraints in place to ensure that data is consistent.

The next option is the fully denormalized structure where data from all tables is “pre-joined” into a single table.

The single-table schema is just slightly different. Data for all entities are commingled into a single table.

Application designers, and data modelers should look at their specific use-cases and determine whether or not they want to eliminate redundancies and inconsistencies and pay the cost of joins (or performing the join themselves in multiple steps if the database doesn’t support it), or denormalize and benefit from lower complexity and sometimes lower cost!

The other thing to keep in mind is that nothing in the single table design pattern requires you to bring all entities into a single table. A design where some entities are combined into a single table, coexists perfectly with others that are not.

What’s all the hoopla about?

The GCE outage on June 2 2019

I happened to notice the GCE outage on June 2 for an odd reason. I have a number of motion activated cameras that continually stream to a small Raspberry Pi cluster (where tensor flow does some nifty stuff). This cluster pushes some more serious processing onto GCE. Just as a fail-safe, I have the system also generate an email when they notice an anomaly, some unexplained movement, and so on.

And on June 2nd, this all went dark for a while, and I wasn’t quite sure why. Digging around later, I realize that the issue was that I relied on GCE for the cloud infrastructure, and gmail for the email. So when GCE had an outage, the whole thing came apart – there’s no resiliency if you have a single-point-of-failure (SPOF) and GCE was my SPOF.

WhiScreen Shot 2019-06-05 at 7.17.17 AMle I was receiving mobile alerts that there was motion, I got no notification(s) on what the cause was. The expected behavior was that I would receive alerts on my mobile device, and explanations as email. For example, the alert would read “Motion detected, camera-5 <time>”. The explanation would be something like “NORMAL: camera-5 motion detected at <time> – timer activated light change”,  “NORMAL: camera-3 motion detected at <time> – garage door closed”, or “WARNING: camera-4 motion detected at <time> – unknown pattern”.

I now realize that the reason was that the email notification, and the pattern detection relied on GCE and that SPOF caused delays in processing, and email notification. OK, so I fixed my error and now use Office365 for email generation so at least I’ll get a warning email.

But, I’m puzzled by Google’s blog post about this outage. The summary of that post is that a configuration change that was intended for a small number of servers ended up going to other servers, shit happened, shit cleanup took longer because troubleshooting network was the same as the affected network.

So, just as I had a SPOF, Google appears to have had an SPOF. But, why is it that we still have these issues where a configuration change intended for a small number of servers ends up going to a large number of servers?

Wasn’t this the same kind of thing that caused the 2017 Amazon S3 outage?

At 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.

Shouldn’t there be a better way to detect the intended scope of a change, and a verification that this is intended? Seems like an opportunity for a different kind of check-and-balance?

Building completely redundant systems sounds like a simple solution but at some point the cost of this becomes exorbitant. So building completely independent control and user networks may seem like the obvious solution but is it cost effective to do that?

Old write-up about CAP Theorem

In 2011, Ken Rugg and I were having a number of conversations around CAP Theorem and after much discussion, we came up with the following succinct inequality. It was able to help us much better speak to the issue of what constituted “availability”, “partition tolerance”, and “consistency”. It also confirmed our suspicions that availability and partition tolerance were not simple binary attributes; Yes or No, but rather that they had shades of gray.

So here’s the write-up we prepared at that time.

parelastic-brewers-conjecture

Unfortunately, the six part blog post that we wrote (on parelastic.com) never made it in the transition to the new owners.

Blockchain is an over hyped technology solution looking for a problem

The article makes a very simple argument for something that I have felt for a while, block chain is a cool technology but the majority of the use cases people talk about are just bull shit.

http://www.coindesk.com/blockchain-intermediaries-hype/

More than 99% of Blockchain Use Cases Are Bullshit

I’ve been following the blockchain ecosystem for some time now largely because it strikes me as yet another distributed database architecture, and I dream about those things.

For some time now, I’ve been wondering what to do after Tesora and blockchain was one of the things I’ve been looking at as a promising technology but I wasn’t seeing it. Of late I’ve been asking people who claim to be devotees at the altar of blockchain what they see as the killer app. All I hear are a large number of low rumbling sounds.

And then I saw this article by Jamie Burke of Convergence.vc and I feel better that I’m not the only one who feels that this emperor is in need of a wardrobe.

Let’s be clear, I absolutely agree that bitcoin is a wonderful use of the blockchain technology and it solves the issue of trust very cleverly through proof of work. I think there is little dispute of elegance of this solution.

But once we go past bitcoin, the applications largely sound and feel like my stomach after eating gas station sushi; they sound horrible and throw me into convulsions of pain.

In his article, Jamie Burke talks of 3d printing based on a blockchain sharded CAD file. I definitely don’t see how blockchain can prevent the double-spend (i.e. buy one Yoda CAD file, print 10,000).

Most of the blockchain ideas I’m seeing are things which are attempting to piggy-back on the hot new buzzword and where blockchain is being used to refer to “secure and encrypted database”. After all, there’s a bunch of crypto involved and there’s data stored there right? so it must be a secure encrypted database.

To which I say, Bullshit!

P.S. Oh, the irony. This blog post references a blog post with a picture labeled “Burke’s Bullshit Cycle”, and the name of this blog is hypecycles.com.

Comparing parallel databases to sharding

I just posted an article comparing parallel databases to sharding on the ParElastic blog at http://bit.ly/JaMeVr

It was motivated by the fact that I’ve been asked a couple of times recently how the ParElastic architecture compares with sharding and it occurred to me this past weekend that

“Parallel Database” is a database architecture but sharding is an application architecture

Read the entire blog post here:

http://bit.ly/JaMeVr

SQL, NoSQL, NewSQL and now and new term …

NonsenSQL!

Read all about it at C Mohan’s blog (cmohan.tumblr.com).

Mohan knows a thing or two about databases. As a matter of fact, keeping track of his database related achievements, publications and citations is in itself a big-data problem.

NonsenSQL, read all about it!

 

Revisiting Network I/O APIs: The netmap Framework

Great article on High Scalability entitled Revisiting Network I/O APIs: The netmap Framework. Get the paper they reference here.

As network performance continues, the bottleneck will become the amount of time spent in moving packets between the wire (hardware) and the application (software) and vice versa. The netamp framework is an interesting approach to address this.

On migrating from Microsoft SQL Server to MongoDB

Just reading this article http://www.wireclub.com/development/TqnkQwQ8CxUYTVT90/read describing one companies experiences migrating from SQL Server to MongoDB.

Having read the article, my only question to these folks is “why do it”?

Let’s begin by saying that we should discount all one time costs related to data migration. They are just that, one time migration costs. However monumental, if you believe that the final outcome is going to justify it, grin and bear the cost.

But, once you are in the (promised) MongoDB land, what then?

The things that this author believes that they will miss are:

  • maturity
  • tools
  • query expressiveness
  • transactions
  • joins
  • case insensitive indexes on text fields

Really, and you would still roll the dice in favor of a NoSQL science project. Well, then the benefits must be really really awesome! Let’s go take a look at what those are. Let’s take a look at what those are:

  • MongoDB is free
  • MongoDB is fast
  • Freedom from rigid schemas
  • ObjectID’s are expressive and handy
  • GridFS for distributed file storage
  • Developed in the open

OK, I’m scratching my head now. None of these really blows me away. Let’s look at these one at a time.

  • MongoDB is free
  • So is PostgreSQL and MySQL
  • MongoDB is fast
    • So are PostgreSQL and MySQL if you put them on the same SSD and multiple HDD’s like you claim you do with MongoDB
  • Freedom from rigid schemas
    • I’ll give you this one, relational databases are kind of “old school” in this department
  • ObjectID’s are expressive and handy
    • Elastic Transparent Sharding schemes like ParElastic overcome this with Elastic Sequences which give you the same benefits. A half-way decent developer could do this for you with a simple sharded architecture.
  • GridFS for distributed file storage
    • Replication anyone?
  • Developed in the open
    • Yes, MongoDB is free and developed in the open like a puppy is “free”. You just told us all the “costs” associated with this “free puppy”

    So really, why do people use MongoDB? I know there are good circumstances where MongoDB will whip the pants off any relational database but I submit to you that those are the 1%.

    To this day, I believe that the best description of MongoDB is this one:

    http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale

    Mongo DB is web scale
    by: gar1t

    http://www.xtranormal.com/xtraplayr/6995033/mongo-db-is-web-scale

    Database scalability myth (again)

    A common myth that has been perpetrated is that relational database do not scale beyond two or three nodes. That, and the CAP Theorem are considered to be the reason why relational databases are unscalable and why NoSQL is the only feasible solution!

    I ran into a very thought provoking article that makes just this case yesterday. You can read that entire post here. In this post, the author Srinath Perera provides an interesting template for choosing the data store for an application. In it, he makes the case that relational databases do not scale beyond 2 or 5 nodes. He writes,

    The low scalability class roughly denotes the limits of RDBMS where they can be scaled by adding few replicas. However, data synchronization is expensive and usually RDBMSs do not scale for more than 2-5 nodes. The “Scalable” class roughly denotes data sharded (partitioned) across many nodes, and high scalability means ultra scalable systems like Google.

    In 2002, when I started at Netezza, the first system I worked on (affectionately called Monolith) had almost 100 nodes. The first production class “Mercury” system had 108 nodes (112 nodes, 4 spares). By 2006, the systems had over 650 nodes and more recently much larger systems have been put into production. Yet, people still believe that relational databases don’t scale beyond two or three nodes!

    Systems like ParElastic (Elastic Transparent Sharding) can certainly scale to much more than two or three nodes, and I’ve run prototype systems with upto 100 nodes on Amazon EC2!

    Srinath’s post does contain an interesting perspective on unstructured and semi-structured data though, one that I think most will generally agree with.

    All you ever wanted to know about the CAP Theorem but were scared to ask!

    I just posted a longish blog post (six parts actually) about the CAP Theorem at the ParElastic blog.

    http://www.parelastic.com/database-architectures/an-analysis-of-the-cap-theorem/

    -amrith

    High Availability and Fault Tolerance

    This article on HA and FT at ReadWriteWeb caught my eye. A while ago I used to work at Stratus and it is not often that I hear their name these days. Stratus’ Fault Tolerant systems achieve their impressive uptime by hardware redundancy.

    In very simple terms, if the probability of some component or sub-system failure is p, then the probability of two failures at the same time is a much smaller p * p.

    When I was at Stratus, we used to guarantee “five nines”, or an uptime of 99.999% on systems that ran credit card networks, banking systems, air traffic control systems, and so on. Systems where the cost of downtime could be measured either in hundreds of thousands or millions of dollars an hour, or in human lives potentially lost.

    Before I worked at Stratus, I used to work for a Stratus Customer and my first experience with Fault Tolerance was when I received a box in the mail with a note that said something to the effect that a CPU board had failed in one of our systems (about a month ago), so please pop that board out and put this replacement board in its place.

    And we hadn’t realized it, the system had been chugging along just fine!

    So what does uptime % translate to in terms of hours and minutes?

    99% uptime : 3.65 days of downtime per year

    99.9% uptime: 8.76 hours of downtime per year

    99.99% uptime: 52.56 minutes of downtime per year

    99.999% uptime: 5.256 minutes of downtime per year

    Stratus claims that across its customer base of 8000 servers the uptime is 99.9998%

    99.9998% uptime: 63 seconds of downtime per year.

    Now, that’s pretty awesome!

    And when I flew into Schipol Airport, or saw containers being loaded onto ships in Singapore, or I used my American Express Credit Card, logged into AOL, or looked at my 401(k) on Fidelity, I felt pretty darn proud of it!

    %d bloggers like this: