Architecture – Hype Cycles

Xtremedata has announced their entry into the data warehouse appliance space. Called the dbX, this product is an interesting twist on the use of FPGA’s (Field Programmable Gate Arrays). Unlike the other data warehouse appliance companies, Xtremedata is not a data warehouse appliance startup. They are a six year old company with an established product line and a patent for the In Socket Accelerator that is part of dbX.

What is ISA?

The core technology that Xtremedata makes is a device called an In Socket Accelerator. Occupying a CPU or co-processor slot on a motherboard, this plug compatible device emulates a CPU using an FPGA. But, the ISA goes beyond the functionality provided by the CPU that it replaces.

Xtremedata’s patent (US Patent 7454550 – Systems and methods for providing co-processors to computing systems. issued in 2008) includes the following description

It is sometimes desirable to provide additional functionality/capability or performance to a computer system and thus, motherboards are provided with means for receiving additional devices, typically by way of “expansion slots.” Devices added tothe motherboard by these expansion slots communicate via a standard bus. The expansion slots and bus are designed to receive and provide data transport of a wide array of devices, but have well-known design limitations.

One type of enhancement of a computer system involves the addition of a co-processor. The challenges of using a co-processor with an existing computer system include the provision of physical space to add the co-processor, providing power to the co-processor, providing memory for the co-processor, dissipating the additional heat generated by the co-processor and providing a high-speed pipe for information to and from the co-processor.

Without replacing the socket, which would require replacing the motherboard, the CPU cannot be presently changed to one for which the socket was not designed, which might be desirable in providing features, functionality, performance and capabilities for which the system was not originally designed.

and

FPGA accelerators and the like, including counterparts therefore, e.g., application-specific integrated circuits (ASIC), are well known in the high-performance computing field. Nallatech, (see http://www.nallatech.com) is an example of several vendors that offer FPGA accelerator boards that can be plugged into standard computer systems. These boards are built to conform to industry standard I/O (Input/Output) interfaces for plug-in boards. Examples of such industry standards include: PCI (PeripheralComponent Interconnect) and its derivatives such as PCI-X and PCI-Express. Some computer system vendors, for example, Cray, Inc., (see http://www.cray.com) offer built-in FPGA co-processors interfaced via proprietary interfaces to the CPU.

What else does Xtremedata do?

Their core product is the In Socket Accelerator with applications in Financial Services, Medical, Military, Broadcast, High Performance Computing and Wireless (source: company web page). The dbX is their latest venture integrating their ISA in a standard HP server and providing “SQL on a chip”.

How does this compare with Netezza?

The currently shipping systems from Netezza also include an FPGA but, as described in their patent (US patent 709231000, Field oriented pipeline architecture for a programmable data streaming processor) places an FPGA between a disk drive and a processor (as a programmable data streaming processor). In the interest of full disclosure, I must also point out that I worked at Netezza and on the product described here. All descriptions in this blog are strictly based on publicly available information and references are provided.

Xtremedata, on the other hand, uses the FPGA in an ISA and emulates the processor.

So, the two are very different architectures, and accomplish very different things.

Other fun things that can be done with this technology 🙂

Similar technology was used to make a PDP-10 processor emulator. You can read more about that project here. Using similar technology and a Xilinx FPGA, the folks described in this project were able to make a completely functional PDP-10 processor.

Or, if you want a new Commodore 64 processor, you can read more about that project here.

Why are these links relevant? ISA’s and processor emulators seem like black magic. After all, Intel and AMD spend tons of money and their engineers spend a huge amount of time and resource to design and build CPU’s. It may seem outlandish to claim that one can reimplement a CPU using some device that one analyst called “Field Programmable Gatorade”. These two projects give you an idea of what people do with this Gatorade thing to make a processor.

And if you think I’m making up the part about Gatorade, look here and here (search for Gatorade, it is a long article).

About Xtremedata

XtremeData, Inc. creates hardware accelerated Database Analytics Appliances and is the inventor and leader in FPGA-based In-Socket Accelerators(ISA). The company offers many different appliances and FPGA-based ISA solutions for markets such as Decision Support Systems, Financial Analytics, Video Transcoding, Life Sciences, Military, and Wireless. Founded in 2003 XtremeData has established two centers: Headquarters are located in Schaumburg, IL (near Chicago, Illinois, USA) and a software development location in Bangalore, India.

XtremeData, Inc. is a privately held company.

source: company web page

My Opinion

This is a technology that I have been watching for some time now. It will be interesting to see how this technique compares (price, performance, completeness) with other vendors who are already in the field. By adopting industry standard hardware (HP) and being a certified vendor for HP (Qualified by HP’s accelerator program), and because this isn’t the only thing they do with these accelerators, it promises to be an interesting development in the market.

Over the past several months, I have been wondering about the Shared Nothing architecture that seems to be very common in parallel and massively parallel systems (specifically for databases) these days. With the advent of technologies like cheap servers and fast interconnects, there has been a considerable amount of literature that point to an an apparent “consensus” that Shared Nothing is the preferred architecture for parallel databases. One such example is Stonebraker’s 1986 paper (Michael Stonebraker. The case for shared nothing. Database Engineering, 9:4–9, 1986). A reference to this apparent “consensus” can be found in a paper by Dewitt and Gray (Dewitt, Gray Parallel database systems: the future of high performance database systems 1992).

A consensus on parallel and distributed database system architecture has emerged. This architecture is based on a shared-nothing hardware design [STON86] in which processors communicate with one another only by sending messages via an interconnection network.

But two decades or so later, is this still the case, is “Shared Nothing” really the way to go? Are there other, better options that one should consider?

As long ago as 1996, Norman, Zurek and Thanisch (Norman, Zurek and Thanisch. Much ado about shared nothing 1996) made a compelling argument for hybrid systems. But even that was over a decade ago. And the last decade has seen some rather interesting changes. Is the argument proposed in 1996 by Norman et al., still valid? (Updated 2009-08-02: A related article in Computergram in 1995 can be read at http://www.cbronline.com/news/shared_nothing_parallel_database_architecture)

With the advent of clouds and virtualization, doesn’t one need to seriously consider the shared nothing architecture again? Is a shared disk architecture in a cloud even a worthwhile conversation to have? I was reading the other day about shared nothing cluster storage. What? That seems like an contradiction, doesn’t it?

Some interesting questions to ponder:

In the current state of technology, are the characterizations “Shared Everything”, “Shared Memory”, “Shared Disk” and “Shared Nothing” still sufficient? Do we need additional categories?
Is Shared Nothing the best way to go (As advocated by Stonebraker in 1986) or is a hybrid system the best way to go (as advocated by Norman et al., in 1996) for a high performance database?
What one or two significant technology changes could cause a significant impact to the proposed solution?

I’ve been pondering this question for some time now and can’t quite seem to decide which way to lean. But, I’m convinced that the answer to #1 is that we need additional categories based on advances in virtualization. But, I am uncertain about the choice between a Hybrid System and a Shared Nothing system. The inevitable result of advancements in virtualization and clouds and such technologies seem to indicate that Shared Nothing is the way to go. But, Norman and others make a compelling case.

What do you think?

Michael Stonebraker. The case for shared nothing. Database Engineering, 9:4–9, 1986

Tag: Architecture

Xtremedata’s latest announcement

Wondering about “Shared Nothing”

Share this:

Share this: