Blog: Colin White Subscribe to this blog's RSS feed!

Colin White

I like the various blogs associated with my many hobbies and even those to do with work. I find them very useful and I was excited when the Business Intelligence Network invited me to write my very own blog. At last I now have somewhere to park all the various tidbits that I know are useful, but I am not sure what to do with. I am interested in a wide range of information technologies and so you might find my thoughts will bounce around a bit. I hope these thoughts will provoke some interesting discussions.

About the author >

Colin White is the founder of BI Research and president of DataBase Associates Inc. As an analyst, educator and writer, he is well known for his in-depth knowledge of data management, information integration, and business intelligence technologies and how they can be used for building the smart and agile business. With many years of IT experience, he has consulted for dozens of companies throughout the world and is a frequent speaker at leading IT events. Colin has written numerous articles and papers on deploying new and evolving information technologies for business benefit and is a regular contributor to several leading print- and web-based industry journals. For ten years he was the conference chair of the Shared Insights Portals, Content Management, and Collaboration conference. He was also the conference director of the DB/EXPO trade show and conference.

Editor's Note: More articles and resources are available in Colin's BeyeNETWORK Expert Channel. Be sure to visit today!

November 2008 Archives

Relational database systems, such as IBM DB2 and Oracle Database, have undergone over a quarter century of development. During that time they have managed to successfully fight off competing database technologies for supporting mainstream database management. Do you remember the object/relational wars of the eighties?

MapReduce, a software framework introduced by Google for supporting parallel processing over large petabyte files has garnered significant attention of late. IBM is experimenting with this in conjunction with Google, and GreenPlum recently announced support.

The significant interest in MapReduce, and related technologies such as Hadoop and HDFS, has led to a backlash from the relational camp. David DeWitt and Michael Stonebraker have been especially outspoken (see and

Here is a small quote from their thoughts on the topic:

"As both educators and researchers, we are amazed at the hype that the MapReduce proponents have spread about how it represents a paradigm shift in the development of scalable, data-intensive applications. MapReduce may be a good idea for writing certain types of general-purpose computations, but to the database community, it is:

1. A giant step backward in the programming paradigm for large-scale data intensive applications

2. A sub-optimal implementation, in that it uses brute force instead of indexing

3. Not novel at all -- it represents a specific implementation of well known techniques developed nearly 25 years ago

4. Missing most of the features that are routinely included in current DBMS

5. Incompatible with all of the tools DBMS users have come to depend on"

Does this mean the database wars are starting up again?

My opinion is that MapReduce is not intended for general purpose commercial database processing and is therefore not a major threat to relational systems. However, it does have its uses (as Google has demonstrated) for certain types of high volume processing. It also demonstrates that as data volumes get bigger, and the complexity of data and data structures increases, other types of database technology may start to gain traction in certain niche marketplaces. The use by IBM of the SPADE language, instead of StreamSQL, in its InfoSphere Streams product (System S) also demonstrates the changes going on in the database market.

What do you think?

Posted November 25, 2008 4:20 PM
Permalink | No Comments |