Blog: Colin White Subscribe to this blog's RSS feed!

Colin White

I like the various blogs associated with my many hobbies and even those to do with work. I find them very useful and I was excited when the Business Intelligence Network invited me to write my very own blog. At last I now have somewhere to park all the various tidbits that I know are useful, but I am not sure what to do with. I am interested in a wide range of information technologies and so you might find my thoughts will bounce around a bit. I hope these thoughts will provoke some interesting discussions.

About the author >

Colin White is the founder of BI Research and president of DataBase Associates Inc. As an analyst, educator and writer, he is well known for his in-depth knowledge of data management, information integration, and business intelligence technologies and how they can be used for building the smart and agile business. With many years of IT experience, he has consulted for dozens of companies throughout the world and is a frequent speaker at leading IT events. Colin has written numerous articles and papers on deploying new and evolving information technologies for business benefit and is a regular contributor to several leading print- and web-based industry journals. For ten years he was the conference chair of the Shared Insights Portals, Content Management, and Collaboration conference. He was also the conference director of the DB/EXPO trade show and conference.

Editor's Note: More articles and resources are available in Colin's BeyeNETWORK Expert Channel. Be sure to visit today!

August 2006 Archives

I read with interest Claudia Imhoff's recent blog about the ODS being alive and well.

I agree! I have also heard statements about the ODS being dead and of course that's absurd. The problem is that people like to put labels on things and many of the data stores given the ODS label are work data sets or staging areas, and are not in fact ODSs.

In an ideal world we would have one data store that all applications use, and there would be no need for an ODS, or even data stores for data warehousing, CDI or MDM. Unfortunately, applications are developed at different times, have different data and data model requirements, performance needs, and so forth.

I remember when distributed database was the flavor of the day. Many of these efforts failed because of the heterogeneous data involved, and because of integrity and performance issues. Since then we have come up with a variety of data management and data integration options to try and overcome these issues. Like distributed database, these technologies work for certain types of application. What we have learnt from these efforts is that one size does not fill all when it comes to data management.

In general, there are five types of application processing in IT systems: business transaction, collaborative, business intelligence, planning, and master data processing. These applications may require current data or past data.

Most operational business transaction applications require current data. These applications and their associated data stores are dispersed throughout the enterprise. This dispersed data makes it difficult to gain a current view of enterprise-wide business operations. To solve this problem, required subsets of the operational data is integrated into an operational data store (ODS). The latency of the the data in the ODS will vary based on business needs. The ODS can be used for operational reporting, to feed downstream applications, and as a temporary bridge during legacy application migration.

Dispersed operational data also makes it difficult to gain a past view of enterprise-wide business operations. To solve this problem, the data warehouse concept was created. A data warehouse contains a historical subset of past data captured from the operational environment. The data sources for the data warehouse can be the operational applications themselves, or an ODS.

Collaborative applications are also dispersed throughout the enterprise. These applications require both current and past data. In this case, an enterprise view is obtained by capturing the dispersed data in a content management system (CMS). The data from the CMS data store can also act as a data source for a data warehouse. Planning data follows a similar pattern to collaborative data. The industry direction here is to consolidate planning data into a shared data store. Again this data store can be used as a data source for a data warehouse.

When data cannot be consolidated into a single data store for operational, BI or collaborative processing, then data federation approaches can help to create an integrated view of the dispersed data.

The last piece in the puzzle is master data processing. It is this type of processing that has sometimes led to the comment that the ODS is dead. Current master data is often intermingled with operational business transaction data. In these cases, master data finds its way in an ODS for operational processing, and into a data warehouse for BI and decision processing.

For ease of management, the direction of the industry is to separate master data from standard transaction data, and to manage both current and past master data in a separate integrated store. This approach removes current master data from business transaction systems and the ODS. It also removes past master data from the data warehouse. Standard transaction data, however, still flows into the ODS and the data warehouse.

It will take some time to get to full enterprise MDM. Meanwhile interim solutions like CDI hubs are being to used to propagate master data between systems. The key thing to note here is that these hubs only process master data. The remaining transaction data continues to flow into the ODS and data warehouse. If a CDI hub processes all the transaction data then it is performing the same identical function that an ODS does today, and there is no point in creating a new name for it.

Any comments?

Posted August 24, 2006 3:59 PM
Permalink | 1 Comment |

IBM's proposed acqusition of Filenet and OpenText's acqusition of Hummingbird demonstrate that the enterprise content management (ECM) space is consolidating. As in other markets, the number of independent vendors is diminishing as the big infrastructure players like BEA, IBM, Microsoft, Oracle, and SAP acquire and develop new products and barge their way into new markets. The infrastructure vendors are bringing together content management, search, collaboration, portals and process management to create what can be thought of as a knowledge management platform. I guess knowledge management is still a dirty word in many organizations. I think at one time to overcome this issue, Gartner added business intelligence to this mix and created the concept of the smart enterprise suite. Like many Gartner buzzwords it has fallen by the wayside.

The challenge for companies like IBM and OpenText is integrating products and dealing with overlapping function. IBM particularly has a challenge here. It has a bewildering number of options for content management, search, and rules management. In many cases, large vendors just see what products stick and gather revenue from their overlapping products. Computer Associates has used this model for years. The problem for the customer is wading through this morass to determine what products to buy and what ones are likely to survive.

Posted August 23, 2006 10:18 AM
Permalink | No Comments |