Blog: William McKnight Subscribe to this blog's RSS feed!

William McKnight

Hello and welcome to my blog!

I will periodically be sharing my thoughts and observations on information management here in the blog. I am passionate about the effective creation, management and distribution of information for the benefit of company goals, and I'm thrilled to be a part of my clients' growth plans and connect what the industry provides to those goals. I have played many roles, but the perspective I come from is benefit to the end client. I hope the entries can be of some modest benefit to that goal. Please share your thoughts and input to the topics.

About the author >

William is the president of McKnight Consulting Group, a firm focused on delivering business value and solving business challenges utilizing proven, streamlined approaches in data warehousing, master data management and business intelligence, all with a focus on data quality and scalable architectures. William functions as strategist, information architect and program manager for complex, high-volume, full life-cycle implementations worldwide. William is a Southwest Entrepreneur of the Year finalist, a frequent best-practices judge, has authored hundreds of articles and white papers, and given hundreds of international keynotes and public seminars. His team's implementations from both IT and consultant positions have won Best Practices awards. He is a former IT Vice President of a Fortune company, a former software engineer, and holds an MBA. William is author of the book 90 Days to Success in Consulting. Contact William at

Editor's Note: More articles and resources are available in William's BeyeNETWORK Expert Channel. Be sure to visit today!

As applications discover the need to work with MDM for their projects to be successful, and the data and the parties are identified, the MDM team needs to be able to engage the requirement.  This is what I call setting up shop. 


You will need to balance structure and agility.


Early in the project's planning/research phase, a general conversation about the project and its data needs should be conducted with MDM Leadership.  As enough information is made available to complete the project plan tasks, the MDM team could provide the project team with the appropriate tasks for inclusion into their project plan.  A description of a superset of those tasks follows.   


All projects engaging MDM data need to share these documents with the MDM team or put MDM Leadership on outright sign-off for the following documents during the Requirements phase:


  • Business Requirements  

  • Non-functional Requirements  


The Business Requirements should contain diagrams and commentary on the interface(s) that the project will have to MDM.  Depending on your working model, either the MDM team or the application team provides the technical aspects of the application integration with MDM.  Figure out which in the manifesto or be prepared for MDM to be labeled "hard to work with."

Posted February 19, 2011 9:39 AM
Permalink | No Comments |


I have completed this paper, where I make the case for one of the largest trends in BI, mobility.


"No matter what business you are in, you are in the business of information. And it's business intelligence that has long been the discipline to deliver the needed information. Demand for business intelligence as a means to get maximum value from information has never been higher as businesses increasingly compete in real time and require information that is integrated from across the enterprise. The old saw about business intelligence is that it gets "the right information to the right people at the right time." It's really time to add "right medium" to that mix.


Automating business decisions and action is one path to business intelligence maturity. Determining what actions to trigger automatically based on changes in corporate data can come from a solid understanding of how decisions are made today. However, many decisions are multifaceted, and a knowledge worker's analysis will continue to be a part of effective business intelligence.


Effective analysis is getting more complicated for knowledge workers. The more complicated aspects include sensing what is happening and combining that with summarized historical data to build a set of possible actions. These "analytics" are the basis of competitive advantage for organizations today. Once calculated, they must be put to effective use, again utilizing the best medium available for real-time delivery."


Please see here for the full paper.  The contents include:


Business Intelligence Deployment Option History

Business Mobility

Mobile Business Intelligence Deployed

GUESS? Store Managers Don't Have to Second Guess Data

PriceLYNX: Going Mobile to Curb Supply Spend

What These Stories Tell Us Tips

Approaches to Mobile Business Intelligence

MicroStrategy Mobile


Posted February 1, 2011 8:33 AM
Permalink | No Comments |

MDM programs are generally designed to provide the data needed by a cross-section of applications or for data that can utilize its workflow capabilities for its origination and updates.  It's an approach usually not taken for data needed by a single application, although it may be done as a set-up for future applications. 


Part of the MDM manifesto must include how teams will source its data.  Over 75% of the post-implementation requests of MDM will be around this question.


In order to acquire the data, it must be mapped to the data structures of the target application.  Who does this? The MDM team, the application team, a separate integration team, or a separate architecture team?


Regardless, all new projects should meet with MDM Leadership in a very early phase of their project to determine:


  1. Data available in MDM that should be used in the project
  2. Data not available in MDM that should be sourced by the MDM team for the project (and other projects)
  3. Data that the project is generating that the MDM team should source into MDM
  4. Time and resource estimate for the MDM team contribution


Data not in MDM that needs to be may be done in 1 of 2 ways:


  1. MDM (the MDM team usually) can source the data from its origination point or a third party system
  2. MDM can update or add to its workflow environment, which incorporates manual entry of the information at the right point  




Posted January 29, 2011 9:45 AM
Permalink | No Comments |

Several MDM programs out there are in development and about to go to production.  Several others are struggling in production as they try to move the program into a second subject area or to engage more publishing or subscribing systems to the information.  Others need to extend the data governance beyond a single business group.


Few have made the leap to successfully setting up their MDM program as a fully functioning member of the 'major system' ecosystem of the company.  The guidelines in these blog entries will help those shops make that transition and address the questions that the REST of the company may have about MDM.  It is absolutely essential that MDM be properly positioned to these important evaluators of the program success.


Enterprise MDM cannot be successful "in a vacuum" - built to meet the need of a single application/subject area that is well-known.  Building MDM with this hyper-focus to the exclusion of all concerns for scalability results in just what I am seeing now: MDM re-dos and multiple MDMs where there could be one, enterprise MDM.


These questions include (again, from the perspective of those not in the MDM team):


What is MDM?

What data is available there?

Do I have to use MDM's data?

Do I really have to use MDM's data?  Who will care if I don't?

What if the MDM data is not suitable for my application needs?

How long does it take to incorporate my data?

Whose role is it to add data to MDM?

Is it push or pull?

I'm just going to bring this third-party data into my application, not MDM, OK?

Is my need an extension of a subject area or a new one?

Who do I talk to about MDM?

Do I have to contribute my data?

How do I modify the existing MDM workflows?

Does the MDM team carry a separate project plan for my need?

Who builds the plan and manages those tasks?

How do I unit test, do quality assurance testing, etc. with MDM data?


Just knowing these questions could trigger the necessary action, but in case it doesn't, I'll keep posting here (and you can as well) some tips to setting up shop with MDM.

Posted January 23, 2011 9:29 PM
Permalink | No Comments |

In my last post, I talked about Microsoft's new, upcoming columnar offering, Apollo.   I said it was designed to take some pressure off the core DBMS to do it all and do it all fast.  That's doubly true for Parallel Data Warehouse (PDW), the new MPP offering from Microsoft.  This is probably one of the last times you'll hear the word DataAllegro, but that technology, acquired in 2008 by Microsoft, is what PDW is based on.  MIcrosoft has spent the last 2 years replacing the core (Ingres) DBMS with SQL Server and the Linux/Java with Windows/C#.  PDW currently works on HP hardware and is in early release.

Microsoft is giving its users 2 major additional data storage options in Denali - columnar and MPP.  Microsoft is going down the path of functional parity between the core SMP offering and PDW, which is already integrated with the SQL BI stack.  It hopes to keep some of those SMP customers hitting its scalability limits in the Microsoft tent.

There is a lot of overlap in capabilities among SMP, columnar and MPP.  It's your job to sort through your workloads and make a plan.  I have found MPP much more advantageous the larger the data is and columnar useful for those high column selectivity workloads.

I'll be part of a virtual seminar focused on PDW on Tuesday.  I'll be talking about data consolidation strategies, a topic Microsoft is ready to take on with PDW.


As budgets languish, data growth balloons and business demand intensifies, BI and data warehousing professionals are under immense pressure to squeeze every last dollar of value from existing investments, while providing 24/7 access to mission-critical business information. That's the bad news.

The good news is you're invited to join renowned visionaries Bill Inmon (the father of data warehousing) and William McKnight (leading information management consultant), for our LIVE, interactive virtual seminar on November 16th (9:00 AM - 1:30 PM (EDT))  - designed to help you leverage next-generation data warehousing technologies for maximum gain.

Posted November 14, 2010 8:44 AM
Permalink | No Comments |

It was columnar day for me at SQL PASS on Wednesday.  On Tuesday, Microsoft announced that Denali, the, the code name for its next release, would have a columnar data store option.  My talk was on columnar databases Wednesday.  Here are some of the details I shared about Denali's column store, which has a project name of Apollo.  If you're interested in columnar databases in general, see my blog entries here.

In Denali, there will be an index type of "COLUMN STORE".  I find this to be an interesting use of index, because the resultant data stores that are created are not like traditional indexes.  However, Microsoft has never been a conformist organization.   The column stores are non-clustered indexes.  No indexes can be created on top of the column stores.

Where the column store is like an index is that you need the accompanying row-wise database.  The column stores are not created mutually exclusive to the row-wise database.  To my knowledge, this is the only database that requires this.  I don't expect this to be a long-term requirement.  While this may seem like it's expanding your storage needs (and it is), it may not be as much as you initially think because some non-clustered indexes might become redundant in this architecture. 

The good news about this is that the optimizer has been updated to route queries to the column stores or the row store accordingly.  This could prove to be a competitive differentiator.  Few other database systems have this.  An intelligent hybrid optimizer will be key to the success of databases that are at least partly columnar. 

Apollo's vectors (per my Sybase IQ language in my earlier posts) are called column segments, although there can be multiple segments per column, as explained below.  You can only have one column store index per table, but you can name as many columns as you want.  Of course, it doesn't matter what order you use because each column forms an independent segment.  Only single-column segments are supported in Apollo.

Apollo leverages the intellectual property, patents, and good experiences that Microsoft has had with Vertipaq, the client-side, columnar, in-memory structure used with PowerPivot.  Columnar remains the preferred, future, and only, format for Vertipaq. 

In Apollo, no inserts, updates or deletes are allowed on the tables that have a COLUMN STORE (this is the part of the talk where I did a mock exit).  You can, however, do incremental loads and you can switch partitions to add data.  You can also sandwich DISABLE and REBUILD of the segments around your updates.  I expect this will improve over time.

As long as I'm on limitations, the columns selected have data type restrictions.  The columns must be integer, real, string, money, datetime or a decimal that is 18 digits or less.  No other data types are supported.

As we know, getting more (relevant) information in the I/O is one of the advantages of columnar data store.  Microsoft has taken this to a new level.  While data is still stored in pages (blocks), the unit of I/O is actually one million data values.  That forms a "segment."  You read it right - the unit of I/O is not a certain number of "K" but has to do with the NUMBER of data values.  Inside those pages, the data is stored in blobs.  Bitmapping is part of the storage somehow as well although columnar data page layouts are not public information.  Neither is how it's doing materialization.  As for I/O, compression algorithms have been reengineered for columnar.  These are not the same compression algorithms from the row-wise database.

Posted November 12, 2010 10:22 AM
Permalink | No Comments |


For much of the last decade, conventional theories surrounding decision support

architectures have focused more on cost than business benefit. Lack of Return on

Investment (ROI) quantification has resulted in platform selection criteria being focused

on perceived minimization of initial system cost rather than maximizing lasting value to

the enterprise. Often these decisions are made within departmental boundaries without

consideration of an overarching data warehousing strategy.


This reasoning has led many organizations down the eventual path of data mart proliferation.

This represents the creation of non-integrated data sets developed to address

specific application needs, usually with an inflexible design. In the vast majority of

cases, data mart proliferation is not the result of a chosen architectural strategy, but a

consequence due to lack of an architectural strategy.


To further complicate matters, the recent economic environment and ensuing budget

reduction cycles have forced IT managers to find ways of squeezing every drop of

performance out of their systems while still managing to meet users' needs. In other

words, we're all being asked to do more with less. Wouldn't it be great to follow in

others' footsteps and learn from their successes while still being considered a thought



The good news is that the data warehousing market is now mature enough that there are

successes and best practices to be leveraged. There are proven methods to reduce costs,

gain efficiencies, and increase the value of enterprise data. Pioneering organizations

have found a way to save millions of dollars while providing their users with integrated,

consistent, and timely information. The path that led to these results started with a

rapidly emerging trend in data warehousing today - Data Mart Consolidation (DMC).

I've learned that companies worldwide are embracing DMC as a way to save large

amounts of money while still providing high degrees of business value with ROI. DMC

is an answer to the issues many face today.  


Posted November 7, 2010 4:44 PM
Permalink | No Comments |

Customer contact center data contains hidden nuggets of insight about customers,

products, and business operations, and it provides the foundation for effective

customer relationship management (CRM). Mining this data for insights can be

daunting, however.

The databases that support operational activities such as call center operations are tuned

for the performance of those operations and are usually inappropriate for data analysis.

The database structures are designed for transaction processing. The databases themselves

contain limited historical content because data retention is typically limited to a

maximum of three to six months. And the data in them is only a subset of the total

contact activities the business handles, either because geographically dispersed contact

centers handle enterprise contact activities or because different applications or divisions

handle telephone, e-mail, and Web-based contacts.

Deriving the full value from customer contact data requires the integration of all contact

records, regardless of how or where they were received. Data might represent phone

contacts routed to a regional call center, e-mails sent to a service organization, or Web

interactions between sales agents and prospects surfing a company's Web site. The ability

to optimize the business to better meet the needs of customers depends on knowing

what those customers are doing, regardless of the communication channel they use and

regardless of how or where their contacts were routed within an enterprisewide contact


Posted October 31, 2010 11:08 AM
Permalink | No Comments |

For any well-done information management project, there exists a set of documentation.

I've recently come into many shops to help clients overcome process, organizational and technical challenges and many times documentation is non-existent.  Of course, we all know many benefits of good documentation but also know that it helps consultants be a quick-study and get straight to advice or action.

But what documentation is necessary and is it necessary if you are deploying an agile methodology?  I'll answer the first part below but my answer to documentation and agile is that yes, documentation is important in agile.  If you are going to support the system, potentially do something similar again, hire consultants or have employee turnover, you need documentation - at the appropriate times and with appropriate, not necessarily excruciating, detail. 

You also probably do not want to be the person who repeats himself repeatedly, which is what you'll do without documentation available to hand out.  Repeating oneself is hardly the most advanced use of time.

The documentation can be built with agility just like the systems, but here is a list to think about delivering (or, as the case may be, retrofitting).  It's a STARTER list for what is necessary.

  • Non Functional Requirements - Describes the environment in which the system operates

  • Decisions Capture - A place to catch all those important decisions that are made and a vehicle for making those decisions visible in the culture

  • Logical Data Model - The logical data model should not exist without narrative

  • Test Approach - Describe how the development will be tested - systems, data, user involvement, players, etc.

  • Interface Specification - Interfaces are largely what these projects are all about - either interfacing existing systems or adding multiple components that need interfacing themselves

  • Startup Plan - Day 1 of production is hardly when data can begin accruing in new information management projects; Planning for and loading the data backlog needs a plan

  • Data Access Specification - Describes how the new data being made available will be accessed by users and systems


Do yourself a favor.  Inventory your documentation set and make sure it's the right level for sustaining the system.

Posted September 21, 2010 11:58 AM
Permalink | No Comments |

Do we need business intelligence (BI) tools to be successful?
Get William's take on whether you need business intelligence (BI) tools to be successful and learn how about low cost BI including cheap and open source BI tools.

Fastest way to learn business intelligence (BI)
Find out how you can learn business intelligence (BI), get business intelligence training and discover why analytics is the key to marketing efforts in this BI tip.

Data architect careers: The benefits of working at a System Integrator
Working as a data architect at a consultancy or System Integrator can be challenging, but has benefits. Find out the career value of working at a System Integrator, as advised by William McKnight.

Posted August 20, 2010 3:27 PM
Permalink | No Comments |

Search this blog
Categories ›
Archives ›
Recent Entries ›