Blog: William McKnight Subscribe to this blog's RSS feed!

William McKnight

Hello and welcome to my blog!

I will periodically be sharing my thoughts and observations on information management here in the blog. I am passionate about the effective creation, management and distribution of information for the benefit of company goals, and I'm thrilled to be a part of my clients' growth plans and connect what the industry provides to those goals. I have played many roles, but the perspective I come from is benefit to the end client. I hope the entries can be of some modest benefit to that goal. Please share your thoughts and input to the topics.

About the author >

William is the president of McKnight Consulting Group, a firm focused on delivering business value and solving business challenges utilizing proven, streamlined approaches in data warehousing, master data management and business intelligence, all with a focus on data quality and scalable architectures. William functions as strategist, information architect and program manager for complex, high-volume, full life-cycle implementations worldwide. William is a Southwest Entrepreneur of the Year finalist, a frequent best-practices judge, has authored hundreds of articles and white papers, and given hundreds of international keynotes and public seminars. His team's implementations from both IT and consultant positions have won Best Practices awards. He is a former IT Vice President of a Fortune company, a former software engineer, and holds an MBA. William is author of the book 90 Days to Success in Consulting. Contact William at

Editor's Note: More articles and resources are available in William's BeyeNETWORK Expert Channel. Be sure to visit today!

October 2008 Archives

Fall conference season is a wrap for me except for my virtual presentation next week for the online DAMA-NCR Wilshire Symposium "Leveraging Information Asset Management." I'm presenting "Incorporating Syndicated Data into your Information Management environment" on Wednesday, November 5 at 12:30 ET.

Observations from the field will continue in the blog.

Technorati tags: DAMA, Syndicated Data

Posted October 31, 2008 11:31 AM
Permalink | No Comments |

With a title like “It’s a Data World After All!”, I had to attend because I couldn’t agree more with the principle. I’d also like to attend a presentation with a counter point of view - if you know of any.

Anyway, this session was by Pat Wilson at Disney, from just down the street. She’s on the data warehouse team. The premise was that the approach to gathering requirements for a data project is different from other projects. Like many companies using analytics, Disney makes hotel offers based on your answers to their questions with the offers that are most likely to succeed.

She went through some BI basics and made a good point about not collecting data for which you were unaware of the eventual usage and to avoid the “kitchen sink” requirement which goes like “just give us everything, we’ll use what we need.” Also avoid the “mind reader” approach which goes like “just give us what we want, we’ll know what to do with it.” Also avoid the “trial and error” approach which goes like “we’ll know it’s right when we see it.”

I think this presentation could correlate each scenario above to a Disney character. May I suggest, respectively, Goofy, Evil Queen, and Sleepy.

The (great) overall point is get DEEPER with your requirements for these projects and don’ t accept the shallow “requirements” the business tries to get away with. It’s a recipe for failure. She recommended an approach that follows along with this model:


I definitely agree with everything said here. It tracked pretty closely to my "ROI and Focus" blog in terms of why the data warehouse team exists and my article on Requirements Gathering (search Google for “McKnight Data Warehouse Requirements Analysis”).

Technorati tags: Business rules, Business Rules Forum, Business Requirements

Posted October 29, 2008 8:31 AM
Permalink | No Comments |

I was at the Business Rules Forum in Orlando yesterday to present and take in a few seminars otherwise. I’ll report on a few here. First, I’ll report on “Beyond Subject Matter Expertise” by Steve Demuth at ILOG.

What does business DNA look like? Process is heart and soul of how things get done. Then, decisions go into those processes. Similar to how we’ve evolved, you can change DNA without becoming a new business.

Process describes the how of the core activities of the enterprise. A decision determines the what of enterprise activity and it’s typically automatable or human, but not a mixture.

Automated decisions are ubiquitous (i.e., commissions, cross-selling, fraud.) The talk then focused on how to automate a decision intelligently. Sometimes, it’s a decision table, sometimes a rule flow (flowchart.) How to turn analysis into rules: model the landscape, understand the business goal, and formulate and formalize the solution. The last step is where your business rule management system (BRMS like ILOG, Fair Isaac) comes in. A value-added step at that point is to add simulation on historical data, perhaps in your data warehouse. Then, analyze the simulated outcome.

Then, Steve talked about numerically characterizing history to evaluate those outcomes and predict the best solution to take. It’s about predictability and likelihood. For example, in a group of transactions, how many will be fraudulent or how many will take-up the cross-sell offer?

Then, Steve talked about planning and scheduling with BRMS and how it can create a (for example) optimized nurse schedule for a hospital and deal with the inevitable last-minute decisions that must occur based on last-minute no-shows. This is an example of creating the adaptive enterprise – one that adapts to business changes. However, to get there, we need to break down the hedgerows between business departments, specifically IT and business groups.

Finally, 80% of a business’ problems are about being better at what you do. Business rules can help. 20% of the problems are about being something different than what you are. Both this “adaptation” and “creation” (potentially “destruction”) are necessary.

Technorati tags: Business rules, Business Rules Forum

Posted October 29, 2008 8:28 AM
Permalink | No Comments |

And then there was the case study from my client, Commerzbank. I co-presented this one with Carolina Posada, Vice President at Commerzbank. In regards to the presentation topic of data governance, as a midsize organization (the US Branch), they combined Program Governance (program direction) and Data Governance (standards) into a Steering Committee. They also have a data stewardship program.

I pointed out that the important thing about these committees is that all of the necessary information management functions for the organization are done. These committees normally comprise data governance, data stewardship, program governance and a business intelligence competency center (or center of excellence.) I do not wish to overdo committees at my clients, but want to be sure all of the required functions for success are being done.

The benefits Carolina cited for their Analytical MDM implementation were:
1. Data management is aligned with the company strategy
2. Operational systems (by product) supports reporting and compliance
3. The hub allows the single customer master to be shared to all product systems
4. Early data issues detection
5. They know their complete exposure to clients, whereas before it was piecemeal and incomplete
6. Reconciliation of transformed data to GL metrics
7. Managers consuming information and providing constant feedback for improvements
8. A unified customer view... for all its other benefits

I generalized from many MDM implementations and presented "Top 10 Mistakes Companies Make in Forming Data Governance." They are (in no particular order):
1. Not Translating IT Investments into Business Objectives
2. Thinking of it as a Technical Function
3. Scope Creep
4. A Revolving Door of Membership and Participation
5. No Decision Maker
6. Failure to Create a Charter
7. Turning Governance into the Blame Game
8. Lack of Customization to the Culture
9. Thinking of it as "just meetings"
10. Hyperfocus on a tactical issue

Technorati tags: Master Data Management, Business Intelligence, MDM Summit

Posted October 24, 2008 9:18 AM
Permalink | No Comments |

I made it out for a day on Monday to the MDM Summit in New York. The conference has picked up some from years past. Their information has it that case studies are the draw so the conference had quite a few of them. RR Donnelley (using Purisma) had a great case study because they have followed some best practices like:

1. Knowing BI & MDM go hand-in-hand
2. Focusing on MDM when combining 3 large organizations to formRR Donnelley
3. They didn't pick the technology first, but grew into it
4. Somebody there had the wisdom to declare early that MDM must be minimally invasive to the source systems, and it was something RR Donnelley followed
5. They used D&B DUNS number for identifying (B2B) customers
6. They built in capability for (what I call) master data query
7. Data governance and stewardship

They use the Registry model for MDM.

The "ROI" from the effort was in sales reporting, reduced manual work in reviewing customer names, and knowing their exposure to companies who were/are potentially going under in the challenging economy.

The last best practice was to use outside implementation services. I know of one that can help there.

Technorati tags: Master Data Management, merger, Business Intelligence, MDM Summit

Posted October 24, 2008 8:54 AM
Permalink | No Comments |

Next up was “Unstructured Text Analysis” from Julie Hartigan, who holds a PhD in natural language processing. First, she established that search is not mining or analysis.
Then I learned another new word: zettabytes, which is a million petabytes. Used in a sentence by Julie: In a study by IDC, it was determined that the digital universe is projected to be 1.8 zettabytes (roughly 1,800,000 petabytes) in 2011.

Since 85% of data is unstructured data, we are making decisions on 15% of data. Search is the backbone of access, but it’s only one way we find information. We also categorize, cluster, and link (if you like a, you’ll like b). She made the case that most searches are small (i.e., few predicates) and query revision is the norm. I think she means queries issued by people, not queries in production, which have tended to get large over time in development and can tend to be quite large (and frequently run.) However, the point that we want our data back fast, whether we’re IT people or business people, and without writing long queries is taken. I.e., select gross-profit, not select . Taking this a step further, I am continually pushing in my consulting for deriving more data in the back end and making it easy on the end user. Easy=they use it, Hard=they don’t.

Back to Julie, she said search is not the answer, but language may be. However, we need to get ambiguity out of language – and there’s plenty of that. I.e., “light brown dog”, “pressing a suit”, “the chicken is ready to eat”, “he saw her duck”. Then there’s similes, metaphors and idioms in language. We need text mining and NLP (natural language processing not neuro-linguistic programming) is the most effective. We decomposed some sentences like the tools do.

Adding unstructured makes it not the “single version of the truth”, it’s the “single version of the whole truth.”

Julie went into “voice of the customer” – mining call logs, for example to determine “unhappy” sentiment was due to the fact that beneficiaries are not getting the support they needed. Regarding VOC, what if call records were not structured? Search (i.e., for the word “billing”) would not suffice for analysis. You really must structure the data to see trends and norms.
She showed a dashboard from Claraview of some analysis of hotel sentiment from, replete with categorization of sentiment and drill-through to the detailed feedback. i.e, “This was one of the worst hotels I have stayed at. The room heater made deafening noises all night long. The staff at the hotel was all extremely rude and condescending.”

Example used were from products Attensity and Clarabridge.

In closing, there was some discussion of mining abbreviations. One language that will be a real challenge to mine is the one we heard Tuesday night at Circque de Solei’s “Ka” show – Grammelot (yet another new word tangentially credited to Partners.)

A couple of classic Teradata case studies and a panel of grumpy old men (their title) - where we were asked not to journal it - later and it was time to leave Partners. As I left Partners, what I’ll remember beyond the conference is the high quality of the Teradata team – both Teradata employees and the community of professionals. That is certainly one of Teradata’s biggest assets.

Technorati tags: Teradata, Teradata Partners, Unstructured Data

Posted October 17, 2008 9:30 PM
Permalink | No Comments |

There were a couple of great best practices in Claudia Imhoff’s operational business intelligence presentation. How about

1. Any weaknesses you have in your BI will only get worse when you speed it up
2. Don’t make (or think you’ll make) big changes to operational processes – that’s done a bit at a time

Technorati tags: Teradata, Teradata Partners

Posted October 17, 2008 9:28 PM
Permalink | No Comments |

First up yesterday for me was “Teradata Spatial” by Michael Watzke of Teradata. He talked about the ST_GEOMETRY data type. Ideas for its use included area, perimeter, census tracks and streets that intersect. For example, find all customers within 100 meters of another customer. Here’s the new word from this presentation: tessellated.

Use cases presented were in insurance (customer density) and communications (find customers near stores or in a marketing area.)

Teradata 14 will add the Raster data type for imagery and geocoding.

Overall, I think this is a nice addition to Teradata’s already strong feature/function set. It will be very important to some and ignored by others.

Technorati tags: Teradata, Teradata Partners, spatial,data warehouse

Posted October 17, 2008 9:25 PM
Permalink | No Comments |

I conducted some podcasts for the B-eye-network this week at Teradata Partners, which are available here on (not surprisingly) the B-eye-network. Check out the podcasts in the lower left corner of the home page.

I spoke with Deb Hoefer of Teradata, who is spearheading a training program for Teradata business users. For example, one course is Teradata SQL for Business Users. I definitely agree they need training and we spent some time talking about how they can customize a program – something obviously needed. I think this is a great idea as user communities are clamoring for empowerment! This topic actually has been a top three requirement in the strategies that I have been conducting lately for clients.

I also spoke with Ray Wilson. We talked about Teradata’s programs for anti-money laundering and meeting the regulations for uncovering the concealment of sources of income.

Finally, I had a podcast with Alok Pareek, the CTO at Golden Gate. We talked about their technology (they did some co-development with Teradata), but primarily we talked about dual-active. It was a good quick journey into the value of dual-active, some of them in ways not so obvious, as in environments where there can be zero down time or risk for an ERP upgrade. By using dual-active, some companies are running multiple versions of their ERP against the SAME data, with only limited pressure on the end users to get on the more recent version. I mention at the end that upgrades are one of the most challenging things IT does and better ways than the high-risk weekend war rooms are needed.

Enjoy all the Partners podcasts.

Technorati tags: Teradata, Teradata Partners, business intelligence,data warehouse

Posted October 17, 2008 9:24 PM
Permalink | No Comments |

I caught the session by Donald Feinberg with Gartner. He mentioned there was yet another market plunge happening today and had other acknowledgment of the bleak broader business environment. Regardless, Donald said that there are no indications that budgets for IT are being cut. I think with his Dow announcement he lost the audience for a bit, while the drop was being checked on cell phones throughout.

Some of the highlights:

IT doesn’t need to communicate to the business, they need to work with the business. Nicely put!

By 2012, users will interact with BI analytics as an element of greater than 85% of every business application. This BI will need the DW.

Half of BI and IM initiatives will fail. How will they fail if they’re already successful? There will be different measurements going forward. IT doesn’t listen and ignores the need for change.

There is a lack of vision for BI. Creating strategy is not fun so it doesn’t get done. I note that's what I do and I think I think it’s fun.

DW/BI is the #1 technical priority for CIOs and IT for 4 years straight now with a 9.2% growth estimate in 2008.

Every student graduating from college today has knowledge of IT and ability to get their own data. This will lead to Excel and Access proliferation and data quality problems if we’re not careful. It will also lead to a rise in SaaS and DW in the cloud.

The DW is mission critical!

Linux revenue growth from 2006 to 2007 was 61.8% and Unix was negative. He stated Linux will be able to handle any workload.

DW appliances are getting popular (several available now from Teradata) because nobody wants to lose their job due to a misconfiguration of a machine and there is one vendor to call with problems.

Technorati tags: Teradata, Teradata Partners, business intelligence,data warehouse

Posted October 15, 2008 11:29 PM
Permalink | No Comments |

Lance Armstrong gave the keynote this morning. He spoke just before embarking on his daily 5 hour bike ride practicing for the next Tour de France! (I immediately quit complaining about the ½ mile walk between room and conference at Mandalay Bay.) He was very conversational and open about his cancer and how his riding helps create awareness around this major killer. It was definitely an inspiring story. I don’t track cycling much, but he is one of the people I have great admiration for.

Good luck in France Lance and great job Partners for bringing us Lance.

Technorati tags: Teradata, Teradata Partners, Lance Armstrong

Posted October 15, 2008 11:07 PM
Permalink | No Comments |

I’m here at the Teradata Partners conference in Las Vegas. As always, it has been a very educational and fun conference – extremely well done as always. This is my 6th or 7th and they’re still very vibrant. Though it wasn’t the first platform I deployed a data warehouse on, my first deployment with Teradata was where I believe I received an educational foundation in best practices. I’ll share a few observations from my conference experience in some entries.

My first session was a presentation from Gartner analyst Gareth Herschel on “Emerging Insights for Customer Segmentation – Attitudes, not zip codes.” He talked about a 360 degree view of the customer, but mentioned that 350 degrees (or so) may be good enough. He also said there may be value in spending some extra money to get the same data from multiple sources and compare. I’ve come to some of the same conclusions in my work with syndicated data, which I’ll be sharing in my syndicated data presentation. I’ll post my speaking engagements here as they get close.

Gareth also suggested a “working backwards” (my words) strategy, moving from sales&marketing and service&support to what data you get (through buying, collecting or integrating). Susan Sarandon, the actress, once said that for every scene she does, at the end she wants the audience to know more about her character and the movie. Likewise, we should use every interaction with the customer to develop a relationship and make them more satisfied and/or more profitable.

He had some interesting statistics about “enterprise initiated, marketing driven” approaches versus “customer triggered” and “customer initiated” approaches. He quoted 1-5% response for the first, 5x that for the second and 10x that for the third approach. In other words, make those customers initiate! Teradata has solutions across all 3 ways.

He shared that Zappos (online shoe store) has a policy to offer every new employee $2000 to WALK AWAY from the company after 2 weeks of training, no questions asked. Why? The training is 6 weeks in total and they’d rather not do the additional training and starting of the employee in the job if it’s not right for them. I forget the context to the session topic, but that was interesting by itself.

Next, he suggested we break down the decision making process and ask if you have the right tools. He also talked about customer deciling, which I found interesting (and speak on in my Understanding MDM and the Benefits talk). You don’t hear about that topic enough.

Finally, he talked about the different ways of influence: “guru” where we only know what they’re saying, not who’s listening, peer (i.e.,,, friend (we’re unsure how to get this information today) and family (maybe we can know this.) He warned against placing too much emphasis on blogs – to use them only to IDENTIFY issues, not for quantifying issues. So, with the realization that this entry is not quantifying an issue, I wrap up this entry.

Technorati tags: Teradata, Teradata Partners, customer segmentation

Posted October 15, 2008 10:44 PM
Permalink | No Comments |

Acquisitions are all around us these days, both through the natural movements of the market as well as the activity becoming increasingly necessary because of business downtown. This can double, or greater, the number of customer records that need to be managed. Duplicates will undoubtedly be a challenge in this process and it would be unheard of for this situation not to create data conversion issues.

It used to be that I would advocate IT be a part of due diligence in M&A. With the urgent nature of many of the recent M&A activity, that is not happening. IT must assess the master data issues in an M&A and take appropriate action. Just getting the application layer together (i.e., by integrating ERP) is not enough. The data layer is equally, if not more, important to enable answers to questions like (all for the combined entity):

Who are the customers?
Who are the most/least profitable customers?
What customers are shared by the pre-merge companies?
How do I reform my sales staff to address the customers?
What suppliers are common to the pre-merge companies and what is the total spend with them?
What is my total exposure to each customer and supplier?
How do I reform my vendor management?

Some will turn to a “neutral” source such as D&B for keying the customers at this point and other times it’s appropriate to form up a new surrogate key for the customers. Either way, physical co-habitation inside a database and true integration of customer lists are a must and M&A is a good time for MDM.

Posted October 8, 2008 11:45 AM
Permalink | No Comments |

I'll be speaking on "Incorporating Syndicated Data Into Your Information Management Environment" at the Business Rules Forum in Orlando, FL on October 28.

Information Management is increasingly turning its focus to input that comes from outside of the company. These days, we can get much of the information we need through a reverse-append with a syndicated data provider. However, data providers make generalizations to balance the breadth and quality of the data, making this an imperfect science. This seminar will help you consider the tradeoffs (in breadth, depth and accuracy), how to architect for and wisely use external data.


Syndicated data, Business Rules Forum

Posted October 7, 2008 12:46 PM
Permalink | No Comments |

Two concepts that must go together are Master Data Management and Data Quality. One reason why is that, no matter that you are calling it MDM which is supposed to carry some cache, to many people in the organization, it's just another place where customer (or other subject area) master data is going to be maintained. In this "free market" where the MDM data store is about the 25th such store to hold master data, it had better be the best store for master data.

It must be up-to-date, able to take on syndicated data, and void of all intolerable defects across the spectrum of referential integrity, uniqueness, cardinatlity, subtype and supertype rules, value reasonability, consistency, formatting, data derivation, completeness, correctness and conformance to a clean set of values.

Poor data quality in MDM, the most leveragable of the master data stores, from where master data will propagate throughout the organization, will not provide a foundation that management will support and hurt the project more than just about anything.

On the positive side, if the MDM hub can provide and propagate high quality master data, that will almost surely provide a unique high value propsition to the organization.

Posted October 6, 2008 3:53 PM
Permalink | No Comments |

I'll be speaking with Carolina Posada of Commerzbank in "Case Study: Top 10 Mistakes in Forming Enterprise Data Governance" at the MDM FALL SUMMIT in New York, NY on October 20 at 2:30 et.

It is increasingly apparent that information management is an essential competitive arena at which the company must excel. Business participation comes in the form of data governance. Many have taken strides to forge data governance over one system, but heterogeneous,disparate applications with overlapping data focus is a reality that governance must address. Get pragmatic, explicit advice to forge effective governance that transcends the pitfalls in positioning information as a competitive asset as well as determining the contributions of each level of the business to data governance, aligning governance with business strategy and building a business case for data governance.


Technorati tags: Master Data Management, MDM Summit, Business Intelligence

Posted October 5, 2008 7:37 PM
Permalink | No Comments |

Search this blog
Categories ›
Archives ›
Recent Entries ›