-
MIT Information Quality Symposium Day 2
Posted on July 17th, 2009 1 commentWith Day 2 of the MIT IQIS complete, I thought it would be good to write up another summary. I was very impressed with the quality of speakers and their dedication to the field of Information Quality. The work shows a lot of innovative thinking and pride. (I’ll add in links and update later today)
Robert Grossman – Information Quality in the Cloud
Bob is part of the Open Cloud Consortium and passionate about the topic. He presented everything you need to know to understand where Cloud Computing is today, where it’s going next (based on open debate among dueling standards boards), and how it affects Information Quality discussions. He has a unique ability to take very complex topics and break them down into simple conversations.
The most interesting part for me was defining Public, Community and Private Clouds, which I couldn’t have described before this talk. I also appreciated his comment that Cloud is the only way to analyze 100TB of data, and that the alternative is to merely entomb it.
Delphine Clement - Cost of Non Quality Data
Delphine is from HP in France and discussed how they have approached their KQI – Key Quality Indicators. I like that KQIs mirror KPIs but that Information Quality is metadata reporting rather than business metrics so it’s separate. Delphine also presented a methodology for measuring direct vs. indirect cost savings from Data Quality initiatives. She has clearly spent a lot of time working on this approach and is doing a great job. I really enjoyed this presentation.
Lyn Robison - Diagnosing IT’s Impact on the Business
Lyn, from The Burton Group has a theory on how to measure data quality from an IT perspective, but I thought it was very pie in the sky. There were lots of questions about the politics of such an effort, and I don’t think the approach was practical. For instance, if your measured data quality metrics turn up as poor, the IT organization will blame the business. There’s no way this could work politically.
I liked that Lyn tried to compare the business people’s perception of Data Maturity vs. the IT perception, but how do you align IT perception and Business perception? Someone also asked, should IT be measured on poor data quality? The answer: Not if the Business owns the data.
Steve Sarsfield - Using Data Quality Scores to Sell IQ Value
Steve echoed others who encouraged Information Quality progress by “Leveraging a Crisis” to build momentum. He also asked us to present the “Do Nothing” approach, i.e. present to our management what would happen if they ignored the problem. Steve’s scoring method was based on the Trillium TS Insight product, but appeared to be a practical way to measure Data Quality. I think some of this can be done easily with or without Trillium, but I appreciated how the tool can manage the measurements over time.
Marillo Boccia – Data Quality in the Media Industry
Marillo is the Director of Database Marketing at Grupo Abril, the largest publisher in the Southern Hemisphere. He presented a project (done with the help of service provider Assesso) where his team personalized magazine ads for Banc Itau to 1.2 Million subscribers. Cool stuff. They merged their subscriber database with the bank’s and did a massive customer data cleanup to ensure very high data quality. They amazed their customers in the process.
Dan Defend and Aparna Vani - Data Quality Challenges for Yahoo’s Massive Data Environment
Dan and Aparna presented the Data Quality and Analytics sides respectively. They monitor website interaction and uncover trending and outage information by analyzing a constant flow of clickstream data. Their group deals iwth duplication challenges, security issues, and the need to report outage alerts instantly. Their work was also driven by past MIT IQIS conferences, and they presented their practical approach to establishing a central data quality process and framework.
-
MIT Information Quality Industry Symposium Day 1
Posted on July 16th, 2009 No commentsI’m just settling in for Day 2 of the MIT IQIS 2009 and thought I’d throw out some thoughts for a couple of future posts I’m drafting. Here are the quick recaps from yesterday.
Danette McGilvray - Ten Steps to Data and Trusted Information
A great primer on how to manage any Data Quality project. Her 10 steps made a lot of sense. Not all of the methodology would be used for any given project, but still it worked for me. I also won her book, “Executing Data Quality Projects” in the drawing at the end of the class.
Bill Inmon - DW2.0 and Unstructured Data
After 10+ years in Data Warehousing I finally got to see Bill Inmon speak. Bill is the rockstar of the DW world. He’s regarded as the Father of Data Warehousing and treated as royalty at a conference like this. His new stuff was all about contextual ETL. Sounded interesting, but I believe there are others working on the same thing.
Keynote: Ronald Bechtold - Transforming the Army with High
Ronald is the Chief Data Officer at the Army. Cool title. Not what you’re picturing. He is a passionate CIO type who has a huge challenge. Definitely some words of wisdom in there. “Focus on solving problems,” rather than tools, technology or data. Good stuff!
Joe Bugajski - MDM Improves Information Quality to Deliver Value
Joe had some great examples where Data Quality actually led to increased revenue. Imagine that! Value from Data work. I think that’s what we’re all striving for. Joe is a big personality who speaks well, so this one was entertaining.
Mark Goloboy (that’s right, me) - CRM Data Quality for Sales and Marketing
After a bit of nerves, I found my groove and thought the presentation went really well. Some good questions about where my company started with Data Governance - it’s a very new ffocus or us. I also got to push back on some industry experts when asked why we weren’t focusing on MDM to start. Plus, Bill Inmon attended.
Martin Boyd - Product Data Quality Product from Silver Creek
More contextual analysis. Seemed to be done in a very smart way. The software was functional at big clients, and they had figured out how to solve some complex issues around improving poor product data. If they had the same thing for Customers, it would be a more interesting product. More development or a merger are needed here.
-
DEBATE: How should data governance and data quality work together?
Posted on June 30th, 2009 No commentsI added a comment to a fun debate on the Data Quality Pro site. The question was around the interaction between Data Governance and Data Quality. Most people agreed they were connected, but I think some people are living in the details of Data Governance. It has so much more potential than just fixing data models. My thoughts…
DEBATE: How should data governance and data quality work together?
I think people are on the right track linking Data Governance and Data Quality. No need to rehash above. My one input would be that Data Quality should be the measuring stick for the success of Data Governance programs.
I do feel that some people align Data Governance too closely with data modeling and data object definition. There are examples in the comments above. Sure clean data is the end goal, but Data Governance is the journey to get there. Data Governance is more of a cultural shift starting with 1) Aligning business strategy toward a common goal; 2) Building definitions, re-engineering processes & updating systems to represent those aligned business strategies; 3) Defining data objects to store the common definitions; and finally 4) Measuring success based on quantitative analysis of Data Quality. Without that, Data Governance becomes a theoretical data modeling exercise, and all of our work is minimized. I’m sure this is more of a top-down approach than the classical definition, but it’s where Data Governance will go next.
-
Lightweight Data Governance
Posted on May 11th, 2009 2 commentsLast week I read a great article from First Spike on the upcoming demand for Data Governance work. The author referenced several sources who predicted a sharp rise in demand for Data Governance. One even predicted that it will be a regulatory requirement. I followed up with Mark Cowan from First Spike last week to discuss our definitions of Data Governance.
Mark was very interested in what a Data Governance program looked like at an Internet company. His point was that it’s more typical to see Data Governance in the Health Care and Financial Services industries. That makes sense since those types of organizations are more likely to have higher data quality standards and regulatory requirements. Without going into too many specifics, I let him know my approach on lightweight Data Governance. I think it’s something that I’ll continue to explore, and develop further. I had never articulated it that way before, but it sums up my theory well.
We got to talking about structured vs. unstructured data, and approaches for dealing with each. Lightweight Data Governance is very much unstructured Data Governance. Rather than building formalized organizations to manage data governance and large scale Master Data Management solutions, my approach has been to improve existing infrastructure, systems and processes piece by piece.
This approach can lead to early success in Data Governance programs, backing from colleagues in other departments and an understanding of the value that Data Governance can bring to an organization. It also eliminates some of the arguments from critics regarding high program start up costs, number of dedicated resources, etc. I would highly recommend it as a starting point.
Conversely, most existing theory is based in top-down, large-scale Data Governance. I’ve attended webinars that promote getting buy-in from the CEO down for Data Governance programs. To paraphrase, “Without executive support, Data Governance programs cannot succeed.” I think it’s critical to make some early progress in a new Data Governance program, and get mid-level support. The Directors and VPs who own not only business usage, but also the data and reporting technology need to understand the value of Data Governance. Many do already. If you partner with those leads then executive support will be there when you need it. That’s my theory at least.
-
Letting B2B Data Die or: How I Learned to Stop Worrying and Ignore the Problem
Posted on May 7th, 2009 2 commentsI had a conversation with a colleague from another company recently. They mentioned a data quality project to clean up old B2B CRM Data. I had to stop and ask, “Why?” This wasn’t just old data, but really old pre-acquisition data from another source system.
We discussed further and I found that the data in question was sales data from 2006. When it was migrated the database keys were butchered, and many of the relationships were lost. It was going to take four FTE resources over 6 months to fix the problem. In a bad economy, I just couldn’t justify why any company would spend that effort. My suggestion… Dump the data from the reporting tables. That’s right. Delete it. Archive it for audit purposes, but get it out of the way and focus on today’s problems.
My colleague was shocked. Here I am, a Data Governance expert and Data Quality evangelist telling them to ignore the problem. My reasoning? You need to constantly prioritize your projects. If something is more valuable in terms of revenue, or presents a greater risk, or is a bigger pain to more people, fix that first! Don’t dwell on perfection. You’ll never get there. Just try to make the most improvements you can, as quickly as you can. Grab the low hanging fruit rather than re-planting the tree.
If this offends your data quality sensibilities, please comment. I’m curious to know whether my opinion resonates.


