Commentary on Data Governance, Marketing Technology and Web Analytics.
RSS icon Email icon Home icon
  • Five New Ideas From 2010 MIT Information Quality Industry Symposium

    Posted on July 15th, 2010 goloboym 1 comment

    Here are some quick thought from the first day of the MIT Information Quality Industry Symposium. It’s my favorite event of the year. I refer to it as the “anti-boondoggle.” All academic theory and very little vendor fluff. I suppose that what you get when MIT and the University of Arkansas organize events. I’ll either post another top 5 tomorrow, or a full recap.

    Please comment if you’d like me to dive further into any of these topics.

    1) Cloud Is No Longer The Focus

    Last year everyone talked about Governance in the cloud. This year it’s dead. Why? I think it may be that this group, unlike the Sales 2.0, is focused on Enterprise scale monolithic systems. Last year at MITIQIS, many presentations were focused on the cloud impact on large scale information quality programs. This year, it’s all about internal, installed systems. I find this facinating. Did this group try cloud, and not see the value? Or is it that there is still a duality of idealogies: One that prefers to keep things internal, and a second that wants to move their IT responsiblility to SaaS apps?

    2) Master Data Management (MDM) Isn’t The Only Solution

    I was surprised that among the Information Quality vendors and practitioners, MDM was no longer the focus. Joe Bugajski focused on it, but others merely touched on how they would interact with MDM rather than focus on MDM as the central system in an Information Quality focused environment. This year, many people talked about Information Quality at the system level, and fixing business process and human interfaces to eliminate dirty data at the source. This reminded me of the Data Warehouse to Data Mart paradigm shift of 10-15 years ago. I just felt old writing that.

    3) Data Quality is a Dirty Word

    “Information Quality” is now in vogue. I was corrected several times in conversations when I mentioned data quality. This is somewhere between a more highbrow way of marketing ourselves, and snobery. I don’t think this matters in the least bit, but others believe it’s more accurate and lends more credibility to our practice. As you’ll notice throughout my writing, I resist heavily the practice of pluralizing the word data. I never write, “data are,” which I believe is gramatically accurate. I feel the same way here. I do “Data Quality” work, regardless of who says that term is wrong. All right… I’ll use it in this post and try it on for size. This is the Information Quality Symposium after all.

    4) Free Sources Drive Down R&D Cost

    Data is available from government sources and tools are available from open source communities. No surprise there, but there was in increased focus on it here at MITIQIS. Why? Talend, an information quality vendor, builds their tools on the back of those open source libraries. They credited various shared data models, methodolgies and data sources that allow them to shortcut proprietary R&B spend. Trillium also spoke up, and mentioned that they leverage some of the same open-source thinking in their full price solutions.

    5) 60-90% of Operational Data is Valueless

    I won’t say worthless, since there is some operational necessity to the transactional systems that created it, but valueless from an analytic perspective. Credit to Kirk Amidon for this insight - he attended the session where this stat was quoted. Similarly, Steve Adler from IBM and others discussed it in their presentations. Data only has value, and is only worth passing through to the Data Warehouse if it can be directly used for analysis and reporting. No news on that front, but it’s been more of the focus since the proliferation of data has started an increasing trend in storage spend. That wasn’t discussed at the conference… just my opinion.

  • MIT Information Quality Symposium Day 2

    Posted on July 17th, 2009 goloboym 1 comment

    With Day 2 of the MIT IQIS complete, I thought it would be good to write up another summary. I was very impressed with the quality of speakers and their dedication to the field of Information Quality. The work shows a lot of innovative thinking and pride. (I’ll add in links and update later today)

    Robert Grossman – Information Quality in the Cloud

    Bob is part of the Open Cloud Consortium and passionate about the topic. He presented everything you need to know to understand where Cloud Computing is today, where it’s going next (based on open debate among dueling standards boards), and how it affects Information Quality discussions. He has a unique ability to take very complex topics and break them down into simple conversations.

    The most interesting part for me was defining Public, Community and Private Clouds, which I couldn’t have described before this talk. I also appreciated his comment that Cloud is the only way to analyze 100TB of data, and that the alternative is to merely entomb it.

    Delphine Clement - Cost of Non Quality Data

    Delphine is from HP in France and discussed how they have approached their KQI – Key Quality Indicators. I like that KQIs mirror KPIs but that Information Quality is metadata reporting rather than business metrics so it’s separate. Delphine also presented a methodology for measuring direct vs. indirect cost savings from Data Quality initiatives. She has clearly spent a lot of time working on this approach and is doing a great job. I really enjoyed this presentation.

    Lyn Robison - Diagnosing IT’s Impact on the Business

    Lyn, from The Burton Group has a theory on how to measure data quality from an IT perspective, but I thought it was very pie in the sky. There were lots of questions about the politics of such an effort, and I don’t think the approach was practical. For instance, if your measured data quality metrics turn up as poor, the IT organization will blame the business. There’s no way this could work politically.

    I liked that Lyn tried to compare the business people’s perception of Data Maturity vs. the IT perception, but how do you align IT perception and Business perception? Someone also asked, should IT be measured on poor data quality? The answer: Not if the Business owns the data.

    Steve Sarsfield - Using Data Quality Scores to Sell IQ Value

    Steve echoed others who encouraged Information Quality progress by “Leveraging a Crisis” to build momentum. He also asked us to present the “Do Nothing” approach, i.e. present to our management what would happen if they ignored the problem. Steve’s scoring method was based on the Trillium TS Insight product, but appeared to be a practical way to measure Data Quality. I think some of this can be done easily with or without Trillium, but I appreciated how the tool can manage the measurements over time.

    Marillo Boccia – Data Quality in the Media Industry

    Marillo is the Director of Database Marketing at Grupo Abril, the largest publisher in the Southern Hemisphere. He presented a project (done with the help of service provider Assesso) where his team personalized magazine ads for Banc Itau to 1.2 Million subscribers. Cool stuff. They merged their subscriber database with the bank’s and did a massive customer data cleanup to ensure very high data quality. They amazed their customers in the process.

    Dan Defend and Aparna Vani - Data Quality Challenges for Yahoo’s Massive Data Environment

    Dan and Aparna presented the Data Quality and Analytics sides respectively. They monitor website interaction and uncover trending and outage information by analyzing a constant flow of clickstream data. Their group deals iwth duplication challenges, security issues, and the need to report outage alerts instantly. Their work was also driven by past MIT IQIS conferences, and they presented their practical approach to establishing a central data quality process and framework.

  • MIT Information Quality Industry Symposium Day 1

    Posted on July 16th, 2009 goloboym No comments

    I’m just settling in for Day 2 of the MIT IQIS 2009 and thought I’d throw out some thoughts for a couple of future posts I’m drafting. Here are the quick recaps from yesterday. 

    Danette McGilvray - Ten Steps to Data and Trusted Information

    A great primer on how to manage any Data  Quality project. Her 10 steps made a lot of sense. Not all of the methodology would be used for any given project, but still it worked for me. I also won her book, “Executing Data Quality Projects” in the drawing at the end of the class.

    Bill Inmon - DW2.0 and Unstructured Data

    After 10+ years in Data Warehousing I finally got to see Bill Inmon speak. Bill is the rockstar of the DW world. He’s regarded as the Father of Data Warehousing and treated as royalty at a conference like this. His new stuff was all about contextual ETL. Sounded interesting, but I believe there are others working on the same thing.

    Keynote: Ronald Bechtold - Transforming the Army with High

    Ronald is the Chief Data Officer at the Army. Cool title. Not what you’re picturing. He is a passionate CIO type who has a huge challenge. Definitely some words of wisdom in there. “Focus on solving problems,” rather than tools, technology or data. Good stuff!

    Joe Bugajski - MDM Improves Information Quality to Deliver Value

    Joe had some great examples where Data Quality actually led to increased revenue. Imagine that! Value from Data work. I think that’s what we’re all striving for. Joe is a big personality who speaks well, so this one was entertaining.

    Mark Goloboy (that’s right, me) - CRM Data Quality for Sales and Marketing

    After a bit of nerves, I found my groove and thought the presentation went really well. Some good questions about where my company started with Data Governance - it’s a very new ffocus or us. I also got to push back on some industry experts when asked why we weren’t focusing on MDM to start. Plus, Bill Inmon attended.

    Martin Boyd - Product Data Quality Product from Silver Creek

    More contextual analysis. Seemed to be done in a very smart way. The software was functional at big clients, and they had figured out how to solve some complex issues around improving poor product data. If they had the same thing for Customers, it would be a more interesting product. More development or a merger are needed here.