In a previous post, I mentioned that one of the impacts of implementing a master data management system was removing some or all of the data integration responsibilities from the data warehouse. Since the data warehouse is the usual place for data integration, in particular customer data integration, I would like to explain myself and see if you agree, or at least see my point of view.

What I mean is that in the big picture of enterprise data management, there are data applications that produce data, and data applications that consume data. In a lot of cases, a data application can (and must) play both roles. I make the distinction here to show one of the great promises of master data management: the reduction in the amount of duplication of work being done by software processes that must transform data between the applications that produce that data, and the applications that consume it. In the following diagram, we can see what can happen if several data consumer applications (right side) must connect directly to other data producing applications (left side) to get (and transform) the data they need.

Not a Pretty Picture.

Not a Pretty Picture.

As you can see, it can get pretty messy, pretty quickly. There are things that exist in the IT world to try and reduce this complexity, SOA, EAI, etc. and some do a reasonable job, but in the end, there is more needed to really remove the duplication of effort. This is master data management (MDM). If we have an MDM hub in the middle of the above diagram, to act as a level of abstraction between data producers and data consumers, we can significantly simplify the picture and remove all redundancy.

Much better.

Much Better.

The MDM hub works the most effectively, where there is an actual data store implemented within it. The efficiency arises from being able to separate source-system specific data transformations from source-system independent data transformations. It enables this by maintaining, at its core, a enterprise-level business entity view of the data held in the department level (purpose-specific) data applications. This business view of the data, is populated from the source system’s data using source-system specific transformations, and thereby transforming the data into a data model that represents how the business sees the data entities, rather than how the purpose-specific applications see the data. When the data is transformed in this way, into an enterprise level business view of the data, the enterprise as a whole can understand the data and use the data by doing further transformations of that data into purpose-specific data (i.e. data to improve marketing, customer care, logistics efficiency, etc.).  The beauty of the whole system is that the end-users of the data do not need to worry about directly transforming the source system data into their own purpose-specific data, especially when they might need the same type data from multiple source systems (each with their own way of storing that information). They only need to go to one place, where the structure of the data is oriented to how they see the data, and where the data means the same thing thing to everyone (enterprise level). They are then isolated from the changes underneath, that is if a source system is replaced, decommissioned, or radically changed, the impact is kept to a minimum. The only changes required are the source-system specific transformations, and nothing else.

To get back to my original argument, I believe that an MDM system is mandatory for efficient operation of any medium to large-sized organization, and gets more important as the enterprise gets bigger. And I like to keep the main roles of the data managers separate from one another (at least on a personnel level, it not at an organizational structure level). That is, I prefer to have those individuals who use the data to run the business (data consumer enterprise roles) focused on using the data, and those individuals who are responsible for producing the data, focused on the applications that produce the data, regardless of where they fit into the organization’s operational structure. Therefore, I maintain that those individual who are responsible for the MDM system focus on doing that to the best of their abilities, and those that use the MDM system, such as those who would use the MDM system as the main source for the data warehouse, focused on building and maintaining the data warehouse.

So, at the end of the day, the bulk (possibly) of your data integration would happen in the MDM system, using source-system specific transformations, which would then allow the data warehouse to focus on generating analytical data for downstream use , without having the added burden of being responsible for the data integration as well, having only to worry about using source-system independent transformations, like many other downstream applications.

Leave a Reply

(required)

(required)

© 2009 Business Intelligence Review Suffusion WordPress theme by Sayontan Sinha