- Architecture Overview
- Component Model
- Component Interaction Diagram
- Subject-Oriented Integration
Component Interaction Diagram
Whether or not a component relationship diagram represents a good component model can be assessed by a use-case-driven validation through component interaction diagrams. In subsequent chapters, you will see a lot more component interaction diagrams for various use cases. In this section, however, we start with a basic use case in a business-to-consumer (B2C) scenario where we would like to add influencer scores to the customer master data records in the Master Data Hub. To determine influencer ratings, we need to analyze information coming from sources such as videos on YouTube, blogs, wikis, forums, posts on Facebook and Twitter, and so on. Analyzing these sources requires a broad range of analytics such as voice-to-text or video analytics on the raw data, followed by deeper analytics on topic and sentiment detection. These analytics may be implemented as a sequence of Map-Reduce jobs based on a social analytics library. The output of these jobs is written to a well-modeled social media mart where predictive analytics are deployed to determine influencer scores. In addition, the data derived from social media can also be used for the matching of social personas to customer records in the Master Data Hub. Figure 4.8 shows the end-to-end component interaction diagram for this use case.
Figure 4.8 Component interaction diagram for enriching master data with influencer scores
- Step 1. In this step, connectors consume data feeds from the new information sources such as Facebook and Twitter. Depending on software selection, the connectors might be part of the information virtualization component or the MapReduce engine. For monitoring ongoing marketing campaigns, events, or 1:1 customer engagement opportunities, information is accumulated frequently—usually every few minutes—for some situations, for example where we are integrating customer service with social media, we may receive updates continuously.
- Step 2. In this step, the raw data is loaded into the MapReduce engine.
- Step 3. Configured as a series of analytical operations, the MapReduce engine executes a broad range of analytics. Initially, “document-centric” analytics strive to identify concepts, social personas, authoring location, demographics, behavioral patterns, and sentiment. Subsequently, topic detection analytics are applied, including correlation and assignment of individual mentions of concepts to topics. Leading MapReduce engines can execute this analytical sequence in a matter of minutes.
- Steps 4 and 5. After the MapReduce engine finishes, the results are moved to a social media mart. Depending on the database used for the social media mart as well as the software for the MapReduce engine, this could be as simple as a flat file produced by the MapReduce engine that gets loaded through a bulk load interface into the database. Of course, alternatively, the flat file might also be loaded with ETL software (the consolidation engine in the information virtualization component) that may also be used to simultaneously restructure and enhance the information as it is loaded.
- Step 6. Using the matching capability of the Master Data Hub,3 we can now determine which social persona might correspond to a customer record. There might be social personas with no match in the Master Data Hub or vice versa where a customer uses different social personas on different social platforms.
- Step 7. The outcome of the matching task can be visualized with reports using the reporting component.
- Step 8. Using predictive analytics with appropriate models and scoring functions, we can compute influencer scores.
- Steps 9 and 10. The influencer scores can be moved from the social media data mart to the Master Data Hub using capabilities such as ETL (part of the consolidation functionality) from the information virtualization layer. As a result, the master data records in the Master Data Hub are enriched with the influencer scores from the social media platforms.
In addition to these core steps of the use case, there are additional options for consideration. Influencer scores might change over time from a more coarse-grained to a more fine-grained level or vice versa. Influencer categories based on lower and upper thresholds might be defined through reference values. If the influencer score analytics suggest a change in these categories, the corresponding reference value sets might require an update in the Reference Data Hub (step 11). Possible social media sources might contain additional opportunities to enrich master data beyond the influencer scores. There might be additional contact, address, or other demographic information found—that with cleansing through appropriate data quality services (13) could be added to the matched master data records as well. For example, if a match is found for a social persona, there might be pictures or documents available from the social media that could be persisted in a Content Hub (step 12) and linked as unstructured master data to the master data record in the Master Data Hub.