Data Processing Components
Data processing components include Hadoop, Snowflake, SOLR, and Tableau.
Audience Manager uses the following components to process data:
In Audience Manager, Hadoop is the master database that contains everything Audience Manager knows about a user. For example, when the Profile Cache Servers create log files that contain data about your users, it sends that data to Hadoop for storage. Other important Hadoop elements include:
- Hive: A data warehouse for Hadoop. Hive manages ad hoc queries to the data stored in Hadoop.
- HBase: A very large Hadoop database. It processes and manages inbound and outbound data, trait rules, algorithmic modeling information, and performs many other functions related to storing and moving data to different systems.
Customers do not have direct access to these systems. However, customers do work with them indirectly as these components store important data about their site visitors.
Snowflake is a massive cloud database. It provides data to many of the dashboard graphs and their related text boxes that display the % change for each item in the graph. If you use Audience Manager and look at the dashboard reports, you're interacting with data provided by Snowflake.
This is by no means a comprehensive list, but some common dashboard reports that Snowflake is responsible for include:
- All the overlap reports (see the Interactive Reports section for information about each overlap report).
SOLR is an open-source database and server system from Apache. It provides robust and fast search capabilities over our large data sets. As an Audience Manager customer, you can see SOLR in action when you build segments. It provides data to the Estimated Historic Segment Size report. SOLR is ideal for this role because of its speed. For example, SOLR is able to update the historic size data as you build rules and add new traits to a segment.
Audience Manager uses Tableau to display data in the Interactive Reports and the Audience Optimization Reports . The interactive reports display performance and overlap data for traits and segments. Instead of using numbers arranged in columns and rows, they return data using different shapes, colors, and sizes. Additionally, you can choose individual or groups of data points and drill down into the report results for more details. These visualization techniques and report interactivity help make large amounts of numeric data easier to understand.