Data Ingestion overview
- Topics:
- Data Ingestion
CREATED FOR:
- Developer
Adobe Experience Platform brings data from multiple sources together in order to help marketers better understand the behavior of their customers. Adobe Experience Platform Data Ingestion represents the multiple methods by which Experience Platform ingests data from these sources, as well as how that data is persisted within the Data Lake for use by downstream Experience Platform services.
This document introduces the three main ways in which data is ingested into Experience Platform, with links to their respective overview documentation for more detailed information.
Batch ingestion
Batch ingestion allows you to ingest data into Experience Platform as batch files. Batches are units of data that consist of one or more files to be ingested as a single unit. Once ingested, batches provide metadata that describes the number of records successfully ingested, as well as any failed records and associated error messages.
Manually uploaded datafiles such as flat CSV files (mapped to XDM schemas) and Parquet dataframes must be ingested using this method.
See the batch ingestion overview for more information.
Streaming ingestion
Streaming ingestion allows you to send data from client- and server-side devices to Experience Platform in real time. Experience Platform supports the use of data inlets to stream incoming experience data, which is persisted in streaming-enabled datasets within the Data Lake. Data inlets can be configured to automatically authenticate the data they collect, ensuring that the data is coming from a trusted source.
See the streaming ingestion overview for more information.
Sources
Experience Platform allows you to set up source connections to various data providers. These connections enable you to authenticate to your external data sources, set times for ingestion runs, and manage ingestion throughput.
Source connections can be configured to gather data from other Adobe applications (such as Adobe Analytics and Adobe Audience Manager), third-party cloud storage sources (such as Azure Blob, Amazon S3, FTP servers, and SFTP servers), and third-party CRM systems (such as Microsoft Dynamics and Salesforce).
See the Sources overview for more information.
ML-Assisted schema creation
To quickly integrate new data sources, you can now use machine learning algorithms to generate a schema from sample data. This automation simplifies the creation of accurate schemas, reduces errors, and speeds up the process from data collection to analysis and insights.
See the ML-assisted schema creation guide for more information on this workflow.
Next steps and additional resources
This document provided a brief introduction to the different aspects of Data Ingestion in Experience Platform. Please continue to read the overview documentation for each ingestion method to familiarize yourself with their different capabilities, use cases, and best practices. You can also supplement your learning by watching the ingestion overview video below. For information on how Experience Platform tracks the metadata for ingested records, see the Catalog Service overview.