Documentation Experience Platform Data Science Workspace Guide

Data Science Workspace overview

Last update: Mon Aug 14 2023 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Data Science Workspace

CREATED FOR:

User
Developer

NOTE

Please note that the presence of documentation about a feature on Experience League does not guarantee its availability to every customer. This feature is only available to existing customers who purchased an Adobe Experience Platform or Adobe Experience Platform Intelligence License. Please refer to the official product description to understand features and other details associated with your purchased SKUs/products.

Adobe Experience Platform Data Science Workspace uses machine learning and artificial intelligence to unleash insights from your data. Integrated into Adobe Experience Platform, Data Science Workspace helps you make predictions using your content and data assets across Adobe solutions.

Data scientists of all skill levels will find sophisticated, easy-to-use tools that support rapid development, training, and tuning of machine learning recipes - all the benefits of AI technology, without the complexity.

With Data Science Workspace, data scientists can easily create intelligent services APIs - powered by machine learning. These services work with other Adobe services, including Adobe Target and Adobe Analytics Cloud, to help you automate personalized, targeted digital experiences in web, desktop, and mobile apps.

This guide provides an overview of the key concepts related to Data Science Workspace.

Introduction

Today’s enterprise puts a high priority on mining big data for predictions and insights that will help them personalize customer experiences and deliver more value to customers - and to the business.
As important as it is, getting from data to insights can come at a high cost. It typically requires skilled data scientists who conduct intensive and time-consuming data research to develop machine-learning models, or recipes, which power intelligent services. The process is lengthy, the technology is complex, and skilled data scientists can be hard to find.

With Data Science Workspace, Adobe Experience Platform allows you to bring experience-focused AI across the enterprise, streamlining and accelerating data-to-insights-to-code with:

A machine learning framework and runtime
Integrated access to your data stored in Adobe Experience Platform
A unified data schema built on Experience Data Model (XDM)
The computing power essential for machine learning/AI and managing big datasets
Prebuilt machine learning recipes to accelerate the leap into AI-driven experiences
Simplified authoring, reuse, and modification of recipes for data scientists of varied skill levels
Intelligent service publishing and sharing in just a few clicks - without a developer - and monitoring and retraining for continuous optimization of personalized customer experiences

Data scientists of all skill levels will achieve insights faster and more effective digital experiences sooner.

Getting started

Before diving into the details of Data Science Workspace, here is a brief summary of the key terms:

Term

Definition

Data Science Workspace

Data Science Workspace within Experience Platform enables customers to create machine learning models utilizing data across Experience Platform and Adobe Solutions to generate intelligent insights and predictions to weave delightful end-user digital experiences.

Artificial Intelligence

Artificial intelligence is a theory and development of computer systems that are able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.

Machine Learning

Machine learning is the field of study that enables computers the ability to learn without being explicitly programmed.

Sensei ML Framework

Sensei ML Framework is a unified machine learning framework across Adobe that leverages data on Experience Platform to empower data scientists in the development of machine learning driven intelligence services in a faster, scalable and reusable manner.

Experience Data Model

Experience Data Model (XDM) is the standardization effort lead by Adobe to define standard schemas such as Profile and ExperienceEvent, for Customer Experience Management.

JupyterLab

JupyterLab is an open-source web-based interface for Project Jupyter and is tightly integrated into Experience Platform.

Recipes

A recipe is Adobe’s term for a model specification and is a top-level container representing a specific machine learning, AI algorithm or ensemble of algorithms, processing logic, and configuration required to build and execute a trained model and hence help solve specific business problems.

Model

A model is an instance of a machine learning recipe that is trained using historical data and configurations to solve for a business use case.

Training

Training is the process of learning patterns and insights from labeled data.

Trained Model

A trained model represents the executable output of a model training process, in which a set of training data was applied to the model instance. A trained model will maintain a reference to any intelligent web service that is created from it. The trained model is suitable for scoring and creating an intelligent web service. Modifications to a trained model can be tracked as a new version.

Scoring

Scoring is the process of generating insights from data using a trained model.

Service

A deployed service exposes functionality of an artificial intelligence, machine learning model or advanced algorithm through an API so that it can be consumed by other services or applications to create intelligent apps.

The following chart outlines the hierarchical relationship between Recipes, Models, Training Runs, and Scoring Runs.

Understanding Data Science Workspace

With Data Science Workspace, your data scientists can streamline the cumbersome process of uncovering insights in large datasets. Built on a common machine learning framework and runtime, Data Science Workspace delivers advanced workflow management, model management, and scalability. Intelligent services support re-use of machine learning recipes to power a variety of applications created using Adobe products and solutions.

One-stop data access

Data is the cornerstone of AI and machine learning.

Data Science Workspace is fully integrated with Adobe Experience Platform, including the Data Lake, Real-Time Customer Profile, and Unified Edge. Explore all your organizational data stored in Adobe Experience Platform at once, along with common big data and deep learning libraries, such as Spark ML and TensorFlow. If you don’t find what you need, ingest your own datasets using the XDM standardized schema.

Prebuilt machine learning recipes

Data Science Workspace includes prebuilt machine learning recipes for common business needs, like retail sales prediction and anomaly detection, so data scientists and developers don’t have to start from scratch. Currently three recipes are offered, product purchase prediction, product recommendations, and retail sales.

If you prefer, you can adapt a prebuilt recipe to your needs, import a recipe, or start from scratch to build a custom recipe. However you begin, once you train and hyper-tune a recipe, creating a custom intelligent service doesn’t require a developer - just a few clicks and you’re ready to build a targeted, personalized digital experience.

Workflow focused on the data scientist

Whatever your level of data science expertise, Data Science Workspace helps simplify and accelerate the process of finding insights in data and applying them to digital experiences.

Data exploration

Finding the right data and preparing it is the most labor-intensive part of building an effective recipe. Data Science Workspace and Adobe Experience Platform will help you get from data to insights more quickly.

On Adobe Experience Platform, your cross-channel data is centralized and stored in the XDM standardized schema, so data is easier to find, understand, and clean. A single store of data based on a common schema can save you countless hours of data exploration and preparation.

As you browse, use R, Python, or Scala with the integrated, hosted Jupyter Notebook to browse the catalog of data on Platform. Using one of these languages, you can also take advantage of Spark ML and TensorFlow. Start from scratch, or use one of the notebook templates provided for specific business problems.

As part of the data exploration workflow, you can also ingest new data or use existing features to help with data preparation.

Authoring

With Data Science Workspace, you decide how you want to author recipes.

Save time by browsing for a prebuilt recipe that addresses your business needs, which you can use as is or configure to meet your specific requirements.
Create a recipe from scratch, using the authoring runtime in Jupyter Notebook to develop and register the recipe.
Upload a recipe authored outside Adobe Experience Platform into Data Science Workspace or import recipe code from a repository, such as Git, using the authentication and integration available between Git and Data Science Workspace.

Experimentation

Data Science Workspace brings tremendous flexibility to the experimentation process. Start with your recipe. Then create a separate instance, using the same core algorithm paired with unique characteristics, such as hyper-tuning parameters. You can create as many instances as you need, training and scoring each instance as many times as you want. As you train them, Data Science Workspace tracks recipes, recipe instances, and trained instances, along with evaluation metrics, so you don’t have to.

Operationalization

When you’re happy with your recipe, it’s just a few clicks to create an intelligent service. No coding required - you can do it yourself, without enlisting a developer or engineer. Finally, publish the intelligent service to Adobe IO and it’s ready for your digital experience team to consume.

Continuous improvement

Data Science Workspace tracks where intelligent services are invoked and how they’re performing. As data rolls in, you can evaluate intelligent service accuracy to close the loop, and retrain the recipes as needed to improve performance. The result is continuous refinement in the precision of customer personalization.

Access to new features and datasets

Data scientists can take advantage of new technologies and datasets as soon as they are available through Adobe services. Through frequent updates, we do the work of integrating datasets and technologies into the platform, so you don’t have to.

Security and peace of mind

Securing your data is a top priority for Adobe. Adobe protects your data with security processes and controls developed to help comply with industry-accepted standards, regulations, and certifications.

Security is built into software and services as part of the Adobe Secure Product Lifecycle.
To learn about Adobe data and software security, compliance, and more, visit the security page at https://www.adobe.com/security.html.

Data Science Workspace in action

Predictions and insights provide the information you need to deliver a highly personalized experience to each customer who visits your web site, contacts your call center, or engages in other digital experiences. Here’s how your day-to-day work happens with Data Science Workspace.

Define the problem

It all starts with a business problem. For example, an online call center needs context to help them turn a negative customer sentiment positive.

There’s plenty of data about the customer. They’ve browsed the site, put items in their cart, and even placed orders. They might have received emails, used coupons, or contacted the call center previously. The recipe, then, needs to use the data available about the customer and their activities to determine propensity to buy and recommend an offer that the customer is likely to appreciate and use.

At the time of the call center contact, the customer still has two pairs of shoes in the cart, but removed a shirt. With this information, the intelligent service might recommend that the call center agent offer a coupon for 20% off on shoes during the call. If the customer uses the coupon, that information is added to the dataset and the predictions become even better the next time the customer calls.

Explore and prepare the data

Based on the business problem defined, you know the recipe should look at all the customer’s web transactions, including site visits, searches, page views, links clicked, cart actions, offers received, emails received, call center interactions, and so on.

A data scientist typically spends up to 75% of the time required to create a recipe exploring and transforming the data. Data often comes from multiple repositories and is saved in different schemas - it must be combined and mapped before it can be used to create a recipe.

If you’re starting from scratch or configuring an existing recipe, you begin your data search in a centralized and standardized data catalog for your organization, which simplifies the hunt considerably. You might even find that another data scientist in your organization has already identified a similar dataset, and choose to fine-tune that dataset rather than start from scratch.
All the data in Adobe Experience Platform complies with a standardized XDM schema, eliminating the need to create a complex model for joining data or obtain help from a data engineer.

If you don’t immediately find the data you need, but it exists outside Adobe Experience Platform, it’s a relatively simple task to ingest additional datasets, which will also transform into the standardized XDM schema.
You can use Jupyter Notebook to simplify data pre-processing - possibly starting with a notebook template or a notebook you’ve used previously for propensity to buy.

Author the recipe

If you already found a recipe that meets all your needs, you can move on to experimentation. Or, you can modify the recipe a bit or create one from scratch - taking advantage of the Data Science Workspace authoring runtime in Jupyter Notebook. Using the authoring runtime ensures that you can both use the Data Science Workspace training and scoring workflow and convert the recipe later so it can be stored and reused by others in your organization.

You can also import a recipe in to Data Science Workspace and take advantage of the experimentation workflows as you create your intelligent service.

Experiment with the recipe

With a recipe that incorporates your core machine learning algorithms, many recipe instances can be created with a single recipe. These recipe instances are referred to as models. A model requires training and evaluation to optimize its operating efficiency and efficacy, a process typically consisting of trial and error.

As you train your models, training runs and evaluations are generated. Data Science Workspace keeps track of evaluation metrics for each unique model and their training runs. Evaluation metrics generated through experimentation will allow you to determine the training run that performs best.

Visit either the API or UI tutorial on how to train and evaluate models in Data Science Workspace.

Operationalize the model

When you’ve selected the best trained recipe to address your business needs, you can create an intelligent service in Data Science Workspace without developer assistance. It’s just a couple of clicks - no coding required. A published intelligent service is accessible to other members of your organization without the need to recreate the model.

A published intelligent service is configurable to automatically train itself from time to time using new data as they become available. This ensures your service maintains its efficiency and efficacy as time continues.

Next steps

Data Science Workspace helps streamline and simplify the data science workflow, from data gathering to algorithms to intelligent services for data scientists of all skill levels. With the sophisticated tools Data Science Workspace provides, you can significantly shorten the time from data to insights.

More importantly, Data Science Workspace puts the data science and algorithmic optimization capabilities of Adobe’s leading marketing platform in the hands of enterprise data scientists. For the first time, enterprises can bring proprietary algorithms to the platform, taking advantage of Adobe’s powerful machine learning and AI capabilities to deliver highly personalized customer experiences at massive scale.

With the marriage of brand expertise and Adobe’s machine learning and AI prowess, enterprises have the power to drive more business value and brand loyalty by giving customers what they want, before they ask for it.

For additional information, such as a complete day-to-day workflow, please begin by reading the Data Science Workspace walk-through documentation.

Additional resources

The following video is designed to support your understanding of Data Science Workspace.

https://video.tv.adobe.com/v/30567?quality=12&enable10seconds=on&speedcontrol=on

Transcript

In this video, we are going to introduce you to Data Science Workspace in Adobe Experience Platform. Data science permeates Adobe products. For marketers and analysts, we provide AI and ML features directly in the Experience Cloud applications as well as AI as a service in our intelligence services such as customer AI and attribution AI. For data scientists, we have Data Science Workspace which was designed to streamline and simplify the data science workflow from data gathering to algorithms. All will significantly shortening the time it takes to get from raw data to actionable insights. Data Science Workspace lets you easily bring in existing models into activation workflows or build entirely new ones. Data science teams can directly integrate their AI and ML investments with Experience Platforms, Real-Time Customer profile. Let’s take a look at some common use cases. One being AI enabled segment discovery. Who are the key visitors to your site? Which prospects are likely to convert? And which existing customers are likely to churn? Experience Platform provides an end-to-end machine learning framework to develop models to help drive segmentation on your Real-Time Customer profile data. This is done through scheduled training and scoring jobs to generate insights as new input and training data becomes available. Another scenario is machine learning driven omni-channel experiences such as web and app personalization. With Real-Time Customer profiles, Experience Cloud data and any custom data, you can easily develop your machine learning models to derive insights for actioning.

Although this may sound straightforward, implementing this process comes with a set of challenges. Existing systems were not built to support continuous insights and intelligence. Data is everywhere and siloed leading to higher latencies. Data preparation is slow and tedious. As we all know, a lot of a data scientist’s time is spent on data preparation instead of modeling. Systems are not designed for real time. Decisions are locked away in specific applications across the enterprise. Privacy and policies are constantly changing making it difficult to adapt. And insights and intelligence are not connected to engagement systems.

As a business analyst or data scientist, overcoming these challenges to drive personalized experiences for your business is extremely difficult.

With that in mind, let’s dive deeper into what the machine learning and artificial intelligence journey looks like in Data Science Workspace and look at how we can solve these machine learning challenges. The journey starts with data preparation. A data scientist should have easy access to all omni-channel data and shouldn’t have to spend too much time preparing for insights. With Jupyter Lab Notebooks natively integrated with Platform, accessing this data is only a few clicks away. To assist with data preparation and data analysis, data scientists can leverage Real-Time Customer profile. Profile gives you a holistic view of each individual customer by combining data from multiple channels including online, offline, CRM, and third parties allowing you to consolidate your customer data into a unified view, offering an actionable, timestamped account of every customer interaction. This gives data scientists full access to stitched and raw attribute as well as event data in an optimized format. Once data preparation has taken place, we move to model development. This is the core task of a data scientist, and where most of their time should be spent. To do this effectively, a data scientist is able to use state-of-the-art technologies to build models. Bringing existing code or use pre-built Adobe models with either the Platform interface or Adobe Sensei API. Like the rest of Platform, Data Science Workspace was built with an API first approach. The data available in Platform can be accessed through Data Science Workspace using query service reducing the need to clean, scrub, and fix incoming data. Since Adobe’s XDM Experience Data Model has already standardized all Platform schemas to reduce latency and data collection. Next, Data Science Workspace provides the ability to train and evaluate models directly within Platform. The machine learning workflow helps to rapidly experiment with different model configurations against any of your data sets small or large in the data lake. After developing a model, a data scientist can easily activate the model by deploying it as a service. Once a service is deployed, you’re able to monitor it using Data Science Workspace’s ML service management. To ensure that the output is being generated as expected and written back to the data lake. This enables continuous learning and retraining of the models to improve your predictions over time. Your predictions can then be executed through batch and real-time jobs to enrich your profile data. These predictions, machine learning, and artificial intelligence insights, can easily be activated to both Adobe and non-Adobe products by your marketing team leaving customers with more personalized experiences. In other words, Data Science Workspace provides one seamless workflow that allows you to go from data to consuming your insights all within Experience Platform.

When looking at the customer journey from how a customer discovers your brand, how they try your products or services, buy and use those products, and any engagement thereafter. AI can play a key role across each phase of the customer journey. Whether you want to curate content targeted to customers or calculate propensity of conversions. Having an AI Playbook across the customer journey is key to continuously engage with your customers in meaningful ways to build long-term brand loyalty. With Data Science Workspace, data science can now happen as close as possible to the origin of the data. In addition, data scientists can spend most of their time on predictions and insights, while marketers can easily activate the outputs of a data scientists’ efforts in both Adobe and non-Adobe products. You should now have a sense of what Data Science Workspace is and how your business can leverage Data Science Workspace. Thanks for watching. - -

recommendation-more-help

cc79fe26-64da-411e-a6b9-5b650f53e4e9