In this blog we define the process for earning AWS customer credits when migrating Data and AI workloads to Databricks on Amazon Web Services (AWS) with the AWS Migration Acceleration Program (MAP). We will show you how to use AWS MAP tagging to identify new migrated workloads such as Hadoop and Enterprise Data Warehouses (EDW), in order to ensure workloads qualify for valuable AWS customer credits. This information is helpful for customers, technical professionals at technology and consulting partners, as well as AWS Migration Specialists and Solution Architects.
Databricks overview
Databricks is the data and AI company. More than 7,000 organizations worldwide — including Comcast, Condé Nast, H&M and over 40% of the Fortune 500 — rely on the Databricks Lakehouse Platform to unify their data, analytics and AI. Founded by the original creators of Apache Spark™, Delta Lake and MLflow, Databricks is on a mission to help data teams solve the world’s toughest problems. Databricks is recognized by Gartner as a Leader in both Cloud Database Management Systems and Data Science and Machine Learning Platforms.
The Databricks Lakehouse on AWS unifies the best of data warehouses and data lakes in one simple platform to handle all your data, analytics and AI use cases. It’s built on an open and reliable data foundation that efficiently handles all data types and applies one common security and governance approach across all of your data and cloud platforms.
What is the AWS Migration Acceleration Program (MAP)?
The AWS Migration Acceleration Program (MAP) is a comprehensive and proven cloud migration program based upon AWS’s experience migrating thousands of enterprise customers to the cloud. Enterprise migrations can be complex and time-consuming, but MAP can help you accelerate your cloud migration and modernization journey with an outcome-driven methodology.
MAP provides tools that reduce costs and automate and accelerate execution through tailored training approaches and content, expertise from AWS Professional Services, a global partner network, and AWS investment. MAP also uses a proven three-phased framework (Assess, Mobilize, and Migrate and Modernize) to help you achieve your migration goals. Through MAP, you can build strong AWS cloud foundations, accelerate and reduce risk, and offset the initial cost of migrations. Leverage the performance, security, and reliability of the cloud.
Why do you need to tag resources?
Migrated resources must be identified with a specific map-migrated tag (tag key is case sensitive) to ensure AWS credits are provided to customers as an incentive and to reduce the cost of migrations. The tagging process explained below should be used for Hadoop, Data Warehouse, on-premises, or other cloud workload migrations to AWS.
Steps to Tag Migrated Resources
The following infographic provides an overview of the seven-step process:
Set up an AWS Organization account
Set up a Databricks Workspace
Set up your Databricks workspacevia Cloud Formation or the Databricks account console in less than 15 minutes.
Activate AWS MAP Tagging
Provide the Migration Program Engagement ID (MPE ID is received after signing an AWS MAP Agreement with your AWS representatives) on the CloudFormation stack to be used to create the dependent AWS objects. This will create Cost and Usage Reports (CUR) and generate a server ID to be used by the AWS Migration Hub for migrations.
AWS CloudFormation template for generating server IDs and setting up Cost and usage reports
Providing the MPE ID before initiating the AWS CloudFormation Stack for MAP
After the AWS CloudFormation is run successfully, copy the migration hub server IDs generated from the output and tag them as a value to the map-migrated tag set on the Databricks clusters used as the target clusters for migration. In addition to Databricks clusters, follow the same tagging mechanism across other AWS resources used for the migration, including the Amazon S3 buckets and Amazon Elastic Block Store (EBS) volumes.
Copying the server IDs from the AWS CloudFormation output to be used in MAP tagging
Databricks clusters being used for migration
Spin up the Databricks clusters for migration and tag them with map-migrated tags one of three ways: 1. the Databricks console, 2. the AWS console, or 3. the Databricks’ API and its cluster policies.
1. MAP tagging Databricks clusters using the Databricks console (preferred)
Amazon EBS volumes are automatically MAP tagged when tagging is done via the Databricks console/h4>
2. MAP tagging Databricks clusters via the AWS console
3. Databricks cluster tagging can be performed via cluster policies
Be sure to tag the associated Amazon S3 buckets
Once all Databricks on AWS resources are tagged appropriately, perform the migration and track the usage via AWS Cost Explorer. Organizations who have signed an AWS MAP Agreement and performed all the required steps will see credits applied to their AWS account. Remember to activate the MAP tags in the Cost Allocation Tags section of the AWS Billing Console. The map-migrated tags may take up to 24 hours to show up in the Cost Allocation Tags section after you have deployed the CloudFormation template.
Activating Cost Allocation Tags
Automatically Delivered Cost and Usage Reports
Services > Billing > Cost & Usage Reports.
Summary
In this blog we explained how to successfully tag migrated workloads to Databricks on AWS using the AWS Migration Acceleration Program (MAP). Using tags to identify migrated workloads will benefit customers through AWS credits. The steps involved include generating server IDs on the AWS Migration Hub, setting up cost allocation tags, using MAP tags to target Databricks clusters, automatically delivering cost and usage reports, and tracking usage via Cost Explorer.
Questions? Email us at aws@databricks.com.
Additional Resources
AWS Migration Acceleration Program (MAP)
- AWS Migration Acceleration Program
- AWS Migration Acceleration Program Tagging Instructions Guide (Note: Refer to this guide for the latest CloudFormation template.)
Hadoop Migrations
- Migration Guide: Hadoop to Databricks
- The Hidden Value of Migrating from Hadoop to Databricks
- 5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
SAS Migrations
Data Warehouse Migrations
--
Try Databricks for free. Get started today.
The post How to Migrate Your Data and AI Workloads to Databricks With the AWS Migration Acceleration Program appeared first on Databricks.