Migration and Transfer
On a high level, the following are the steps involved in planning a migration activity from on-premise to AWS:
-
Assessment and Inventory
-
Categorization
-
Determining Cloud Services
-
Migration Planning
-
Migration Execution
AWS Migration Hub
AWS Migration Hub assists in planning the migration activities to the AWS cloud.
It helps to discover details of the on-premise environment, either through agent-based or agent-less discovery strategies.
It generates a list of servers, CPU, Memory and Network utilization. It provides recommendations of target EC2 instances and the associated cost.
AWS Application Discovery Service
AWS Application Discovery Service helps gather necessary information of your on-premise applications and infrastructure.
It helps identify relationships and dependencies between servers.
The data is pushed from agents to ADS every 15 minutes.
AWS Application Migration Service
AWS Application Migration Service performs the lift and shift of on-premise VMs to the cloud.
AWS has a staging area and a migrated resources area.
A Replication agent needs to be installed on the on-premise servers.
It integrates with AWS Systems Manager, S3 and Elastic Disaster Recovery.
It provides a continuous replication from on-prem to the staging area. Staging can be periodically moved to production.
AWS Database Migration Service
AWS Database Migration Service helps to migrate databases from on-premise to cloud.
DMS Fleet Advisor
provides insight into the database.
A Replication Instance
helps to replicate the data.
A Replication Task
defines how the replication should occur.
You must create an EC2 instance that will handle the migration task.
- Migration Types
-
- Full Load
-
there is an associated downtime
- Full Load + Change Data Capture (CDC)
-
no downtime
- CDC only
-
native tool to copy the data, it checks for the delta
Migration could be homogenous (e.g. Oracle → Oracle) or heterogeneous (e.g. mongoDB → DocumentDB)
Use Case:
RDS MySQL → Aurora
Option 1: DB Snapshots from RDS MySQL are restored as MySQL Aurora DB
Option 2: Create a Aurora Read Replica from your RDS MySQL and when replication lag is 0, promote it as its own DB cluster (can take time and $$)
Use Case:
External MySQL → Aurora
Option 1: Use Percona XtraBackup
to create a file backup in S3. Create Aurora MySQL DB from Amazon S3
Option 2: Create an Aurora MySQL DB. Use the mysqldump
utility to migrate MySQL into Aurora - slower than S3
Use DMS if both databases are up and running.
Use Case:
RDS PostgreSQL→ Aurora
Option 1: DB Snapshots from RDS PostgreSQL restored as MySQL Aurora DB
Option 2: Create an Aurora Read Replica from your RDS PostgreSQL and when replication lag is 0, promote it as its own DB cluster (can take time and $$)
Use Case:
External PostgreSQL→ Aurora
Create a backup and put it in Amazon S3. Import it using the aws_s3
Aurora extension.
Use DMS if both databases are up and running
AWS Elastic Disaster Recovery
To manage a disaster recovery site on-premise could be a costly affair.
AWS Elastic Disaster Recovery service is a fully managed disaster recovery service for physical, virtual and cloud based servers.
- Disaster Recovery Types
-
- On-premise to On-premise
-
traditional DR, very expensive
- On-premise to Cloud
-
hybrid recovery
- Cloud to Cloud
-
DR among two cloud regions
The below terminology is important for managing DR:
- Recovery Point Objective (RPO)
-
Defines how often you need to run backups and how far back in time do you need to go to recover
- Recovery Time Objective (RTO)
-
Defines the downtime till you recover
- Strategies for DR
-
- Backup / Restore
-
High RPO, High RTO, Most cheap option
- Pilot Light
-
A small deployment is always ready on the cloud. Useful for critical workloads. The RDS is running, EC2 is not running.
- Warm Standby
-
Full system is up and running but at minimal size. In case of disaster, we can scale to production load.
- Hot Site / Multi Site Approach
-
Very low RTO, very expensive, Full Production Scale is running on AWS and on-premise aka active-active.
An AWS Replication Agent needs to be installed on any source server that needs to be backed up.
Staging area is the location where AWS receives the data.
A Launch template is used to configure the specifications of the recovery servers.
Elastic Disaster Recovery
can minimize downtime and data loss with reliable recovery.
On the Cloud we have:
-
The staging subnet which replicates all data to an EBS volume per source disk
-
The recovery subnet
S3 can be used for Disaster Recovery and it offers 11 9s of data durability.
EBS Snapshots can be used for DR. The snapshots are incremental.
AWS Backup
AWS Backup is a fully managed service that centrally manages and automates backups across AWS services.
Backup Vault stores your data.
Backup Plan defines the configuration for backup.
Recovery Point is the point of time that data can be restored.
AWS Backup Vault Lock enforces the WORM model i.e. Write Once Read Many.
Backups cannot be deleted even by the root user.
AWS Mainframe Modernization
AWS Mainframe Modernization moves mainframes present on-premise to the cloud and changes the runtime to a modern system.
Supports Refactoring and Re-platforming.
AWS DataSync
AWS DataSync helps to move large amount of data to and from AWS.
Components include:
Agent: to be installed on the source
Location: defines the source and destination
Task: describes the transfer (blueprint)
Task Execution: actual execution of a task
A Task could be in one of the following states:
-
Available
-
Running
-
Unavailable (agent offline)
-
Queued (another task is using the agent)
A Task Execution could be in one of the following states:
-
Queued
-
Launching
-
Preparing
-
Transferring
-
Verifying
-
Success
-
Error
A DataSync agent connects to the on-premise data center and sends the data to the DataSync Discovery service so that it can provide the ideal recommendations for your setup.
Configure an AWS DataSync agent on the on-premise server that has access to the on-premise NFS file system. Transfer data over the AWS Direct Connect connection to an AWS PrivateLink interface VPC endpoint for Amazon EFS
by using a private virtual interface (VIF). Set up an AWS DataSync scheduled task to send the files to the Amazon EFS file system every 24 hours.
AWS Transfer Family
AWS Transfer Family is a secure transfer service that enables you to transfer files in and out of AWS storage services.
It works with S3 and EFS.
Transfer Family Server
is a fully managed and highly available FTP server where files can be hosted.
It supports the following protocols:
We can implement AWS Transfer Family with automated lifecycle policies to transition older data to more cost-effective storage classes.
AWS Snow Family
AWS Snow Family consists of a set of physical hardware devices that facilitate copying data to and from AWS.
They are rugged, portable and highly secure.
They are recommended when it takes more than 1 week to transfer data on the network.
Apart from migrating data, some devices can collect and process data at the edge.
The following devices support Data Migration:
Snowcone Snowball Edge Snowmobile
The following devices support Edge computing:
Snowcone Snowball Edge
- Snowball Edge
-
Can move terabytes/ petabytes of data. It can provide block storage and Amazon S3 compatible object storage.
It weighs 50 pounds and has compute power.
It is generally used for large data cloud migrations, decommissioning a data center or in disaster recovery setup.- Snowball Edge Storage Optimized
-
80 TB of HDD capacity for block volume and S3 compatible object storage, 40 vCPU, 80 GiB memory.
- Snowball Edge Compute Optimized
-
42 TB of HDD or 28 TB NVMe capacity for block volume and S3 compatible storage, 104 vCPUs, 416 GiB memory, option GPU, Storage Clustering available (up to 16 nodes).
- Snowcone
-
A small and portable device which has compute power. + It weights only 4.5 pounds and is available in SSD and HDD.
Snowcone is used in scenarios where Snowball does not fit (usually space-constrained environments).
Customer must use their own batteries and cables.
Data can be sent to AWS either offline or by connecting it to the internet and using AWS DataSync.- Snowcone
-
8 TB of HDD storage, 2 CPUs, 4 GB memory
- Snowcone SSD
-
14 TB of SSD storage, 2 CPUs, 4 GB memory
- Snowmobile
-
Is a truck that can transfer exabytes of data.
1 EB = 1000 PB = 1,000,000 TB.
Each Snowmobile has a capacity of 100 PB.
It is highly secured, with temperature control, GPS and 24/7 video surveillance.
It is preferred to Snowball if you transfer more than 10 PB of data.
Edge computing is processing data while its being created on an edge location. Edge locations have limited or no internet access.
They can run EC2 Instances and AWS Lambda Functions.
AWS OpsHub
is a CLI to manage your Snow Family Device.