Migrate Data from AWS to GCP
Requirements changes from time to time. One day, you are using AWS and the other day you want to leverage the powerful BigQuery from GCP(or there might be some other use case). In that case, you want to migrate your whole Datalake from AWS(read any other Cloud Provider or on-prem storage or different GCP Account) to GCP.
There are other ways to migrate your data from AWS to GCP such as Writing a Python script which reads all the blobs from your S3 bucket and writes them to GCS Bucket. Another way is to use GCP service; the process is hassle free(only the hassle is free, it comes with a cost); you will be charged on the bytes transferred by GCP.
So, I will share the step by step process on how to use the service provided by GCP which is *drumrolls* DATA TRANSFER.
Step 1: Login to your GCP Account, search for Data Transfer
on GCP Cloud Search and Click on the Product.
Step 2: Click on Create Transfer Job
Step 3: Select the Source
from where you want to migrate the data.
Step 4: Select the Bucket or the path which you want to migrate. If you want to migrate the whole bucket then just pass the bucket name. The bucket should have read access enabled in order to read the blobs.
Next, you will have to pass the AWS Credentials so that GCP can make connection with AWS on your behalf.
You can either pass the Access Key ID and Secret Access Key or Pass AWS IAM role with some particular permission which you can see in the screenshot attached.
Step 5: This is an optional step. You can also pass filters in the form of prefix so that you can migrate only particular data from the bucket/folder.
Similarly, you can pass the time range to copy only those particular files modified in that time frame.
Step 6: Next is you have to pass the destination bucket/folder where you want to dump the data.
Step 7: You can further schedule the occurrence of the job based on your requirement.
Step 8: You can write the description to identify the jobs. In addition to that, you get the option to overwrite the data or keep a copy of data if the checksums are same.
Step 9: You are all done, just click on Create
button and GCP will take care of the rest. You will be able to see the run history on the homepage.
It’s that easy.