Snowflake Data Migration - Part 1

Introduction

ยท

3 min read

Being a Data engineer, is never been so easy, because most of our time ๐Ÿ•œ at work requires us to mobilize the data from one place to another.

Either it may be from OnPrem(๐Ÿจ) to cloud (โ˜๏ธ) or Cloud(โ˜๏ธ) to Cloud(โ˜๏ธ), as the data is new revolution, now-a-days every decision is completely backed by data. So in order to get insight or make better decisions out of data a data engineer must work super hard to clean the data..

So in this series i will walk you through, all the steps that will make your next migration project on snowflake a super success.

Firstly, as discussed the data sources, are usually from:

1) On-premises(๐Ÿจ).

2) Cloud Database (โ˜๏ธ DB) .

3) System Logs etc

But prior to sending any data to snowflake, we usually build a data lake, where the data is usually annotated and stored, based on date and is stored on cloud storage. So here comes a daunting problem while you transfer the data from On-Prem to cloud, i.e if you aren't annotating the data correctly then the Data Lake becomes a Data Swamp. so make sure that you never make such a kind of mistakes while building a data lake.

lore_iq_datalakecomp.jpg Image source: getlore.io/blog/getting-rescued-from-the-da..

if you ever ask a data engineer, to clean a data swamp:

5ilesl.jpg

So as a first, and important step try to maintain a clean data lake which will become the point of contact for any data issues, on the cloud. However, not only data lake is useful as a point of contact, rather it also can be used to setup multiple data pipelines, to varied destinations.

Below is an example of a data lake on AWS:

getoto_Aws-cdk-pipelines-blog-datalake-data_lake.png Image source: noise.getoto.net/tag/data-lake

In addition, once the setup of data lake is completed, we can start writing our snowflake procedures, which will transfer the data from S3 to Snowflake. which can actually assumed to be historic load.

๐Ÿ’ก Can you list out, the tools that are present in the market to transfer the data from On prem database to AWS S3 in the comment section ?

So here is an outline architecture of the migration, project which we are going to implement.

Note ๐Ÿ˜€: I am not that super professional to create astounding architectures just let me know if there is any room for improvement.

sample_data_migration.png

Image Author: Naveen Kumar Vadlamudi

In the next part i will describe:

a) The detailed snow procedure, that will transfer the data from S3 bucket to Snowflake.

b) Show you how to, automate the process for all the tables.

c) Capturing the CDC for SCD-2.

d) Performing unit test on data and many more....

Until then, stay tuned to the blog and if you have learned something new, please don't forget to hit a like button, as it will encourage me to write more blogs like this.

Please, share this article if you find this interesting or helpful ๐Ÿ™.

ย