Airflow server to manage data pipelines
Siminvest is an investing platform where users can find the best mutual funds to invest in and can invest their money without any hassles.
Apache Airflow is an open-source workflow management platform for data engineering pipelines. Airflow enabled them to programmatically author, schedule, and monitor their workflows via the built-in Airflow user interface. Airflow is a data transformation pipeline ETL (Extract, Transform, Load) workflow orchestration tool.
Using Airflow we are writing our data pipelines to monitor the data from different source(iqplus etc.) to destination.
Used Apache Airflow open-source tool and all the script's are written in python and is deployed on GCP using vm instance..
From day one, we built the CI/CD pipeline using bitbucket so that as soon as we commit the code, it is deployed to a testing and production server.
Also we did slack integration to get the alerts for the failure jobs.
Enabled flak8 on the CI/CD pipeline for the code linting.