airflow Archives - Magenaut

Read and group json files by date element using pyspark

August 22, 2022 by Magenaut

I have multiple JSON files (10 TB ~) on a S3 bucket, and I need to organize these files by a date element present in every json document.

setting up s3 for logs in airflow

August 15, 2022 by Magenaut

I am using docker-compose to set up a scalable airflow cluster. I based my approach off of this Dockerfile https://hub.docker.com/r/puckel/docker-airflow/

How to create a conditional task in Airflow

August 11, 2022 by Magenaut

I would like to create a conditional task in Airflow as described in the schema below. The expected scenario is the following: