Read and group json files by date element using pyspark
I have multiple JSON files (10 TB ~) on a S3 bucket, and I need to organize these files by a date element present in every json document.
I have multiple JSON files (10 TB ~) on a S3 bucket, and I need to organize these files by a date element present in every json document.