GroupBy column and filter rows with maximum value in Pyspark
I am almost certain this has been asked before, but a search through stackoverflow did not answer my question. Not a duplicate of [2] since I want the maximum value, not the most frequent item. I am new to pyspark and trying to do something really simple: I want to groupBy column “A” and then only keep the row of each group that has the maximum value in column “B”. Like this: