Notes about saving data with Spark 3.0 | by David Vrba ...
towardsdatascience.com › notes-about-saving-dataOct 3, 2020 · sortWithinPartitions — it is also a DataFrame transformation and unlike in the previous case Spark will not try to achieve a global sort but instead, it will sort each partition separately. So here you can distribute the data on the Spark cluster as you require for the final layout using the repartition() function (this will also create a shuffle) and then call sortWithinPartitions to have each partition sorted.