Cleaning data is a very common task for data professionals. The data we read from source systems are sometimes corrupt, duplicated, or need some other kind of transformation to adjust to our needs.
In this post, I demonstrate a few common data-cleaning tasks with spark Python and SQL.
See the full post on my blog - https://chenhirsh.com/cleaning-data-with-spark/
Comentários