This tutorial will cover how to explore, clean, and model data related to book sales. The data has been taken from Kaggle courtesy of the BookCrossing project.
The data for this project is split into 3 datasets. One dataset contains data related to the users, one dataset contains data related to the books, and the final dataset contains all of the user-generated ratings of the books.
To more accurately represent how data analysis would be done in Fire Insights, different tasks have been split up into different workflows. This keeps the project more organized, allows for more expansion in the future, and will provide performance benefits during testing and execution.
All 3 tutorials are available here: https://docs.sparkflows.io/en/latest/tutorials/end-to-end/books-recommendations/index.html