top of page

Overview

Organizations across multiple industries are in pursuit of building powerful Recommender Systems, which aim to provide great recommendations to their users at any time.

However, building successful Recommender Systems has been extremely challenging for several reasons. Cleaning the incoming datasets, joining very different datasets, further enriching these datasets, building big data machine learning models for predicting recommendations, and loading these models and associated datasets into serving stores like Apache HBase™, and Apache Cassandra™ becomes complex requiring a lot of data processing, coordination, and orchestration. Additionally, handling NRT data adds even more complexity to an already complex problem.

Sparkflows solves it fluently by allowing each of the above steps to be done with pre-built Connectors, Processors, and Workflow.  In addition to standard connectors and processors, Sparkflows provides streaming workflows for processing NRT streaming data for processing and loading it into HBase, etc. Thus, pipelines are built and tested in order of hours instead of weeks.

Sparkflows has built-in support for machine learning to test various ML models, calculate results and load them into HBase, etc. for serving. Thus it smoothly supports Lambda Architecture for incorporating both Batch and Streaming to get great results!

 Consumers expect systems

(including websites) to be highly Intelligent, understand their needs, and Recommend products and services they would like at that specific time. Consumers love systems that can read their minds and make their engagements seamless.

There are way too many options today

Everyone is Loaded

But everyone consumes only selected things

It is extremely important to recommend the right things at the right time to every person.

There are several kinds and Contexts in which consumers consume Recommendations

recommendation-datasets.png

Several datasets from a wide variety of systems are used to Predict Recommendations that will drive sales and engagement 

However, Building end-to-end Powerful Recommendations Systems is Extremely Complex

Challenges

Distributed Systems

Data from too many Systems need to be connected. Handling various file formats, images etc. get too daunting.

Complex Jobs for Data Enriching

Acquiring, Cleaning, Combining and Enriching Big Data is very complex.

Building Jobs for Performance

Building Big Data Batch & Streaming Jobs for Performance is very hard.

Predicting Recommendations

Algorithms and Predicting Recommendation with Big Data gets very challenging

Operationalization

Operationalizing the distributed system end to end quickly becomes complex

Team Size

Most teams are not large enough to build and operationalize these end to end complex Big Data Systems

Recommender Systems can be build on Sparkflows quickly, using the pre-built connectors and processors

Various Kinds of Recommender Systems

Collaborative Filtering

People who agreed in the past will agree in the future

Content Based Filtering

Recommend Items similar to what the user liked in the past

Hybrid Recommendations

Collaborative Filtering + Content Based

Frequent Parallel Growth

Find Items which are frequently bought together

Simple Aggregates
  • Top N

  • Most Popular

  • Recent Uploads

Search Based

Using Search Engines

sparkflows-recommender-flow.png

Recommender System powered by Sparkflows

Sparkflows powers each step of Building Recommender System. Building the Recommender System is a highly iterative complex process with many people involved in building it.

Hence, it becomes immensely difficult to build them out.

Sparkflows makes it seamless to power each step of the process. It makes it easy for anyone to understand and update the system at any point of time.

customer-360-read-data.png

Step 1: Choose your data source

Sparkflows supports a variety of data sources both batch and streaming.

 

Connectors for CSV, Apache Kafka, JDBC, Markato, MongoDB, Apache HBase etc. are available out of the box.You will need to configure them to point to the right data source. 

clean-transform-enrich.png

Step 2: Clean, Transform, Combine and Enrich

Clean, Combine, Join, De-deduplicate, Transform and Enhance data with over 200+ pre-built processors. 

Step 3: Build Variety of Recommenders with Sparkflows

Collaborative Filtering

Using ALS Processor

Rich User & Item Profiles

Using the power of 190+ Processors

Content Based Filtering

Using Similarity Processor

Using Clustering Processors

Frequent Pattern Mining

Using FP-Growth Processor

Top-N

Easily Compute various Aggregates

Stream Processing

Handle NRT data with seamless Streaming Processors

customer-360-predictions.png

Step 5: Apply more ML/NLP

Enrich the user and item profiles with more ML/NLP in Sparkflows.

Step 4: Build Hybrid Recommender Systems

hybrid-recommender.png

Easily Combine results of various Recommender Systems build to get great results

recommendation-power-apps.png

Step 6: Load Recommendations into Serving Stores & Power Intelligent Applications 

Load profiles into serving stores such as Apache HBase, Apache Cassandra and Elastic and power intelligent applications such as Personalization, Virtual Assistant, Proactive case, Demand prediction, Churn Prediction, Fraud detection etc. with ease.

recommendation-sparkflows-arch.png

Bringing it All Together

Sparkflows makes it seamless to build out the various Powerful Recommender Systems.

Sparkflows handles both the Streaming and Batch workloads thus enabling the Lambda Architecture. Process streams from Apache Kafka and load them into HBase/Solr etc.

Process batch jobs, perform ML/NLP and load results into the serving stores.

Sparkflows Differentiators

10X Faster

Build out use cases in weeks instead of months with native connectors and processors

Iterate quickly

Iterate quickly with visual workflows and built-in version control

Go Further

Go even further with built-in nodes for ML,MLP, Sentiment analysis etc.

bottom of page