


Collaborative Self-Serve Advanced
Analytics with Sparkflows + Google Cloud Platform (GCP)
Perform Data Analytics, Data Exploration, and build ML models
and Data Engineering in minutes using the 450+ Processors in Sparkflows

Sparkflows is deeply integrated with and certified on GCP. It can be installed on an EC2 machine, run in standalone mode, or submit the jobs to EMR or Google Cloud Dataprep. It can process data from Google Cloud Storage, BigQuery, Google Cloud Dataflow, etc.

Build and Run Analytics and ML jobs on Dataproc or standalone machines.
Seamlessly read files from Google cloud storage and process them.
Send data to and build ML
models on Google Cloud Datalab
Read and Write data to Google Big Query
Read and process streaming
data from Apache Kafka and Dataflow
Results include data in
Charts, Tables, Text etc.

Integration with Google cloud Dataproc
Sparkflows can be easily installed on a Google Dataproc Cluster. Sparkflows can be installed on the master node of an Dataproc cluster. It would then submit the jobs to the Dataproc

Sparkflows can submit the Analytical Jobs to be run onto Google Dataprep. The results and visualizations are displayed back in Sparkflows.
Integration with Google Cloud Dataprep

Integration with BigQuery
Sparkflows is fully integrated with Bigquery. Sparkflows has processors for reading from and writing to Bigquery. They include:
Read BigQuery
Write BigQuery

Integration with Google Cloud Storage
Sparkflows allows you to access your files on Cloud Storage. The jobs run by Sparkflows can read from and write to files on Cloud Storage. The files can be in various file formats including CSV, JSON, Parquet, Avro, etc. Sparkflows also allows you to browse your files on Cloud Storage.
Integration with Google Cloud Datalab

Sparkflows is fully integrated with Google Cloud Datalab. Sparkflows provides a number of processors for doing model building with Datalab. These include :
LinearLearnerBinaryClassifier
LinearLearnerRegressor
PCASageMakerEstimator
SaveSageMaker
KMeansSageMakerEstimator
XGBoostSageMakerEstimator
LDASageMakerEstimator
Benefits of Sparkflows on Google Cloud
Find quick value with Sparkflows and GCP
Enable Business Analysts
Enable Business Analysts to find quick value with GCP clusters.
Self Serve Advanced Analytics
Enable users to do analytics and Machine Learning in minutes.
Enable 10x more to build
Data Science use cases.
10x More Users
Makes it easy to build, maintain
and execute