top of page
SparkFlows.io_full_logo_with_reqdchange.png

Collaborative Self-Serve Advanced
Analytics with Sparkflows + Google Cloud Platform (GCP)

Perform Data Analytics, Data Exploration, and build ML models

and Data Engineering in minutes using the 450+ Processors in Sparkflows

GCP+sparkflows-03-03.png

Sparkflows is deeply integrated with and certified on GCP. It can be installed on an EC2 machine, run in standalone mode, or submit the jobs to EMR or Google Cloud Dataprep. It can process data from Google Cloud Storage, BigQuery, Google Cloud Dataflow, etc.

GCP_architecture.png

Build and Run Analytics and ML jobs on Dataproc or standalone machines.

Seamlessly read files from Google cloud storage and process them.

Send data to and build ML

models on Google Cloud Datalab

Read and Write data to Google Big Query

Read and process streaming

data from Apache Kafka and Dataflow

Results include data in

Charts, Tables, Text etc.

Integration with Google cloud Dataproc

Sparkflows can be easily installed on a Google Dataproc  Cluster. Sparkflows can be installed on the master node of an Dataproc cluster. It would then submit the jobs to the Dataproc 

Sparkflows can submit the Analytical Jobs to be run onto Google Dataprep. The results and visualizations are displayed back in Sparkflows.

Integration with Google Cloud Dataprep
Integration with BigQuery

Sparkflows is fully integrated with Bigquery. Sparkflows has processors for reading from and writing to Bigquery. They include:

         Read BigQuery

         

         Write BigQuery

Integration with Google Cloud Storage

Sparkflows allows you to access your files on Cloud Storage. The jobs run by Sparkflows can read from and write to files on Cloud Storage. The files can be in various file formats including CSV, JSON, Parquet, Avro, etc.​ Sparkflows also allows you to browse your files on Cloud Storage.

Integration with Google Cloud Datalab

Sparkflows is fully integrated with Google Cloud Datalab. Sparkflows provides a number of processors for doing model building with Datalab. These include :

LinearLearnerBinaryClassifier

LinearLearnerRegressor

PCASageMakerEstimator

SaveSageMaker

KMeansSageMakerEstimator​

XGBoostSageMakerEstimator

 

​LDASageMakerEstimator

Benefits of Sparkflows on Google Cloud

Find quick value with Sparkflows and GCP

Enable Business Analysts

Enable Business Analysts to find quick value with GCP clusters.

Self Serve Advanced Analytics

Enable users to do analytics and Machine Learning in minutes.

Enable 10x more to build
Data Science use cases.

10x More Users

Makes it easy to build, maintain
and execute

No code and low code platform

Return on Investment (ROI)

Solve your data science use cases 10x faster.

bottom of page