Self-Service Data Science AI Platform

Installation

Overview

Sparkflows can be installed on the cloud or on-premise. It can be installed on AWS, Azure, Google Cloud, Databricks, Cloudera, and Hortonworks.

Know more

Databricks

Sparkflows can be installed on one or more machines. The jobs get submitted to the Databricks cluster.

Know more

AWS

Sparkflows can be installed on AWS. It can be deployed on a standalone EC2 machine. It can then read data from S3, Redshift, etc., process them, and write out the results to S3, Redshift, etc.

Or it can be installed on the edge node of an EMR cluster. In this case, it would submit the jobs to the EMR cluster for processing.

Know more

GCP

Sparkflows can be installed on GCP. It can be deployed on a standalone EC2 machine. It can then read data from S3, Redshift, etc., process them, and write out the results to S3, Redshift, etc.

Or it can be installed on the edge node of an EMR cluster. In this case, it would submit the jobs to the EMR cluster for processing.

Know more

Azure

Sparkflows can be installed on Azure. It can be deployed on a standalone machine. It can then read data from ADLS, SQL Server, etc., process them, and write out the results to ADLS, SQL Server, etc.

Or it can be installed on the edge node of an HDInsight cluster. In this case, it would submit the jobs to the HDInsight cluster for processing.

Know more

Cloudera

Sparkflows can be installed on the edge node of a Cloudera Cluster. It then submits the jobs to the Cluster. Sparkflows interact with HIVE, HDFS, Kafka, etc.

Know more