blue-bg-header-size.png

Installation

deploy.png

Overview

Sparkflows can be installed on cloud or on-premise. It can be installed on AWS, Azure, Google Cloud, Databricks, Cloudera, Hortonworks.

AWS_SPARKFLOWS_ARCHITECTURE.png

AWS

Sparkflows can be installed on AWS. It can be deployed on a standalone EC2 machine. It can then read data from S3, Redshift etc. process them and write out the results to S3, Redshift etc.

Or it can be installed on the edge node of an EMR cluster. In this case it would submit the jobs to the EMR cluster for processing.

sparkflows_azure_hdinsights.png

Azure

Sparkflows can be installed on Azure. It can be deployed on a standalone  machine. It can then read data from ADLS, SQL Server etc. process them and write out the results to ADLS, SQL Server etc.

Or it can be installed on the edge node of an HDInsight cluster. In this case it would submit the jobs to the HDInsight cluster for processing.

sparkflows_cloudera.png

Cloudera

Sparkflows can be installed on the edge node of a Cloudera Cluster. It then submits the jobs to the Cluster. Sparfklows interacts with HIVE, HDFS, Kafka etc.

sparkflows_azure_databricks.png

Databricks

Sparkflows can be installed on one or more machines. The jobs get submitted to the Databricks cluster.

SPARKFLOWS-STANDALONE-ARCHITECTURE.png

Laptop

Sparkflows can be installed on a standalone machine.

  • Facebook - White Circle
  • LinkedIn - White Circle
  • Twitter - White Circle