Visual Application Development​

​Build workflows by dragging and dropping

Rich collection of 250+ Processors

View results of previous runs

Machine Learning

Classification / Clustering / Regression

Collaborative Filtering

Save/Load Model / Predict

Cross Validator

Machine Learning Engines




Scikit Learn


Data Preparation

Prepare Data Seamlessly

Connect to various Sources & Sinks

Filter Data, Joins, Groups, Data Validation, Impute etc.

Connect to various Sources & Sinks

Batch sources : HDFS, Apache HIVE, Amazon S3

Streaming sources : Kafka, Flume

NoSQL sources : HBase, Solr, ElasticSearch

File Formats

Work with a variety of file formats including CSV/TSV, Avro, Parquet, JSON.

Intelligent Schema Inference for the various Datasets


​Perform NLP on large scale data with Apache OpenNLP & StanfordNLP

Perform OCR with Tesseract

Multi-tenancy  & User Management​​​

Users can share Datasets and Workflows with groups

Create users with different roles & permissions

LDAP Integration


View output of workflows as Linechart, Histogram, Barchart

View Random forests visually

Feature Generation​


TF-IDF, One Hot Encoder

String Indexer, Impute, Scaler

Developer Toolkit​

​​Add code using SQL, Scala, Jython nodes

Develop custom Nodes and have them available in Workflows


​​Access Sparkflows with a rich set of REST API's.

Workflows/Datasets/Dashboards/Execute Workflows/Access Result of Execution/Browse HDFS/Browse HIVE


Assemble the output of various workflows and nodes into a Dashboard

Build Dashboards from Relational Sources, adding filtering & drill down capabilities

Workflow Scheduling

Schedule workflows to be run a various time of the day/week/month

Trigger workflows by events in a Kafka topic.

Streaming Analytics

Connect to Apache Kafka, Apache Flume, Sockets, Twitter

Perform Streaming Analytics

Load results into Apache HBase, Apache Solr, Elastic Search etc.


Sparkflows Features

Fire Insights Experience