top of page

Why Sparkflows

Uncover Data Brilliance : Embrace Sparkflows for Analytics Excellence

Introduction to Generative AI

Sparkflows platform enables customers to use the Generative AI capabilities by hosting model and infrastructure in-house (on-prem or on the cloud within a VPC) and via API’s to licensed models like GPT-4.

Visual Application Development​

​Build workflows by dragging and dropping

Rich collection of 250+ Processors

View results of previous runs

Machine Learning

Classification / Clustering / Regression

Collaborative Filtering

Save/Load Model / Predict

Cross Validator

Machine Learning Engines




Scikit Learn


Data Preparation

Prepare Data Seamlessly

Connect to various Sources & Sinks

Filter Data, Joins, Groups, Data Validation, Impute etc.

Connect to various Sources & Sinks

Batch sources : HDFS, Apache HIVE, Amazon S3

Streaming sources : Kafka, Flume

NoSQL sources : HBase, Solr, ElasticSearch

File Formats

Work with a variety of file formats including CSV/TSV, Avro, Parquet, JSON.

Intelligent Schema Inference for the various Datasets


​Perform NLP on large scale data with Apache OpenNLP & StanfordNLP

Perform OCR with Tesseract

Multi-tenancy  & User Management​​​

Users can share Datasets and Workflows with groups

Create users with different roles & permissions

LDAP Integration


View output of workflows as Linechart, Histogram, Barchart

View Random forests visually

Feature Generation​


TF-IDF, One Hot Encoder

String Indexer, Impute, Scaler

Developer Toolkit​

​​Add code using SQL, Scala, Jython nodes

Develop custom Nodes and have them available in Workflows


​​Access Sparkflows with a rich set of REST API's

Workflows/Datasets/Dashboards/Execute Workflows/Access Result of Execution/Browse HDFS/Browse HIVE


Assemble the output of various workflows and nodes into a Dashboard

Build Dashboards from Relational Sources, adding filtering & drill down capabilities

Workflow Scheduling

Schedule workflows to be run a various time of the day/week/month

Trigger workflows by events in a Kafka topic.

Streaming Analytics

Connect to Apache Kafka, Apache Flume, Sockets, Twitter

Perform Streaming Analytics

Load results into Apache HBase, Apache Solr, Elastic Search etc


The Sparkflows Experience


bottom of page