top of page

Data Preparation

Sparkflows enables users to build data pipelines via 100+ pre-built processors to validate data, transform data and have clean data prepared. Users can even extend the processors by leveraging our SDK as well.

Sparkflows enables push down analytics which is built into the core architecture enabling the processing to take place where the data resides resulting in easy data governance.

Data Cleaning

Fix incomplete, incorrect, duplicates or otherwise erroneous data in a data set with the help of Sparkflows visual drag and drop data cleaning processors. Let your Analyst and Experts focus on providing quality data by combining and cleaning your datasets with hassle free low code options provided by Sparkflows.

Data Cleaning

  • Data Wrangling

  • Dedup

  • Drop Duplicate Rows

  • Drop Rows with Null

  • Find and replace Using Regex Multiple

  • Imputing with constant

  • Imputing with a mean value

Data Cleaning

  • Imputing with mode value

  • Remove Duplicate rows

  • Remove unwanted characters

  • Imputing with Median

  • Find and Replace Using Regex

  • Remove Unwanted characters Multiple

Filter

  • Drop Columns

  • Select Columns

  • Filter by Date Range

  • Row Filter

  • Filter By String Length

  • Filter By Number Range

Data Filtering

Sparkflows assists your data analysts to enrich datasets by selecting only the required fields and rows, and keep away unwanted readings that might impact the efficiency of the data processing. This filtering operation leads to significant improvement over the data model accuracy.

Data Aggregation

Sparkflows pre-built data enrichments processors improve the data information value by enabling users to join/summarize data from different, disparate and multiple sources. It becomes easier to visualize patterns and trends in your data that would not be apparent during standard processing.

Join

  • Geo Join

  • Join on Columns

  • Join using SQL

  • Join on Common Column

  • Join on Common Columns

Union

  • Union Strict

  • Union All

  • Union Distinct

String

string.png
  • String Functions

  • String Functions Multiple

  • Text Case Transformer

Math

  • Math Expression

  • Math Functions Multiple

String & Math Functions

Use Sparkflows extensive String and Math processor functions to convert data to a different format or case, compute metrics about the data, or perform mathematical manipulations. These functions help immensely in reaching the goal of having a clean dataset which in turn helps improve the quality of the data model.

Date Time Functions

Sparkflows allow end-users to easily handle and normalize data related operations. Extracting data components into separate columns, converting a value to a different date format, converting string to date, etc are a couple of the useful date functions available in Sparkflows.

Date function

  • Date Difference

  • Date Time Field Extract

  • Date to String

  • String to Unix time

Time Function

  • Time functions

  • Unix time to string

  • String to Date time

bottom of page