Data Preparation
Sparkflows enables users to build data pipelines via 100+ pre-built processors to validate data, transform data and have clean data prepared. Users can even extend the processors by leveraging our SDK as well.
Sparkflows enables push down analytics which is built into the core architecture enabling the processing to take place where the data resides resulting in easy data governance.
Data Cleaning
Fix incomplete, incorrect, duplicates or otherwise erroneous data in a data set with the help of Sparkflows visual drag and drop data cleaning processors. Let your Analyst and Experts focus on providing quality data by combining and cleaning your datasets with hassle free low code options provided by Sparkflows.
Data Cleaning
-
Data Wrangling
-
Dedup
-
Drop Duplicate Rows
-
Drop Rows with Null
-
Find and replace Using Regex Multiple
-
Imputing with constant
-
Imputing with a mean value
Data Cleaning
-
Imputing with mode value
-
Remove Duplicate rows
-
Remove unwanted characters
-
Imputing with Median
-
Find and Replace Using Regex
-
Remove Unwanted characters Multiple
Filter
-
Drop Columns
-
Select Columns
-
Filter by Date Range
-
Row Filter
-
Filter By String Length
-
Filter By Number Range
Data Filtering
Sparkflows assists your data analysts to enrich datasets by selecting only the required fields and rows, and keep away unwanted readings that might impact the efficiency of the data processing. This filtering operation leads to significant improvement over the data model accuracy.
Data Aggregation
Sparkflows pre-built data enrichments processors improve the data information value by enabling users to join/summarize data from different, disparate and multiple sources. It becomes easier to visualize patterns and trends in your data that would not be apparent during standard processing.
Join
-
Geo Join
-
Join on Columns
-
Join using SQL
-
Join on Common Column
-
Join on Common Columns
Union
-
Union Strict
-
Union All
-
Union Distinct
String
-
String Functions
-
String Functions Multiple
-
Text Case Transformer
Math
-
Math Expression
-
Math Functions Multiple
String & Math Functions
Use Sparkflows extensive String and Math processor functions to convert data to a different format or case, compute metrics about the data, or perform mathematical manipulations. These functions help immensely in reaching the goal of having a clean dataset which in turn helps improve the quality of the data model.
Date Time Functions
Sparkflows allow end-users to easily handle and normalize data related operations. Extracting data components into separate columns, converting a value to a different date format, converting string to date, etc are a couple of the useful date functions available in Sparkflows.
Date function
-
Date Difference
-
Date Time Field Extract
-
Date to String
-
String to Unix time
Time Function
-
Time functions
-
Unix time to string
-
String to Date time