Data Preparation

Join

Union

Filter

3668503_calendar_clock_schedule_icon.png

Date Time

Parse 

4817004_data cleaning_data removal_data validation_data wiping_database cleaning_icon.png

Data Cleaning

math.png

Math

string.png

String

split.png

Split

code.png

Code

condition.png

Condition

group.png

Group

Rich library of operators to enrich data without writing a single line of code

Parse

  • Apache Log

  • Field Splitter

  • Fixed Length Fields

  • Multi Regex Extractor

  • Parse JSON Col

  • Regex Tokenizer

  • OCR

Join/Union

  • Geo Join

  • Join on Columns

  • Join using SQL

  • Union All

  • Union Strict

  • Join on Common Column

  • Join on Common Columns

  • Union Distinct

Group

folder.png
  • Cube

  • Group By

  • Pivot By

  • Roll up

Date- Time

  • Date Difference

  • Date Time Field Extract

  • Date to String

  • String to Unix time

  • Time Functions

  • Unix time to string

  • String to Date

Data Cleaning

data-cleaning.png
  • Data Wrangling

  • Dedup

  • Drop Duplicate Rows

  • Drop Rows with Null

  • Find and replace Using Regex Multiple

  • Imputing with constant

  • Imputing with a mean value

  • Imputing with mode value

  • Remove Duplicate rows

  • Remove unwanted characters

  • Imputing with Median

  • Find and Replace Using Regex

  • Remove Unwanted characters Multiple

Code

code.png
  • SQL 

  • SQL Executer

  • Python

  • Pipe Python

  • Pipe Python 2

  • Scala

  • Scala VDF

  • Jython

  • Pyspark

  • MultiInput Pyspark

  • MultiInput To MultiOutput Pyspark

  • Run Hive QL

  • Unix Shell Commands

Filter

  • Drop Columns

  • Select Columns

  • Filter by Date Range

  • Row Filter

  • Filter By String Length

  • Filter By Number Range

Math

math.png
  • Math Expression

  • Math Functions Multiple

String

string.png
  • String Functions

  • String Functions Multiple

  • Text Case Transformer

Split

split.png
  • Compare All Columns

  • Compare All Columns Single Output

  • Compare Specific Columns

  • Split By Expression

  • Split by Multiple Expressions

Condition

condition.png
  • Assert

  • Decision

Cast Data Type

cast.png
  • CastColumnType

  • CastMultipleColumnType

Add Column

add column.png
  • Add columns

  • Case When

  • Concat Columns

  • Expressions

  • Generate UUID

  • Generate UID

  • Hash

  • Zip with Index

Others

others.png
  • CDC Using Full Table Merge

  • Columns Rename

  • Count

  • Geo IP

  • Geo Point

  • Multi Window Analytics

  • Multi Window Ranking

  • Recover Hive Partitions

  • Register Temp Table

  • Round Value

  • Sample

  • Sort By

  • Sort Columns

  • Transpose

  • Window Analytics

  • Window Ranking