Search

New Count and Assert Nodes in Fire


Overview

There are use cases where we want to focus on Data Quality. One of the use cases is that if the number of records being processed in less than a certain number, then stop execution.

Fire Insights now supports 2 new Processors:

  • NodeCount

  • NodeAssert

NodeCount counts the number of records in the Dataset and stores it in a variable in the JobContext.

NodeAssert allows the user to provide a conditional expression to be evaluate. NodeAssert has 2 outputs. Based on the results of execution of the expression, execution is sent to one of the outputs.

The conditional expression can use variables generated prior to it.

Workflow

Below is a workflow which uses the 2 new Processors : NodeCount and NodeAssert

The workflow does the following:

  1. Reads in the NYC Trip Data

  2. Finds the number of incoming records

  3. Evaluates if the number of incoming records is greater than 100.

  4. If it is greater than 100, then it saves the dataset in Parquet format.

  5. Else, it prints the records.


NodeCount Configuration

Below is the configuration set of the NodeCount Processor.

It finds the count of the incoming records and stores it in the variable cnt into the JobContext.


NodeAssert Configuration

Below is the configuration set of the NodeAssert Processor.

It evaluates if the value of cnt is greater than 100.




29 views

RESOURCES

SOCIAL

  • facebook
  • linkedin
  • twitter
  • angellist
© 2020 Sparkflows, Inc. All rights reserved. 

Terms and Conditions