top of page

Workflow Automation Templates

A library of ready-to-use workflow templates to accelerate your data journey

Group By FE

Generate user insights from transactions

Data-cleaning.jpg
Overview

This workflow aggregates transactional data using PySpark to create user-level insights and behavioral metrics. It calculates frequency, recency, purchase intervals, total spending, and customer age to support advanced analytics and modeling.

Details

The workflow begins by loading the GroupByFeatureTest.csv dataset, which includes user ID, purchase date, purchase amount, product category, and date of birth. The Feature Engineering node groups data by user ID and computes various metrics, including the number of purchases per user, days since the last transaction, average time between purchases, total purchase value, and customer age derived from the date of birth. The processed dataset, enriched with these user-level statistics, is then displayed using the Print N Rows node for verification and exploration.

It enables scalable, data-driven segmentation and customer analytics by summarizing purchasing behavior efficiently.

bottom of page