top of page

Workflow Automation Templates

A library of ready-to-use workflow templates to accelerate your data journey

Split Stratified Sampling

Preserve class ratios in train-test splits

Data-cleaning.jpg
Overview

This workflow demonstrates how Stratified Sampling ensures balanced representation of each class when splitting data into training and test sets—crucial for imbalanced datasets such as credit card fraud detection.

Details

The Split With Stratified Sampling node divides the Credit Card Fraud dataset while maintaining the same class proportions across both subsets. Unlike random sampling, stratified sampling ensures minority and majority classes are proportionally distributed, preventing model bias.

The results are verified using Print N Rows nodes to confirm the balanced data distribution. This approach enhances the reliability and fairness of model performance.

bottom of page