Menu

Train-Test Split

Train-Test Split

Before training a machine learning model, the dataset must be divided into two parts:

Training Dataset

Used to teach the model patterns present in historical transactions.

Testing Dataset

Used to evaluate how well the trained model performs on unseen transactions.

A common split ratio is:

  • 80% Training Data
  • 20% Testing Data

Separating the data prevents overfitting and provides a more reliable estimate of model performance.