Menu

Creating Training and Testing Sets

Creating Training and Testing Sets

Code

**X_train, X_test, y_train, y_test = (**

**train_test_split(**

**X,**

**y,**

**test_size=0.2,**

**random_state=2**

**)**

**)**

Understanding the Parameters

test_size=0.2

Reserves 20% of the dataset for testing.

This means:

  • 80% used for training
  • 20% used for testing

random_state=2

Controls the random selection process.

Using a fixed random state ensures that the same split is generated each time the notebook is executed.

This improves reproducibility.