Menu

Understanding the Dataset

Lesson 3: Understanding the Dataset

After loading the dataset, the next step is to understand its structure and contents. This step confirms that the data has been loaded correctly and helps us understand what each column represents before performing any cleaning or analysis.

Viewing Sample Records

To get a quick overview of the dataset, we display the first few rows.

# View first few rows

df.head()

This output shows a snapshot of student records, including various performance-related factors and exam scores. Each row represents an individual student’s data.

Checking Dataset Size

Next, we check the number of rows and columns in the dataset.

# Check number of rows and columns

df.shape

This helps us understand how many student records are available and how many features are included in the dataset.

Inspecting Dataset Structure

Finally, we examine detailed information about the dataset.

# Get basic information about the dataset

df.info()

This step provides information about column names, data types, and whether any missing values are present. It helps identify which columns are numerical, which are categorical, and whether cleaning or type conversion is required.

By completing these steps, we gain a clear understanding of the dataset structure, preparing it for cleaning and preparation in the next lesson.