Menu

Loading the Dataset

Lesson 2: Loading the Dataset

After extracting the ZIP file, the next step is to load the student performance dataset into Google Colab. At this stage, we also import the required libraries that will be used throughout the project for data handling and visualization.

Importing Required Libraries

  • Pandas is used for data manipulation and analysis.
  • Matplotlib and Seaborn are used for data visualization and creating charts.

Code To Import Libraries And Load The Dataset

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

# Load CSV file with specified encoding

df = pd.read_csv("/content/StudentPerformanceFactors.csv")

After running this code, the dataset is stored inside the DataFrame named df. This DataFrame will be used in all upcoming steps such as understanding the dataset structure, cleaning the data, performing aggregation, and conducting exploratory data analysis.