Contents
Lesson 2: Loading the Dataset
After extracting the ZIP file, the next step is to load the sales data into the Google Colab notebook. At this stage, we also import the libraries that will be used throughout the project for data handling and visualization.
Pandas is used to load and work with the dataset, while Matplotlib and Seaborn are used later for creating charts and visual analysis. Importing them early keeps the workflow clean and organized.
Code to import libraries and load the dataset:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load CSV file with specified encoding
df = pd.read_csv("/content/sales_data_sample.csv", encoding='latin1')
The encoding is specified to avoid character-related errors that can occur in real-world datasets. Once this code runs successfully, the sales data is stored in the DataFrame df, which will be used in all upcoming steps such as inspection, cleaning, aggregation, and exploratory data analysis.










