Menu

Understanding Scraping Multiple URLs Together

Web Scraping: Scraping Multiple URLs

This tutorial is just to guide you about how to perform web scraping on multiple URLs together, although you would have figured it out in the hour of need.

Some of you might have already guessed, yes we will use the for loop.

We will start with creating an array to store the URLs in it,

array holding url values

urls = ['https://xyz.com/page/1', 'https://xyz.com/page/2']


You can have many URLs in an array. A word of advice though, do not include any URL unnecessary because whenever we make a request to any URL it costs the website owners in terms of an additional request made to their server.

> _And never run a web scraping script in **infinite** loop_

Once you have created an **array**, start a loop from the beginning and do everything inside the loop:

```python

importing bs4, requests, fake_useragent and csv modules

import bs4 import requests from fake_useragent import UserAgent import csv

create an array with URLs

initializing the UserAgent object

user_agent = UserAgent()

starting the loop

for url in urls: ## getting the reponse from the page using get method of requests module page = requests.get(url, headers={"user-agent": user_agent.chrome})

## storing the content of the page in a variable
html = page.content

## creating BeautifulSoup object
soup = bs4.BeautifulSoup(html, "html.parser")

## Then parse the HTML, extract any data
## write it to a file

When you run multiple URLs in a script and want to write the data to a file too, make sure you store the data in form of a **tuple** and then write it in the file.

Next tutorial is a simple excercise where you will have to run web scraping script on Studytonight's website. Excited? Head to the next tutorial.