Web Scraping with Python: A Practical Introduction Using BeautifulSoup 🍜 (Step-by-Step Guide for Beginners)

by John March 27, 2025



 

Web scraping πŸ•ΈοΈ is a powerful technique for automated data collection — letting us extract information from websites programmatically. In Python 🐍, one of the most widely used and beginner-friendly tools for this task is BeautifulSoup . It's clean, flexible, and makes working with HTML and XML a very easy. In this hands-on guide, we’ll walk through a practical example: creating a scraper that grabs real-time 🌑️ temperature data from timeanddate.com nd saves the results straight into a CSV file πŸ’Ύ — ready for analysis or automation.

 

πŸš€ Why Learn Web Scraping with Python? πŸ€–πŸ’‘

 

- The ability to build your own web scraping tools with Python is invaluable πŸ’ͺ

- While there are tons of paid web scraping services out there, doing it yourself means unlimited access to the internet’s data πŸ›’οΈπŸ“Š
They say data is the new oil 🀷‍♂️—so why not drill it yourself?

 

We hope this short mini project sparks interest for those who’ve never tried web scraping before. While we don’t dive into every detail, the goal here is to give you a hands-on taste of what’s possible—and maybe even get you hooked πŸ”πŸ’» Let's Begin!

 


 

 

 Make sure you’ve got Python installed on your system. If not, you can grab the latest version from python.org 🐍. To follow along with this tutorial, you’ll also need two python packages : BeautifulSoup (for parsing HTML) and requests (for making the web requests). Install both in same command with 

 

pip install beautifulsoup4 requests

 

 

Parsing HTML with BeautifulSoup in Python πŸ₯£

 

To start scraping, we need to fetch the web-page and parse its HTML. Ok so recalling that our target is at https://www.timeanddate.com/weather/ which displays a nice table like shown in the image below.

 

 

Weather Data Scraping Target With Python

 

 Here’s how to use requests and BeautifulSoup to get the raw data from timeanddate.com:

 

import requests
from bs4 import BeautifulSoup
import pandas as pd


response = requests.get('https://www.timeanddate.com/weather/')

soup = BeautifulSoup(response.text, 'html.parser')

 

Once we have all the elements pulled into Python, we’ve got some thinking to do πŸ€”. The BeautifulSoup library isn’t a magic wand — there’s still a bit of detective work involved to extract meaningful data. After inspecting the page, we can see that the information we want (city name and temperature 🌑️) is organized in a table like the one we showed above. That means the relevant values will be found inside <td></td> elements, which represent individual pieces of table data.

 

Below we use the find_all() method to narrow down our search. 

 

tds = soup.find_all('td')

print(f"there are {(len(tds))} table data elements")


'''
there are 564 table data elements

'''

 

Looks like we have 564 table elements to parse! 

 

 


 


 

BeautifulSoup Table Extraction

 

Looks like we are making some progress, now let's print out the first 10 elements to see if we can find some sort of pattern. 

 

for td in tds[:10]:
    print(td)


'''
<td><a href="/weather/ghana/accra">Accra</a><span class="wds" id="p0s"></span></td>
<td class="r" id="p0">Sat 05:47</td>
<td class="r"><img alt="Clear. Warm." height="40" src="//c.tadst.com/gfx/w/svg/wt-13.svg" title="Clear. Warm." width="40"/></td>
<td class="rbi">27 Β°C</td>
<td><a href="/weather/canada/edmonton">Edmonton</a><span class="wds" id="p47s"></span></td>
<td class="r" id="p47">Fri 22:47</td>
<td class="r"><img alt="Passing clouds. Cold." height="40" src="//c.tadst.com/gfx/w/svg/wt-14.svg" title="Passing clouds. Cold." width="40"/></td>
<td class="rbi">-4 Β°C</td>
<td><a href="/weather/india/new-delhi">New Delhi</a><span class="wds" id="p94s"></span></td>
<td class="r" id="p94">Sat 11:17</td>

'''

 

Now we are getting even closer, it appears we have found the necessary pattern. It is as follows:

- The presence of a link element <a></a> indicates we have found a new place, between the <a> tags we will find the place from above <a>Accra</a>, <a>Edmonton</a>  and <a>New Delhi</a> 

- The third element after the presence of a <a> tag we will have the temperature in degrees celsius contained within a <td> class named "rbi" 

 

 

So now most of the hard work has actually been completed, now all we need to do is write a simple script to extract the data. First we create a small helper function in order to parse the temperature. Note that the \xa0 represents a non-breaking space in html used to ensure the the characters on either side stay on same line. 

 

def extract_temp_as_float(temp):
    return float(temp.split("\xa0")[0])

 

And then loop over the soup πŸœ

 

temps = []
current_city = None

for td in tds:
    # check if there is a <a> present in the td
    if td.find('a'):
        # if we have a link , then we extract the city name
        current_city = td.get_text().strip()  # Get the city name

    # If we have a city and this is a temperature cell
    elif 'rbi' in td.get('class', []) and current_city:
        temp = td.get_text().strip()  # Get the temperature
        # add the city and the temp that has been passed through helper function 
        # to our list of temperatures
        temps.append({'city': current_city, 'temp': extract_temp_as_float(temp)})
        current_city = None  # Reset the current city


print(temps[:3])

 

And the first three results are shown below. We will assume all the others are correct, and verify some other data points once we have the file in csv format. 

 

[{'city': 'Accra', 'temp': 27.0},
 {'city': 'Edmonton', 'temp': -4.0},
 {'city': 'New Delhi', 'temp': 13.0}]

 


 

Export Scraped Data in Python

 

It appears our script has worked. Now all we need to do is save it as a csv so we don't lose the data. 

 


df = pd.DataFrame(temps)


print(df.head())
print(df.tail())

"""
          city  temp
0        Accra  27.0
1     Edmonton  -4.0
2    New Delhi  13.0
3  Addis Ababa  15.0
4    Frankfurt   9.0
        city  temp
134   ZΓΌrich   8.0
135    Dubai  23.0
136  Nairobi  19.0
137   Dublin   5.0
138   Nassau  20.0
"""

 

 

And it save it , the file will be saved to your current working directory, and now you can open in in Excel or notepad, or maybe even store it on a Linux virtual machine if you are feeling adventurous 

 

 

df.to_csv('global_temps.csv', index=False)

 

 

βœ… Summary: What We Covered

 

In this hands-on guide, you learned how to build a basic but effective web scraper using Python and the BeautifulSoup library. Here's what we tackled:

 

  • πŸ”§ Setup – Installed beautifulsoup4, requests, and made sure Python was ready to go.

 

  • 🌐 Fetched Web Content – Used requests.get() to pull data from timeanddate.com.

 

  • 🍲 Parsed the HTML Soup – Leveraged BeautifulSoup to navigate and inspect the page structure.

 

  • πŸ“Š Extracted Key Data – Identified and pulled the relevant <td> elements holding city names and temperatures.

 

  • πŸ’Ύ Saved to Disk – Stored the scraped data in a clean CSV file using pandas.

 

 

πŸ“š Further Reading