Web Scraping with Python: A Practical Introduction Using BeautifulSoup ๐Ÿœ (Step-by-Step Guide for Beginners)

by John March 27, 2025



 

Web scraping ๐Ÿ•ธ๏ธ is a powerful technique for automated data collection — letting us extract information from websites programmatically. In Python ๐Ÿ, one of the most widely used and beginner-friendly tools for this task is BeautifulSoup . It's clean, flexible, and makes working with HTML and XML a very easy. In this hands-on guide, we’ll walk through a practical example: creating a scraper that grabs real-time ๐ŸŒก๏ธ temperature data from timeanddate.com nd saves the results straight into a CSV file ๐Ÿ’พ — ready for analysis or automation.

 

๐Ÿš€ Why Learn Web Scraping with Python? ๐Ÿค–๐Ÿ’ก

 

- The ability to build your own web scraping tools with Python is invaluable ๐Ÿ’ช

- While there are tons of paid web scraping services out there, doing it yourself means unlimited access to the internet’s data ๐Ÿ›ข๏ธ๐Ÿ“Š
They say data is the new oil ๐Ÿคท‍โ™‚๏ธ—so why not drill it yourself?

 

We hope this short mini project sparks interest for those who’ve never tried web scraping before. While we don’t dive into every detail, the goal here is to give you a hands-on taste of what’s possible—and maybe even get you hooked ๐Ÿ”๐Ÿ’ป Let's Begin!

 


 

 

 Make sure you’ve got Python installed on your system. If not, you can grab the latest version from python.org ๐Ÿ. To follow along with this tutorial, you’ll also need two python packages : BeautifulSoup (for parsing HTML) and requests (for making the web requests). Install both in same command with 

 

pip install beautifulsoup4 requests

 

 

Parsing HTML with BeautifulSoup in Python ๐Ÿฅฃ

 

To start scraping, we need to fetch the web-page and parse its HTML. Ok so recalling that our target is at https://www.timeanddate.com/weather/ which displays a nice table like shown in the image below.

 

 

Weather Data Scraping Target With Python

 

 Here’s how to use requests and BeautifulSoup to get the raw data from timeanddate.com:

 

import requests
from bs4 import BeautifulSoup
import pandas as pd


response = requests.get('https://www.timeanddate.com/weather/')

soup = BeautifulSoup(response.text, 'html.parser')

 

Once we have all the elements pulled into Python, we’ve got some thinking to do ๐Ÿค”. The BeautifulSoup library isn’t a magic wand — there’s still a bit of detective work involved to extract meaningful data. After inspecting the page, we can see that the information we want (city name and temperature ๐ŸŒก๏ธ) is organized in a table like the one we showed above. That means the relevant values will be found inside <td></td> elements, which represent individual pieces of table data.

 

Below we use the find_all() method to narrow down our search. 

 

tds = soup.find_all('td')

print(f"there are {(len(tds))} table data elements")


'''
there are 564 table data elements

'''

 

Looks like we have 564 table elements to parse! 

 

 


 


 

BeautifulSoup Table Extraction

 

Looks like we are making some progress, now let's print out the first 10 elements to see if we can find some sort of pattern. 

 

for td in tds[:10]:
    print(td)


'''
<td><a href="/weather/ghana/accra">Accra</a><span class="wds" id="p0s"></span></td>
<td class="r" id="p0">Sat 05:47</td>
<td class="r"><img alt="Clear. Warm." height="40" src="//c.tadst.com/gfx/w/svg/wt-13.svg" title="Clear. Warm." width="40"/></td>
<td class="rbi">27 ยฐC</td>
<td><a href="/weather/canada/edmonton">Edmonton</a><span class="wds" id="p47s"></span></td>
<td class="r" id="p47">Fri 22:47</td>
<td class="r"><img alt="Passing clouds. Cold." height="40" src="//c.tadst.com/gfx/w/svg/wt-14.svg" title="Passing clouds. Cold." width="40"/></td>
<td class="rbi">-4 ยฐC</td>
<td><a href="/weather/india/new-delhi">New Delhi</a><span class="wds" id="p94s"></span></td>
<td class="r" id="p94">Sat 11:17</td>

'''

 

Now we are getting even closer, it appears we have found the necessary pattern. It is as follows:

- The presence of a link element <a></a> indicates we have found a new place, between the <a> tags we will find the place from above <a>Accra</a>, <a>Edmonton</a>  and <a>New Delhi</a> 

- The third element after the presence of a <a> tag we will have the temperature in degrees celsius contained within a <td> class named "rbi" 

 

 

So now most of the hard work has actually been completed, now all we need to do is write a simple script to extract the data. First we create a small helper function in order to parse the temperature. Note that the \xa0 represents a non-breaking space in html used to ensure the the characters on either side stay on same line. 

 

def extract_temp_as_float(temp):
    return float(temp.split("\xa0")[0])

 

And then loop over the soup ๐Ÿœ

 

temps = []
current_city = None

for td in tds:
    # check if there is a <a> present in the td
    if td.find('a'):
        # if we have a link , then we extract the city name
        current_city = td.get_text().strip()  # Get the city name

    # If we have a city and this is a temperature cell
    elif 'rbi' in td.get('class', []) and current_city:
        temp = td.get_text().strip()  # Get the temperature
        # add the city and the temp that has been passed through helper function 
        # to our list of temperatures
        temps.append({'city': current_city, 'temp': extract_temp_as_float(temp)})
        current_city = None  # Reset the current city


print(temps[:3])

 

And the first three results are shown below. We will assume all the others are correct, and verify some other data points once we have the file in csv format. 

 

[{'city': 'Accra', 'temp': 27.0},
 {'city': 'Edmonton', 'temp': -4.0},
 {'city': 'New Delhi', 'temp': 13.0}]

 


 

Export Scraped Data in Python

 

It appears our script has worked. Now all we need to do is save it as a csv so we don't lose the data. 

 


df = pd.DataFrame(temps)


print(df.head())
print(df.tail())

"""
          city  temp
0        Accra  27.0
1     Edmonton  -4.0
2    New Delhi  13.0
3  Addis Ababa  15.0
4    Frankfurt   9.0
        city  temp
134   Zรผrich   8.0
135    Dubai  23.0
136  Nairobi  19.0
137   Dublin   5.0
138   Nassau  20.0
"""

 

 

And it save it , the file will be saved to your current working directory, and now you can open in in Excel or notepad, or maybe even store it on a Linux virtual machine if you are feeling adventurous 

 

 

df.to_csv('global_temps.csv', index=False)

 

 

โœ… Summary: What We Covered

 

In this hands-on guide, you learned how to build a basic but effective web scraper using Python and the BeautifulSoup library. Here's what we tackled:

 

  • ๐Ÿ”ง Setup – Installed beautifulsoup4, requests, and made sure Python was ready to go.

 

  • ๐ŸŒ Fetched Web Content – Used requests.get() to pull data from timeanddate.com.

 

  • ๐Ÿฒ Parsed the HTML Soup – Leveraged BeautifulSoup to navigate and inspect the page structure.

 

  • ๐Ÿ“Š Extracted Key Data – Identified and pulled the relevant <td> elements holding city names and temperatures.

 

  • ๐Ÿ’พ Saved to Disk – Stored the scraped data in a clean CSV file using pandas.

 

 

๐Ÿ“š Further Reading