How to read a html page with Pandas Using Python
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Here is how to read a pandas DataFrame from an HTML page using the pandas.read_html() method.
Here is the code
How to read data
Here we read a wikipedia web page in HTML in a DataFrame format.
The pandas.read_html() method will return a list of DataFrames it found on the page. Here we take the 6th DataFrame so to say at index 5.
# We import the pandas library for the dataframes
import pandas as pd
# We import the table in html
df = pd.read_html("https://en.wikipedia.org/wiki/National_Basketball_Association")[5]
# We print the dataframe
print(df)
We have now the list of NBA championship in a DataFrame format.
Here you are! You now know how to read a HTML web page with Pandas.
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚