The top 5 libraries for Data Science in 2021

1 min

If you ever work on Data Science project you will need the 5 following libraries.

Pandas, Numpy, Matplotlib, Requests, BeautifulSoup

  1. Pandas deals with the manipulation of datasets in pretty much any shape or form.
  2. Numpy deals with the mathematical operations.
  3. Matplotlib helps when charting the data.
  4. Requests will help you fetch any kind of data.
  5. BeautifulSoup will help you parse HTML data.

Pandas

The most important one will be Pandas.

Pandas will give you superpowers when dealing with Data Frames like structures.

It can basically read anything. CSV, TSV, JSON, SQL, EXCEL, APIs, PICKLE, etc...

It has a built-in C backend in order to be more efficient.

You will be able to read files, manipulate data, handle missing data, etc...

Tags : Pandas, Data Manipulation, Data Handling, DataFrame

Numpy

Numpy is to Pandas what butter is like to bread. They are destined to be used together.

Numpy will give you tools to handle multi-dimensional arrays and matrices and powerfull high mathematical functions to operate on these arrays.

Tags : Numpy, Mathematics

Matplotlib

Matplotlib is a powerfull charting library.

It will allow you to create amazing graphs for your reports or presentations.

Tags : Matplotlib, Plot

Requests

Requests will help you when dealing with HTTP Requests.

You can get JSONs, CSVs or HTML from APIs or Web pages.

Tags : Requests, APIs, Data Gathering, Data Handling, HTTP

Beautiful Soup

Beautiful Soup is a powerful HTML parsing library,

Sometimes when you won't get a proper JSON response, a HTML parser can become handy.

It will help you to reconstruct unstructured data.

Tags : BeautifulSoup, APIs, HTTP, Data Gathering, HTML