How to compute the efficient frontier with Pandas using Python
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
When building your portfolio you will eventually end up questioning which asset to pick and which amount you should allocate to any of these assets.
This is one of those everlasting questions in Finance.
What is an Optimal Portfolio Composition?
An optimal portfolio composition is usually one that beats the market and other strategies.
What is beating the market?
Well, it can be many things.
Your portfolio can
- be more stable than the market
- giving more performance than the market
- hedging the market (neutralizing the risk)
- income portfolio (e.g. w/ dividends)
- etc...
It only depends on what is your benchmark is and what is the goal you are trying to achieve.
Different financial goals will aim for different portfolios and thus different optimal portfolio compositions.
Ok, but now.
How to find an Optimal Portfolio Composition?
Once you define what is the benchmark to beat.
You can try to find what is your strategy's optimal portfolio composition.
Using the efficient frontier is one way to go about it.
What is the efficient frontier?
The efficient frontier will give you the optimal allocation for a selection of assets.
It will find the allocation that will give you the best risk-reward ratio.
How to do it in Python?
Computing the efficient frontier with Pandas in Python requires the following steps:
- Load financial data into a Pandas dataframe. This data should include returns for multiple assets.
- Calculate the expected returns and covariance matrix for the assets.
- Use the
scipy.optimize
library to minimize the portfolio variance for different target returns. - Plot the resulting efficient frontier, with the x-axis being the standard deviation (risk) and the y-axis being the expected return.
Here's a code example that demonstrates these steps:
import pandas as pd
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt
import yfinance as yf
# Load financial data into a Pandas dataframe
prices = yf.download(["AAPL", "MSFT", "TSLA", "GOOG", "BTC-USD", "XOM"])["Adj Close"]
prices = prices.fillna(method="ffill")
prices = prices.resample("D").mean()
# We compute the prices
returns = prices.pct_change().dropna()
# Calculate expected returns and covariance matrix
expected_returns = returns.mean()
cov_matrix = returns.cov()
# Define portfolio optimization function
def portfolio_volatility(weights, expected_returns, cov_matrix):
portfolio_return = np.sum(weights * expected_returns)
portfolio_vol = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
return portfolio_vol
# Minimize portfolio variance for different target returns
frontier_y = np.linspace(0, 0.25, 50)
frontier_volatility = []
for possible_return in frontier_y:
constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1},
{'type': 'eq', 'fun': lambda x: portfolio_volatility(x, expected_returns, cov_matrix) - possible_return})
bounds = [(0, 1) for i in range(returns.shape[1])]
initial_guess = [1/returns.shape[1]] * returns.shape[1]
result = minimize(portfolio_volatility, initial_guess, args=(expected_returns, cov_matrix), method='SLSQP', bounds=bounds, constraints=constraints)
frontier_volatility.append(result['fun'])
# Plot the efficient frontier
plt.plot(frontier_volatility, frontier_y, 'g--')
plt.xlabel('Portfolio Volatility')
plt.ylabel('Expected Return')
plt.title('Efficient Frontier')
plt.show()
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚