Vilhelm Bergsøe

Notes on portfolio optimization

Thu 07 Aug 2025 - 14 min read | 47 view(s)

In the realm of finance, portfolio optimization is the process of selecting the optimal allocation of assets to maximize returns while minimizing risk. This blog post shares my notes on the mathematics of portfolio optimization, starting with single-asset optimization and progressing to more complex scenarios.

Just a heads-up!

While I've explored portfolio optimization and have gained a decent grasp of the maths involved, these notes are more for my understanding than expert advice. So, take it all with a grain of salt! ;)

Let's begin with the simplest problem:

Optimizing the allocation to a single asset.

Suppose we have an asset with a expected return of RR and a variance of σ2\sigma^2. Our goal is to find the optimal allocation ww that minimizes the variance while maximizing the return. This is described using an objective function as follows:

min12σ2w2Rw\min \quad \frac{1}{2}\sigma^2 w^2 - R w

There’s a lot happening here, so let’s break it down.

You might notice the quadratic function form ax2+bx+cax^2+bx+c; this is intentional. The ax2ax^2 term represents our risk, scaled by the square of the allocation ww.

The squared allocation w2w^2 can be a bit confusing, but here’s why it matters: When we scale the asset return RR by ww, we get a new return wRwR. The variance of this scaled return is given by:

Var(wR)=E[(wRE[wR])2]\text{Var}(wR) = \mathbb{E}\left[\left(wR - \mathbb{E}\left[wR\right]\right)^2\right]

Since ww is a constant, we can factor it out: =E[(w(RE[R]))2] = \mathbb{E}\left[\left(w\left(R - \mathbb{E}\left[R\right]\right)\right)^2\right] =w2E[(RE[R])2] = w^2 \cdot \mathbb{E}\left[\left(R - \mathbb{E}\left[R\right]\right)^2\right]

The last term is simply the variance of RR: =w2σ2 = w^2 \sigma^2

Thus, we see that the variance of the scaled return is proportional to the square of the allocation.

In short: Increasing your investment in an asset not only boosts returns but also raises your exposure to return variability. For example, doubling your allocation doesn’t just double the return; it quadruples your risk. This quadratic relationship explains the squared term in the variance calculation.

Now, back to our objective function: the coefficient aa is 12σ2\frac{1}{2}\sigma^2, representing the asset's risk.

The 12\frac{1}{2} in front of the risk term is really just to give us a cleaner derivative as we'll see later

The bxbx term, on the other hand, represents the return of the asset, scaled by the allocation ww. Here, bb is equal to R−R, which is the negative of the return of the asset. The negation is used because we want to maximize the return, and the minimization problem will try to minimize the negative return, effectively maximizing the actual return.

The cc term is absent in this case, as there is no constant term in our objective function.

Now, to find the optimal allocation ww, we need to minimize this quadratic function. We can do this by taking the derivative of the function with respect to ww, setting it to zero, and solving for ww. This will give us the value of ww that minimizes the function.

Let's do that:

w(12σ2w2Rw)=0\frac{\partial{}}{\partial{w}} (\frac{1}{2}\sigma^2w^2 - Rw) = 0

σ2wR=0\sigma^2w - R = 0

w=Rσ2w = \frac{R}{\sigma^2}

So, the optimal allocation to the single asset is given by the ratio of the return to the variance. This makes intuitive sense, as we want to allocate more to assets with high returns and low risk, and less to assets with low returns and high risk.

But, what if we have multiple assets to choose from? How do we optimize the allocation across multiple assets?

Optimizing a Portfolio of multiple assets

We’ve seen how to optimize the allocation to a single asset, where the trade-off is between return and risk. The optimal allocation was determined by balancing the expected return RR against the variance σ2\sigma2 (the risk) of the asset. But what happens when we have two or more assets to choose from?

Extending to Two Assets

Let's start small, with just two assets, Asset A and Asset B. Each asset has its own expected return RAR_{A}, RBR_{B} and variance σA2\sigma^2_{A}, σB2\sigma^2_{B}, just like the single asset case. If we want to allocate our capital between these two assets, we'll assign weights wAw_{A} and wBw_{B}, where the expected return of the portfolio is a simple weighted average of the expected returns of the individual assets:

Rp=wARA+wBRBR_{p} = w_{A}R_{A} + w_{B}R_{B}

This is straightforward: the more you allocate to an asset, the more its return contributes to the portfolio's overall return.

Here's where things start to get interesting. You might assume that the portfolio risk is also just the weighted average of the individual risks:

σp2=wA2σA2+wB2σB2\sigma^2_{p} = w^2_{A}\sigma^2_{A} + w^2_{B}\sigma^2_{B}

At first glance, this seems reasonable. But it's incomplete.

Why? Because it doesn't take into account how the two assets interact. We need to consider how the returns of Asset A and Asset B move in relation to each other. This is where covariance comes in.

Why Covariance Matters

Imagine that Asset A and Asset B are highly correlated -- when Asset A goes up, Asset B tends to go up too. In this case, the portfolio's risk will be higher because both assets are likely to experience gains and losses together, reinforcing each other's volatility.

On the other hand, if Asset A and Asset B tend to move in opposite directions (i.e., they have negative correlation), the portfolio's overall risk will be lower. This is because the losses in one asset might be offset by gains in the other, reducing the total volatility of the portfolio.

Deriving the formula for variance of multiple assets

To derive the formula for the variance of a portfolio with multiple assets, we begin with the definition of variance. For any random variable, the variance measures the spread of its possible outcomes around its mean. In the case of a portfolio, the variance of the portfolio return, RpR_p, is given by:

σp2=E[(RpE[Rp])2]\sigma^2_p = \mathbb{E}\left[\left(R_p - \mathbb{E}[R_p]\right)^2\right]

If our portfolio consists of two assets, the total return of the portfolio is a weighted sum of the individual asset returns. Specifically, if wAw_A and wBw_B represent the weights of Asset A and Asset B, respectively, then the portfolio return is, like described before:

Rp=wARA+wBRBR_p = w_A R_A + w_B R_B

Substituting this into the variance formula, we get:

σp2=E[(wARA+wBRBE[wARA+wBRB])2]\sigma^2_p = \mathbb{E}\left[\left(w_A R_A + w_B R_B - \mathbb{E}[w_A R_A + w_B R_B]\right)^2\right]

Since wAw_A and wBw_B are constants, we can factor them out of the expectation, giving us:

σp2=E[(wA(RAE[RA])+wB(RBE[RB]))2]\sigma^2_p = \mathbb{E}\left[\left(w_A (R_A - \mathbb{E}[R_A]) + w_B (R_B - \mathbb{E}[R_B])\right)^2\right]

At this stage, we expand the square inside the expectation, which gives us three terms:

σp2=E[wA2(RAE[RA])2+wB2(RBE[RB])2+2wAwB(RAE[RA])(RBE[RB])]\sigma^2_p = \mathbb{E}\left[w_A^2 (R_A - \mathbb{E}[R_A])^2 + w_B^2 (R_B - \mathbb{E}[R_B])^2 + 2 w_A w_B (R_A - \mathbb{E}[R_A])(R_B - \mathbb{E}[R_B])\right]

The first two terms correspond to the individual variances of each asset, scaled by the square of their respective weights like we showed earlier with the single-asset case. Specifically, w2(RE[R])2w^2 (R - \mathbb{E}[R])^2 simplifies to w2σ2w^2 \sigma^2.

The third term, 2wAwB(RAE[RA])(RBE[RB])2 w_A w_B (R_A - \mathbb{E}[R_A])(R_B - \mathbb{E}[R_B]), measures how the returns of the two assets move together—specifically, this is the covariance between the returns of Asset A and Asset B. By definition:

Cov(A,B)=E[(RAE[RA])(RBE[RB])]\text{Cov}(A, B) = \mathbb{E}\left[(R_A - \mathbb{E}[R_A])(R_B - \mathbb{E}[R_B])\right]

The factor of 2 comes from the expansion of (wARA+wBRB)2(w_A R_A + w_B R_B)^2, where the covariance term appears twice: once for RAR_A paired with RBR_B and once for RBR_B paired with RAR_A.

Thus, the cross-term becomes:

2wAwBCov(A,B)2 w_A w_B \text{Cov}(A, B)

Finally, combining all the terms, the variance of the portfolio is:

σp2=wA2σA2+wB2σB2+2wAwBCov(A,B)\sigma^2_p = w_A^2 \sigma^2_A + w_B^2 \sigma^2_B + 2 w_A w_B \text{Cov}(A,B)

This formula shows that the portfolio variance depends not only on the individual variances of each asset but also on how the returns of the assets interact, as captured by the covariance term. This is why diversification works — if the assets are not perfectly correlated, the covariance term can reduce overall portfolio risk, even while the portfolio maintains a positive return.

This formula for two assets is already getting a bit long. Imagine trying to write it for a portfolio of 100 assets! This is where the elegance of linear algebra comes to our rescue, allowing us to express these concepts in a much cleaner way.

Generalizing to N Assets with Matrix Notation

To handle a portfolio with any number of assets, we switch to vectors and matrices. Let's define the key components for a portfolio of NN assets:

  1. The Weights Vector (w\mathbf{w}): A column vector containing the allocation to each asset. w=[w1w2wN]\mathbf{w} = \begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_N \end{bmatrix}

  2. The Returns Vector (R\mathbf{R}): A column vector of the expected returns for each asset. R=[R1R2RN]\mathbf{R} = \begin{bmatrix} R_1 \\ R_2 \\ \vdots \\ R_N \end{bmatrix}

  3. The Covariance Matrix (Σ\mathbf{\Sigma}): An N×NN \times N matrix that captures the variance of each asset and the covariance between each pair of assets. Σ=[σ12Cov(1,2)Cov(1,N)Cov(2,1)σ22Cov(2,N)Cov(N,1)Cov(N,2)σN2]\mathbf{\Sigma} = \begin{bmatrix} \sigma_1^2 & \text{Cov}(1,2) & \cdots & \text{Cov}(1,N) \\ \text{Cov}(2,1) & \sigma_2^2 & \cdots & \text{Cov}(2,N) \\ \vdots & \vdots & \ddots & \vdots \\ \text{Cov}(N,1) & \text{Cov}(N,2) & \cdots & \sigma_N^2 \end{bmatrix}

    Note that the diagonal of Σ\mathbf{\Sigma} contains the individual asset variances (σi2\sigma_i^2), and the off-diagonal elements contain the covariances. Since Cov(i,j)=Cov(j,i)\text{Cov}(i,j) = \text{Cov}(j,i), the matrix is symmetric.

With this notation, the portfolio's expected return and variance are expressed very neatly:

  • Portfolio Return: Rp=wTRR_p = \mathbf{w}^T \mathbf{R}
  • Portfolio Variance: σp2=wTΣw\sigma_p^2 = \mathbf{w}^T \mathbf{\Sigma} \mathbf{w}

This is the exact same math as before, just generalized for NN assets. The term wTR\mathbf{w}^T \mathbf{R} is the dot product of the weights and returns, giving us a weighted average. The term wTΣw\mathbf{w}^T \mathbf{\Sigma} \mathbf{w} is a quadratic form that perfectly captures all the variance and covariance terms.

The N-Asset Optimization Problem

Now we can formulate the core problem of portfolio optimization. We want to find the best weight vector w\mathbf{w}. A common way to frame this is as maximizing a utility function that balances return and risk:

maxwwTRλ2wTΣw\max_{\mathbf{w}} \quad \mathbf{w}^T \mathbf{R} - \frac{\lambda}{2} \mathbf{w}^T \mathbf{\Sigma} \mathbf{w}

This should look familiar! It's the multi-asset version of our single-asset objective function. The new term, λ\lambda (lambda), is a risk-aversion parameter.

What is λ\lambda?

λ\lambda represents how much you dislike risk.

  • A high λ\lambda means you are very risk-averse. The optimization will heavily penalize variance, leading to a safer, lower-return portfolio.
  • A low λ\lambda means you are more willing to take on risk for potentially higher returns.

By varying λ\lambda, we can trace out a whole set of optimal portfolios.

We also typically add the constraint that all weights must sum to one: i=1Nwi=1\sum_{i=1}^{N} w_i = 1.

Solving the Optimization Problem

So we have our objective function, but how do we actually find the optimal weight vector w\mathbf{w} that maximizes it?

The process is remarkably similar to the single-asset case. We take the derivative with respect to our variable (w\mathbf{w}), set it to zero, and solve. The only difference is that we're now using matrix calculus.

Our objective function is: maxwwTRλ2wTΣw\max_{\mathbf{w}} \quad \mathbf{w}^T \mathbf{R} - \frac{\lambda}{2} \mathbf{w}^T \mathbf{\Sigma} \mathbf{w}

Taking the derivative with respect to the vector w\mathbf{w} and setting it to zero gives us: RλΣw=0\mathbf{R} - \lambda\mathbf{\Sigma}\mathbf{w} = 0

Now, we just need to solve for w\mathbf{w}. Rearranging the equation, we get: λΣw=R\lambda\mathbf{\Sigma}\mathbf{w} = \mathbf{R} w=1λΣ1R\mathbf{w} = \frac{1}{\lambda} \mathbf{\Sigma}^{-1} \mathbf{R}

Let's pause and appreciate how elegant this solution is. It tells us that the optimal allocation w\mathbf{w} is proportional to Σ1R\mathbf{\Sigma}^{-1}\mathbf{R}.

  • This is the direct multi-asset equivalent of our single-asset solution, w=R/σ2w = R/\sigma^2.
  • Instead of dividing by the variance (σ2\sigma^2), we multiply by the inverse of the covariance matrix (Σ1\mathbf{\Sigma}^{-1}). This matrix inversion is the magic step that accounts for all the complex interactions between every asset in the portfolio.
  • The result is then scaled by our risk aversion, 1/λ1/\lambda.

A note on constraints: The solution above is for the unconstrained case. Adding the constraint that all weights must sum to one (wT1=1\mathbf{w}^T \mathbf{1} = 1) makes the math a bit more involved (it requires a technique called Lagrange multipliers), but the core concept remains the same: we are finding the weights that give us the best risk-return trade-off.

By solving this problem for different values of λ\lambda, we can generate a whole family of optimal portfolios. This whole set of optimal portfolios is actually called the efficient frontier. It's something that I'll dive into in a future post, but for now, let's solve an example portfolio.

A Simple Worked Example

Let's make this real with a simple two-asset portfolio. Suppose we have two stocks:

  • TechCorp (T): A high-growth tech stock.
  • GlobalGoods (G): A stable, global consumer goods company.

We've estimated their financial characteristics as follows:

  • Expected Returns (R\mathbf{R}): TechCorp: 10%, GlobalGoods: 6%
  • Standard Deviations (σ\sigma): TechCorp: 20%, GlobalGoods: 15%
  • Correlation (ρ\rho): The returns have a low positive correlation of 0.3.

First, let's assemble our vectors and matrices.

The returns vector R\mathbf{R} is straightforward: R=[0.100.06]\mathbf{R} = \begin{bmatrix} 0.10 \\ 0.06 \end{bmatrix}

For the covariance matrix Σ\mathbf{\Sigma}, we need the variances and the covariance.

  • Variance of TechCorp: σT2=(0.20)2=0.04\sigma_T^2 = (0.20)^2 = 0.04
  • Variance of GlobalGoods: σG2=(0.15)2=0.0225\sigma_G^2 = (0.15)^2 = 0.0225
  • Covariance: Cov(T,G)=ρTGσTσG=0.3×0.20×0.15=0.009\text{Cov}(T,G) = \rho_{TG} \sigma_T \sigma_G = 0.3 \times 0.20 \times 0.15 = 0.009

So, our covariance matrix Σ\mathbf{\Sigma} is: Σ=[0.040.0090.0090.0225]\mathbf{\Sigma} = \begin{bmatrix} 0.04 & 0.009 \\ 0.009 & 0.0225 \end{bmatrix}

Now, let's solve for the optimal weights w\mathbf{w} using our formula: w=1λΣ1R\mathbf{w} = \frac{1}{\lambda} \mathbf{\Sigma}^{-1} \mathbf{R}. We'll assume a moderate risk-aversion parameter of λ=2\lambda = 2.

  1. Find the inverse of the covariance matrix (Σ1\mathbf{\Sigma}^{-1}): For a 2x2 matrix, this is a standard procedure. The result is: Σ1[27.4710.9910.9948.84]\mathbf{\Sigma}^{-1} \approx \begin{bmatrix} 27.47 & -10.99 \\ -10.99 & 48.84 \end{bmatrix}

  2. Multiply by the returns vector (Σ1R\mathbf{\Sigma}^{-1} \mathbf{R}): [27.4710.9910.9948.84][0.100.06]=[(27.47×0.10)+(10.99×0.06)(10.99×0.10)+(48.84×0.06)]=[2.0881.831]\begin{bmatrix} 27.47 & -10.99 \\ -10.99 & 48.84 \end{bmatrix} \begin{bmatrix} 0.10 \\ 0.06 \end{bmatrix} = \begin{bmatrix} (27.47 \times 0.10) + (-10.99 \times 0.06) \\ (-10.99 \times 0.10) + (48.84 \times 0.06) \end{bmatrix} = \begin{bmatrix} 2.088 \\ 1.831 \end{bmatrix}

  3. Scale by risk aversion (1/λ1/\lambda): With λ=2\lambda = 2, our scaling factor is 1/2=0.51/2 = 0.5. w=0.5×[2.0881.831]=[1.0440.915]\mathbf{w} = 0.5 \times \begin{bmatrix} 2.088 \\ 1.831 \end{bmatrix} = \begin{bmatrix} 1.044 \\ 0.915 \end{bmatrix}

So our solution is to allocate 104.4% of our capital to TechCorp and 91.5% to GlobalGoods.

Wait, 104.4% + 91.5% = 195.9%? How is that possible?

Welcome to the concept of leverage. The unconstrained solution we just calculated doesn't assume our weights must sum to 100%. A total allocation over 100% implies borrowing money to invest more than you have. In this case, you would borrow 95.9% of your capital to achieve this high-return (and high-risk!) portfolio. This is a strategy an investor with very low risk aversion might take.

What if you can't borrow?

Most investors operate under the constraint that their weights must sum to 1 (wT1=1\mathbf{w}^T \mathbf{1} = 1). While the formal solution involves more complex math (Lagrange multipliers), the unconstrained solution we found still gives us something incredibly useful.

If we normalize our unconstrained weights so they sum to 1, we get the portfolio with the best possible risk-return trade-off.

  • Total Weight = 1.044+0.915=1.9591.044 + 0.915 = 1.959
  • Normalized weight for TechCorp: wT=1.044/1.95953.3%w_T = 1.044 / 1.959 \approx 53.3\%
  • Normalized weight for GlobalGoods: wG=0.915/1.95946.7%w_G = 0.915 / 1.959 \approx 46.7\%

This particular portfolio (53.3% TechCorp, 46.7% GlobalGoods) is special. It's the portfolio that maximizes our return for the risk we are taking.

And that’s pretty much the core of it! We've walked through the math, starting with a single asset and building up to a neat way of handling a whole portfolio with matrices. By solving for the weights and then normalizing them, we landed on a specific mix: about 53% in TechCorp and 47% in GlobalGoods.

It's interesting that no matter what our personal risk aversion λ was, the ratio between the assets stayed the same. Normalizing them just gives us a tangible portfolio to look at.

But this definitely opens up more questions. What's so special about this specific 53/47 mix? And what about the constraint that most of us have: that our investments must add up to 100%? We kind of sidestepped that with our normalization trick.

These are deeper topics in portfolio theory, and we'll definitely dig into them in a future post. For now, hopefully, this gives a solid feel for the fundamental math involved. Thanks for reading

tags: [economics, investment, optimization]