Driving AI Success In Investment Management With Factor Timing

Executive Summary

Machine learning in investment management works very well on bottom-up fundamental analysis
Using artificial intelligence to practice factor timing is also predictive
Factor timing aggregates the star rankings and factor scores the machine learning assigns to each equity in a user’s stock universe
Our AI factor timing led to average returns of +1.42% in momentum factors, +5.39% in volatility factors and +2.22% in size factors.

Machine learning improves investment results

The rapid growth of processing power and big data is illustrated perfectly in machine learning (ML). Artificial intelligence (AI) breakthroughs are creating machines that are able to beat humans on reading tests, win Jeopardy, best a world chess master, and even teach themselves to win games of chess, Go and Shogi. The power of machine learning to digest huge amounts of data is also helping investment managers find performance in their portfolios.

In previous posts, we’ve already discussed the predictive power of machine learning in finance, from black swan events like Covid-19 and the quant shock to flagging risk like GME to the importance of relative valuation in machine learning models. There are a few basic tenets to best utilizing machine learning in institutional investment.

ML is very good at picking stocks in supporting bottom-up fundamental analysis, wherein it analyzes hard data about a company’s balance sheet and financial health and compares it against other companies. It works even better when given more computations to do – the more at-bats the machine learning has, the higher the learning quality, which leads to better performance. It makes sense – in a small universe like, say, the Dow, there are only 30 companies for the AI to learn from, compared to, say, the Russell 2000, where there are 2000 opportunities for the machine learning to glean more information on performative stocks.

Driving AI success using factor timing

In order to give ML models the best chance for success in investment management, creating a portfolio with machine learning generated picks is a good way to give the machine the most opportunities to find successful equities and win over the long run. Another method of giving the machine a lot of opportunities is by aggregating its AI scores by specified buckets.

How can we time factors using machine learning? One way is through factor timing. Factor timing aggregates factor scores, where we score all the stocks in the universe with our star rankings (5 stars means the ML predicts a stock will do well, 1 or 0 stars means the ML predicts it will do poorly). We can also rank all the stocks in an investment universe by using factor scores. A factor score for momentum, as an example, would place the top momentum stocks into the top quantile (which we will also call buckets) and the bottom momentum stocks would be in the bottom quantile. Momentum Q1 is the bucket of top (5-star) quantile momentum equities and Momentum Q5 (1 to 0 stars) is the bottom quantile of momentum equities.

We can organize the stocks in any universe across a multitude of different factors. We can then cross reference the buckets (i.e. Momentum Q5) with the predictive machine learned star rankings to see what the machine thinks of Momentum Q5. If, say, the average predictive score for Momentum Q5 is a 2-star, but the average predictive score for Momentum Q1 is a 4-star, we can assume that the machine expects momentum to be a factor that performs well in the upcoming period.

Here is a streamlined example (with simplified data) where we assign TSLA, AAPL, and MSFT to the top momentum quantile (Momentum Q1) and GOOG, FB, and NFLX to the bottom quantile (Momentum Q5):

You can see in the above example that Momentum Q5 has an average star ranking of 4 while Momentum Q1 has an average star ranking of 2. If we assume the star rankings are accurate, then we can assume that Momentum Q5 will outperform Momentum Q1, or put another way – momentum will underperform.

Real Results

Of course, that is a lot of assumptions. How does this application of machine learning work in the real world? We ran a simple experiment using one of our existing models (an S&P 500 model, tested from 2010 to present), which generated these factor scores at the start of each month and compared the success of the prediction over the following 6 months. The below chart shows what actually happened during that time period for a handful of impactful factors.

The above chart shows that in Momentum, for example, during the observation period Momentum Q1 outperformed Momentum Q5 68 times (51%) and the average return seen was Momentum Q1 outperformed Momentum Q5 by +0.37%. If our factor timing is successful we would hope to be both more accurate (i.e. do a better job of predicting Momentum outperforming than 51%) and get better performance (i.e. when we predict Momentum outperforms we should have a higher average return than +0.37%).

For average return – we always calculate it as Q1 – Q5. So you would want to see larger numbers when we predict Q1 is going to outperform Q5 (Q1 > Q5) and smaller numbers when we predict the opposite (Q1 < Q5). To help visualize here is the chart for Momentum below. Since we are looking at forward return, whenever the blue bar (star rating) is above 0, we would want the red line (forward return) to be above zero as well.

Below we list out the prediction accuracy in table form. Interestingly, the machine does an excellent job of helping to predict the momentum, volume, and size factors while doing no better than the baseline on value. In momentum the percent of correct guesses on long momentum was 57.8%, a full 6.8 percentage points higher than expected (51%) and it improved odds of success on short momentum from 49% to 60.8%.

The volatility results are even more interesting. You can see that the machine rarely predicts long volatility – but when it does it is very accurate (63% vs 51% expected) and very performant (+5.39% return vs +0.65% expected). Similarly, even though it frequently predicts short volatility it is more accurate than expected (52.6% vs 49%) and more performant (-0.54% vs +0.65%, remember here we are “short” so negative numbers are a good thing).

Takeaways

Artificial intelligence and machine learning are powerful tools in an investment manager’s arsenal. Adding factor aggregation to their models can help them understand which factors the AI thinks will be performative (or not) in the future. In our tests, factor timing accuracy improved substantially by using machine learning factor aggregation. This led to average returns of +1.42% in momentum factors, +5.39% in volatility factors and +2.22% in size factors. Managers can use the insights generated by artificial intelligence to help position their portfolios and improve factor timings.