Guess what? The model we built in our last post actually suffers from snooping. We did this deliberately to show how easy it is to get mixed up when translating forecasting models into trading signals. Let’s explain. Our momentum model uses a 12-week cumulative return lookback to forecast the next 12-week cumulative return. That may have produced a pretty good explanatory model compared to the others. But we need to be careful.
Unlucky 13! Or contrarian indicator? There’s really nothing so heartwarming as magical thinking. Whatever the case, on Day 12 we iterated through the 320 different model and train step iterations to settle on 10 potential candidates. Today, we look at the best performing candidate and discuss the process to see if the forecasts produce a viable trading strategy.
As we noted before, we could have launched directly into testing Fibonacci retracements with Bollinger Band breakouts filtered by Chaikin Volatility, but what would be the thesis as to why these indicators work better than others?
In Day 11, we presented an initial iteration of train/forecast steps to see if one combination performs better than another. Our metric of choice was root mean-squared error (RMSE)1 which is frequently used to compare model performance in machine learning circles. The advantage of RMSE is that it is in the same units as the forecast variable. The drawback is that it is tough to interpret on its own. A model with a RMSE of 10 is better than one with 20.
On Day 10, we analyzed the performance of the 12-by-12 model by examining the predicted values and residuals. Our initial takeaway suggested the model did seem not overly biased or misspecified in the -10% to 10% region. But when it gets outside that range, watch out! We suspected that there was some autocorrelation in the residuals, which we want to discuss today. There are different statistical tests for autocorrelation and normality that would take too long to explain in a blog post.
On Day 9 we conducted a walk-forward analysis on the 12-by-12 week lookback-look forward combination. We then presented the canonical the actual vs. predicted value graph with a \(45^o\) line overlay to show what a perfect forecast would look like. Here’s the graph again.
As noted previously, we limited the scale of the axes to make it easier to interpret. This omits some outliers, which we’ll touch on below. The main body of the graph shows a nice scattering of the data around the line.
Yesterday we finished up our analysis of the regression models we built using different combinations of lookback and look forward momentum values. Today, we see if we can generate good forecasts using that data. If you’re wondering why we still haven’t tested Fibonacci retracements with Bollinger Band breakouts filtered by Chaikin Volatility, the reason is that we’re first trying to establish some rigor – albeit modest – to our tests.
Yesterday, we discussed the size effects, their statistical significance (e.g., p-values), and some other summary statistics for the various momentum combinations – namely, 3, 6, 9, and 12 week lookback and look forward returns. We found that size effects were small, but a few were significant, and that in the case of the 12-by-12 combination about 75% of the results clustered in the -10% to 10% range for both directions – forward and back.
Welcome to the last day of the first week of 30 days of backtesting! We hope you’re enjoying the ride. If you have any questions or concerns, you can reach us at the contact details listed at the bottom of this post.
On Day 6 we defined momentum rather roughly and ran a bunch of tests to identify the linear relationship between different lookback and look forward periods. However, we didn’t go into detail about the results.
Yesterday we examined the eponymous Fama-French factors to see if we could find something that will help us develop an investment strategy to backtest. It turned out the best performing factor was the market risk premium, which is essentially the return to the market in excess of the risk-free rate. In other words, the best factor is buy-and-hold! I guess that means we’ve finished 24 days early. Just buy the index.
The day has finally arrived! Time to start backtesting! We’ve always wanted to test how Fibonacci retracements with Bollinger Band breakouts filtered by Chaikin Volatility would perform while implementing rolling stop-loss updates based on the ATR scaled by the 7-day minus 5-day implied volatility rank.1 Maybe we’re getting ahead of ourselves. Expeditions are fun and it’s always thrilling to explore uncharted territory. But it’s also easy to get lost and forget that we’re ultimately trying to generate superior, risk-adjusted returns.