Data Science

Day 22: Error Correction

On Day 21, we wrung our hands with frustration over how to proceed. The results of our circular block sampling suggested we shouldn’t expect a whole lot of outperformance in our 12-by-12 model out-of-sample. To deal with this our choices were, back to the drawing board or off to the waterboard to start over or to torture the data until it told us what we wanted. However, we found a third way, in which we could use the information we already had, to make a few minor tweaks to improve the model.

Day 21: Drawing Board

On Day 20 we completed our analysis of the 12-by-12 strategy using circular block sampling on the 3 and 7 blocks. We found the strategy did not outperform buy-and-hold on average and its frequency of outperformance was modest – in the 28-31% range – insufficient to warrant actually executing the strategy. What to do? Back to the drawing board to test a new strategy? Or to the water board to torture the current one?

Day 20: Strategy Sample

On Day 19, we introduced circular block sampling and used it to test the likelihood the 200-day SMA strategy would outperform buy-and-hold over a five year period. We found that the 200-day outperformed buy-and-hold a little over 25% of the time across 1,000 simulations. The frequency of the 200-day’s Sharpe Ratio exceeding buy-and-hold was about 30%. Today, we apply the same analysis to the 12-by-12 strategy. When we run the simulations we find that the strategy has an average underperformance of 11% points vs.

Day 19: Circular Sample

On Day 18 we started to discuss simulating returns to quantify the probability of success for a strategy out-of-sample. The reason for this was we were unsure whether or how much to merit the 12-by-12’s performance relative to the 200-Day SMA. We discussed various simulation techniques, but we settled on historical sampling that accounts for autocorrelation of returns. We then showed the autocorrelation plots for the strategy and the three benchmarks – buy-and-hold, 60-40, and 200-day – with up to 10 lags.

Day 18: Autocorrelation Again!

On Day 17 , we compared the 12-by-12 and 200-day SMA strategies in terms of magnitude and duration of drawdowns, finding in favor of the 200-day. We also noted that most of the contributors to the differences in performance were due to two periods at the beginning and end of the period we were examining. That suggests regime driven performance, which begs the question, how much can we rely on our analysis if we have no certainty that any of the regimes identified will recur, or, even be forecastable?

Day 17: Drawdowns

On Day 16, we showed the adjusted 12-by-12 strategy with full performance metrics against buy-and-hold, the 60-40 SPY-IEF ETF portfolio, and the 200-day SMA strategy. In all cases, it tended to perform better than the benchmarks. However, against the 200-day SMA that performance came primarily at the end of the period. This begs the question of what to make of the performance differences between the 12-by-12 and 200-day SMA strategies.

Day 16: Comps

On Day 15 we adjusted our model to use more recent data to forecast the 12-week look forward return. As before, we used that forecast to generate a trading signal that tells us to go long the SPY if the forecast is positive, and exit (or short for the long-short strategy) if otherwise. We saw this tweak generated about 10% points of cumulative outperformance and a 20% point higher Sharpe Ratio.

Day 15: Backtest II

On Day 14 we showed how the trading model we built was snooping and provided one way to correct it. Essentially, we ensure the time in which we actually have the target variable data aligns with when the trading signals are produced. We then used the value of the next time step to input into the model to generate a forecast. If the forecast was positive, we’d go long the SPY ETF, if negative stay out of the market or short depending on the strategy.

Day 14: Snooping

Guess what? The model we built in our last post actually suffers from snooping. We did this deliberately to show how easy it is to get mixed up when translating forecasting models into trading signals. Let’s explain. Our momentum model uses a 12-week cumulative return lookback to forecast the next 12-week cumulative return. That may have produced a pretty good explanatory model compared to the others. But we need to be careful.

Day 13: Backtest I

Unlucky 13! Or contrarian indicator? There’s really nothing so heartwarming as magical thinking. Whatever the case, on Day 12 we iterated through the 320 different model and train step iterations to settle on 10 potential candidates. Today, we look at the best performing candidate and discuss the process to see if the forecasts produce a viable trading strategy. As we noted before, we could have launched directly into testing Fibonacci retracements with Bollinger Band breakouts filtered by Chaikin Volatility, but what would be the thesis as to why these indicators work better than others?