OPTIMUM PREDICTIVE FILTERS

by

John Ehlers

Source: MESA Software Technical Papers

 

 

INTRODUCTION

Technical Analysis is necessarily reactive to the action of the market. The indicators we develop are largely generated to sense the expected price direction. The predictive nature of these indicators is based on correlation to our past experience, so the expectation is that if something happened before it will happen again. However, none of the indicators are truly predictive in the scientific sense.

In this article we describe what a predictive filter is, how to generate this filter, and (most importantly) the conditions under which the filter can be most effectively used. Like all technical indicators, the optimum predictive filter cannot be used universally in all market conditions. However, carefully observing those conditions where it is appropriate can make it a valuable weapon in your arsenal of technical analysis weapons.

 

WHAT IT IS

An optimum predictive filter is just the difference between the original function and its exponential moving average[1]. That's it! It really is that simple! While the implementation is simple, the derivation is considerably more complex. In general, the response of an optimum system is described by the solution of the Wiener-Hopf equation.

Having defined an optimum predictive filter we must quickly specify the conditions that are required for that filter to be valid. The two conditions are that the amplitude swings of the original function must be limited and the probability of the function passing through its zero value must satisfy a Poisson probability distribution. It turns out that these conditions are easy to satisfy.

Without getting into the math, a Poisson probability distribution means that if there are an average number of zero crossings, then the number of crossings we expect are not too far removed from that average. An approximation to the Poisson probability distribution can be achieved using market data if the prices have been "properly" detrended. It is absolutely crucial that "proper" detrending be accomplished because the buy/sell signals are obtained by the crossing of the detrended price and the predictive filter lines. If the price has not been properly detrended to meet the Poisson probability constraint, then the lines will not cross correctly.

It is also easy to satisfy the amplitude swing requirement using conventional technical indicators such as RSI or Stochastic. Welles Wilder defined the RSI as

CU, or Closes Up, means the sum of the day-to-day differential closing prices over an observation period. If the closing price differential is down for a given day, then its contribution to CU is zero. Similarly, CD means the sum of the day-to-day differential closing prices over the observation period. Only declining prices are considered, and these are summed as positive numbers. Using a little algebra and neglecting the scale factor of 100, the RSI is just the ratio of the value of closes up over the observation period to the sum of all closing price differentials as:

Using this formulation it is easy to see that the RSI has a maximum value of one when there are no closes down and it has a minimum value of zero when there are no closes up. Therefore, the RSI satisfies the condition of having a limited amplitude swings from minimum to maximum. The RSI can be centered on zero (when "properly detrended") by subtracting 0.5 from the computed RSI.

The Stochastic is also amplitude limited between zero and one. It has a unity value when the current closing price is equal to the highest high over the observation period and it has a zero value when the current closing price is equal to the lowest low over the observation period.

"Proper" detrending of the RSI or Stochastic is accomplished by altering their observation period. Proper detrending might be best understood by examining the extremes of improper detrending. If we used a one year observation period of daily data to create the RSI, the RSI would stay very near 0.5 because the sum of the closes up would statistically be near half the total of all differential closes. On the other hand, the RSI would erratically bounce from 0 to 1 if we had a one day observation period. As we increase the observation period, the indicator takes on more the characteristic of a rectangular wave between the 0 and 1 limits when we plot it. Increasing the observation period still further, the RSI ideally assumes the shape of a sinewave where the peak of the wave barely touches the minimum and maximum values. When this condition is reached, the RSI has been properly detrended. If you look at the market from a cycles perspective, the proper detrending occurs when the observation period is between a half cycle and a full cycle length. The shorter length is more appropriate if the resulting RSI is near a maximum or minimum value. If you prefer, you get to the same proper detrending if you sequentially shorten the observation period, starting from a long period. In this case, the RSI swings increase until the extremes just touch the 0 and 1 limits.

A Stochastic is more likely to be properly detrended when the observation period is approximately one full cycle period. (A full cycle length is the period between successive maxima or minima of the resulting Stochastic.) It is often difficult to properly detrend the Stochastic because it persistently sticks near a maximum or minimum value. This is a clue that the price is in a trend mode, and the Poisson probability distribution constraint cannot be met. In such cases, the only thing to do is to switch to a trend-following technical approach.

 

WHAT IT IS NOT

An optimum predictive filter is not a component of a trend following system. Since we are working with detrending indicators, the intended use of the optimum predictive filter is to anticipate short term market turning points. If the conditions of use cannot be met you should not try to force this indicator. It will just end up costing you money.

 

PREDICTION LIMITATIONS

An exponential moving average (EMA) produces two functional results. First, the averaged output is delayed relative to the original function. Secondly, the output amplitude is reduced by the smoothing action of the average. The relationship between delay and the EMA constant is described in the sidebar for a pure trend mode condition. Delay for the sinewave-like detrended function is more complicated because of nonlinearities.

We can get some insight into how the optimum predictor works if we momentarily ignore the reduced amplitude of the EMA. If we describe the angle generating a Cosine wave as j and the phase angle lag of the EMA as the angle q, then the simplified equation for the optimum predictive filter is:

This equation basically tells us that the phase lead of the predictor will be the complement of the EMA lag angle. We therefore have some control over the amount of prediction we can expect. Further, a very short EMA lag produces near the maximum amount of prediction lead. The short EMA lag is not useful because the amplitude of the predictor is small due to the Sin(q /2) term.

A number of nonlinearities enter the real-world picture. For example, the EMA lag can never exceed a quarter cycle. The only practical way to assess the performance of the optimum predictive filter is by tabulating the results, as we have done in Table 1. The entry point for Table 1 is the fraction of a full cycle period you expect to induce through the use of the EMA. Knowing the length of the full cycle, you can easily calculate the EMA constant from the final equation in the sidebar. Table 1 shows that the lag angle is very nearly the same as the induced lag when the angle is small, but the lag angle never gets to 90 degrees (a quarter cycle). The table shows that as the induced lag is increased the amplitude of the predictor rises and the amplitude of the EMA decreases. An oversimplification, but easy to remember rule is that the best induced lag is one-eighth of a cycle (45 degrees), resulting in both the EMA and predictor having equal amplitudes of 0.7 times the amplitude of the RSI (or Stochastic).

Table 1

EMA and Predictor Responses

Delay

EMA

Predictor

(fraction of
a cycle)

lag angle
(degrees)

amplitude

lead angle
(degrees)

amplitude

.05

17

.96

72

.28

.10

32

.87

57

.52

.15

43

.77

46

.67

.20

52

.70

37

.77

.25

57

.63

32

.83

.30

61

.57

26

.87

.35

64

.53

23

.89

.40

49

.50

19

.91

 

An even easier to remember rule is to use an EMA constant of 0.25. This corresponds to an induced lag of 3 days. Then, you can expect reasonable performance from the optimum predictive filter for cycle periods over the range from 12 to 24 days. These cycle periods correspond to EMA lag range from a quarter cycle to 1/8th of a cycle.

 

STEP BY STEP PROCEDURE

The following procedure assumes the use of an RSI as the starting point indicator. You can equally well use a Stochastic or other amplitude-limiting indicator.

 

1. Optimally detrend the indicator by gradually decreasing the observation period so that the peak values almost reach the minimum and maximum indicator limits. The resulting waveform should look sinewave-like, having relatively consistent crossings of the median value. If you cannot get a proper looking waveform, it's probably best to abandon the predictor at this point.

 

2. Subtract 0.5 from the indicator so that the median value is zero. (Subtract 50 if you use a range described in terms of percent). For simplicity, we will call this the RSI.

 

3. Take an EMA of the RSI. The most commonly used value of EMA constant is 0.25.

 

4. Subtract the EMA of step 3 from the RSI. This is the optimum predictive line. Plot the optimum predictor as an overlay to the RSI.

 

5. The buy and sell signals are generated when the optimum predictor crosses the RSI.

 

If the signals are too noisy, you may wish to smooth the RSI in step 2 with a moving average before you take the lagging EMA in step 3. Other smoothing techniques can also be used.

 

EXAMPLES

 

Figure 1

Optimized Predicting Filter for a 24 bar Sinewave

My approach is to always test my theories on theoretical waveforms before trying to use them in real trading situations. Using the theoretical waveforms allows the testing to be done under controlled conditions. This, in turn, lets me examine the limit of usefulness of the technique. Figure 1 is a theoretical 24 day sinewave. The RSI is plotted below the barchart. The optimum detrending occurs when the observation period of the RSI is a half-cycle, or 12 days. The optimum predictive filter is calculated using 1/8th of a cycle induced lag, or 3 days. That is, the EMA constant is 0.25. Figure 1 shows the reduced amplitude of the predictive filter line and how it leads the action of the RSI itself. Most importantly, the buy/sell indications are provided in time to take advantage of the full cycle swing as a trading position.

Knowing that the optimum predictive filter works in controlled conditions, we can turn out attention to real-world situations. In Figure 2 we have optimally detrended the April 1995 Gold Futures contract using a 8 day period of observation for the RSI because the contract had a 16 day cycle over most of the screen. We retain the 3 day induced lag for the EMA. The resulting buy/sell signals, indicated by the up and down arrows where the optimum predictor crosses the RSI, are outstanding where the contract was properly detrended! Detrending was not proper in the left third of the chart, and bad signals resulted. At the expense of being redundant, the indicator works because all the conditions have been met. The RSI has been "properly" detrended. As a result, the crossings of the median values have an approximate Poisson probability distribution, i.e. these crossings occur regularly. We are rewarded in our search with these outstanding trading signals.

 

Figure 2

Optimized Predictive Filter for April 95 Gold

SKINNING THE CAT

The response of an optimum system is described by a solution of the Wiener-Hopf equation. There is more than one solution to the equation. The various solutions are determined by the characteristics of the waveforms being filtered. In the case of the optimum predictive filter we have just described, the signals were required to have limited amplitudes and to have a Poisson probability distribution of their median crossings.

Other optimum predictive filters are possible. For example, an optimum predictive filter was described as part of the BandPass indicator[2]. This kind of predictive filter is described as a "pure predictor" because it does not consider the impact of noise. Since the pure predictor is not constrained by noise considerations, the amplitude of the originating function need not be limited nor are there any probability restrictions on its use. The pure predictor is generated by carefully scaling the momentum of a smooth function. If the function is noisy the pure predictor is so erratic that it is almost useless. However, the pure predictor is appropriate for use with the BandPass indicator because the higher order filters used in the calculation provide a high degree of smoothing. The BandPass indicator can be found in my MESA for Windows and 3D for Windows programs, as well as on several bulletin boards around the country.

 

CONCLUSIONS

An optimized predictive filter can be generated as a minor extension of conventional indicators such as the RSI or Stochastic. The predictive filter can be programmed into most toolbox programs, improving the functionality of the indicator. While the optimized predictive filter can be valuable, the conditions for its use must be carefully observed. These conditions are that the RSI or Stochastic must be properly detrended and the resulting crossings of the median value must be relatively consistent. The typical prediction is approximately 1/8th of a cycle. For a cycle as short as 8 days, this is a one day advance warning - just enough to make an entry at the proper time. When the cycles are longer, you may want to wait a day or two before making an entry because the prediction is a little early. In any event, the optimum predictive filter is major tool to overcome the reactive nature, i.e. the lag, of technical indicators.


 


SIDEBAR

The relationship between averaging lag and the EMA constant

 

The equation to compute an Exponential Moving Average (EMA) is:

Picture a function that increases by 1 for each new day. Then, on the generalized "ith" day the function will have a value of "i". Assume the EMA lag is K. Then the EMA will have a value of (i - K) on the "ith" day and will have a value of (i - K - 1) on the previous day. Inserting these values in the EMA equation, we have:

Thus, the EMA constant is computed as the reciprocal of one plus the expected delay. This method of determining the EMA constant is far more functional than relating the EMA constant to a simple moving average period.

 

 




[1] Y.W. Lee, "Statistical Theory of Communication", John Wiley & Sons, pp 417

[2] John Ehlers, "The BandPass Indicator", Stocks & Commodities, Sep 1994, pp 51