Recommendations: 3
Some of the independant variables are highly correlated and this will cause the tuned parameters to differ fom their expected values even to the extent of sign reversal. So the tuned parameters may be misleading outside of the model, but the descriptive model as a whole, remains valid.
I suspect you already know this, but it is worth reemphasizing  extensive tuning to fit historical data often leads to worse results going forwards than less tuned variants. I find what Bill Eckhardt has to say on the topic (albeit wrt futures trading) useful:
from http://www.futuresmag.com/2011/03/01/williameckhardtthema...
FM: Talk about the battle between optimization and curve fitting.
BE: By trying to improve your system you can make it worse. You can overfit to past data or maybe just do something that is statistically invalid. There is an idea, though it is not universally subscribed to, that you should not optimize your systems. That you should just figure out what are reasonable numbers and go with that. I don’t believe in that; we optimize all the time, but there is some truth to it in the sense that if you overfit, you are going to hurt yourself. Optimizing is a somewhat hazardous procedure, as is trading. And it has to be done with carefulness and deliberateness, and you have to make sure that you are not overfitting to past data.
FM: How do you ward off curvefitting?
BE: What most people use to ward it off is the insample/outofsample technique where they keep half their data for optimization and half their data for testing. That is an industry standard. We don’t do that; it wastes half of the data. We have our own proprietary techniques for overfitting that we actually just improved on a year ago. It is important to test for overfitting; if you don’t have your own test use the insample/outofsample [technique].
I can talk a little more about overfitting, if not my personal proprietary techniques. First of all I like the [term] overfitting rather than curvefitting because curvefitting is a term from nonlinear regression analysis. It is where you have a lot of data and you are fitting the data points to some curve. Well, you are not doing that with futures. Technically there is no curvefitting here; the term does not apply. But what you can do is you can overfit. The reason I like the term overfit rather than curvefit is that overfit shows that you also can underfit. The people who do not optimize are underfitting.
Now the two numbers that most determine if you are overfitting are the number of degrees of freedom in the system. Every time you need a number to define the system, like a certain number of days back, a certain distance in price, a certain threshold, anything like that is a degree of freedom. The more degrees of freedom that you have the more likely that you are to overfit. Now the other side of it is the number of trades you have. The more trades you have, the less you tend to overfit, so you can afford slightly more degrees of freedom. We don’t allow more than 12 degrees of freedom in any system. If you put more bells and whistles on your system it is easy to get 40 degrees of freedom but we hold it to 12. On the other side of that, for us to make a trade we have to have a sample of at least 1,800; we won’t make a trade unless we have 1,800 examples. That is our absolute minimum. Typically we would have 15,000 trades of a certain kind before we would make an inference as to whether we want to do it.
The reason you need so many is the heavy tail phenomena. It is not only that heavy tails cause extreme events, which can mess up your life, the real problem with the heavy tails is that they can weaken your ability to make proper inferences. Normal distribution people say that large samples kick in around 35. In other words, if you have a normal distribution and you are trying to estimate a mean, if you have more than 35 you’ve got a good estimate. [In] contrast, with the kind of distributions we have with futures trading you can have hundreds of samples and they could still be inadequate; that is why we go for 1,800 as a minimum. That is strictly a function of the fatness of tails of the distribution. You have to use robust statistical techniques and these robust statistical techniques are blunt instruments. [They] are data hogs, so both seem to be disadvantages but they have the advantages of tending to be correct.



Announcements
