No. of Recommendations: 38

We've debated this point in the past and not really gotten to a strong consensus. There are a number of possible ways to fit a line to a set of data points. The most familiar, and the one I'm currently using for the charts, is the least squares algorithm. The line that is the "Average CAGR" line is the line that produces the lowest sum of squares of residuals over all the data points (the residuals are the differences between the actual stock price and the Average CAGR line for each date).

There is one major problem with this algorithm: when the stock price goes much too high or much too low, the algorithm weights those crazy prices more, by the square of the residual. The result is that the line is pulled extra strongly toward the crazy prices than normal prices that are closer to the Average CAGR. We see this a lot with stocks that had crazy high prices during the 2000-range bubble; the Average CAGR line is pulled up toward the bubble. You can imagine what is going on if you look at the Log Chart for any stock. A stock price twice as far from the Average CAGR line as another price weighs 4 times as heavily, in effect pulling the line towards it 4 times as strongly (because 2 squared = 4).

One of the more interesting alternatives is a fit algorithm called Least Absolute Deviation ("LAD", or more formally, L1 Norm). Instead of minimizing the sum of the squared residuals, it minimizes the sum of the absolute values of the residuals. Thus it gives all residuals equal weight according to their distance from the Average CAGR line. I think this is a much better approach.

The downside: the charts will change (Average CAGR will move, with the result of RMS and RF and everything else also changing), although many stocks will not be greatly affected. Here are some examples using this week's data:

PFE:

LSQ: http://invest.kleinnet.com/bmw1/stats25/PFE.html

LAD: http://invest.kleinnet.com/bmw1/special/PFE.html

CSCO:

LSQ: http://invest.kleinnet.com/bmw1/stats16/CSCO.html

LAD: http://invest.kleinnet.com/bmw1/special/CSCO.html

KO:

LSQ: http://invest.kleinnet.com/bmw1/stats30/KO.html

LAD: http://invest.kleinnet.com/bmw1/special/KO.html

S&P 500:

LSQ: http://invest.kleinnet.com/bmw1/stats30/$INX.html

LAD: http://invest.kleinnet.com/bmw1/special/$INX.html

Nasdaq:

LSQ: http://invest.kleinnet.com/bmw1/stats30/$COMPX.html

LAD: http://invest.kleinnet.com/bmw1/special/$COMPX.html

Most do not change much, but as far as I can tell, all the changes are in the right direction.

I would like to switch over to the LAD algorithm for next week's regular chart runs. I think it's the right choice in the long run, and one time or another I think it will have to be done.

-Mike

No. of Recommendations: 1

LSQ versus LAD

Hi Mike!

I couldn't agree more...go with LAD....it's the right move, IMH( and mathematically impaired )O! ;-)

Cheers!

Murph

No. of Recommendations: 1

Mike,

Going with LAD is the right move IMH (pretty math aware) O.

Mark

No. of Recommendations: 3

Mike,

It looks like I might be the lone dissenter, but changing over to a LAD algorithm would be a disaster for me. I have a spreadsheet showing how the RMS and RF change with time. My variant on the BMW method depends more on that change than on the absolute values themselves. If you change algorithms, I'll have to start over on the BMW method altogether, and several months' work will go up in smoke.

Also, I consider myself pretty math-savvy, and from a theoretical standpoint I much prefer the LSQ algorithm. I'm aware of the problems that outliers cause for LSQ, but IMO you just have to be aware and live with those problems. The advantage of LSQ is that the RMS is solidly grounded in statistical principles, and you can make all sorts of confidence-level statements based on it (of course, you have to assume a {log-}normal distribution).

Finally, I don't believe that **any** mechanical approach will be able to accurately "filter out" the effects of a series of extreme outliers such as those experienced by the tech stocks in the late '90s. Personally, I just stay away from those stocks. If you must invest in that area, I feel that BuildMWell's method of hand-fitting CAGR curves is the only way to properly account for the "bubble" effects.

I know that it would be a lot of extra work, but if you feel that you must switch over to the LAD algorithm, is there any chance that you could do it in addition to, instead of in place of, the LSQ algorithm?

I really do appreciate all that you do for the group, and I don't want to seem like a Luddite, but I would really prefer that we maintain the *status quo*.

Best regards,

Robert

No. of Recommendations: 0

I like the LAD method best. It is closer to what my eye tells me the CAGR should be. My eye wants to weed out the bubble impact and I end up drawing the CAGR visually from your charts and lowering the CAGR to a more consistent CAGR absent that bubble event.

ToroBravo2003

No. of Recommendations: 2

Mike, I do not see enough difference between the two approaches to matter. Where there is a huge bubble effect, the Average CAGR drops very little with your LAN approach. I think the decision between LSQ vs. LAD is moot.

I have no preference either way, but I think you are worrying too much about this. Your statistic approach is what we need and it gives us the correct picture, either way. We just need to remember that the bubble gives us an unrealistically high average CAGR...it doesn't detract from the overall analysis in any way. If it ain't broke, why fix it?

My variable percent lines show a different perspective, but the low CAGR is essentially the same on your charts and on mine. That is the key ingredient. You arrive at the same low CAGR line either way you work the data. Now, I have not graphed your LSQ versus LAD low CAGR lines, but I believe they will be pretty much identical.

Have you tried showing both approaches on the same chart? I mean overlayed on each other? I think it would be very interesting to see how they compare to each other....side by side.

The problem with a bubble, as I see it, is it will very much over-shoot reality to the upside, but the price cannot under-shoot proportionally. At the top, the investor can justify all sorts of rosy scenarios that drive the bubble prices higher and higher. The high price becomes a self-fulfilling prophesy. Each time the price rises beyond rational levels, it justifies an even higher jump.

On the down-side, the earnings are there to support the price. Now, they may be suffering, but the underlying company has real value to establish a foundation. There is no ceiling at the top to give us that solid cap. It is like a balloon being pushed into a cinder block. the top stays rounded while the bottom is compressed flat.

It seems to me that the top is more elastic than the bottom. Don't the statistics assume the same elasticitity in both directions? By the way, I did not do all that well in statistics class, as you can tell.

No. of Recommendations: 3

Hi MIke

First, congratulations on your good work in computing and publishing CAGR curves for the stocks along with their standard deviations from the mean.

I was involved in land surveying for a number of years, and it was common practice to balance surveys when we finished. We would have corordinates for a starting point and we would always close the survey, computing coordinates for the closing point, which, in theory, should equal the coordinates of the point of beginning. There was always some error due to the imprecision of instruments and human accidental error, but if it was in excess of 1/10000 we worried and rechecked.

We took special precautions to negate the systematic errors which might be introduced by instruments (example-tape being .01 foot out of calibration). What was left after we eliminated systematic errors were accidental or random errors, which were just as likely to be positive as negative.

We used three methods to balance our surveys: the transit method; the compass method; and the method of least squares. The transit method assumed that most of the random errors were due to measurements of distance, and was not widely used. The compass method assumed that the errors were about equally likely to be in angle measurement as distance measurement. The least square method assumned that the most likely solution was the one where the square of the residual-error distances from north/south and east/west axis were a minimum. This was actually the best method to use, but it was very burdensome, mathematically speaking, prior to the introduction of hand calculators and computers.

And we didn't have many extreme points, since we would predict the line in which a blunder occured and also predict whether the blunder was in distance measurement or angular measurement.

I think this is analagous to the method that you currently use to generate the CAGR curves. Of course, you don't have a precise starting point as we had in measuring the error of the results. So, I think you do a darn good job of generating charts, and I commend you for it. Also, I use the charts.

In retrospect, I'm not certain about correcting the deviations from the average by the sum of the absolute variance, but what you say makes sense. It would certainly reduce the influence of prices that were extremely different from the average. So, I encourage you to go for it. Every measurement contains error due to assumptions, measurements and human error.

I have enjoyed this board so much. And I hope it prospers for a long, long time.

Delwin

No. of Recommendations: 0

My only concern with LAD is if it is assigning less weight to the bubble, then it will in the future, * by virtue of having assigned less weight to the bubble * cause us to miss on opportunities at or below -2RMS that would have been pretty clear from LS.

Bosco.

No. of Recommendations: 0

Mike,

I've toyed with trying to assign a sqrt to deviations to the mean. Alas I have lots of math background but *NO* background in statistics :( to help me with this.

I think using LAD is correct.

Jim

No. of Recommendations: 4

I agree with your suggestion, from both an intuitive and mathematical standpoint.

Benoit Mandelbrot strongly cautions about "fat tails," in "The Misbehavior of Markets." Mathematically, a bubble is a "fat tail" (an event that occurs in greater number/ degree than predicted by a Gaussian probability distribution). Intuitively, we perceive this as an outlier. Bubbles, in retrospect (I wouldn't want to seem more knowledgable than Alan Greenspan) show greater numbers of outliers. Same for crashes (see "Manias, Panics and Crashes," by Kindleberger).

In 2000/2001, intuitive understanding of the exponentially-growing deviation of the stock market from its underlying fundamentals (the GDP growing by 3% a year, the market averages growing by 25% a year), led me to withdraw my money from the market. In retrospect, this was the right thing to do.

Using the LAD, instead of the least squares, reduces the effect of bubbles. That's a good thing to do.

However, the LAD still includes the outlying data.

Perhaps, you could use a normalization function, to reduce the effect of "fat tails." Data points that deviate by more than 3 SD could be "dampened."

From the standpoint of protecting investors, it might be worthwhile to highlight the segment of the chart that deviates from the norm. Due diligence could then help reveal whether the deviation from normal growth is occurring because of a business reason (such as release of an important new product line), or because of a bubble.

Wendy

No. of Recommendations: 2

*LSQ versus LAD*

Hi Mike!

I couldn't agree more...go with LAD....it's the right move, IMH( and mathematically impaired )O! ;-)

Cheers!

Murph

Wait a second, here! Is Murph making mathmatical algorithm recs..... or is there an adult beverage that is abbreviated with the initials LAD???

:))) Steve

No. of Recommendations: 2

Personally, I don't think it matters which measuring stick is used. The graduations on the ruler may differ somewhat but the scope of what we are measuring is several orders of magnitude greater. I don't think the difference in accuracy is significant.

They are your charts Mike........do what makes you happy.

MW (2 cents)

No. of Recommendations: 1

Wait a second, here! Is Murph making mathmatical algorithm recs..... or is there an adult beverage that is abbreviated with the initials LAD???

:))) Steve

Hey, Steve!

Wait a minute! You mean LAD doesn't LAD stand for " Lager, Ale & Daiquiri? "

What's up with that!?!

Cheers!

Murph