Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Correct process for creating a linear trend line in Excel

22 views
Skip to first unread message

Onion Knight

unread,
Jun 15, 2012, 5:16:44 PM6/15/12
to
There has been some debate in COLA as to the correct way to create a
linear trendline in Excel. Someone in the forum kindly made a video to
show what he thought was the correct method
http://tmp.gallopinginsanity.com/LinearTrendLineCreation.mov

A couple of other people in the group say that the video skips a
number of steps but when I look at the directions from Microsoft I do
not see any they could be referring to.

Does anyone see any missing steps for the creation of a linear trend
line? I admit this is a bit over my head so any input would be welcome.

Peter Köhlmann

unread,
Jun 15, 2012, 6:29:39 PM6/15/12
to
Give it up, Snit Glasser.

Nobody here believes that "Onion Knight" is anything but a sock

joeu2004

unread,
Jun 15, 2012, 7:22:04 PM6/15/12
to
"Onion Knight" <onionkn...@gmail.com> wrote:
> There has been some debate in COLA as to the correct way
> to create a linear trendline in Excel. Someone in the forum
> kindly made a video to show what he thought was the correct
> method
> http://tmp.gallopinginsanity.com/LinearTrendLineCreation.mov
[....]
> Does anyone see any missing steps for the creation of a
> linear trend line?

I cannot find the original debate in comp.os.linux.advocacy (COLA); just the
thread titled "Visualizing where to draw the standard deviation line", which
refers to yet-another "a debate in COLA".

No matter; I'm not interested. I think we can answer your Excel questions
without getting into anything so esoteric.

I see no missing steps in creating the linear trendline per se using Excel.

I would also select the trendline options to display the trendline equation
and the R-squared of the trendline.

And after displaying the trendline equation, I might select the option to
Format Trendline Label in order to alter the display precision of the
coefficients. I would select Scientific with 14 decimal places if I want to
see the trendline coefficients "exactly"; that is, the most precision that
Excel will display.

But none of those "steps" affects how the linear trendline is created.

FYI, we can also use the Excel function LINEST to determine the regression
line coefficients.

If the dates are in A1:A24 and the percentages are in B1:B24, select two
horizontal cells and array-enter the following formula (press
ctrl+shift+Enter instead just Enter): =LINEST(B1:B24,A1:A24).

To compare with the "exact" trendline coefficients, I would also format the
LINEST results as Scientific with 14 decimal places.

You might notice infinitesimal differences between the trendline
coefficients and the LINEST results. But in this case, they are same up 10
or 11 decimal places in this case; "close enough for government work". As a
guess, the difference might be attributed to physically different internal
algorithms and/or to differences in internal floating-point arithmetic
anomalies.

For some insight into how the Excel LINEST and linear trendline coefficients
might be derived, see http://en.wikipedia.org/wiki/Simple_linear_regression.

Note that =COVAR(A1:A24,B1:B24)/VARP(A1:A24) in E3 and
=AVERAGE(B1:B24)-E3*AVERAGE(A1:A24) are about the same as the LINEST
results. The difference is in the 3 least-significant binary bits of the
floating-point representations in this case.

You might also take note of the wiki page section titled "Normality
assumption". That may or may not be relevant to the COLA debates.

Snit

unread,
Jun 15, 2012, 8:45:21 PM6/15/12
to
On 6/15/12 4:22 PM, in article jrgg4u$to5$1...@dont-email.me, "joeu2004"
<joeu...@foo.bar> wrote:

> "Onion Knight" <onionkn...@gmail.com> wrote:
>> There has been some debate in COLA as to the correct way
>> to create a linear trendline in Excel. Someone in the forum
>> kindly made a video to show what he thought was the correct
>> method
>> http://tmp.gallopinginsanity.com/LinearTrendLineCreation.mov
> [....]
>> Does anyone see any missing steps for the creation of a
>> linear trend line?
>
> I cannot find the original debate in comp.os.linux.advocacy (COLA); just the
> thread titled "Visualizing where to draw the standard deviation line", which
> refers to yet-another "a debate in COLA".
>
> No matter; I'm not interested. I think we can answer your Excel questions
> without getting into anything so esoteric.

Really the video is all you need to see (I am the one who created it and the
other related images with trend lines)... it shows the steps and they match
what MS suggests exactly (with the exception that I use the right-click
method to get to some tools where the MS page suggests using the Ribbon).

> I see no missing steps in creating the linear trendline per se using Excel.

Thank you. I think anyone who knows Excel well... or even not so well... or
even poorly but can read the steps provided by MS... can see this. But
someone in COLA kept saying I was missing steps:

cc:
------
I will gladly educate your ignorant ass, and give you step by
step instructions on how to correctly do it so you can see
all your missing steps, but first you have to stop lying.
------
So you are missing some key steps before doing the trendline.
I have said that many times now.
------

He later said he was in reference to doing more than just creating the trend
line, which is all I said I was doing - he claimed I was missing steps to do
part of a process I never claimed to do. Yes: his claims were idiotic.
This is the nature of the "advocacy" groups.

Bottom line: my steps for creating a linear trend line were flawless, which
is not to say creating such a line was the only form of analysis that could
be done on such data or that the creation of a linear trend line on data
that is clearly not linear is the best way to do an analysis.

Using the same steps, though, I was able to show a trend line with a very
good fit for the second half of 2011:
<http://tmp.gallopinginsanity.com/LinuxTrend2011-2ndhalf.png>

And also able to show how the trend changed for the first half of 2012:
<http://tmp.gallopinginsanity.com/LinuxTrendLine2012.jpg>

The fact is, a linear trend line is not going to show a very good fit with
non-linear data... or data where the trend changes.

> I would also select the trendline options to display the trendline equation
> and the R-squared of the trendline.

You can see where I did that in the above links... and in others that I
made... a whole bunch of them combined:
<http://tmp.gallopinginsanity.com/LinuxMultLinearTrendLines.png>

Even then my labeling is not done well... but my only claim was that I made
the linear trend line correctly, not that the graph had good labeling, etc.
I openly acknowledge it did not.

> And after displaying the trendline equation, I might select the option to
> Format Trendline Label in order to alter the display precision of the
> coefficients. I would select Scientific with 14 decimal places if I want to
> see the trendline coefficients "exactly"; that is, the most precision that
> Excel will display.

Seems a bit excessive for the needs here - but I did know about those
options.

> But none of those "steps" affects how the linear trendline is created.

Correct. Nor do they remove the fact there are other forms of analysis that
can be done.

> FYI, we can also use the Excel function LINEST to determine the regression
> line coefficients.
>
> If the dates are in A1:A24 and the percentages are in B1:B24, select two
> horizontal cells and array-enter the following formula (press
> ctrl+shift+Enter instead just Enter): =LINEST(B1:B24,A1:A24).

I have used this function a couple of times... but never for any real work.
:)

> To compare with the "exact" trendline coefficients, I would also format the
> LINEST results as Scientific with 14 decimal places.
>
> You might notice infinitesimal differences between the trendline
> coefficients and the LINEST results. But in this case, they are same up 10
> or 11 decimal places in this case; "close enough for government work". As a
> guess, the difference might be attributed to physically different internal
> algorithms and/or to differences in internal floating-point arithmetic
> anomalies.
>
> For some insight into how the Excel LINEST and linear trendline coefficients
> might be derived, see http://en.wikipedia.org/wiki/Simple_linear_regression.
>
> Note that =COVAR(A1:A24,B1:B24)/VARP(A1:A24) in E3 and
> =AVERAGE(B1:B24)-E3*AVERAGE(A1:A24) are about the same as the LINEST
> results. The difference is in the 3 least-significant binary bits of the
> floating-point representations in this case.
>
> You might also take note of the wiki page section titled "Normality
> assumption". That may or may not be relevant to the COLA debates.

Not really... though the idea of a normal distribution was also debated. A
bit of info you might or might not know: you can easily visualize where the
standard deviation (sigma) lines should be drawn on a normal distribution...
the lines are *always* at the distance from the mean to the inflection point
on the curve (where the concavity changes). I was repeatedly told by the
same person the distance was irrelevant:

cc:
-----
There'se nothing wrong with the image, other than some weird
axis labeling.
-----
Snit's so fucking stupid he thinks the sigma lines are drawn
based on distance from the mean, not area under the curve.
-----
| The sigma lines are drawn based on the area of the curve -
| which is easy to see when the images screw it up, esp. when
| they do so really badly, like in some of the ones I showed
| you.
They are not wrong.
------
LOL!!!! All of those links are fine. The first sigma lines
cover 68% of the area UNDER THE CURVE.
-----
If you would like to prove, on any single one of the links
you call incorrect, that the first sigma lines do not bound
an area that is 68.2% of the area UNDER THE CURVE, then I
would like to see it.
-----
Hahahaha your "approximate inflection points" are hilarious.
Please, post more on this subject.
------

I explained to him why he was wrong here: <http://youtu.be/MoW3hMq-eIc>

There are many examples of people who should know better getting this wrong.
Here are some

<http://www.udel.edu/htr/Statistics/Images/Class12/normal2.gif> From:
<http://www.udel.edu/htr/Statistics/Notes/class12.html>

Which is the example I used for showing how him can make a decent
approximation visually: <http://tmp.gallopinginsanity.com/sd.png>.

I also pointed to some other examples which at least appear incorrect to me
(though they are not as far off as the above example):

<http://www.footballguys.com/shickstandard_1_files/image009.gif> From:
<http://www.footballguys.com/shickstandard_1.htm>
Sigma lines clearly not at a far enough distance from the mean, esp. on the
graph to the right.

<http://www.gsseser.com/images/StandardDeviation2s.gif> From:
<http://www.gsseser.com/Deviation.htm>
Sigma lines clearly not at a far enough distance from the mean.

You would think that such sites would be made by people who knew better. I
openly admit I am not a math wiz but it is rather silly when sites claiming
to be teaching such things get their depictions wrong (of course, one of the
sites above is from "Footballguys"... and you might not expect them, by
stereotype, to be the most knowledgeable in such areas anyway). :)

Anyway, sorry to have the idiotic debate spread to other forums... though it
is sorta fun to see people who clearly know what they are talking about rip
apart those who were calling me names and insisting they "knew" I was wrong.
:)




--
The indisputable facts about that absurd debate: <http://goo.gl/2337P>
cc being proved wrong about his stats BS: <http://goo.gl/1aYrP>
7 simple questions cc will *never* answer: <http://goo.gl/cNBzu>
cc again pretends to be knowledgeable about things he is clueless about.

Steve Carroll

unread,
Jun 15, 2012, 9:39:48 PM6/15/12
to
On Jun 15, 5:22 pm, "joeu2004" <joeu2...@foo.bar> wrote:
> "Onion Knight" <onionknight...@gmail.com> wrote:
> > There has been some debate in COLA as to the correct way
> > to create a linear trendline in Excel. Someone in the forum
> > kindly made a video to show what he thought was the correct
> > method
> >http://tmp.gallopinginsanity.com/LinearTrendLineCreation.mov
> [....]
> > Does anyone see any missing steps for the creation of a
> > linear trend line?
>
> I cannot find the original debate in comp.os.linux.advocacy (COLA); just the
> thread titled "Visualizing where to draw the standard deviation line", which
> refers to yet-another "a debate in COLA".
>
> No matter; I'm not interested.  I think we can answer your Excel questions
> without getting into anything so esoteric.
>
> I see no missing steps in creating the linear trendline per se using Excel.

Given your reply it's pretty evident you realize that the process you
undertake depends on what your goal for creating the trend line is and
why you'd want to do the steps you mentioned in your reply. Earlier,
in another forum, Snit was talking about this and a poster cut right
to the chase by asking Snit (based on common sense and what Snit had
previously written) the two most obvious questions:

"I gather there are two questions here:

1) Is the trendline approriately fitted to that data?

2) Is the trendline useful in some way?"

Snit's reply:
"Not quite: the question was merely if the process of creating the
trend line was correct - did it follow the process of creating a
linear trend line that is supported by the build in "linear trend
line" properties of the program."

As you can see, Snit himself explains that his goal was not to see if
the "trendline approriately fitted to that data" nor to have the
trendline be "useful in some way". Personally I find this rather
strange... but Snit's goal was revealed in the same post the above
quotes came from... and it's actually even more strange. Notably,
Snit wrote about a correlation between Linux UI improvements and an
increase in market share (a thing he has been arguing in another
newsgroup):

"The point, however, was not to predict the future past where I first
created the line. I had noted that the desktop Linux distros were
clearly focusing on improving their usability - and I predicted that
an increase in usability would lead to an increase in users."

And this, from the same post:

"But the graph was initially made to show that the then-current data
fit with my past predictions - not to make any specific prediction
about the future (such as that the same level of upward trend would be
seen)."


Anyone familiar with this topic knows that, regardless of the method
he used to create his "past predictions" (or any, for that matter) the
data Snit used would not aid in supporting a correlation between UI
improvements in Linux and a rise in marketshare because it's the wrong
kind of data.

We know Snit didn't care if the line fit his data or was useful in any
way... if he's going to argue that he didn't produce the line to show
evidence of his alleged correlation (a thing he is saying up above...
but possibly without realizing it) then one would have to logically
ask:

What the hell was the purpose of his trend line?

Martin Brown

unread,
Jun 17, 2012, 3:57:14 AM6/17/12
to
The difference is that (apart from in some very early versions of XL2007
where the graph regression function was wrecked to make it agree with
another well known products answers) the linear regression on XL charts
uses a well designed algorithm that gives a very good approximation to
the true least squares parameter fit to data. LINEST is demonstrably
numerically unstable although it requires at least a 3rd order
polynomial fit and moderately hostile data to break it. Examples have
been posted here or in another m.p.e group previously.
(some years back)

For linear fits the two are more or less interchangable but for higher
orders of polynomial the chart function fit is more accurate.
>
> For some insight into how the Excel LINEST and linear trendline
> coefficients might be derived, see
> http://en.wikipedia.org/wiki/Simple_linear_regression.
>
> Note that =COVAR(A1:A24,B1:B24)/VARP(A1:A24) in E3 and
> =AVERAGE(B1:B24)-E3*AVERAGE(A1:A24) are about the same as the LINEST
> results. The difference is in the 3 least-significant binary bits of the
> floating-point representations in this case.
>
> You might also take note of the wiki page section titled "Normality
> assumption". That may or may not be relevant to the COLA debates.

What is COLA?

--
Regards,
Martin Brown

Onion Knight

unread,
Jun 19, 2012, 12:29:26 AM6/19/12
to
On Jun 15, 9:16 pm, Onion Knight <onionknight...@gmail.com> wrote:
> There has been some debate in COLA as to the correct way to create a
> linear trendline in Excel. Someone in the forum kindly made a video to
> show what he thought was the correct methodhttp://tmp.gallopinginsanity.com/LinearTrendLineCreation.mov
>
> A couple of other people in the group say that the video skips a
> number of steps but when I look at the directions from Microsoft I do
> not see any they could be referring to.
>
> Does anyone see any missing steps for the creation of a linear trend
> line? I admit this is a bit over my head so any input would be welcome.

Thank you to all who responded. The answer is now clear and Snit made
his trend lines correctly. They showed an increase in Linux usage in
2011 and a decrease in 2012. CC has now admitted his process was so
fucked up he could not even see the change in the trend and thought
the data was at flaw when it was his own process.

Snit

unread,
Jun 19, 2012, 2:16:55 AM6/19/12
to
On 6/18/12 9:29 PM, in article
0c0f10cd-e98c-4a52...@8g2000vbu.googlegroups.com, "Onion
Exactly correct: cc ending up claiming the data he asked us to use was wrong
and admitted he missed the trends I was able to show. He incorrectly
thought that his ability to get a higher R^2 value somehow proved that some
of the data points were wrong... he could not accept that the trend changed,
even though it was easy to see it did:

Second half of 2011 - clearly trending up:
<http://tmp.gallopinginsanity.com/LinuxTrend2011-2ndhalf.png>

First half of 2012 - clearly trending down:
<http://tmp.gallopinginsanity.com/LinuxTrendLine2012.jpg>

cc screwed up his analysis and missed this... though he was also
inconsistent with his claims about this:

cc #1:
-----
It will be 1%. Same as it ever was.
-----
cc #2:
-----
Linux has been on a significant downward trend since then.
-----
And if you look at January and then look at now, then there
is a downward trend.
-----

It is not possible for both of his claims to be correct, of course.
0 new messages