piecewise regression

21 views
Skip to first unread message

Christoph Ruehlemann

unread,
Mar 5, 2018, 5:04:50 AM3/5/18
to corplin...@googlegroups.com
Hi all,

Can't seem to figure out where the mistake is in this code for a piecewise regression model:

model <- lm(X ~ (xyz < 0.89474) * xyz + (xyz >= 0.89474) * xyz)

Its plot has an ugly V-shape added to the break point (see plot attached):

XYZ=sort(unique(xyz))
plot(XYZ ~ xyz, frame=F, type="n")
lines(XYZ, predict(model, list(xyz=XYZ)))

Does anybody spot the mistake?

Chris
Piecewise regression.tiff

Stefan Th. Gries

unread,
Mar 5, 2018, 11:52:49 PM3/5/18
to CorpLing with R
Let me first seize this opportunity to remind everyone of a few useful
links (and their content):
- <https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example>
- <http://adv-r.had.co.nz/Reproducibility.html>
- and maybe <https://cran.r-project.org/web/packages/reprex/index.html>,
which I haven't used myself.

In that spirit:

rm(list=ls(all=TRUE))
# define some data
qwe <- structure(list(x = c(0.98332368268674, 0.98332368268674,
1.52269832078749,
2.14415170816445, 2.61317313259988, 3.04701795020266, 3.64502026635785,
4.52443543717429, 5.08726114649681, 5.69698899826288, 5.49765489287782,
5.49765489287782, 4.92310364794441, 4.38372900984366, 3.77400115807759,
3.38705848291836, 2.39038795599305, 2.39038795599305, 1.86273885350318,
1.40544296467863, 0.936421540243196, 1.92136653155761, 3.31670526925304,
4.16094383323683, 5.00518239722061, 5.55628257093225, 6.28326577880718,
6.65848291835553, 7.25648523451071, 8.21797915460336, 8.69872611464968,
9.30845396641575, 8.79253039953677, 8.39386218876665, 8.39386218876665,
7.77240880138969, 7.02197452229299, 6.77573827446439, 6.31844238563984,
6.31844238563984), y = c(5.47882045322487, 5.47882045322487,
4.70223707147008, 4.45514235909355, 4.10214991284137, 4.03155142359094,
3.64325973271354, 2.54898314933178, 1.87829750145264, 2.12539221382917,
2.26658919233004, 2.26658919233004, 3.07847181871005, 3.07847181871005,
3.53736199883789, 3.78445671121441, 4.87873329459616, 4.87873329459616,
5.30232423009878, 5.83181289947705, 5.58471818710052, 5.23172574084834,
4.59633933759442, 3.25496804183614, 3.11377106333527, 3.67855897733876,
4.73753631609529, 5.65531667635096, 6.50249854735619, 8.19686228936665,
9.00874491574666, 9.43233585124927, 8.83224869262057, 8.16156304474143,
8.16156304474143, 7.06728646135967, 5.40822196397443, 5.40822196397443,
4.63163858221964, 4.63163858221964)), .Names = c("x", "y"))

# visualize
plot(qwe); grid()

# prepare for analysis
y <- qwe$y; x <- qwe$x # original data
splitter <- ifelse(x>=5, "hi", "lo") # breakpoint indicator
summary(model <- lm(y ~ x*splitter)) # fit and summarize the model

# prepare for adding regression lines
PREDICTOR <- seq(1, 9, 0.25) # x-axis values for predictions
PREDICTIONS <- predict( # y-axis values for predictions
model,
newdata=list(
x=PREDICTOR,
splitter=ifelse(PREDICTOR>=5, "hi", "lo")))

# augment visualization
abline(v=5) # add line for where the split occurs
# add regression line(s)
lines(PREDICTOR, PREDICTIONS)

HTH,
STG

Christoph Ruehlemann

unread,
Mar 6, 2018, 4:39:00 AM3/6/18
to corplin...@googlegroups.com
thanks for this, but it seems this graph too plots the two regression lines with an unwanted 'nose', doesn't it (see attachment)?


--
You received this message because you are subscribed to the Google Groups "CorpLing with R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to corpling-with-r+unsubscribe@googlegroups.com.
To post to this group, send email to corpling-with-r@googlegroups.com.
Visit this group at https://groups.google.com/group/corpling-with-r.
For more options, visit https://groups.google.com/d/optout.



--
Screen Shot 2018-03-06 at 10.35.10.png

Alex Perrone

unread,
Mar 6, 2018, 8:48:27 AM3/6/18
to corplin...@googlegroups.com
No, it doesn't. Change this line

PREDICTOR <- seq(1, 9, 0.25) # x-axis values for predictions

to 

PREDICTOR <- seq(1, 9, 0.01) # x-axis values for predictions

and re-plot.



Alex Perrone

unread,
Mar 6, 2018, 8:49:22 AM3/6/18
to corplin...@googlegroups.com



On Tue, Mar 6, 2018 at 8:48 AM, Alex Perrone <boet...@gmail.com> wrote:
No, it doesn't. Change this line

PREDICTOR <- seq(1, 9, 0.25) # x-axis values for predictions

to 

PREDICTOR <- seq(1, 9, 0.01) # x-axis values for predictions

and re-plot.



Stefan Th. Gries

unread,
Mar 6, 2018, 11:25:18 AM3/6/18
to CorpLing with R
Well, even

PREDICTOR <- seq(1, 9, 0.01) # x-axis values for predictions

has that nose, it's 'behind' the vertical abline. The nose is 'the connection' between the first and the second regression line - if you don't have the abline hiding it, just plot them separately:

lines(PREDICTOR[PREDICTOR>=5], PREDICTIONS[PREDICTOR>=5])
lines(PREDICTOR[PREDICTOR<5], PREDICTIONS[PREDICTOR<5])

Christoph Ruehlemann

unread,
Mar 8, 2018, 1:58:18 PM3/8/18
to corplin...@googlegroups.com
The nose and the disconnect between the segments may be due to lack of precision in the way the break point is determined.
After re-determining ithe break point for my data based on the method detailed in Crawley (2007: 427), the two segments perfectly connect.

The steps involved are:

1. define a vector "breaks" for potential breaks:
2. run a for loop for piecewise regressions for all potential break points and yank out the minimal residual standard error (mse) for each model:

mse <- numeric(length(breaks))
for(i in 1:length(breaks)){
  piecewise <- lm(V_indep ~ V_dep*(V_dep < breaks[i]) + V_dep*(V_dep >=breaks[i]))
  mse[i] <- summary(piecewise)[6]
}
mse <- as.numeric(mse)

3. identify the break point with the least mse:

breaks[which(mse==min(mse))]

This methodology solved my problem. Perhaps it solves the problem in other data sets too.

Chris

--
You received this message because you are subscribed to the Google Groups "CorpLing with R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to corpling-with-r+unsubscribe@googlegroups.com.
To post to this group, send email to corpling-with-r@googlegroups.com.
Visit this group at https://groups.google.com/group/corpling-with-r.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages