Finding longest climb

Miki Tebeka

unread,

May 17, 2018, 1:19:14 AM5/17/18

to PyData

Hi,

I have a log of a run:

time latitude longitue height

0 2015-08-20 03:48:07.235 32.519585 35.015021 136.199997

1 2015-08-20 03:48:24.734 32.519606 35.014954 126.599998

2 2015-08-20 03:48:25.660 32.519612 35.014871 123.000000

3 2015-08-20 03:48:26.819 32.519654 35.014824 120.500000

4 2015-08-20 03:48:27.828 32.519689 35.014776 118.900002

...

I'd like to find the longest climb. I can easily find the height different between rows with

df['height'] - df['height].shift()

and can mark up/down with

np.sign(df['height'] - df['height].shift())

How can I get the longest climb? groupby won't help since it'll group all the up/down and I'd like find the longest sequence of up and then sum it.

Any ideas?

Thanks!,

Mikinp.sign(df['height'] - df['height].shift())

Pietro Battiston

unread,

May 17, 2018, 3:47:21 AM5/17/18

to pyd...@googlegroups.com

Il giorno mer, 16/05/2018 alle 22.19 -0700, Miki Tebeka ha scritto:
> Hi,
>
> I have a log of a run:
>
> time latitude longitue height
> 0 2015-08-20 03:48:07.235 32.519585 35.015021 136.199997
> 1 2015-08-20 03:48:24.734 32.519606 35.014954 126.599998
> 2 2015-08-20 03:48:25.660 32.519612 35.014871 123.000000
> 3 2015-08-20 03:48:26.819 32.519654 35.014824 120.500000
> 4 2015-08-20 03:48:27.828 32.519689 35.014776 118.900002
> ...
>
> I'd like to find the longest climb. I can easily find the height
> different between rows with
> df['height'] - df['height].shift()
>
> and can mark up/down with
> np.sign(df['height'] - df['height].shift())
>
> How can I get the longest climb?

.diff() (applied on the result of np.sign) will tell you if the sign
changes
np.where() will then tell you _where_ the sign changes
diff() (or equivalent numpy indexing) on the result will tell you the
length of the streaks
You'll then just need to identify which are climbs and which are falls
(by looking back at the result of the original diff())

Notice that to limit switching back and forth between arrays and
Series, you can replace "np.sign(x)" with "x > 0" as long as you are
interested in "strict climbs".

Pietro

Paul Hobson

unread,

May 17, 2018, 11:52:04 AM5/17/18

to pyd...@googlegroups.com

Mika,

I had a similar problem delineating storm events (e.g., I know when it's raining, but what's the longest contiguous record?).

I got a really good answer on stack overflow:

https://stackoverflow.com/questions/22290793/fill-na-values-in-pandas-series-with-a-stop

You can see the final version in action here:

https://github.com/Geosyntec/cloudside/blob/master/cloudside/storms.py#L113

What I think you should note about that implementation for storm delineation, in the notion of an inter-event period. In the context of precipitation, we typically use 6 hours. That means that if it rains, stops for 2 hours, then rains again, we'll typically call that a single a single storm (due to the lag of the hydrologic response of the watershed, but I digress). I bring this up b/c as a cyclist, I don't think a flat spot or brief dip in the elevation profile should break up an e.g., 2-hour slog up hill. Maybe you'll feel the same way ;)

-Paul

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Miki Tebeka

unread,

May 20, 2018, 3:33:57 AM5/20/18

to PyData

Hi Pietro,

Thanks.

> How can I get the longest climb?

.diff() (applied on the result of np.sign) will tell you if the sign
changes
np.where() will then tell you _where_ the sign changes
diff() (or equivalent numpy indexing) on the result will tell you the
length of the streaks
You'll then just need to identify which are climbs and which are falls
(by looking back at the result of the original diff())

Is there a way to do this vectorized? I can do a for loop on all the location where the sign changes but would like to avoid it.

Notice that to limit switching back and forth between arrays and
Series, you can replace "np.sign(x)" with "x > 0" as long as you are
interested in "strict climbs".

Thanks.

Miki Tebeka

unread,

May 20, 2018, 3:34:28 AM5/20/18

to PyData

Thanks! I'll have a look and try to understand the code.

To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.

Pietro Battiston

unread,

May 20, 2018, 4:24:38 AM5/20/18

to pyd...@googlegroups.com

Il giorno dom, 20/05/2018 alle 00.33 -0700, Miki Tebeka ha scritto:
> [...]

>
> > > How can I get the longest climb?
> >
> > .diff() (applied on the result of np.sign) will tell you if the
> > sign
> > changes
> > np.where() will then tell you _where_ the sign changes
> > diff() (or equivalent numpy indexing) on the result will tell you
> > the
> > length of the streaks
> > You'll then just need to identify which are climbs and which are
> > falls
> > (by looking back at the result of the original diff())
>
> Is there a way to do this vectorized? I can do a for loop on all the
> location where the sign changes but would like to avoid it.
>

Sure! You can do something like
climbs = streaks[result_of_first_np_sign == 1]

But now that I think about it, there is probably a simpler way to solve
the entire problem. Let s be the original series:

diff_sign = (s.diff() > 0).astype(int)
streak_diff = diff_sign.diff().abs()
streak_count = streak_diff.cumsum() * diff_sign
longest_streak = streak_count.value_counts().drop(0).index[0]
streak_count[streak_count == longest_streak]

Pietro

Miki Tebeka

unread,

May 21, 2018, 4:06:23 AM5/21/18

to PyData

Thanks! I'll study this code.

Reply all

Reply to author

Forward