Finding longest climb

39 views
Skip to first unread message

Miki Tebeka

unread,
May 17, 2018, 1:19:14 AM5/17/18
to PyData
Hi,

I have a log of a run:

                     time   latitude   longitue      height
0 2015-08-20 03:48:07.235  32.519585  35.015021  136.199997
1 2015-08-20 03:48:24.734  32.519606  35.014954  126.599998
2 2015-08-20 03:48:25.660  32.519612  35.014871  123.000000
3 2015-08-20 03:48:26.819  32.519654  35.014824  120.500000
4 2015-08-20 03:48:27.828  32.519689  35.014776  118.900002
...

I'd like to find the longest climb. I can easily find the height different between rows with 
df['height'] - df['height].shift()

and can mark up/down with
np.sign(df['height'] - df['height].shift())

How can I get the longest climb? groupby won't help since it'll group all the up/down and I'd like find the longest sequence of up and then sum it.

Any ideas?

Thanks!,
Mikinp.sign(df['height'] - df['height].shift())


Pietro Battiston

unread,
May 17, 2018, 3:47:21 AM5/17/18
to pyd...@googlegroups.com
Il giorno mer, 16/05/2018 alle 22.19 -0700, Miki Tebeka ha scritto:
> Hi,
>
> I have a log of a run:
>
>                      time   latitude   longitue      height
> 0 2015-08-20 03:48:07.235  32.519585  35.015021  136.199997
> 1 2015-08-20 03:48:24.734  32.519606  35.014954  126.599998
> 2 2015-08-20 03:48:25.660  32.519612  35.014871  123.000000
> 3 2015-08-20 03:48:26.819  32.519654  35.014824  120.500000
> 4 2015-08-20 03:48:27.828  32.519689  35.014776  118.900002
> ...
>
> I'd like to find the longest climb. I can easily find the height
> different between rows with 
> df['height'] - df['height].shift()
>
> and can mark up/down with
> np.sign(df['height'] - df['height].shift())
>
> How can I get the longest climb?

.diff() (applied on the result of np.sign) will tell you if the sign
changes
np.where() will then tell you _where_ the sign changes
diff() (or equivalent numpy indexing) on the result will tell you the
length of the streaks
You'll then just need to identify which are climbs and which are falls
(by looking back at the result of the original diff())

Notice that to limit switching back and forth between arrays and
Series, you can replace "np.sign(x)" with "x > 0" as long as you are
interested in "strict climbs".

Pietro

Paul Hobson

unread,
May 17, 2018, 11:52:04 AM5/17/18
to pyd...@googlegroups.com
Mika,

I had a similar problem delineating storm events (e.g., I know when it's raining, but what's the longest contiguous record?).

I got a really good answer on stack overflow:

You can see the final version in action here:

What I think you should note about that implementation for storm delineation, in the notion of an inter-event period. In the context of precipitation, we typically use 6 hours. That means that if it rains, stops for 2 hours, then rains again, we'll typically call that a single a single storm (due to the lag of the hydrologic response of the watershed, but I digress). I bring this up b/c as a cyclist, I don't think a flat spot or brief dip in the elevation profile should break up an e.g., 2-hour slog up hill. Maybe you'll feel the same way ;)

-Paul


--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Miki Tebeka

unread,
May 20, 2018, 3:33:57 AM5/20/18
to PyData
Hi Pietro,

Thanks.

> How can I get the longest climb?

.diff() (applied on the result of np.sign) will tell you if the sign
changes
np.where() will then tell you _where_ the sign changes
diff() (or equivalent numpy indexing) on the result will tell you the
length of the streaks
You'll then just need to identify which are climbs and which are falls
(by looking back at the result of the original diff())
Is there a way to do this vectorized? I can do a for loop on all the location where the sign changes but would like to avoid it.
 
Notice that to limit switching back and forth between arrays and
Series, you can replace "np.sign(x)" with "x > 0" as long as you are
interested in "strict climbs".
Thanks. 

Miki Tebeka

unread,
May 20, 2018, 3:34:28 AM5/20/18
to PyData
Thanks! I'll have a look and try to understand the code.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.

Pietro Battiston

unread,
May 20, 2018, 4:24:38 AM5/20/18
to pyd...@googlegroups.com
Il giorno dom, 20/05/2018 alle 00.33 -0700, Miki Tebeka ha scritto:
> [...]
>
> > > How can I get the longest climb? 
> >
> > .diff() (applied on the result of np.sign) will tell you if the
> > sign 
> > changes 
> > np.where() will then tell you _where_ the sign changes 
> > diff() (or equivalent numpy indexing) on the result will tell you
> > the 
> > length of the streaks 
> > You'll then just need to identify which are climbs and which are
> > falls 
> > (by looking back at the result of the original diff()) 
>
> Is there a way to do this vectorized? I can do a for loop on all the
> location where the sign changes but would like to avoid it.
>

Sure! You can do something like
climbs = streaks[result_of_first_np_sign == 1]

But now that I think about it, there is probably a simpler way to solve
the entire problem. Let s be the original series:

diff_sign = (s.diff() > 0).astype(int)
streak_diff = diff_sign.diff().abs()
streak_count = streak_diff.cumsum() * diff_sign
longest_streak = streak_count.value_counts().drop(0).index[0]
streak_count[streak_count == longest_streak]

Pietro

Miki Tebeka

unread,
May 21, 2018, 4:06:23 AM5/21/18
to PyData
Thanks! I'll study this code.
Reply all
Reply to author
Forward
0 new messages