Join without index after groupby

23 views
Skip to first unread message

Mansour

unread,
Apr 25, 2024, 11:12:41 PMApr 25
to PyData
I am a hoping to find a clean way to count zeros between two rows in a specific column. I can do this in loop, but I prefer to take the opportunity and get it using group by.
For the sake of the discussion, I have prepared an example:


import numpy as np
import pandas as pd

# Preparing the data
evnt_id = [1, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, -1, 0, 0, 1]
expctd = [0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 4, 0, 0, 2]

dt_rng = pd.date_range(start ='1-1-2018',periods = len(evnt_id),  freq ='1h', name = "time")
dt_idx = pd.DatetimeIndex( dt_rng )
arr = np.column_stack( (evnt_id, expctd ))
df =  pd.DataFrame( arr,  columns=['event_id', 'exp_occ' ] , index= dt_idx)
rng_idx = pd.RangeIndex( start=0, stop=len(evnt_id), step=1 , name='seq' )

data = df.copy()
grp =  (
    (data['event_id'] == 0 ) & ( data['event_id'].shift() !=  0 )
                ).cumsum()

occurance_count = data.groupby( grp).agg( {
#     "WHAT_TO_USE_FOR_INDEX" : "idx",
        "event_id": "count"
})


This is producing close enough results, but no way to join it back on the original frame, without some dirty hacks. How would others approach the problem ?

Thank you



Reply all
Reply to author
Forward
0 new messages