converting dataframe columns full of minutes and seconds.

765 views
Skip to first unread message

ryan h

unread,
May 14, 2013, 12:07:34 AM5/14/13
to pyd...@googlegroups.com


So I have a dataframe like this.   df = pd.DataFrame({'A':[u'23:30', u'31:42']})

How do I convert them to floats so the output is like this A        # divide the seconds by 60 to convert the seconds into decimals.  
                                                           23.5
                                                           31.7                   

I cant find anything in the book about converting just minutes and seconds.  I only see stuff on dates.  

Wouter Overmeire

unread,
May 14, 2013, 3:08:09 AM5/14/13
to pyd...@googlegroups.com



2013/5/14 ryan h <ryan...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

In [34]: df = pd.DataFrame({'A':[u'23:30', u'31:42']})

In [35]: df
Out[35]:
       A
0  23:30
1  31:42

In [36]: def convert(x):
   ....:     m, s = x.split(':')
   ....:     return float(m) + float(s) / 60.0
   ....:

In [37]: convert('23:30')
Out[37]: 23.5

In [38]: df.applymap(convert)
Out[38]:
      A
0  23.5
1  31.7

ryan h

unread,
May 14, 2013, 6:23:40 PM5/14/13
to pyd...@googlegroups.com

It works, but the actual frame as other columns that consists of other data so It won't work for it.

in[21]: df = pd.DataFrame({'A':[u'23:30', u'31:42'],
                                        'B':[25.3,456]})

in[22]: df
 
out[22] : A       B
           23:30  25.3
           31:42  456.0


in[23]: df.applymap(convert) gives this error AttributeError: ("'numpy.float64' object has no attribute 'split'", u'occurred at index B')

and I can't just convert the specific column. 

in[24]" df['A'].applymap(convert) AttributeError: 'Series' object has no attribute 'applymap'

I tried: in
in[24]: colA = df['A']
in[25]: colA.applymap(convert)
but get this error
AttributeError: 'Series' object has no attribute 'applymap'


ryan h

unread,
May 15, 2013, 1:28:03 AM5/15/13
to pyd...@googlegroups.com
I guess I just need to change the function so that it leaves the float data alone.  

Wouter Overmeire

unread,
May 15, 2013, 3:30:08 AM5/15/13
to pyd...@googlegroups.com



2013/5/15 ryan h <ryan...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

On a Series apply (iso applymap) needs to be used to process element by element.

In [6]: df['A'] = df['A'].apply(convert)

In [7]: df
Out[7]:
      A      B
0  23.5   25.3
1  31.7  456.0


Or you select out the columns in such a way that always a DataFrame is returned and applymap can be used. By using the list as indexer, multiple columns can be converted in one go

In [21]: df[['A']] = df[['A']].applymap(convert)

In [22]: df
Out[22]:
      A      B
0  23.5   25.3
1  31.7  456.0

Example for converting multiple columns (and leaving other columns untouched):
In [23]: df = pd.DataFrame({'A': [u'23:30', u'31:42'],
   ....:                    'B': [u'11:10', u'8:23'],
   ....:                    'C': ['foo', 'bar']}
   ....: )

In [24]: df
Out[24]:
       A      B    C
0  23:30  11:10  foo
1  31:42   8:23  bar

In [25]: cols = ['A', 'B']

In [26]: df[cols] = df[cols].applymap(convert)

In [27]: df
Out[27]:
      A          B    C
0  23.5  11.166667  foo
1  31.7   8.383333  bar




ryan h

unread,
May 15, 2013, 11:47:58 AM5/15/13
to pyd...@googlegroups.com
That's what I needed.  Thank You.
Reply all
Reply to author
Forward
0 new messages