Fwd: [pydata] Re: pandas: applying a function successively over rows

9 views
Skip to first unread message

Peter Lubell-Doughtie

unread,
Aug 13, 2013, 1:14:22 PM8/13/13
to bambo...@googlegroups.com

Wow, that is much faster, time to reexamine our use of apply.

---------- Forwarded message ----------
From: "Jeff" <jeffr...@gmail.com>
Date: Aug 13, 2013 9:44 AM
Subject: [pydata] Re: pandas: applying a function successively over rows
To: <pyd...@googlegroups.com>
Cc: <public-pydata-/JYPxA39Uh5...@plane.gmane.org>

Pls note that using a native operation is MUCH faster than apply 
In [5]: %timeit df.apply(lambda x: x+1)
1000 loops, best of 3: 1.04 ms per loop

In [6]: %timeit df+1
10000 loops, best of 3: 71.7 µs per loop

Doc string for expanding apply; the function must reduce, e.g. take a 1-dim ndarray and produce a value

In [7]: pd.expanding_apply?
Type:       function
String Form:<function expanding_apply at 0x40aa488>
File:       /usr/local/lib/python2.7/site-packages/pandas-0.12.0-py2.7-linux-x86_64.egg/pandas/stats/moments.py
Definition: pd.expanding_apply(arg, func, min_periods=1, freq=None, center=False, time_rule=None)
Docstring:
Generic expanding function application

Parameters
----------
arg : Series, DataFrame
func : function
    Must produce a single value from an ndarray input
min_periods : int
    Minimum number of observations in window required to have a value
freq : None or string alias / date offset object, default=None
    Frequency to conform to before computing statistic
center : boolean, default False
    Whether the label should correspond with center of window
time_rule : Legacy alias for freq

Returns
-------
y : type of input argument

In [8]: pd.expanding_apply(df,lambda x: x.sum())
Out[8]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 100 entries, 0 to 99
Data columns (total 10 columns):
0    100  non-null values
1    100  non-null values
2    100  non-null values
3    100  non-null values
4    100  non-null values
5    100  non-null values
6    100  non-null values
7    100  non-null values
8    100  non-null values
9    100  non-null values
dtypes: float64(10)

On Tuesday, August 13, 2013 7:57:29 AM UTC-4, Timmie wrote:


> sure: df.apply(lambda x: x + 1) (or whatever)
df['A'].apply(lambda x: x + 1).head()

==> works very well.

> ewma should be done using these optimized functions (and many more
> goodies in there)
> http://pandas.pydata.org/pandas-docs/dev/computation.html#expanding-window-moment-functions
Thanks for that pointer!

pd.expanding_sum(df['A']).head()

==> works very well.


I tred:
func = (lambda x: x + 1)
pd.expanding_apply(df['A'], func).head()

It says:
TypeError: only length-1 arrays can be converted to Python scalars

What am I missing?





--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply all
Reply to author
Forward
0 new messages