Fwd: [pydata] Re: pandas: applying a function successively over rows

9 views

Skip to first unread message

Peter Lubell-Doughtie

unread,

Aug 13, 2013, 1:14:22 PM8/13/13

to bambo...@googlegroups.com

Wow, that is much faster, time to reexamine our use of apply.

---------- Forwarded message ----------
From: "Jeff" <jeffr...@gmail.com>
Date: Aug 13, 2013 9:44 AM
Subject: [pydata] Re: pandas: applying a function successively over rows
To: <pyd...@googlegroups.com>
Cc: <public-pydata-/JYPxA39Uh5...@plane.gmane.org>

Pls note that using a native operation is MUCH faster than apply

In [5]: %timeit df.apply(lambda x: x+1)

1000 loops, best of 3: 1.04 ms per loop

In [6]: %timeit df+1

10000 loops, best of 3: 71.7 µs per loop

Doc string for expanding apply; the function must reduce, e.g. take a 1-dim ndarray and produce a value

In [7]: pd.expanding_apply?

Type: function

String Form:<function expanding_apply at 0x40aa488>

File: /usr/local/lib/python2.7/site-packages/pandas-0.12.0-py2.7-linux-x86_64.egg/pandas/stats/moments.py

Definition: pd.expanding_apply(arg, func, min_periods=1, freq=None, center=False, time_rule=None)

Docstring:

Generic expanding function application

Parameters

----------

arg : Series, DataFrame

func : function

Must produce a single value from an ndarray input

min_periods : int

Minimum number of observations in window required to have a value

freq : None or string alias / date offset object, default=None

Frequency to conform to before computing statistic

center : boolean, default False

Whether the label should correspond with center of window

time_rule : Legacy alias for freq

Returns

-------

y : type of input argument

In [8]: pd.expanding_apply(df,lambda x: x.sum())

Out[8]:

Int64Index: 100 entries, 0 to 99

Data columns (total 10 columns):

0 100 non-null values

1 100 non-null values

2 100 non-null values

3 100 non-null values

4 100 non-null values

5 100 non-null values

6 100 non-null values

7 100 non-null values

8 100 non-null values

9 100 non-null values

dtypes: float64(10)

On Tuesday, August 13, 2013 7:57:29 AM UTC-4, Timmie wrote:

> sure: df.apply(lambda x: x + 1) (or whatever)
df['A'].apply(lambda x: x + 1).head()

==> works very well.

> ewma should be done using these optimized functions (and many more
> goodies in there)
> http://pandas.pydata.org/pandas-docs/dev/computation.html#expanding-window-moment-functions
Thanks for that pointer!

pd.expanding_sum(df['A']).head()

==> works very well.

I tred:
func = (lambda x: x + 1)
pd.expanding_apply(df['A'], func).head()

It says:
TypeError: only length-1 arrays can be converted to Python scalars

What am I missing?

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all

Reply to author

Forward

0 new messages