Proposal: Using pandas as backend

Stefan Urbanek

unread,

Jul 27, 2012, 4:58:45 AM7/27/12

to datab...@googlegroups.com

Hi,

Here is a proposal of using Pandas data analysis library as one of potential backends for Brewery:

http://blog.databrewery.org/post/28088920149

Note that having dependency on Pandas will impose dependency on numpy and couple other packages. I do not think that more dependencies it good idea, therefore it should be only an alternative. It will require different approach to stream construction and different Node interface.

What do you think?

Stefan

Twitter: @Stiivi

Home: http://stiivi.com

Brewery: http://databrewery.org

Github: https://github.com/Stiivi

Adrian Klaver

unread,

Jul 28, 2012, 10:23:58 AM7/28/12

to datab...@googlegroups.com, Stefan Urbanek

On 07/27/2012 01:58 AM, Stefan Urbanek wrote:
> Hi,
>
> Here is a proposal of using Pandas data analysis library as one of
> potential backends for Brewery:
>
> http://blog.databrewery.org/post/28088920149

The blog page seems to be down. I cannot access the above post.

>
> Note that having dependency on Pandas will impose dependency on numpy
> and couple other packages. I do not think that more dependencies it good
> idea, therefore it should be only an alternative. It will require
> different approach to stream construction and different Node interface.
>
> What do you think?

As far as I gotten with Pandas is bookmarking the site for further
review. It looks interesting so I could see integrating it. I will try
to delve in deeper this weekend.

>
> Stefan

Thanks,

--
Adrian Klaver
adrian...@gmail.com

Stefan Urbanek

unread,

Jul 29, 2012, 1:38:28 PM7/29/12

to datab...@googlegroups.com, Adrian Klaver

On 28.7.2012, at 16:23, Adrian Klaver <adrian...@gmail.com> wrote:

> On 07/27/2012 01:58 AM, Stefan Urbanek wrote:
>> Hi,
>>
>> Here is a proposal of using Pandas data analysis library as one of
>> potential backends for Brewery:
>>
>> http://blog.databrewery.org/post/28088920149
>
> The blog page seems to be down. I cannot access the above post.
>

It is tumblr hosted, sometimes refreshing multiple times helps (or accessing just blog.databrewery.org and then article).

>>
>> Note that having dependency on Pandas will impose dependency on numpy
>> and couple other packages. I do not think that more dependencies it good
>> idea, therefore it should be only an alternative. It will require
>> different approach to stream construction and different Node interface.
>>
>> What do you think?
>
> As far as I gotten with Pandas is bookmarking the site for further review. It looks interesting so I could see integrating it. I will try to delve in deeper this weekend.
>

That would be nice. Here is the backend idea explained:

http://yfrog.com/z/kl45czp

Basically Brewery streams is metadata-based description of workflow/thought flow alternatives. The backend serves as computational engine.

I am going to be offline (for most of the time) for next week, be back on 5th. We can discuss it afterwards.

Naveen Michaud-Agrawal

unread,

Sep 10, 2012, 9:26:24 AM9/10/12

to datab...@googlegroups.com, Adrian Klaver

Hi Stephan,

I think this is a great idea! Has any progress been made on using pandas as a backend?

Naveen

Stefan Urbanek

unread,

Sep 13, 2012, 2:20:37 AM9/13/12

to datab...@googlegroups.com

Hi Naveen,

I didn't had too much time to implement that yet, unfortunately. However, I found another alternative: carray

https://github.com/francescAlted/carray/

Nice small framework which provides fast column-based storage layer with persistence. Implementation would be simpler than with Pandas, while still providing structures for algorithms that work with numpy arrays:

http://cl.ly/image/271k0d1k0815

What do you think?

s.

Stefan Urbanek

data analyst and data brewmaster

Naveen Michaud-Agrawal

unread,

Oct 4, 2012, 11:26:30 AM10/4/12

to datab...@googlegroups.com

Carray looks like a good alternative for persistence. I was thinking that pandas would help in implementing some of the higher level backend interface, since it has very fast grouping and filtering.

Naveen

Reply all

Reply to author

Forward