Avoiding a spurious RuntimeWarning in functions.rescaleData

117 views
Skip to first unread message

Antony Lee

unread,
Sep 2, 2013, 10:07:25 PM9/2/13
to pyqt...@googlegroups.com
Hi,

I would suggest patching the last few lines of rescaleData as follows:

-        d2 = data-offset
-        d2 *= scale
-        data = d2.astype(dtype)
+        data = (data.astype(dtype) - offset) * scale

The reason is that sometimes, d2 *= scale will trigger a RuntimeWarning, e.g. if data happens to be an uint16 and the multiplication by a large scale (that is a float64) triggers a RuntimeWarning, even though we will want to cast the data to float64 immediately after anyways.

As a side note, while I haven't tried it, I would guess that the weave implementation of rescaleData can't be faster than a simple numpy expression ("(data.astype(dtype) - offset) * scale") if it must invoke a compiler every time...

Antony

Antony Lee

unread,
Sep 4, 2013, 6:57:28 PM9/4/13
to pyqt...@googlegroups.com, anton...@berkeley.edu

Luke Campagnola

unread,
Sep 7, 2013, 12:10:01 AM9/7/13
to pyqt...@googlegroups.com
On Mon, Sep 2, 2013 at 10:07 PM, Antony Lee <anton...@berkeley.edu> wrote:
Hi,

I would suggest patching the last few lines of rescaleData as follows:

-        d2 = data-offset
-        d2 *= scale
-        data = d2.astype(dtype)
+        data = (data.astype(dtype) - offset) * scale

Thanks, Antony!
I am currently preparing 0.9.8 and will include this fix.
 
The reason is that sometimes, d2 *= scale will trigger a RuntimeWarning, e.g. if data happens to be an uint16 and the multiplication by a large scale (that is a float64) triggers a RuntimeWarning, even though we will want to cast the data to float64 immediately after anyways.

As a side note, while I haven't tried it, I would guess that the weave implementation of rescaleData can't be faster than a simple numpy expression ("(data.astype(dtype) - offset) * scale") if it must invoke a compiler every time...

Weave is significantly faster--it only invokes the compiler once per dtype and stores the compiled modules to disk.

I am hoping to replace weave with a small library of pre-built C functions that would support most platforms while not requiring any compiler (and still falling back to pure python when necessary). Not yet sure how feasible this is..


Luke

Guillaume Poulin

unread,
Sep 7, 2013, 2:59:26 AM9/7/13
to pyqt...@googlegroups.com


Le samedi 7 septembre 2013 12:10:01 UTC+8, Luke Campagnola a écrit :

I am hoping to replace weave with a small library of pre-built C functions that would support most platforms while not requiring any compiler (and still falling back to pure python when necessary). Not yet sure how feasible this is..


It's easily feasible. It's what I do in one of my project: https://github.com/gpoulin/pybeem/tree/folding. I have two module beem.experiment._pure_python and beem.experiment._pure_c. The importation is done in beem/experiment/experiment.py. If the importation of beem.experiment._pure_c fails, beem.experiment._pure_python is used instead.

Guillaume Poulin

unread,
Sep 7, 2013, 3:19:53 AM9/7/13
to pyqt...@googlegroups.com
However for the distribution it's start to be more complex because you need a binary for each target platform and each target python. I would suggest to only distribute binary for windows and rely on the system compiler for all *nix with a simple "python setup built" (it also permit the user to easily set custom CFLAGS). Already for windows, it's 4 binaries: win32 python 2.7, win32 python 3.3, win64 python 2.7, win64 python 3.3.

So at the end, if you want to remove weaver code just for the sake of removing weaver, I'm not sure it's worth it.

Luke Campagnola

unread,
Sep 7, 2013, 11:09:18 AM9/7/13
to pyqt...@googlegroups.com
On Sat, Sep 7, 2013 at 3:19 AM, Guillaume Poulin <poulin.g...@gmail.com> wrote:

Le samedi 7 septembre 2013 14:59:26 UTC+8, Guillaume Poulin a écrit :

Le samedi 7 septembre 2013 12:10:01 UTC+8, Luke Campagnola a écrit :

I am hoping to replace weave with a small library of pre-built C functions that would support most platforms while not requiring any compiler (and still falling back to pure python when necessary). Not yet sure how feasible this is..

It's easily feasible. It's what I do in one of my project: https://github.com/gpoulin/pybeem/tree/folding. I have two module beem.experiment._pure_python and beem.experiment._pure_c. The importation is done in beem/experiment/experiment.py. If the importation of beem.experiment._pure_c fails, beem.experiment._pure_python is used instead.

However for the distribution it's start to be more complex because you need a binary for each target platform and each target python. I would suggest to only distribute binary for windows and rely on the system compiler for all *nix with a simple "python setup built" (it also permit the user to easily set custom CFLAGS).

This was my main concern--how many different platforms would I need to compile for? If I remember correctly, OSX does not have a compiler by default. Many Linux distributions have also elected to leave out the compilers. For individual developers this is not much problem, but if you want to distribute an application to many users then it will be important to have support for many platforms..
 
Already for windows, it's 4 binaries: win32 python 2.7, win32 python 3.3, win64 python 2.7, win64 python 3.3.

My plan is to use ctypes to access the libraries, so a single module per OS/architecture will work for any version of python. It also makes it much more likely that users will be able to compile their own modules, since the C files can be very simple with no dependencies on python.
 

Luke

Antony Lee

unread,
Sep 10, 2013, 11:08:57 PM9/10/13
to pyqt...@googlegroups.com

Indeed I misunderstood the way weave.inline works.  Still...

import time, timeit, numpy as np
from scipy import weave

def rescale_np(data, scale, offset):
    return (data - offset) * scale

def rescale_weave(data, scale, offset): # basically copied from rescaleData
    size = data.size
    input = np.ascontiguousarray(data).reshape(size)
    output = np.empty((size,), dtype=data.dtype)
    code = "for (int i = 0; i < size; ++i) { output[i] = ((double)input[i] - offset) * scale; }"
    weave.inline(code, ["size", "input", "output", "offset", "scale"], compiler="gcc")
    return output.reshape(data.shape)

if __name__ == "__main__":
    code = "rescale_weave(np.random.random((512,)), 10, 1)"
    number = 100000
    timer = time.clock # process time
    print(timeit.repeat("np.random.random((512,))", setup="import numpy as np",
                        number=number, timer=timer)) # approx. time required to generate the data
    print(timeit.repeat(code.replace("weave", "np"),
                        setup="import numpy as np; from __main__ import rescale_np",
                        number=number, timer=timer)) # numpy version
    print(timeit.repeat(code,
                        setup="import numpy as np; from scipy import weave; "
                        "from __main__ import rescale_weave; " + code,
                        number=number, timer=timer)) # weave version, calling the function once during setup so that compilation time is substracted

Output:

[1.0582530498504639, 1.0574111938476562, 1.0580999851226807] # data generation
[2.238456964492798, 2.2196199893951416, 2.217461109161377] # numpy
[2.6102700233459473, 2.6073598861694336, 2.6155309677124023] # weave

Once you substract the time spent calling np.random.random it seems that the weave solution is ~33% slower... or did I miss something?

Versions used:

$ python2 --version; python2 -c "import numpy, scipy; print numpy.__version__, scipy.__version__"
Python 2.7.5
1.7.1 0.12.0

Jochen Schröder

unread,
Sep 10, 2013, 11:38:18 PM9/10/13
to pyqt...@googlegroups.com
On 11/09/13 13:08, Antony Lee wrote:
>
>
> On Friday, September 6, 2013 9:10:01 PM UTC-7, Luke Campagnola wrote:
>>
>> On Mon, Sep 2, 2013 at 10:07 PM, Antony Lee <anton...@berkeley.edu<javascript:>
I'm not really surprised by this. Beating vectorized numpy calculations
is actually really difficult. For a program of mine I was trying
optimise some functions using cython. That code actually had some for
loops in it, which is usually what really slows down numpy code. But
once I played some indexing tricks my numpy code was actually
significantly faster than the compile cython code (which was probably
not fully optimised but still). BTW I highly recommend line_profiler and
kernprof (http://pythonhosted.org/line_profiler/)for profiling this sort
of things.

Cheers
Jochen

Luke Campagnola

unread,
Sep 11, 2013, 1:09:16 AM9/11/13
to pyqt...@googlegroups.com
My experience with this has been a bit different--often a simple C routine is much faster, especially when you are performing multiple operations in serial. For example:

>>> timeit("((((np.arange(1000000, dtype=int)+5)*7)%10)-2).astype(np.ubyte)", setup="import numpy as np", number=100)
5.4135632038116455

Compare to this C program:

#include <stdlib.h>
void main() {
    int i;
    char* data = malloc(100000000);
    for( i=0; i<100000000; i++) {
        data[i] = (((i + 5) * 7) % 10) - 2;
    }
}

$ gcc test.c && time ./a.out
real    0m0.897s
user    0m0.800s
sys     0m0.080s

So in this example, the C code is over 6x faster. I suspect a few reasons: 
 - Data is generated on the CPU and makes only one trip out to memory, whereas numpy must visit the complete memory block for each operation.
 - The C code increments i only once per final value, whereas numpy must increment a counter for each operation.
 - Numpy performs more memory allocations (and thus is also less memory-efficient; note that I could not even run the 100MB test in python without going to swap, hence the 1MBx100 instead). 
 

Luke

Guillaume Poulin

unread,
Sep 11, 2013, 1:10:05 AM9/11/13
to pyqt...@googlegroups.com
You don't need to reshape the vector to use weave. It's where you lose precious milliseconds. By using the below code I'm getting

[0.6200000000000001, 0.6099999999999999, 0.6200000000000001] #data generation
[1.3499999999999996, 1.3500000000000005, 1.3499999999999996] #numpy
[1.1100000000000003, 1.1199999999999992, 1.120000000000001] #weaver

import time, timeit, numpy as np
from scipy import weave


def rescale_np(data, scale, offset):
   
return (data - offset) * scale


def rescale_weave(data, scale, offset): # basically copied from rescaleData
    size
= data.
size
    output
= np.empty(data.shape, dtype=data.dtype)
    code
= r"""

        for (int i = 0; i < size; ++i) {
            output[i] = ((double)data[i] - offset) * scale;
        }"""

    weave
.inline(code, ["data", "size", "output", "offset", "scale"], compiler="gcc", extra_compile_args=['-march=native -mtune=native -O3' ])
   
return output


if __name__ == "__main__":
    code
= "rescale_weave(np.random.random((25,2,10)), 10, 1)"

    number
= 100000
    timer
= time.clock # process time

   
print(timeit.repeat("np.random.random((25,2,10))", setup="import numpy as np",

                            number
=number, timer=timer)) # approx. time required to generate the data
   
print(timeit.repeat(code.replace("weave", "np"),
            setup
="import numpy as np; from __main__ import rescale_np",
            number
=number, timer=timer)) # numpy version
   
print(timeit.repeat(code,
                        setup
="import numpy as np; from scipy import weave; "
                       
"from __main__ import rescale_weave; " + code,
                        number
=number, timer=timer)) # weave version, calling the function once during setup so that compilation time is substracted

After this you could add some OpenMP to boost even more the weaver code. You could use something like

import time, timeit, numpy as np
from scipy import weave




def rescale_np(data, scale, offset):
   
return (data - offset) * scale




def rescale_weave(data, scale, offset): # basically copied from rescaleData
    size
= data.
size
    output
= np.empty(data.shape, dtype=data.dtype)
    code
= r"""
        #pragma omp parallel for

        for (int i = 0; i < size; ++i) {
            output[i] = ((double)data[i] - offset) * scale;
        }"""

    weave
.inline(code, ["data", "size", "output", "offset", "scale"], compiler="gcc", extra_compile_args=['-march=native -mtune=native -O3 -fopenmp' ], headers=['<omp.h>'], extra_link_args=['-lgomp'])
   
return output




if __name__ == "__main__":
    code
= "rescale_weave(np.random.random((25,2,10)), 10, 1)"

    number
= 100000
    timer
= time.clock # process time

   
print(timeit.repeat("np.random.random((25,2,10))", setup="import numpy as np",

                            number
=number, timer=timer)) # approx. time required to generate the data
   
print(timeit.repeat(code.replace("weave", "np"),
            setup
="import numpy as np; from __main__ import rescale_np",
            number
=number, timer=timer)) # numpy version
   
print(timeit.repeat(code,
                        setup
="import numpy as np; from scipy import weave; "
                       
"from __main__ import rescale_weave; " + code,
                        number
=number, timer=timer)) # weave version, calling the function once during setup so that compilation time is substracted

I don't have result for this code because I have difficulty with some parallel codes inside python on my system (maybe because I have linked numpy with openblas). It should make the code faster but more difficult to compile which I think it's a good argument against OpenMP usage since the objectives of pyqtgraph (easy to install).

Luke Campagnola

unread,
Sep 11, 2013, 1:14:16 AM9/11/13
to pyqt...@googlegroups.com
On Tue, Sep 10, 2013 at 11:08 PM, Antony Lee <anntz...@gmail.com> wrote:

On Friday, September 6, 2013 9:10:01 PM UTC-7, Luke Campagnola wrote:
On Mon, Sep 2, 2013 at 10:07 PM, Antony Lee <anton...@berkeley.edu> wrote:
 
The reason is that sometimes, d2 *= scale will trigger a RuntimeWarning, e.g. if data happens to be an uint16 and the multiplication by a large scale (that is a float64) triggers a RuntimeWarning, even though we will want to cast the data to float64 immediately after anyways.

As a side note, while I haven't tried it, I would guess that the weave implementation of rescaleData can't be faster than a simple numpy expression ("(data.astype(dtype) - offset) * scale") if it must invoke a compiler every time...

Weave is significantly faster--it only invokes the compiler once per dtype and stores the compiled modules to disk.

I am hoping to replace weave with a small library of pre-built C functions that would support most platforms while not requiring any compiler (and still falling back to pure python when necessary). Not yet sure how feasible this is..

Indeed I misunderstood the way weave.inline works.  Still...

[ snip ] 
Output:

[1.0582530498504639, 1.0574111938476562, 1.0580999851226807] # data generation
[2.238456964492798, 2.2196199893951416, 2.217461109161377] # numpy
[2.6102700233459473, 2.6073598861694336, 2.6155309677124023] # weave

Once you substract the time spent calling np.random.random it seems that the weave solution is ~33% slower... or did I miss something?

Usually rescaleData includes a type conversion to uint8 or uint16 as well. If you add that in, then the situation reverses--I get 2.5 for numpy and 2.3 for weave. Still, the difference is much smaller than I remember it being in the past. I suspect numpy has been optimized since the last time I profiled that code, which was quite a long time ago. I'll definitely consider removing weave altogether  :)


Luke

Guillaume Poulin

unread,
Sep 11, 2013, 1:22:34 AM9/11/13
to pyqt...@googlegroups.com
Other argument against weave

Traceback (most recent call last):
 
File "test.py", line 2, in <module>
   
from scipy import weave
 
File "/usr/lib64/python3.3/site-packages/scipy/weave/__init__.py", line 23, in <module>
   
raise ImportError("scipy.weave only supports Python 2.x")
ImportError: scipy.weave only supports Python 2.x


Luke Campagnola

unread,
Sep 11, 2013, 1:29:38 AM9/11/13
to pyqt...@googlegroups.com
On Wed, Sep 11, 2013 at 1:10 AM, Guillaume Poulin <poulin.g...@gmail.com> wrote:
You don't need to reshape the vector to use weave. It's where you lose precious milliseconds. By using the below code I'm getting

[0.6200000000000001, 0.6099999999999999, 0.6200000000000001] #data generation
[1.3499999999999996, 1.3500000000000005, 1.3499999999999996] #numpy
[1.1100000000000003, 1.1199999999999992, 1.120000000000001] #weaver

That's also very interesting.. I assumed that reshaping was basically free, since it should only involve computing new strides. I'll have a closer look at this.
I note that you removed the call to ascontiguousarray, which might be the most expensive part and was required to fix [some bug I don't remember]. 

 

Guillaume Poulin

unread,
Sep 11, 2013, 1:43:45 AM9/11/13
to pyqt...@googlegroups.com
 By adding, data=np.ascontiguousarray(data), I get:

[0.6100000000000001, 0.6199999999999999, 0.6099999999999999]
[1.3600000000000003, 1.3499999999999996, 1.3600000000000003]
[1.25, 1.25, 1.25]

and with the reshape and ascontiguousarray

[0.6200000000000001, 0.6499999999999997, 0.6300000000000003]
[1.4099999999999997, 1.37, 1.3600000000000003]
[1.509999999999999, 1.5, 1.5200000000000014]

The weave code is slower but still faster than numpy alone with ascontiguousarray. So the reshape seems to have some cost. I'm also surprised, I thought it was just some scalar value inside the array. I also added extra_compile_args=['-march=native -mtune=native -O3] which is an easy way to speed up the weave code.

Antony Lee

unread,
Sep 11, 2013, 2:27:50 AM9/11/13
to pyqt...@googlegroups.com
In the "who can speed up numpy code competition", I realized that we can avoid one array allocation (for a big gain) as follows...

import time
import timeit

import numpy as np
from scipy import weave

def rescale_np(data, scale, offset):
    return (data - offset) * scale # allocates two arrays

def rescale_np_savealloc(data, scale, offset):
    aux = data - offset
    aux *= scale # reuses the first array
    return aux

def rescale_weave(data, scale, offset):

    size = data.size
    input = np.ascontiguousarray(data)
    output = np.empty((size,), dtype=data.dtype)
    code = "for (int i = 0; i < size; ++i) { output[i] = ((double)input[i] - offset) * scale; }"
    weave.inline(code, ["size", "input", "output", "offset", "scale"],
                 compiler="gcc", extra_compile_args=["-march=native -mtune=native -O3"])
    return output.reshape(data.shape)

if __name__ == "__main__":
    code = "rescale_{0}(np.random.random((512, 512)), 10, 1)"
    setup = "import numpy as np; from __main__ import rescale_{0}; " + code
    number = 100
    timer = time.clock
    print(timeit.repeat("np.random.random((512, 512))", setup="import numpy as np",
                        number=number, timer=timer))
    print(timeit.repeat(code.format("np"), setup=setup.format("np"),
                        number=number, timer=timer))
    print(timeit.repeat(code.format("np_savealloc"), setup=setup.format("np_savealloc"),
                        number=number, timer=timer))
    print(timeit.repeat(code.format("weave"), setup=setup.format("weave"),
                        number=number, timer=timer))

...
[0.493684, 0.48480900000000005, 0.490529]
[1.103758, 1.1512509999999998, 1.1011769999999999] # naive numpy
[0.9201769999999998, 0.9463699999999999, 0.9118899999999996] # quite some gains...
[0.8682890000000008, 0.8683599999999991, 0.8692340000000005] # weave

Also, you cannot remove ascontiguousarray, which seems to be a no-op for arrays that are already C-contiguous anyways (try t=np.random.random((512, 512)); np.ascontiguousarray(t) is t ==> True).  If the array is not contiguous then the straight C-loop doesn't work as expected...

t = np.random.random((5, 5))
u = t[:, ::2]
print(u.flags)
print(rescale_np(u, 10, 1))
print(rescale_weave(u, 10, 1))

Oh and I don't use Python 2 at all anymore so all this is quite theoretical for me anyways :-)

Antony


2013/9/10 Guillaume Poulin <poulin.g...@gmail.com>

--
-- [ You are subscribed to pyqt...@googlegroups.com. To unsubscribe, send email to pyqtgraph+...@googlegroups.com ]
---
You received this message because you are subscribed to the Google Groups "pyqtgraph" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyqtgraph+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Antony Lee

unread,
Sep 11, 2013, 3:19:07 PM9/11/13
to pyqt...@googlegroups.com
It's not over yet... as Luke mentioned the most common (only? I don't know) use case is to use the rescaled data as indexes into a LUT of size 256... so what happens if we just set the dtype of the out array properly?  See https://gist.github.com/anntzer/6528382
Now weave creates output as a char* and gcc magic outputs much faster code:
0.70 µs # generation (yes I'm just using empty now but the results below are similar use random.random)
5463.93 µs # naive numpy
3330.78 µs # numpy saving one allocation
475.17 µs # weave

Antony

Guillaume Poulin

unread,
Sep 11, 2013, 9:25:23 PM9/11/13
to pyqt...@googlegroups.com, anton...@berkeley.edu
I did some test for fun with a python module written in C. I did speed profiling to rescale a matrix of (512,x) and look at the performance of the different implementation. I attached the result as graph. For small x C module out preform all other implementation. With bigger x, weave give the same result than the C module.

I didn't try it yet. I will see what it gives for me and with my C module.
big.png
small.png

Luke Campagnola

unread,
Sep 11, 2013, 10:39:50 PM9/11/13
to pyqt...@googlegroups.com
This is great!
I recommend we move development discussions over to github. I opened an issue here:


Luke

Guillaume Poulin

unread,
Sep 12, 2013, 12:50:10 AM9/12/13
to pyqt...@googlegroups.com
You confuse me. Where should we put the pull request? At  https://github.com/lcampagn/pyqtgraph or https://github.com/pyqtgraph/inp? Is there a reason to create a new project and not simply a new branch for inp?

Luke Campagnola

unread,
Sep 12, 2013, 1:01:15 AM9/12/13
to pyqt...@googlegroups.com
On Thu, Sep 12, 2013 at 12:50 AM, Guillaume Poulin <poulin.g...@gmail.com> wrote:
Le jeudi 12 septembre 2013 10:39:50 UTC+8, Luke Campagnola a écrit :

This is great!
I recommend we move development discussions over to github. I opened an issue here:


Luke

You confuse me. Where should we put the pull request? At  https://github.com/lcampagn/pyqtgraph or https://github.com/pyqtgraph/inp?

I think perhaps a new topic branch within github.com/pyqtgraph/inp would be best, but I'm not totally clear on github's intended organization..
 
Is there a reason to create a new project and not simply a new branch for inp?

It just seemed like the thing to do :)
Creating a project allows the possibility of more team-management in the future, and gives pyqtgraph an unambiguous canonical repository.


Guillaume Poulin

unread,
Sep 12, 2013, 1:38:14 AM9/12/13
to pyqt...@googlegroups.com
So you want to use pyqtgraph/inp for the development and lcampagn/pyqtgraph for mirroring? I just have the impression that lcampagn/pyqtgraph is not so useful. Creating a project pyqtgraph/pyqtgraph with a branch "stable" and a branch "inp" would have do the trick. But you are the boss and... I don't really care as long as I know where to push. So I'm gonna fork pyqtgraph/inp and push back there.

Guillaume Poulin

unread,
Sep 12, 2013, 1:39:16 AM9/12/13
to pyqt...@googlegroups.com
By the way, you can delete lcampagn/test

Luke Campagnola

unread,
Sep 12, 2013, 2:13:49 AM9/12/13
to pyqt...@googlegroups.com
On Thu, Sep 12, 2013 at 1:38 AM, Guillaume Poulin <poulin.g...@gmail.com> wrote:


Le jeudi 12 septembre 2013 13:01:15 UTC+8, Luke Campagnola a écrit :
On Thu, Sep 12, 2013 at 12:50 AM, Guillaume Poulin <poulin.g...@gmail.com> wrote:

You confuse me. Where should we put the pull request? At  https://github.com/lcampagn/pyqtgraph or https://github.com/pyqtgraph/inp?

I think perhaps a new topic branch within github.com/pyqtgraph/inp would be best, but I'm not totally clear on github's intended organization..
 
Is there a reason to create a new project and not simply a new branch for inp?

It just seemed like the thing to do :)
Creating a project allows the possibility of more team-management in the future, and gives pyqtgraph an unambiguous canonical repository.

So you want to use pyqtgraph/inp for the development and lcampagn/pyqtgraph for mirroring? I just have the impression that lcampagn/pyqtgraph is not so useful.

I think lcampagn/pyqtgraph will stick around as my personal repository, where I can push junk without it going into the main branches. If it turns out to be not useful, then I'll delete it to avoid more confusion.
 
Creating a project pyqtgraph/pyqtgraph with a branch "stable" and a branch "inp" would have do the trick. But you are the boss and... I don't really care as long as I know where to push. So I'm gonna fork pyqtgraph/inp and push back there.
 
With this I was just trying to mimic the current dev/inp structure that exists on launchpad, but if that's counterintuitive then I'm perfectly happy to go with your suggestion. I'm really just making a lot of guesses here; probably I need to find some other well-organized projects to compare to.



Guillaume Poulin

unread,
Sep 12, 2013, 3:03:54 AM9/12/13
to pyqt...@googlegroups.com


Le jeudi 12 septembre 2013 14:13:49 UTC+8, Luke Campagnola a écrit :



On Thu, Sep 12, 2013 at 1:38 AM, Guillaume Poulin <poulin.g...@gmail.com> wrote:


Le jeudi 12 septembre 2013 13:01:15 UTC+8, Luke Campagnola a écrit :
On Thu, Sep 12, 2013 at 12:50 AM, Guillaume Poulin <poulin.g...@gmail.com> wrote:

You confuse me. Where should we put the pull request? At  https://github.com/lcampagn/pyqtgraph or https://github.com/pyqtgraph/inp?

I think perhaps a new topic branch within github.com/pyqtgraph/inp would be best, but I'm not totally clear on github's intended organization..
 
Is there a reason to create a new project and not simply a new branch for inp?

It just seemed like the thing to do :)
Creating a project allows the possibility of more team-management in the future, and gives pyqtgraph an unambiguous canonical repository.

So you want to use pyqtgraph/inp for the development and lcampagn/pyqtgraph for mirroring? I just have the impression that lcampagn/pyqtgraph is not so useful.

I think lcampagn/pyqtgraph will stick around as my personal repository, where I can push junk without it going into the main branches. If it turns out to be not useful, then I'll delete it to avoid more confusion.

First I would rename pyqtgraph/inp to pyqtgraph/pyqtgraph. The name pyqtgraph/inp is confusing and will give the impression that is another project related to pyqtgraph. If you want to keep lcampagn/pyqtgraph to mess around with the code, the better is probably to fork pyqtgraph/pyqtgraph from github (using the button at the top right). Then people are gonna see that lcampagn/pyqtgraph is a fork and that pyqtgraph/pyqtgraph is the canonical repo.
 
Creating a project pyqtgraph/pyqtgraph with a branch "stable" and a branch "inp" would have do the trick. But you are the boss and... I don't really care as long as I know where to push. So I'm gonna fork pyqtgraph/inp and push back there.
 
With this I was just trying to mimic the current dev/inp structure that exists on launchpad, but if that's counterintuitive then I'm perfectly happy to go with your suggestion. I'm really just making a lot of guesses here; probably I need to find some other well-organized projects to compare to.

To mimic the dev/inp structure of Launchpad is probably better to keep the master branch for more stable code with tags for releases  (equivalent of dev) and create a inp branch for more "experimental" code. You then ask in the README to put the pull request in the inp branch. Like this, you would still have a dev/inp structure and it would be less confusing for everybody. In addition, I would try to push more regularly on the dev/master branch. People that use code directly from repositories should expect a working code but not completely stable. In addition, long period without pushing to the master branch can give the impression that the project is dead http://www.ohloh.net/p/pyqtgraph.

Guillaume Poulin

unread,
Sep 12, 2013, 4:46:04 AM9/12/13
to pyqt...@googlegroups.com, anton...@berkeley.edu
I had a ctypes example to the test since it's what Luke plans to use. The result is in the image attached to this post.  The problem with ctypes seems to be the big overhead. For big array is not so much an issue but for small one, naive numpy code out perform ctypes.
optest2.png

Guillaume Poulin

unread,
Sep 12, 2013, 5:31:17 AM9/12/13
to pyqt...@googlegroups.com, anton...@berkeley.edu
 There is a small range of array size where the weave code seems a little faster on my system. For all practical purposes, ctypes, weave and C module give the same performance on big array.
optest2.png

samue...@gmail.com

unread,
Mar 20, 2016, 12:28:50 AM3/20/16
to pyqtgraph
It looks like this has still not been fixed. Using version 0.9.10, I get an error caused by the 'd2 *= scale' line when I use setFrame() with an uint8 array

TypeError: Cannot cast ufunc multiply output from dtype('float64') to dtype('uint8') with casting rule 'same_kind'

 

On Monday, September 2, 2013 at 7:07:25 PM UTC-7, Antony Lee wrote:
Hi,

I would suggest patching the last few lines of rescaleData as follows:

-        d2 = data-offset
-        d2 *= scale
-        data = d2.astype(dtype)
+        data = (data.astype(dtype) - offset) * scale


The reason is that sometimes, d2 *= scale will trigger a RuntimeWarning, e.g. if data happens to be an uint16 and the multiplication by a large scale (that is a float64) triggers a RuntimeWarning, even though we will want to cast the data to float64 immediately after anyways.

As a side note, while I haven't tried it, I would guess that the weave implementation of rescaleData can't be faster than a simple numpy expression ("(data.astype(dtype) - offset) * scale") if it must invoke a compiler every time...

Antony

Luke Campagnola

unread,
Mar 20, 2016, 11:11:18 AM3/20/16
to pyqt...@googlegroups.com
That's correct; the bug has been fixed in github but the fix hasn't been released yet.


--
You received this message because you are subscribed to the Google Groups "pyqtgraph" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyqtgraph+...@googlegroups.com.

Edward Hartley

unread,
Oct 27, 2016, 4:55:54 AM10/27/16
to pyqtgraph
So it's correct that I can't see the change to fix this in the public repo develop branch?
Reply all
Reply to author
Forward
0 new messages