40 views

Skip to first unread message

Jan 31, 2019, 5:34:55 AM1/31/19

to reikna

If I try the following example:

https://github.com/fjarri/reikna/blob/develop/examples/demo_real_to_complex_fft.py

and set the array size to 100000 elements:

arr = numpy.random.normal(size=100000).astype(numpy.float32)

I get an assertion error, implying that the result do not coincide with the one obtained with numpy. I also plotted the outputs for both methods one against the other:

plt.plot(reference,result,"o")

And saw that the errors were not small.

However, if I do the same for an array with comparable size but with a number of elements that is a power of 2 (for example, 2**18), I obtain a correct result.

I get a correct result if the array is small enough even if the size is not a power of 2.

I am aware that:

"""

Note

Current algorithm works most effectively with array dimensions being power of 2 This mostly applies to the axes over which the transform is performed, beacuse otherwise the computation falls back to the Bluestein’s algorithm, which effectively halves the performance.

"""

But this should not affect the accuracy of the result, right?

https://github.com/fjarri/reikna/blob/develop/examples/demo_real_to_complex_fft.py

and set the array size to 100000 elements:

arr = numpy.random.normal(size=100000).astype(numpy.float32)

I get an assertion error, implying that the result do not coincide with the one obtained with numpy. I also plotted the outputs for both methods one against the other:

plt.plot(reference,result,"o")

And saw that the errors were not small.

However, if I do the same for an array with comparable size but with a number of elements that is a power of 2 (for example, 2**18), I obtain a correct result.

I get a correct result if the array is small enough even if the size is not a power of 2.

I am aware that:

"""

Note

Current algorithm works most effectively with array dimensions being power of 2 This mostly applies to the axes over which the transform is performed, beacuse otherwise the computation falls back to the Bluestein’s algorithm, which effectively halves the performance.

"""

But this should not affect the accuracy of the result, right?

Jan 31, 2019, 5:48:19 AM1/31/19

to reikna

I need to look into it, but I doubt it is the Bluestein's algorithm itself that is incorrect. I suspect that there are some inaccuracies creeping in when the size becomes large. While regular FFT involves multiplying by factors like exp(ik) where k is of the order of the problem size, the power in Bluestein's factors is of the order of size squared, and it is possible that trigonometric functions do not like it. I think it may be possible to take a modulo of the power beforehand to prevent that. Were you just testing it out of curiosity, or do you actually need large non-power-of-2 FFTs?

Jan 31, 2019, 6:09:27 AM1/31/19

to reikna

Thank you very much for the quick answer.

I need to perform large FFTs, although in principle they do not need to be non-power-of-2.

Jan 31, 2019, 6:40:02 AM1/31/19

to reikna

I did some tests, and modulo does not seem to help. Apparently there is some accuracy threshold around the size 65536 (which clearly points to something, but I can't figure out what), after which the results deteriorate quickly. Double precision or fast_math do not help; interestingly enough, the problem does not appear on Intel embedded graphics and OpenCL. Unfortunately, I cannot promise any quick solution at the moment. I guess, for now you'll have to use power-of-2 sizes (if that makes it any better, that's what is usually recommended anyway, e.g. in Numerical Recipes :) Ideally, FFT should use small prime radices other than 2, and only apply Bluestein for prime problem sizes, but that'll take quite a bit of time to implement.

Jan 31, 2019, 9:06:27 AM1/31/19

to reikna

Thank you, I will take it into account then and only use power-of-2 sizes.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu