Attended the first GTC Workshop in Singapore

26 views
Skip to first unread message

xman

unread,
May 13, 2011, 1:30:36 AM5/13/11
to sgc-ru...@googlegroups.com
It's all about CPU vs GPU, what's your speed up? CUDA programming, and coffee. Some of the talks were quite interesting, showing animations of simulation and huge speed up in terms of 20x to 100x. I hope Ruby CUDA can maintain that kind of speed up :)

In a particular talk presented by a NTU professor, he showed 3x speed up against highly optimized CPU implementation. Someone questioned, such a low speed up is it still convincing to use GPUs? ... the professor replied, it's a good speed up. What if all your salary is multiplied by 3? :D


Message has been deleted
Message has been deleted

xman

unread,
May 13, 2011, 1:44:14 AM5/13/11
to sgc-ru...@googlegroups.com
So far I get good encouragement from x-colleagues and friends on developing SGC Ruby CUDA, but I'm yet to see great interest in using Ruby for HPC.

Instead I was asked, what about Perl CUDA? :D I don't think I've seen any effort on that. But I thought, since Ruby inherited a lot from Perl, would that work if you take Ruby CUDA and use it as if it's Perl CUDA programming? :p

Essentially I'm not aware that many bioinformatics applications use Perl. I wonder if this is due to legacy, or Perl continues to get momentum in building new applications.

hyqneuron

unread,
May 25, 2011, 7:52:33 AM5/25/11
to SGC Ruby CUDA
I wanted to go, but couldn't make it because of the time.

As for switching to the GPU, well, that 3x speedup alone doesn't tell
much.
What CPU is that person using? How many cores? How many threads?
What's the upfront price? How much power does it consume at full-load?
All these have to be considered.

A 3x speedup indeed does not look impressive at all. Algorithms that
give little space for parallelism or that have low arithmetic
intensity are just not suited for the GPU. I guess the professor was
picking the wrong algorithms to present.

xman

unread,
May 25, 2011, 9:23:25 AM5/25/11
to sgc-ru...@googlegroups.com
As for switching to the GPU, well, that 3x speedup alone doesn't tell
much.
What CPU is that person using? How many cores? How many threads?
What's the upfront price? How much power does it consume at full-load?
All these have to be considered.

The actual specs are not the main points to highlight here. You can just assume it's the latest CPU.
The point is you might get only 3x speed up with modern GPU which is seldom reported in the literature.
 
A 3x speedup indeed does not look impressive at all. Algorithms that
give little space for parallelism or that have low arithmetic
intensity are just not suited for the GPU. I guess the professor was
picking the wrong algorithms to present.

When you have an important problem to solve, you'll still solve the same problem, not solving something irrelevant. Whether 3x speed up is impressive, it's subjective and dependent on the context. If the context is relevant to your salary, getting 3x salary is that sounds good? Given that a professor's salary is already quite good.

When you see 20x speed up reported, do you really trust the experiments? There are tons of experiments with severe numerical errors.

The professor made the point that he's comparing to a highly optimized implementation on CPU. If you are comparing to a slow CPU implementation, of course you'll get 20x speed up or more, any numbers you like.

hyqneuron

unread,
May 25, 2011, 12:58:44 PM5/25/11
to SGC Ruby CUDA
> When you have an important problem to solve, you'll still solve the same
> problem, not solving something irrelevant. Whether 3x speed up is
> impressive, it's subjective and dependent on the context. If the context is
> relevant to your salary, getting 3x salary is that sounds good? Given that a
> professor's salary is already quite good.
>
> When you see 20x speed up reported, do you really trust the experiments?
> There are tons of experiments with severe numerical errors.
>
> The professor made the point that he's comparing to a highly optimized
> implementation on CPU. If you are comparing to a slow CPU implementation, of
> course you'll get 20x speed up or more, any numbers you like.

I certainly agree with this. I've seen people getting 2000x
speedups(LOL). I recently helped a person to optimize part of an
algorithm for electrical grid simulation. Despite the final speedup of
17x, I bet that with some optimization of the CPU code that would have
been turned into less than 1x (the algorithm was too data-dependent).
Some people just love fancy numbers.
Reply all
Reply to author
Forward
0 new messages