R

8 views
Skip to first unread message

Hans Salomonsson

unread,
Jan 24, 2012, 9:25:51 AM1/24/12
to ml_chalmers
Hi,

R seems to be the language of statistics and machine learning. Open
source, > 2m users, > 2000 packages in CRAN, etc.
Why use matlab that is focused on engineering computations (other than
convenience for course assistants)?

I am voting for that it also would be ok to do the home assignments in
R. Any thoughts?

Best regards,
Hans Salomonsson

Vinay Jethava

unread,
Jan 24, 2012, 11:35:26 AM1/24/12
to ml_ch...@googlegroups.com
Dear Hans, 

R vs Matlab reminds me of Emacs vs Vi! It mostly boils down to communities (statistics vs machine learning: http://www-stat.stanford.edu/~tibs/stat315a/glossary.pdf ). Here are a few reasons: 

1. (Communities) R is used mostly by statisticians, and focussed on traditional statistics - small number of samples,  hypothesis testing, etc. :) Machine learning is largely within the domain of computer scientists and engineers  - who are more interested in "quickly" building "large-scale" systems (once they get out of grad school; or leave it for big money!) 

2. (Education) Most machine learning courses that I’m aware of,  use Matlab as the programming language of choice as well as one of the suggested books (Barber). It is easier to find help, standardized and an active community beyond statisticians. (think mac vs linux!)

Indeed, Matlab has become the mainstay of academic research in engineering (quick prototyping, accessibility in “all” universities, standard platform, etc.) (See http://www.phdcomics.com/comics/archive.php?comicid=626). Most people in class coming from engineering background would most likely have already used matlab somewhere.

3. (Matrix laboratory “matlab") The key success behind matlab was “matrix computations made easy” - and everything starting out from a matrix. This allowed it to be a workhorse for developing techniques for  different fields e.g. statistics, optimization (linear programming), electrical engineering (FFTs), etc. This is true even today compared to R for example, where one has to write more code just to get a matrix. In contrast, I’m sure there are a number of statistical tests in R are simply not present in matlab. 

a <- c(10,20,15,43,76,41,25,46)                      
b <- c(2,5,8,3,6,1,5,6)                         
mymatrix <- matrix(c(a,b),8)

Hope, this makes some of the rationale for choosing matlab clear. I’m sure a statistician will find a number of arguments against this. One possibility is to use Octave (open source clone of Matlab) if you don’t wish to use matlab for privacy reasons. We’ll try to see if we can make the assignments so that you don’t have to submit code - but this would possibly be for the next offering of the course - since the assignments have largely decided for this iteration. 

Regards
Vinay 

Vinay Jethava

unread,
Jan 24, 2012, 11:47:29 AM1/24/12
to ml_chalmers
I forgot to mention that a number of techniques that you would be
introduced to in this class have their origin in linear algebra
(matrices) and optimization theory e.g. Support Vector Machines,
Kernels, Semi-definite programming - which are not taught in any
statistics course. Though there might be some implementation in R for
all of this - matlab is naturally suited here.

On Jan 24, 5:35 pm, Vinay Jethava <vjeth...@gmail.com> wrote:
> Dear Hans,
>
> R vs Matlab reminds me of Emacs vs Vi! It mostly boils down to communities (statistics vs machine learning:http://www-stat.stanford.edu/~tibs/stat315a/glossary.pdf). Here are a few reasons:
>
> 1. (Communities) R is used mostly by statisticians, and focussed on traditional statistics - small number of samples,  hypothesis testing, etc. :) Machine learning is largely within the domain of computer scientists and engineers  - who are more interested in "quickly" building "large-scale" systems (once they get out of grad school; or leave it for big money!)
>
> 2. (Education) Most machine learning courses that I’m aware of,  use Matlab as the programming language of choice as well as one of the suggested books (Barber). It is easier to find help, standardized and an active community beyond statisticians. (think mac vs linux!)
>
> Indeed, Matlab has become the mainstay of academic research in engineering (quick prototyping, accessibility in “all” universities, standard platform, etc.) (Seehttp://www.phdcomics.com/comics/archive.php?comicid=626). Most people in class coming from engineering background would most likely have already used matlab somewhere.

Hans Salomonsson

unread,
Jan 30, 2012, 7:24:34 AM1/30/12
to ml_ch...@googlegroups.com
Dear Vinay,

Thanks for your reply. I will use matlab for the assignments. However, R is huge among data mining practitioners. For example most kaggle users use R. You can build large-scale systems in R as well. DB integration is good and you can run on multiple cores/computers quite easily nowadays.

Linear algebra and numerical math support is good in Matlab, but it's also expensive and you can't examine the source of functions/algorithms. I am not saying that Matlab is bad, just that R also is a good choice for machine learning and in a perfect world it would be great if the students could choose between them.

Another point is that in my opinion I think academia should support open source alternatives. Especially when a great part of the community is academic researchers.

Regards,
Hans


>> 1. (Communities) R is used mostly by statisticians, and focussed on traditional statistics - small number of samples, hypothesis testing, etc. :) Machine learning is largely within the domain of computer scientists and engineers - who are more interested in "quickly" building "large-scale" systems (once they get out of grad school; or leave it for big money!)

>

24 jan 2012 kl. 17.47 skrev Vinay Jethava:

> I forgot to mention that a number of techniques that you would be
> introduced to in this class have their origin in linear algebra
> (matrices) and optimization theory e.g. Support Vector Machines,
> Kernels, Semi-definite programming - which are not taught in any
> statistics course. Though there might be some implementation in R for
> all of this - matlab is naturally suited here.
>

R has implemenations o

gustav...@gmail.com

unread,
Feb 1, 2012, 4:30:55 AM2/1/12
to ml_chalmers
I have implemented my own programming language. I vote for that it
also would be ok for me to use it for the home assingments
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages