כמה רשמים על useR!2016

瀏覽次數:34 次
跳到第一則未讀訊息

Tal Galili

未讀,
2016年7月17日 凌晨1:00:462016/7/17
收件者:israel-r-...@googlegroups.com
שלום כולם,
חזרתי לארץ לפני כמה ימים ואני עדיין מתאושש מג'ט לג אימתני.

לפני שבועיים השתתפתי בכנס useR!2016, וחשבתי שיעניין אתכם לקרוא קצת רשמים.
אני מפרט מזיכרון חלקי בלבד (כך שייתכן שבצפייה חוזרת בהרצאות נגלה שהיו אי דיוקים בתיאור שלי), אז אני מתנצל על כך מראש.
לשמחתנו הרבה הכנס השנה הוקלט (כמעט) במלואו, ואתם תוכלו לשבת ולצפות בשעות של הרצאות מהכנס (שזה ממש מגניב לדעתי):

השנה הייתה הפעם השביעית שלי ברצף שהלכתי לכנס הזה. השנה מדובר היה בכנס הגדול ביותר שהיה אי פעם (נטען שהגיעו קרוב ל- 900 משתתפים). מעבר להרצאות המעניינות, האינטראקציה עם אינספור משתמשי R הייתה מעוררת השראה עבורי. זה ללא ספק, עבורי, הכנס הכי מעניין שיש בעולם.
שנה הבאה הכנס יתקיים בבלגיה (בסביבות תחילת יולי אני חושב), ואני ממליץ לכם בחום רב לעשות מאמץ ולנסות לטוס לשם. עוד אין אתר אינטרנט באוויר (שאני מצאתי).

בברכה,
טל

==============


This is the first year that useR had (almost) all of its talk recorded by video. They can be seen here:

There are MANY interesting talks to watch there. I am not going to talk about the invited speakers, since all of their talks were worth while. I also attended fewer talks than I had intended to (both because they have been recorded, but mainly because of the interesting talks I had between sessions - that got carried out).

The schedule is listed here:
The links I provide are for the abstracts of the schedule, but you can search the video site to find the talks.

Day 1
The first day was about workshops (these wereת, sadly, not recorded). 

In the morning I partially went to
Which was less deep than I had hoped. But the repository of the tutorial is very detailed and holds interesting references for common "machine learning" algorithms:

I also briefly went to 
where the big take home message is that for class imbalance we may want to use alternative measures than miss-classification rate (things like AUC, sensitivity, specificity, or Kappa). The problem is that some models don't allow us to build them based on these measures directly (such as CART/rpart), but alternative tuning parameters can be used (e.g. class weights) in order to search for alternative models.
I think he has more on this in his book "Applied Predictive Modeling" (but I'm not sure)


In the second part I went to 
Regression Modeling Strategies and the rms package by Frank Harrell, who talked in details about the idea of using smoothing splines using the rms package. The idea there is to have flexible models to deal with various non-linear relationships where polynomials may not be flexible enough. One important note is that since model interpretation on the equation level is hard, using various plots is essential (at least for simple enough data sets).
More details are available in his book (regression modelling strategies).

The R consortium was mentioned many times during the conference as a new place for companies to donate money to R projects. They are currently funding several projects but nothing substantial came out of it yet (maybe within the next year - since they are funding some cool projects).

Day 2


Kaggle seems to have some nice datasets. They are also working on encouraging sharing of data analysis. This is interesting.
It appears xgboost is getting good results (more than random forests). And that python is growing fast in popularity (due to one machine learning algorithm, I think it was deep learning, but I'm not sure).

A lot of people are thinking about how to teach R
And also on how to teach statistics using shiny to non-R people:

The broom package (for getting data to a tidy format for piping to other functions, such as ggplot2 functions) is emerging as a very powerful tool, gaining more and more support from the community:

Google is working on the next generation of R (called Rho):
this work is still preliminary but very interesting.

There is some gap in the literature about color schemes. A recent cognitive research made in interesting distinction between variance and bias in color-value recall:
The big take home message for me was that many colors (such as in rainbow) reduce variance. But the lack of perceptual uniformity results in bias (again, such as in rainbow). Sadly, viridis was not checked in this experiment.

Day 3


There is now a new ipython notebook alternative within Rstudio (!)
it is going to be very interesting to see the impact of this on peoples workflow with using R.

A nice package for helping with randomization tests:

My heatmap overview talk went well

A very nice work on a package for simulations with R:

Domino lab is offering a github-like system for data analysis, with as much transparency as they could think of. Interesting product:

Day 4


Making books with R and R markdown is now easier:

There are now more specialized ranfom forest packages for very wide or very long datasets, ranger is one of them:

DataSHIELD is an interesting (R based) project for distributed privacy preserving data analysis framework:

Torsten is working on a more generalized way of describing various regression models:

On this day I chaired a lightning talk session, so I would recommend all of these talks as well :D






Jonathan Rosenblatt

未讀,
2016年7月17日 凌晨4:02:012016/7/17
收件者:israel-r-user-group
תודה על העדכון!


--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
--
Jonathan Rosenblatt
Dept. of Industrial Engineering and Management
Ben Gurion University of the Negev

回覆所有人
回覆作者
轉寄
0 則新訊息