Bootcamp slots still available

36 views

Skip to first unread message

Stefan Th. Gries

unread,

Jun 1, 2018, 6:20:32 PM6/1/18

to StatForLing with R

Hi all

Some quick self-advertising (with apologies, should be the last or second-to-last time for this calendar year). There are still two slots available in a 'Statistical modeling with R' bootcamp taking place 03-07 Sept 2018 at Maria Curie - Skłodowska University in Lublin, Poland. Also, I can now give more info about the content of both this one (and the one in Erfurt), which go considerably beyond what SFLWR2 offers:

###########################################

Day 1:

- simple and multifactorial linear regression modeling
- interactions, model selection & simplification (predictors & levels)
- orthogonal contrasts, predictions, model diagnostics and validation
- different approaches to curvature

Day 2:

- simple and multifactorial generalized linear regression modeling

- predictions and approaches to model validation

- exercises on examples data set(s)

Day 3:

- intro to linear mixed-effects modeling

- re-analyzing published data: 3 case studies

- excursus:

-- writing your own functions

-- multimodel inferencing

-- remarks on multicollinearity

Day 4:

- residualization

- generalized linear mixed-effects & multilevel modeling

- reanalyzing published data: 2 case studies

- excursus: influential data points

Day 5:

- classification and regression trees: 3 different packages

- problems of trees, potential solutions

- quick overview of random forests

- (missing data) imputation

- (case study)

###########################################

If you are interested, plz check out this website for details on how to sign up etc.

Cheers,

STG

Martin Schweinberger

unread,

Oct 31, 2018, 7:34:06 PM10/31/18

to statforli...@googlegroups.com

Hi all,

I have a problem that I haven’t found a satisfying solution for yet.

I ran two vector space models on amplifier-adjective bigrams (very + good, really + good, very + nice, really + nice, etc.) in two corpora.

So, the data the models worked on looked something like this:

nice good beautiful bad …

so 15 5 13 2 …

really 10 54 12 5 …

very 30 24 34 20 …

…

Now I have two vectors of cosine values for the amplifiers reflecting their distributional similarity in the two corpora (I essentially followed Leshina, Natalia. 2015. How to do Linguis,tics with R. Data exploration and statistical analysis. John Benjamins: Amsterdam & Philadelphia: 323-332 in creating the models).

I want to test whether the cosine values for very and really are more similar in one corpus compared to the other. In other words, I want to test, whether two individual values that are part of two vectors of values differ significantly (mind you – I am not interested in whether the distributions differ but only two specific values out of the two distributions).

Comparing the cluster solutions and stating that the number of nodes connecting really and very in the two corpora seems quite unsatisfactory in this case… I thought about z-transforming the values so that I can at least compare them but this does not tell me much in terms of significance. Also, I thought about modeling the data as a repeated-measures design but this does also not strike me as a recommendable procedure for my data (as it is not truly a repeated-measures design). I am sorry for bothering you all with this, but I am at a loss about how to test for significance here. Any ideas?

Best, Martin

=====================================

Dr. Martin Schweinberger

5/221 Sir Fred Schonell Drive

St Lucia, QLD, 4067

Fon.: +61 (0)404 228 226

Home: http://www.martinschweinberger.de/

--
You received this message because you are subscribed to the Google Groups "StatForLing with R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to statforling-wit...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages