Test set makeup

37 views
Skip to first unread message

Oscar Palmqvist

unread,
Feb 25, 2024, 7:04:57 AM2/25/24
to Ideology and Power in Parliamentary Speeches
Hi!

We were wondering how the test set was created. Was it randomly sampled or something like a fixed number of samples from each country? We are interested in whether or not the same distributions of labels/countries/speakers in the training set are likely to occur in the test set. 

Kind regards,
Team policy-parsing-panthers

cagri coltekin

unread,
Feb 25, 2024, 4:49:22 PM2/25/24
to Oscar Palmqvist, Ideology and Power in Parliamentary Speeches
Hi,

We are preparing a more detailed description, but in a nutshell:

- The test set for political orientation detection is pretty much
random. Speakers in the training and test sets are disjoint,
but otherwise, they are randomly sampled.

- For the power identification, the constraint is the other way
around. As much as possible, there is an overlap of speakers
between the training and test sets. For the speakers that
changed the power role within the available data, we place
their speeches in one role to the training set, and the other
to the test set. However, this is not always possible. So, the
amount of speakers with power role switch varies across the
parliaments.

Test set size for all parliaments for both tasks is
(approximately) 2000 speeches. The label distributions are
similar on the training and test sets.

I hope this helps.

Best,
Cagri
> --
> You received this message because you are subscribed to the Google Groups "Ideology and Power in Parliamentary Speeches" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to ideology-and-power-in-parlia...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/ideology-and-power-in-parliamentary-speeches/787febe9-5114-4463-a8d7-0694783eb5c6n%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages