Hi,
We are preparing a more detailed description, but in a nutshell:
- The test set for political orientation detection is pretty much
random. Speakers in the training and test sets are disjoint,
but otherwise, they are randomly sampled.
- For the power identification, the constraint is the other way
around. As much as possible, there is an overlap of speakers
between the training and test sets. For the speakers that
changed the power role within the available data, we place
their speeches in one role to the training set, and the other
to the test set. However, this is not always possible. So, the
amount of speakers with power role switch varies across the
parliaments.
Test set size for all parliaments for both tasks is
(approximately) 2000 speeches. The label distributions are
similar on the training and test sets.
I hope this helps.
Best,
Cagri
> --
> You received this message because you are subscribed to the Google Groups "Ideology and Power in Parliamentary Speeches" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
ideology-and-power-in-parlia...@googlegroups.com.
> To view this discussion on the web, visit
https://groups.google.com/d/msgid/ideology-and-power-in-parliamentary-speeches/787febe9-5114-4463-a8d7-0694783eb5c6n%40googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.