reliability test

12 views
Skip to first unread message

Karin Edlund

unread,
Jun 1, 2020, 4:09:57 AM6/1/20
to ChiBolts
Hi,

I have a question about reliability test and deciding on a proper amount of my material to check reliability for. 

I am planning to use the RELY function in CLAN to check for reliability and to do it on 10% of my data. 

I know in some studies as much as 25% is checked for agreement. 

What are your guidelines? is it depending on the collected data, how much it encompasses or the data's content? 
If you have suggestions on references, I am grateful to take part of them.

Best Regards,
Karin Edlund,
PhD student in Special Education
Stockholm University

Brian MacWhinney

unread,
Jun 1, 2020, 9:44:27 PM6/1/20
to ChiBolts
Dear Karin,
    RELY was originally designed to check reliability of coding tiers, rather than the main line. For that, it does a great job, because the shape of coding tiers is so well defined.  If that is what you are doing, then I would say that the decision about what percentage to check is really a function of your overall corpus.  For example, if you only have two hours of recording and you check 10% of that, then that would seem incomplete.  However, if you have 200 hours and you check a random 10% of that and show good reliability, then I would be pretty convinced that you are okay.  So, percentage is a function of corpus size. 
    I suppose a similar argument could be made regarding corpus or data type, but that argument would be a bit less convincing.
       Later on, Leonid expanded the function of RELY to include comparisons of the main line.  This method just looks for overall match of words.  Please just read the manual about these differences.

Best,

-- Brian MacWhinney

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CANkUE2ze32zpkYg%2Bkw9TtH7YUfLe2-J8gzws1n%2Bq06He9-rmRw%40mail.gmail.com.

Karin Edlund

unread,
Jun 2, 2020, 3:47:30 AM6/2/20
to ChiBolts
Dear Brian,

Thank you for clarifying the RELY function.
 I will check for reliability both regarding coding tiers and on word/utterance level. For the coding tiers RELY seems an excellent choice. Perhaps I can use RELY as well when checking agreement on word/utterance level even if it checks only on main lines. I am interested to check for overall match on word level. Or what could be another suitable way of checking agreement on word/utterance level?
My corpus is approximately 15 hours spread over 55 video recordings of teacher student verbal interaction during text talks.
Mean value for duration of the text talks is approx 15 minutes.
I have randomly selected 15 recordings for which 6 minutes are checked for reliability. This is equal to 10% of the corpus.
I am uncertain about if the percentage size of 10% is appropriate related to my corpus?

Best Regards,
Karin Edlund





Reply all
Reply to author
Forward
0 new messages