RELY function 3

Karin Edlund

unread,

Aug 20, 2020, 5:00:27 AM8/20/20

to ChiBolts

Hi,

I have a question about the RELY function 3 and how it is calculated?

My transcriptions comprises teacher and students dialogue.

In the manual it says it will calculate the overall match on the main line between two versions and that it uses a rough-and-ready "bag of "words" to compare.

What does this mean?

Grateful to receive some more information/explanantion about this.

Best regards,

Karin Edlund

Brian MacWhinney

unread,

Aug 20, 2020, 10:17:32 AM8/20/20

to ChiBolts

Dear Karin,

The bag of words method just means that the program looks at the words in one transcript and then checks to see if the other transcript uses the same words. This method doesn't pay attention to the order of words. So, a transcript that says "This bus always arrives early" and one that says "This bus arrives always early" will end up being judged as the same, because they have the same words.

-- Brian MacWhinney

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CANkUE2wWy05oX3Yn8DfnzAOinQ6qh9nC5Jej4jkyPLPkhn0COQ%40mail.gmail.com.

Karin Edlund

unread,

Aug 20, 2020, 2:11:40 PM8/20/20

to chib...@googlegroups.com

Hi Brian,

Thank you so much for your answer!

Does it calculate for all the words or a percentage of the words in the transcripts?

Are the speakers, the information before the asterisk, included in the calculation or just the words spoken?

Best regards,

Karin Edlund

To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/EE8348E7-6646-42FB-A7BD-C12A0AEE6EE4%40andrew.cmu.edu.

Brian MacWhinney

unread,

Aug 20, 2020, 2:51:43 PM8/20/20

to ChiBolts

Dear Karin,

RELY looks at each utterance one-at-a-time. Method 3 assumes that the utterances are fully correspondent between the two files being compared and that the only differences would be in the words used. If they fully match, then reliability is 100%. If not, it is reduced by the mismatches for words.

--Brian

To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CANkUE2zQcoo0rQggbmiuD_fbOEKv%2Bt9VoAki-eW4_CVf_b2-vA%40mail.gmail.com.

Karin Edlund

unread,

Aug 20, 2020, 3:20:17 PM8/20/20

to chib...@googlegroups.com

Hi Brian,

Ok, I understand. If I get alot of errors that say ”tier names do not match”, does that then reduce the correspondence between the two transcriptions? And that would mean that tier names are included in the word matching?

The ID headers are the exact same for both the transcriptions but on the level of utterance we transcribe differently at times and we also comment, %com, for a different amount. For that reason there is some shifting in the transcriptions and therefore the tier names do not match, if I am getting it right?

Thank you again and best regards,

Karin Edlund

To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/3670D10D-BFA9-4BCD-9BE2-26090D8F112C%40andrew.cmu.edu.

Brian MacWhinney

unread,

Aug 20, 2020, 4:17:37 PM8/20/20

to ChiBolts, Karin Edlund

Dear Karin,

The tier names are not included in the bag of words match, but they have to match and align in order for the program to run. If you thiink they are fully aligned and that this is somehow an error, I would need to see your input files. It is possible that adding additional tiers could be messing things up.

--Brian

To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CANkUE2wqmdVgPZJ9wyYM8o_-9ur5CgnarOJVUuf0oR2ae8yiyQ%40mail.gmail.com.

Reply all

Reply to author

Forward