Creating sub-categories with GEM/GEMFREQ

13 views
Skip to first unread message

Macarena Quiroga

unread,
Oct 12, 2021, 2:05:42 PM10/12/21
to chibolts
Hi everyone,
I've got a group of files where I want to mark segments with gem. I'll need to stablish a hierarchy of categories, for example: "reading-child-night" and "reading-adult-night". After marking the gems, I'll need to run freq analysis on them. Is there any way I'll be able to run those analysis in each one of those gems without losing track of the tags? I know I can use the +s switch to extract gems with "reading", for example, and then run freq, but I'm afraid that it will lose the rest of the information. And probably I'll end up having a big amount of different gem tags, so I don't think runing a specific freq for each one of them would be a good idea.

In other words, the output I wish to obtain is a spreadsheet with all the different gems of the files in the rows (for example, "reading-child-night" and "reading-adult-night" as different rows), and the columns of types, tokens and ratio.
I haven't started yet with the identification of the segments, so if there's a better way to do it, please let me know.
Thank you!

Leonid Spektor

unread,
Oct 12, 2021, 4:27:11 PM10/12/21
to chib...@googlegroups.com
Hi,

The +f option allows you to specify some uniq keyword that you can use to identify the file. To use your example here is what you would do:

gem +n +sreading-child-night +fgem.reading-child-night +d1 +t*par *.cha
gem +n +sreading-adult-night +fgem.reading-adult-night +d1 +t*par *.cha

Then, assuming you are looking for speaker *PAR, run FREQ command:

freq +d3 +t*PAR *.gem*.cex

In the stat.frq.xls output file in the first column you will have filename with the name of the GEM. In the case of commands above rows will start with the following:

File Language                   Corpus   Code   Age
.gem.reading-adult-night   ........................................
.gem.reading-child-night   ........................................


Leonid.

On Oct 12, 2021, at 14:05, Macarena Quiroga <macarenas...@gmail.com> wrote:

Hi everyone,
I've got a group of files where I want to mark segments with gem. I'll need to establish a hierarchy of categories, for example: "reading-child-night" and "reading-adult-night". After marking the gems, I'll need to run freq analysis on them. Is there any way I'll be able to run those analysis in each one of those gems without losing track of the tags? I know I can use the +s switch to extract gems with "reading", for example, and then run freq, but I'm afraid that it will lose the rest of the information. And probably I'll end up having a big amount of different gem tags, so I don't think running a specific freq for each one of them would be a good idea.

In other words, the output I wish to obtain is a spreadsheet with all the different gems of the files in the rows (for example, "reading-child-night" and "reading-adult-night" as different rows), and the columns of types, tokens and ratio.
I haven't started yet with the identification of the segments, so if there's a better way to do it, please let me know.
Thank you!

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/820a5a2f-3491-45ee-a73c-d0ba00bb316en%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages