MEGRASP for french

瀏覽次數:12 次

2022年6月29日 下午2:29:462022/6/29
Hi everyone,

I would like to study different syntactic measures on my transcripts, bit I saw that the MEGRASP function doesn't exist in the actual version of CLAN for french. I've also read in the manual that it would be possible to create a training corpus (and thus a megrasp.mod file), so i have two questions : 
- can you confirm that It will be possible to use the training corpus to use MEGRASP on my french transcripts ? 
- I've just tried the procedure described in the manual and I'm not sure if i'm doing it right ... 
Here is a part of the CLAN output (running MEGRASP on a short training corpus in training mode) :
Finished processing file with 7 and 1 errors.
widthfactor = 1.000000
preparing for estimation...
number of samples = 66
number of features = 599
calculating empirical expectation...
performing LMVM
  0 of 301  logl(err) = -2.708050 (0.4091)
  1 of 301  logl(err) = -1.856078 (0.4091)
  2 of 301  logl(err) = -1.780707 (0.4091)


But i didn't find any megrasp.mod file in the core files. Do i have to use the english grammar for this procedure, with the megrasp.mod file created with a french training corpus ? 

Thanks a lot for your help,

Kind regards,

Fagniart S.

2022年6月29日 下午2:44:142022/6/29
Editing my message :

The answer was in my question message : by using english grammar, i've succeed running MEGRASP on my training corpus. Now the MEGRASP function works on all my transcript, I now need to check if it seems reliable.

If anyone have tried the same experience on french trancripts, i would be very interesting on any feedback.


Fagniart S.

Brian Macwhinney

2022年6月29日 下午3:52:112022/6/29
Dear Sophie,
    We don’t distribute the training corpora for each language along with the MOR grammars, so you are not seeing the crucial piece that would be needed for French.  You can read chapter 11 of the MOR manual to understand which grammatical relations would have to be tagged.  I would say that creating a reasonable training corpus for French would take about 10 days of solid work.  It could be a bit faster if one uses the trick of starting by using the English MOR.
    Later this summer I will be exploring the use of Universal Dependency taggers for this purpose, but I can’t promise anything about this now.

—Brian MacWhinney

You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

2022年6月30日 凌晨3:32:592022/6/30
Dear Mr MacWhinney,

Thanks a lot for your answer. I understand that the creation of a solid training corpus will be effortful. As you suggest, i'll start with a use the english MEGRASP function to annote the grammatical relations of some transcripts : it will be a usefull base (basic sentences structures of french and english being quite similar) for then correcting the tagging by following the manual instructions.
I'm looking forward to see the results of your summer works.

Kind regards,


Brian Macwhinney

2022年6月30日 上午11:32:422022/6/30
Yes, that approach will work to some degree. If you get work by this method to create a properly tagged corpus, we can then turn around and use that as a training corpus for French.

> To view this discussion on the web visit

0 則新訊息