Concordance Analysis for a parallel corpus

ah.hai...@gmail.com

unread,

Oct 25, 2016, 6:59:46 AM10/25/16

to WordSmith Tools

Hi Mike,

I am currently working on a parallel corpus of Arabic and English, and was wondering if there are WS6 (the version I am now using) enables me to create concordance lines for a particular node word in one language, and get its equivalent/ concordance lines in the other language?

It is worth mentioning that the two corpora are saved in two .txt files, and are aligned at a sentence level where each line has only one sentence; i.e. line 1 (sentence 1) in the Arabic file is the same as line one (sentence 1) in the English corpus.

Yesterday, I used the aligner in WS6, and the results were more than perfect, but still I couldn’t carry out a concordance analysis for the two files.

I have tried to use AntPconc, but found that it doesn’t fully support Arabic?

I was wondering if you could consider adding such feature to WS. That would be really great.

Regards,

Ahmad

Mike Scott

unread,

Oct 25, 2016, 11:41:48 AM10/25/16

to WordSmith Tools

I am currently working on a parallel corpus of Arabic and English, and was wondering if there are WS6 (the version I am now using) enables me to create concordance lines for a particular node word in one language, and get its equivalent/ concordance lines in the other language?

Yesterday, I used the aligner in WS6, and the results were more than perfect, but still I couldn’t carry out a concordance analysis for the two files.

Ahmad, hi

It ought to be possible to get the Aligner to display all the English and the Arabic pairs of sentences where the English contains a given word or phrase, yes. That wouldn't be a normal concordance --- because WordSmith doesn't know any translation equivalences so it can only show the corresponding whole sentence without knowing which word(s) correspond in meaning to a given search-word --- but it might suit your needs, mightn't it?

I could consider doing that for WS7, but not WS6 which like WS1 - WS5 isn't being developed any more. Essentially you'd type in a search-word and the Aligner would take all sentences in one language which contained that together with all the corresponding sentences of the other language, and I think it'd open those up in a new copy of the Aligner as a kind of follow-up.

Let me know whether that is what you'd like.

Cheers -- Mike

ah.hai...@gmail.com

unread,

Oct 25, 2016, 1:24:36 PM10/25/16

to WordSmith Tools

Hi Mike,

Yes, that would be really good. So, if my search word is (X), I will get all of the sentences that contain that word in one language as in the concordance tool (KWIC: the node word with the surrounding words to the left and right), followed by their corresponding sentences in the other language without knowing which word(s) correspond in meaning to (X). Would it also be possible to sort the sentences by a given number of words to the left or right of (X) (L1, L2..)? If yes, this will be really helpful and useful for anyone working on parallel corpora.

	X
The corresponding sentence in the other language
	X
The corresponding sentence in the other language
	X
The corresponding sentence in the other language

Thanks,

Ahmad

Mike Scott

unread,

Oct 25, 2016, 2:31:24 PM10/25/16

to WordSmith Tools

I expect so.I will try this when I get time -- won't be for a while as I am busy with research myself at the moment.

Cheers -- Mike

ah.ha...@gmail.com

unread,

Jan 6, 2017, 9:46:24 AM1/6/17

to WordSmith Tools

Hi Mike,

I was wondering whether the new feature of processing parallel corpora will be available for WS7’s users soon!

Wish you a Happy and Prosperous New Year 2017,

Ahmad

Mike Scott

unread,

Jan 8, 2017, 6:00:49 AM1/8/17

to WordSmith Tools

Thanks for the best wishes and a Happy New Year to all from me too!

I'm working on this feature Ahmad. I hope to have something for you soon.

Cheers -- Mike

Mike Scott

unread,

Jan 22, 2017, 5:57:11 AM1/22/17

to WordSmith Tools

Ahmad, hi

I'm slowly getting there. There is already a search facility I've put in, in the Aligner, and it does a language-specific search but the function is still needing improvement. I should be able to get it to save all relevant pairs/triplets but that is not yet done. When I have more done, I will improve the Help on Aligner so you could look out for a reference to searching there.

Cheers -- Mike

Mike Scott

unread,

Jan 29, 2017, 12:49:11 PM1/29/17

to WordSmith Tools

Ahmad, hi

At http://lexically.net/downloads/version7/HTML/index.html?viewer_settings_info.htm you will see I've been working on ensuring searching works in multiple languages and with a chance to save language-specific colours. You can already test the save and search in English and Arabic.

Now I need to generate a way to show only the lines which contain search hits. That will come soon.

Cheers -- Mike

Mike Scott

unread,

Feb 17, 2017, 1:25:39 PM2/17/17

to WordSmith Tools

At last, http://lexically.net/downloads/version7/HTML/index.html?parallel_concordancing.htm there is a version which I hope will do what you need, Ahmad