elwiki ans elwiktionary update request

97 views
Skip to first unread message

arnaud prinstet

unread,
Mar 11, 2023, 11:18:02 AM3/11/23
to aard...@googlegroups.com
Please update greek wiktionary ans wikipedia! Thanks in advance!

aard...@gmail.com

unread,
Mar 16, 2023, 7:34:50 AM3/16/23
to aarddict
Will be uploaded today. 
Check it out on ftp.halifax.rwth-aachen.de/aarddict/elwiki
and let me know if this is what you are looking for.
Markus 

Arnaud Prinstet

unread,
Mar 16, 2023, 12:49:15 PM3/16/23
to aard...@googlegroups.com
Thanks a lot, for now the elwiki directory still doesn't  exist, but waiting for it!

Mar 11, 2023 18:18:01 arnaud prinstet <arnaudp...@gmail.com>:

aard...@gmail.com

unread,
Mar 16, 2023, 5:35:04 PM3/16/23
to aarddict
It is synchronized and available now. :)
Have fun

Arnaud Prinstet

unread,
Mar 17, 2023, 1:02:03 AM3/17/23
to aard...@googlegroups.com
This is great! viele danke!!!

Mar 16, 2023 23:35:06 aard...@gmail.com <aard...@gmail.com>:

--
You received this message because you are subscribed to a topic in the Google Groups "aarddict" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aarddict/eu_q33XbRhk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to aarddict+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aarddict/5c36272a-7c0b-4f4a-806b-f5f464bddcabn%40googlegroups.com.

Arnaud Prinstet

unread,
Mar 17, 2023, 1:20:59 AM3/17/23
to aard...@googlegroups.com
Just one strange  thing for the elwiktionary the size (193 mb ) is less than the version of last year (210 mb), I don't t know if it is normal?

Mar 17, 2023 07:02:00 Arnaud Prinstet <arnaudp...@gmail.com>:

AardFeeder

unread,
Mar 17, 2023, 5:04:41 AM3/17/23
to aard...@googlegroups.com

Short answer: yes

 

Long answer:

I did not create the other elwiktionary. However I guess that the compression for that older version is standard.

As our phones have become more powerful I made some tests and came to the conclusion that a higher compression does not impact usability. I am using not using a top-notch phone but a midsize Galaxy A52 and can’t see a difference.

And I never got a complain that the wikis are sluggish.

So I am using as (new) standard 1024 chunks instead of 384 which makes the files smaller.

--
You received this message because you are subscribed to the Google Groups "aarddict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aarddict+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aarddict/1ac4dee8-103b-4212-a695-1b6de7d64f17%40gmail.com.

Arnaud Prinstet

unread,
Mar 17, 2023, 8:20:15 AM3/17/23
to aard...@googlegroups.com
Great! Thank you for your detailed response and for the great work that you make for aard and for publishing regular updates of the Wiktionary and Wikipedia archives in all languages !

Mar 17, 2023 11:04:43 AardFeeder <AardF...@web.de>:

arnaud prinstet

unread,
Mar 12, 2024, 1:51:10 PMMar 12
to aard...@googlegroups.com
Thanks for the great work that you do on those updates. Following the greek wiktionary and wikipedia, i note that the updates of the greek wiktionary seems going fine but as for wikipedia the elwiki 20240201 counts 235.524 items  so more than the  20231201 (229.878 items ) but still less than the elwiki20230901 (255.556 items) so I stick with the September 2023 update!


On Friday, March 17, 2023, Arnaud Prinstet <arnaudp...@gmail.com> wrote:
Great! Thank you for your detailed response and for the great work that you make for aard and for publishing regular updates of the Wiktionary and Wikipedia archives in all languages !

Mar 17, 2023 11:04:43 AardFeeder <AardF...@web.de>:

Short answer: yes

Long answer:

I did not create the other elwiktionary. However I guess that the compression for that older version is standard.

As our phones have become more powerful I made some tests and came to the conclusion that a higher compression does not impact usability. I am using not using a top-notch phone but a midsize Galaxy A52 and can’t see a difference.

And I never got a complain that the wikis are sluggish.

So I am using as (new) standard 1024 chunks instead of 384 which makes the files smaller.

From: aard...@googlegroups.com [mailto:aarddict@googlegroups.com] On Behalf Of Arnaud Prinstet
Sent: Freitag, 17. März 2023 06:20
To: aard...@googlegroups.com
Subject: Re: elwiki ans elwiktionary update request

Just one strange  thing for the elwiktionary the size (193 mb ) is less than the version of last year (210 mb), I don't t know if it is normal?

Mar 17, 2023 07:02:00 Arnaud Prinstet <arnaudp...@gmail.com>:

This is great! viele danke!!!

Mar 16, 2023 23:35:06 aard...@gmail.com <aard...@gmail.com>:

It is synchronized and available now. :)

Have fun

arnaud schrieb am Donnerstag, 16. März 2023 um 17:49:15 UTC+1:

Thanks a lot, for now the elwiki directory still doesn't  exist, but waiting for it!

Mar 11, 2023 18:18:01 arnaud prinstet <arnaudp...@gmail.com>:

Please update greek wiktionary ans wikipedia! Thanks in advance!

--
You received this message because you are subscribed to a topic in the Google Groups "aarddict" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aarddict/eu_q33XbRhk/unsubscribe.

To unsubscribe from this group and all its topics, send an email to aarddict+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "aarddict" group.

To unsubscribe from this group and stop receiving emails from it, send an email to aarddict+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "aarddict" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aarddict/eu_q33XbRhk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to aarddict+unsubscribe@googlegroups.com.

AardF...@web.de

unread,
Mar 15, 2024, 4:05:33 AMMar 15
to aarddict
I hear you. 
I have the same concerns.
That's why I tracked the content and compared to the dumps and the scraping. Both seem to have glitches in providing data. 
I referring here to the dumps, as I use the scraping for wiktionaries only.
And in the end I am not sure which count is correct. It looks to me like the elwiki20230901 (255.556 items) is overstated.
According to the actual statistics
Greece has 232999 articles as of today compared to the 237139 of the Dumpfile of March 1st 2024
The article count in the slob files is counting blobs. Some blobs are needed for internal organisation. So there are a little bit more blobs than articles. In the given case around 4000 blobs for internal organisation. And this is the best number we can get.
It is _very_ close to the actual number of articles

These are the historical values for the elwikis
elwiki20230601 703072kB blob count: 225960
elwiki202308* 894704kB blob count: 253935
elwiki202312* 718108kB blob count: 229878
elwiki202401* 748140kB blob count: 235884
elwiki202402* 734596kB blob count: 235524
elwiki202403* 746812kB blob count: 237139
which looks pretty accurate. 

I can only compile the data I get. The content is given.

But of course you can use whatever version you like,

have fun
Markus


On Tuesday, March 12, 2024 at 6:51:10 PM UTC+1 arnaud wrote:
Thanks for the great work that you do on those updates. Following the greek wiktionary and wikipedia, i note that the updates of the greek wiktionary seems going fine but as for wikipedia the elwiki 20240201 counts 235.524 items  so more than the  20231201 (229.878 items ) but still less than the elwiki20230901 (255.556 items) so I stick with the September 2023 update!

On Friday, March 17, 2023, Arnaud Prinstet <arnaudp...@gmail.com> wrote:
Great! Thank you for your detailed response and for the great work that you make for aard and for publishing regular updates of the Wiktionary and Wikipedia archives in all languages !

Mar 17, 2023 11:04:43 AardFeeder <AardF...@web.de>:

Short answer: yes

Long answer:

I did not create the other elwiktionary. However I guess that the compression for that older version is standard.

As our phones have become more powerful I made some tests and came to the conclusion that a higher compression does not impact usability. I am using not using a top-notch phone but a midsize Galaxy A52 and can’t see a difference.

And I never got a complain that the wikis are sluggish.

So I am using as (new) standard 1024 chunks instead of 384 which makes the files smaller.

From: aard...@googlegroups.com [mailto:aard...@googlegroups.com] On Behalf Of Arnaud Prinstet
Sent: Freitag, 17. März 2023 06:20
To: aard...@googlegroups.com
Subject: Re: elwiki ans elwiktionary update request

Just one strange  thing for the elwiktionary the size (193 mb ) is less than the version of last year (210 mb), I don't t know if it is normal?

Mar 17, 2023 07:02:00 Arnaud Prinstet <arnaudp...@gmail.com>:

This is great! viele danke!!!

Mar 16, 2023 23:35:06 aard...@gmail.com <aard...@gmail.com>:

It is synchronized and available now. :)

Have fun

arnaud schrieb am Donnerstag, 16. März 2023 um 17:49:15 UTC+1:

Thanks a lot, for now the elwiki directory still doesn't  exist, but waiting for it!

Mar 11, 2023 18:18:01 arnaud prinstet <arnaudp...@gmail.com>:

Please update greek wiktionary ans wikipedia! Thanks in advance!

--
You received this message because you are subscribed to a topic in the Google Groups "aarddict" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aarddict/eu_q33XbRhk/unsubscribe.

To unsubscribe from this group and all its topics, send an email to aarddict+u...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "aarddict" group.

To unsubscribe from this group and stop receiving emails from it, send an email to aarddict+u...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "aarddict" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aarddict/eu_q33XbRhk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to aarddict+u...@googlegroups.com.

Arnaud Prinstet

unread,
Mar 15, 2024, 1:49:03 PMMar 15
to 'AardF...@web.de' via aarddict
Thank you for those explanations and for the great and very useful work that you do on thoses updates! So now I am reassured and will stick with the last update!
A big thank thank you for all

Mar 15, 2024 10:05:37 'AardF...@web.de' via aarddict <aard...@googlegroups.com>:

I hear you. 

--
You received this message because you are subscribed to the Google Groups "aarddict" group.

To unsubscribe from this group and stop receiving emails from it, send an email to aarddict+u...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "aarddict" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aarddict/eu_q33XbRhk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to aarddict+u...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "aarddict" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/aarddict/eu_q33XbRhk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to aarddict+u...@googlegroups.com.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages