About the static dictionary

113 views
Skip to first unread message

Cody L

unread,
Nov 12, 2020, 6:06:24 PM11/12/20
to Brotli
Where is the best place to ask questions about the contents of the static dictionary?

eus...@google.com

unread,
Dec 21, 2020, 8:19:32 AM12/21/20
to Brotli
Hello.

 Feel free to ask such questions in this group.

Cody L

unread,
Dec 23, 2020, 12:48:41 AM12/23/20
to Brotli
Hello. Thank you for getting back to me.

Based on a little bit of reading I have done in this group and through the GitHub repository, I get the impression that the static dictionary was generated by analyzing many sources of text and intelligently selecting and extracting text fragments.

My question is, was there any inspection of the resulting static dictionary for possibly offensive words? I ask after inspecting the dictionary myself (i.e. the one provided in RFC 7932).

Thanks again for your time,

eus...@google.com

unread,
Jan 8, 2021, 7:13:39 AM1/8/21
to Brotli
I believe, Jyrki Alakuijala knows the answer.

There are few obscene words in the dictionary (including the f-word), but that reflects the state of the internet and other text corpora at the date of creation of dictionary.

Cody L

unread,
Jan 11, 2021, 10:33:58 PM1/11/21
to Brotli
Okay, makes sense. Hopefully the words aren't too obscene (at least not more obscene than the f-word). I noticed that the dictionary has non-English words; is there any chance either Jyrki Alakuijala or someone else on the Brotli team has looked into it?

eus...@google.com

unread,
Jan 12, 2021, 8:02:15 AM1/12/21
to Brotli
I've took a look at Russian part - it is clear (no obscene words). Though there are also Chinese, Hindi and Arabic parts...
Words that does not belong to those 5 languages are native language names (e.g. Ελληνικά for Greek), pieces of JS, CSS and HTML, and years (e.g. 2020)...

Cody L

unread,
Jan 19, 2021, 9:57:04 PM1/19/21
to Brotli
Cool for now. I'll keep watching this discussion in case someone else finds something/nothing.

Thank you for your time,
Reply all
Reply to author
Forward
0 new messages