José Aguirre's metalexicography

8 views
Skip to first unread message

Gilles-Maurice de Schryver

unread,
Nov 8, 2012, 3:52:15 PM11/8/12
to José Aguirre, eur...@freelists.org, asi...@freelists.org, afr...@freelists.org, DS...@yahoogroups.com, lexicogr...@yahoogroups.com, is...@lists.le.ac.uk, lexico...@googlegroups.com

Hello José Aguirre,

 

Are we living on the same planet? Are we sharing the same discipline?

 

> First, as for web connectivity,

 

The digital revolution is not only about online dictionaries, and thus the need to be connected to the Internet at all times.

 

> By a strange coincidence, richer countries tend to have better Internet access

 

If this is really inexplicable to you, one of us could try to explain ... any volunteers?

 

> My main point is that this "paperless" "Second Revolution" in lexicography has to do more with money and marketing strategies than anything else.

 

I invite you to go back to the early days, say the Hector project dreamed up by such giants as Sue Atkins and Patrick Hanks. Money? Marketing? No, it was ans is all about making better dictionaries and offering new types of results in a new environment.

 

> if I jot down a random Chinese character on a piece of paper for both of us to look up, 10 times out of 10 I will find it in a paper dictionary before you do in your digital dictionary.

 

So here’s a nice challenge for the CJK gurus! If Jack Halpern’s tools can’t already beat you on this, let this be the Deep Blue lexicographic equivalent. Jack!?

 

> clicking ...

 

True browse modes, mimicking the paper experience, can easily be created, Kindle-style. (Stop thinking that ads and thus the need for clicks are the only possible business model.)

 

> Higgs boson

 

As we know, dictionary criticism is part of metalexicography, so always welcome! It’s one of the ways to improve the ‘classic’ contents of dictionaries. As all practicing lexicographers will confirm: we take reviews seriously, and if a good point is made, we improve in the next edition (paper) or the next day (online).

 

> Open Dictionary

 

More metalexicography. My reply: Well, yes, that’s what you get with users’ input. Lexicography is an art, so we shouldn’t worry too much that we’ll all be out of a job too soon.

 

> The main constraint of a paper dictionary used to be the space available to each entry.

 

Not at all. This is the most trivial one, yes. What defined paper is the fact that it’s all about text and figures only, no sound, no video, no way to talk to it (well, not in a meaningful, look-up mode), no way to go from any word straight into a corpus to see real use (as was done, a decade ago, in the eCOBUILD with its Wordbank), no way to see Adam Kilgarriff-style wordsketches being proffered anywhere as pop-ups, no way to see Patrick Hanks-style verb patterns to make sense of how words are really being used (to then take you to the meaning), etc. Think out of the box when thinking digital, not just plain paper in electronic form, please.

 

> Definition of “dictionary”

 

Indeed, needs a bit of an update, isn’t it!? (And the examples could be multiplied a hundred-fold: writing in English in and about lexicography has unfortunately suffered from a bit too much navel gazing, as any lexicographer working on an ‘exotic’ language will confirm. What to do about it? Let us -- we, who are not working with and on and in English -- publish more, so that the Anglo-Saxons will take note. It’s not their fault, it’s ours. Do you have a definition for “dictionary” from a Korean or Chinese dictionary at hand? Would love to see those (translated, as Google Translate will make a dog’s breakfast of it).)

 

> So, please, tell me, what was new in all this, where was the revolution, where is "the true power of the digital medium" exploited here?

 

Unfortunately, and typical of most metalexicography, your criticism is just that: criticism. Your piece is short on ways to do better, solutions to the problems you point out, answers to the questions you raise. Getting us to do that will lead to the new dictionaries we are hoping to see. A revolution is the starting point. It’s not 1789 that is important, it’s what came after. Similarly, 2013 will only be remembered as the starting point of a new type of lexicography.

 

> despite all media hype and shock headline therapy, the Second Revolution in lexicography has not happened

 

I beg to differ.

 

All best,

Gilles-Maurice.

 

 

From: José Aguirre [mailto:jagui...@yahoo.com]
Sent: woensdag 7 november 2012 14:58
To: eur...@freelists.org; gillesmauric...@UGent.be; Michael Rundell
Subject: Re: [euralex] Re: End of print dictionaries at Macmillan

 

I'd like to reply to some of Michael Rundell's arguments in favour of the online dictionaries.

First, as for web connectivity, Wikipedia, quoting the International Telecommunications Union, states that in 2011 65% of the world population "Not using the Internet" as opposed to 35% "Using the Internet". [http://en.wikipedia.org/wiki/Internet_access]. I.e., roughly two thirds of the world don't access the web. By a strange coincidence, richer countries tend to have better Internet access; (we'll probably have to wait until mining companies get wider mining rights to see the situation improve). My main point is that this "paperless" "Second Revolution" in lexicography has to do more with money and marketing strategies than anything else.

Secondly, digital dictionaries can be divided between "online dictionaries" (that require Internet access) and "electronic dictionaries" (that require just an electronic device that can run that dictionary). Commercial ads in an electronic dictionary would quickly be labelled adware and would be frowned upon by anyone from here to Antarctica. Commercial ads in an online dictionary... well, what do you expect? surely someone has to pay for that!. Yesterday I wrote in a hurry and thought this would be about suscription fees, Michael Rundell himself pointed out that would be unlikely even if desirable.

Both types of digital dictionaries have some advantages over paper dictionaries, but only some. And by saying this I don't advocate going back to living in caves. Your online dictionary will be of no use when -for any number of reasons- you are not online. Your electronic dictionary will usually get you the information you want faster than your paper dictionary. Usually, but not always. If you have to switch on your computer, by the time your operating system finishes loading I will have found in a paper dictionary the entry we were looking for. Even with your computer switched on already, if I jot down a random Chinese character on a piece of paper for both of us to look up, 10 times out of 10 I will find it in a paper dictionary before you do in your digital dictionary.

As for browsing, well, it might seem we use the same word, but in reality we do not mean the same. Browsing a page on a printed dictionary can give a certain amount of unquantifiable information, about relevance and place of an entry word in a list of lemmas, etc. Your idea of browsing in fact means "clicking". And this is the key to the whole "revolution". In an online dictionary you can click everywhere. In fact, you should click anywhere. A click is a "hit". If you want to place your ads somewhere on the internet, you want to find a website that attracts more "hits" than others, so that more people will see your ads. You can even have your website design geared to attracting hits. For instance, do not provide a scrollbar with a list of lemmas. Scrolling down is not "clicking", when you scroll down you do not hit. Make anything clickable and you'll generate more hits, more hits will mean higher fees for more ads. That is the "Second Revolution" in lexicography.

Let me quote from the "Stop the presses – the end of the printed dictionary" (http://www.macmillandictionaryblog.com/bye-print-dictionary):

"The digital medium is the best platform for a dictionary. One of its advantages is that we can now provide all kinds of supplementary resources – like this blog."

Non sequitur. Like in "This is the best basket to carry apples. One of its advantages is that we can now put into it all sorts of fruits and vegetables." But let's have a look at the blog.

One of the jewels of that blog seems to be Kerry Maxwell’s weekly Buzzwords column, which "has been keeping us up-to-date with changes in the language for almost ten years.". Well, it seems to amount to one post per week, each devoted to one word. From the list, I chose just one recent entry, "Higgs boson", where it is defined as:

"noun [countable]
in physics, a particle (= an extremely small piece of matter that is part of an atom) that could explain where mass (= the amount of matter that something contains) comes from."

Then, one single quotation from CNN dated 5th July 2012:

'It's like molasses! But sort of like the air! Yet it also behaves like fans of Justin Bieber! Everyone's talking about the Higgs boson, even though there's no really great metaphor for describing what it is and how it works. We know that this particle is responsible for the fact that matter - i.e. the stuff we are made of - has mass.'

Using a printed copy of the Oxford English Dictionary Additions Series, vol. 2, I could, at a glance, find a quotation for "Higgs boson" dated 1974. Of course the CNN quotations is much more fun (it even manages to mention Justin Bieber and Higgs on the same paragraph) and cooler than the ones that have been appearing in Physics literature for the past 40 years. No mention of who Higgs may be or what a boson might be. Seriously, is this all you could come up with? Is this keeping us up-to-date with changes in the language? I can't help but think of what John Algeo and his colleagues from "Among the New Words" could have done had they had the digital resources that you seem to have at your disposal.

The final pearl of the blog is the Open Dictionary where users can send their own entries. You wrote: "Thirty years ago, the arrival of corpus data sparked a revolution in the way of dictionaries are created.". It must have been a pretty short-lived revolution when you need user-made examples like this one to illustrate the use of "graph" as a verb: "This site has graphs that graph what's what."

Sorry, but the one thing the Internet could do without is one more blog on language issues full of trendy words but no real substance.

You wrote: "Finally getting rid of the paper constraints, and starting to exploit the true power of the digital medium -- and to be able to do just that -- is nothing less than a revolution."

The main constraint of a paper dictionary used to be the space available to each entry. Now, the online dictionary reproduces verbatim the whole entry for the word"dictionary" exactly as it was printed in my 2002 paper copy of Macmillan English Dictionary for Advanced Learners. When I showed this definition ("a book that gives a list of words in alphabetical order and explains what they mean") to some of my students, those from European countries had nothing to comment, while my Korean and Chinese students were scratching their heads trying to reconcile that with their own dictionaries. Koreans, after some interesting discussion, agreed that they could accept it provided they were allowed to stretch the meaning of "alphabetical" quite a bit. In Chinese lexicographic tradition based on Kangxi radicals this definition just would not work, words are not arranged in alphabetical order and yet they are dictionaries. Michael Rundell mentioned that it was this kind of young "cohort", or "market segment" that they were trying to reach. Well, if you intend to wean them off their Naver and Daum bilingual dictionaries, you'll have to do much better than that. When I wrote yesterday about "recycling old stuff over and over again to diversify their products", I meant exactly this: a 2002 definition lifted from a printed dictionary is "released of all its paper constraints" and placed verbatim on an online dictionary. The remaining differences between the online version and the paper one are: there is a clickable button to hear the pronunciation (remember, 1 click = 1 hit) -but this has been around in electronic dictionaries for the past 20 years-; also you can click on a number of words to go to their definitions (1 click = 1 hit). You can also click on a thesaurus entry for every sense of the word that will show a list of 10 items (of course you can click on the "more" button to see more, but you won't get any sense discrimination between related words, just a list of words with their definitions... ). Bear in mind, while you do all this clicking, that the page is almost empty of any useful information, all that space could have been designed to accomodate a better "user experience" without any clutter at all, but that wouldn't generate clicks (remember, no clicks, no hits). So, please, tell me, what was new in all this, where was the revolution, where is "the true power of the digital medium" exploited here?

I wish Macmillan all the best in whatever business model they choose, but let's be clear about this, despite all media hype and shock headline therapy, the Second Revolution in lexicography has not happened.

Best wishes

José Aguirre

 

Gilles-Maurice de Schryver

unread,
Nov 8, 2012, 5:20:01 PM11/8/12
to David Joffe, José Aguirre, eur...@freelists.org, asi...@freelists.org, afr...@freelists.org, DS...@yahoogroups.com, lexicogr...@yahoogroups.com, is...@lists.le.ac.uk, lexico...@googlegroups.com

Thanks David -- perhaps we ought to declare interest here: we're (also) in the language technology business (but not for CJK, yet) ...

 

So, based on the video at the link below, I'd say the contest has been won by the dictionary of the future already (unless José Aguirre's handwriting is that of a medical doctor: sorry wanted a lighter note ;-).

 

Perhaps I should also point out that more than just single Kanji characters are recognized at a time here (which was the initial challenge): the dictionary of the future recognizes full meaningful chunks, to take one from the video ‘flash’.

 

That's thus 10 for the electronic dictionary, zero for the paper dictionary.

 

Let's bring on the next challenge, please!

 

This is not a joke, this is not about techies having fun, colleagues, what we mean when we say that the "second revolution" in our field has arrived, is exactly examples like this. Leave the paper world behind, and start viewing lexicography in the digital age. Coming up with new solutions to the age-old look-up problems in Chinese and Japanese dictionaries is one of them.

 

All best,

Gilles-Maurice.

 

 

-----Original Message-----
From: David Joffe [mailto:david...@tshwanedje.com]
Sent: donderdag 8 november 2012 22:39
To: 'José Aguirre'; gillesmauric...@UGent.be
Cc: eur...@freelists.org; asi...@freelists.org; afr...@freelists.org; DS...@yahoogroups.com; lexicogr...@yahoogroups.com; is...@lists.le.ac.uk; lexico...@googlegroups.com
Subject: Re: [euralex] José Aguirre's metalexicography

 

On 8 Nov 2012 at 21:52, Gilles-Maurice de Schryver wrote:

 

>     > if I jot down a random Chinese character on a piece of paper for both of us to look up, 10 times

>     out of 10 I will find it in a paper dictionary before you do in your digital dictionary.

 

>     So here's a nice challenge for the CJK gurus! If Jack Halpern's

> tools can't already beat you on this, let this be the Deep Blue

> lexicographic equivalent. Jack!?

 

If I am not mistaken, digital solutions for this problem have already begun to be implemented, e.g.:

 

http://www.techinasia.com/pleco-dictionary-android/

 

Basically, point your smartphone camera at a character, it runs it through OCR, and performs a dictionary search for you. I'm sure it's not perfect, but it's first-generation technology ... I don't know how this particular implementation would perform in a '10 attempts'

'paper vs electronic' contest, but I expect these methods would improve a lot in the next 10 years:

 

 

".. the Android iteration of Pleco dictionary has today gone gold, and now finds a home in the Android Market. It comes with OCR abilities so that it can scan and ‘read’ Chinese characters using your smartphone’s camera, handwriting support, voice recognition, and numerous dictionary options.

 

Its range of features means that it can be used by the most casual of tourists who might want to scan a menu whilst visiting China, to the most studious of students of the Chinese language who might need to add specialist dictionaries and make flashcards"

 

 

 

- David

 

Michael Beijer

unread,
Nov 8, 2012, 6:39:07 PM11/8/12
to lexico...@googlegroups.com, José Aguirre, eur...@freelists.org, asi...@freelists.org, afr...@freelists.org, DS...@yahoogroups.com, lexicogr...@yahoogroups.com, is...@lists.le.ac.uk, gillesmauric...@ugent.be
Hi Gilles-Maurice,

I am fully in the digital corner myself. 

I am a translator, and the sooner we move everything to digital the better. You have no idea how I wish I could somehow get my entire bookshelf of quickly aging specialist bilingual dictionaries into my CAT tool (translation software). I have piles of dictionaries that still contain very good content, but which is basically locked away in their paper pages and will soon just be lost. I am thinking of, e.g., my 4 volume Dutch-English 'Jansonius' from the 70s, my Dictionary of Building and Civil Engineering by S.N. Korchomkin et al. (Kluwer, 1985), and my Illustrated Dictionary of Mechanical Engineering by V.V. Schwartz et al. (1984) – all of which I hardly use because they aren't as readily accessible as my computer resources. 

On the other hand, I am subscribed to several online dictionaries, and this is where the future of lexicography should be headed if you ask me as a translator. Graham P Oxtoby's amazing Comprehensive Dictionary of Industry & Technology, and Aart van den End's Juridisch-Economisch Lexicon & Onroerend Goed Lexicon can be seen as examples of how to successfully operate a dictionary in the digital age. They are full of great content, are updated daily, and you can email their authors term questions and will almost always receive an answer within 20 minutes. Another success story is the Oxford Dictionaries Pro (formerly Oxford Dictionaries Online). This is another dictionary I am more than happy to pay my annual subscription for, as it has become a one-stop shop for all of my English-language dictionary needs. Incidentally, I might be buying myself a hard copy of Graham's Comprehensive Dictionary of Industry & Technology as a Christmas present when it comes out in December 2012 (2
volumes, 3600 pages!!!), but mostly just for fun. On a daily basis, I will be accessing it online as I translate.

There are also various interesting free online multilingual dictionaries popping up like mushrooms, which allow users to add words, such as LeoDict.ccbab.lainterglot, and the Proz.com KudoZ glossaries/term forums, but what I am waiting for is a site that manages to engender the feeling of community of the Proz.com term forums with the professional approach of say the Oxford Dictionaries Pro site. Now that would be a truly modern dictionary!

Michael

Michael Beijer
Translator & Terminologist
(Dutch/Flemish into English)
46 Priory Street, Lewes, 
East Sussex BN7 1HJ, 
United Kingdom.
Mob.+44 (0)797 093 5608
michael@wordbook.nl
Skype/Twitter: michaelbeijer
Reply all
Reply to author
Forward
0 new messages