|
Note from Phil Parker:
In March 2005, I received a telephone call from a nice gentleman, Andrew
Joscelyne, who wished to learn more about this project. Below is a write up
of the interview, in case others might be interested. Thanks Andrew for the
posting this!
p.s.
The Tarahumara dictionary, with over 3000 entries, mentionned below is thanks
to Christen Kramer who compiled it during her senior year in high school
while she lived with her family in Northern Mexico among the Tarahumara
people. We will be posting the Tarahumara dictionary soon this summer.
In addition to Andrew’s review,
titled “The Dyslexicographer”, there is a review, below, from Dr. Péter
Jacsó, Professor, Library and Information Science Program, Department of
Information and Computer Sciences, University of Hawai'i at Mãnoa. It
appeared in the April issue of “Thomson Gale Reference
Reviews”.
I have also
added a few comments from some friendly users. Many thanks for all of the
kind remarks!
The dyslexicographer
Margaret Marks at Translation Blawg rightly wonders what on earth
the Webster’s Online Dictionary (WOD) is all about. Although there is
quite a lot of background information available on the site, I decided to find out from its creator Phil Parker.
Here’s the score.
A Professor at INSEAD, the
European business school, Philip Parker was born dyslexic.
This meant he found reading dictionaries – lists of words and their
definitions – much easier than sustained prose, which demanded too much time
to decipher. So over the past 30 years he has been collecting dictionaries of
all kinds. Around the year 2000, large dictionaries on the web started
charging for ‘premium’ words of the sort he needed in his research and that
really “pissed him off”. So he decided to leverage the definitions he had collected
from his own store, borrowed the out-of-copyright ‘Webster’ badge, and
started building WOD, which he intends to make the biggest multilingual
dictionary site on the web.
He was lucky since he had
loads of help from academic and other assistants, benefited from donations of
out of print dictionaries and word lists, and was able to finance the whole
thing himself. He even uses a firm in Togo
to keyboard in content. This summer he hopes to upgrade the site to feature
dictionaries covering 600 languages (10% of the world’s current language
population), and in the case of existing site languages such as Spanish, he
hopes to increase the entry count from around 100,000 to 600,000 entries.
To give global coverage he is
working in a sequences of passes. The first pass was to work by time zones,
taking a location such as Europe and collecting dictionary materials for all
‘major’ languages. The second pass, now under way, is to include ‘secondary’
languages (say Maltese in Europe). Next year, he plans to start the third
pass by incorporating locally endangered languages, using volunteer help
where necessary. One technique is this: he donates a computer and a small
stipend to missionary children (e.g. for Tarahumara in Mexico) who then create a local
language/English dictionary.
What’s next, once he’s got all
these bilingual word lists? Create a total lexical linker, whereby you can
click from any word to its equivalent in any other language, using English as
the underlying pivot language. An “N-dimensional cube of words in every
language to every language,” as he puts it, that will by this summer be the
world’s largest compilation of language items ever produced. His content
currently weighs in at around one terabyte.
How useful is Phil’s site
proving? He reckons it is among the top ten sites used to search Arabic words
in Arabic script, since the whole hoard has been programmed for Unicode. And
because the Webster word is a synonym for ‘dictionary’ for Americans (as
Kodak once was for cameras or Google is for search engines) WOD ranks between
5 and 7 on, well, Google for ‘Webster’ out of about 150 ‘Webster’ sites on
the web these days. Probably the best way to appreciate the ambition of Phil
Parker’s site is to search the term Webster itself, and see the degree
of encyclopedic potential – words, images, statistical findings from corpora,
sign language versions, et alia multa – that he is trying to pack into what
he calls a hobby. But the definitions don’t include a more recent
decomposition - web + ster (as in napster) - a linguistic peer to peer
resource.
Péter's Digital Reference Shelf – April 2005
Title:
Webster's Online Dictionary, Rosetta Edition
Publisher: Philip M.
Parker, INSEAD
Cost: Free
URL: http://www.websters-online-dictionary.org/
Tested: March 18-25,
2005
Webster's Online Dictionary, Rosetta
Edition defies my efforts to write a traditional review. I always try to
evaluate and review digital ready-reference sources in this column in a
systematic way. For example, I test general dictionaries using a benchmark of
about 150 terms that represent a mix of contemporary, formal, slang, archaic,
recently coined, foreign, borrowed, technical, medical, scientific, and
everyday words. I give a score for each ranging from 0 (no entry) to 5
(perfect entry) depending on the quality of definitions, sample sentences,
attributions, usage notes, etymology, print and audio pronunciation help and
visual illustrations. I can't do that with this dictionary.
I put the dictionaries in
context, comparing them with alternative sources that I reviewed or at least
used extensively. I determine the hit rate, add up the scores, and calculate
their average, then compare the numbers with those garnered by other
dictionaries in the same league. These scores give me a quantifiable result,
such as 85% hit rate in the American Heritage Dictionary (4th edition, 2003
digital update) with a total score of 541 points for 154 words (3.51 average)
versus Merriam-Webster's 10th College edition (2002 digital update), with a
hit rate of 75%, total score of 380 points for the same 154 words (2.47
average). Then I look at the typography, the layout, and various software
aspects and write my review. It still may not be completely objective, but it
is at least systematic and is based on extensive samples.
I can't follow this process
with the Rosetta edition of the Webster's Online Dictionary. It is as if I
were to try to describe a jam session featuring many of the best musicians,
vocalists, other artists and performers. You must see it, hear it and feel
it.
It is the brainchild of
professor Philip M. Parker. His very short biography gives a hint of his lexicographic
interest and competence. His affiliation with INSEAD may not impress you as much as it should because
the institute is not well-known in the U.S. Suffice it to say that last year
it was ranked no. 11 among the executive education programs in the worldwide
yearly survey of the Financial Times. Its faculty have published many
unconventional, eye-opening and award-winning scholarly articles and books.
They may not be booked on TV morning shows and afternoon talk fests (a
dubious sign of celebrity in the contemporary culture), but this faculty is
certainly a very good company for the unorthodox and scholarly thinkers,
doers and projects.
The project is based on the
1913 edition of Webster's Unabridged Dictionary, but that is like saying that
New York is based on New Amsterdam. It has been enhanced by millions of
copyright-cleared entries (including images, drawings, book covers, posters
and photographs) from both historical and contemporary sources. That's why
you would find definitions and examples for such neologisms as bling, blog
and wiktionary.
Parker is not only the
instigator, but also the editor-in-chief of the dictionary (although he does
not use this term). He has been assisted by contributors. But this is not one
of the dime-a-dozen free-for-all wiki projects with contributions without attributions
— though list is not yet complete. (Yes, I know, the dictionary does include
entries from Wikipedia.) Definitions and illustrations for the words are
included from a variety of sources, and each entry is meticulously
acknowledged and, if possible, linked to — just as in the splendid
Answers.com service, formerly known as GuruNet, Sling and Atomica, which I
have reviewed more than once in this column and will probably do so again.
The editor seems to bear the
brunt of the intellectually demanding selections and compilations. No matter
how sophisticated computer technology is applied in this project, I doubt
that "*t*he dictionary will soon consist of over 400 modern languages,
and 10 ancestral languages, with some 30 million individual entries across
languages."
As for the Rosetta qualifier,
it's an obvious homage to the Rosetta Stone, the important cultural heritage
from Egypt that included the same decree in three languages and whose
deciphering was crucial for translating hieroglyphic text and for learning
about ancient cultures. For further details and background about the project
check out the About Us page.
I only illustrate here the lay
of the land and pinpoint a few of the landmarks. Don't start by looking up
words such as "love" or "money" as the results will be
overwhelming. Instead go for the more esoteric words, such as, well Rosetta.
Each word has its own Web
page. Most of the pages are very long, but an excellent
index, which is always at hand, can help you skip quickly to the sections
that interest you the most. The entries start with a traditional definition,
etymology notes when appropriate, and dating of first usage. These are
followed by definitions from special/subject dictionaries and crossword
puzzles, usage
examples from contemporary book and video titles, and even software
titles. For the word Rosetta there is a series of images in slide
show format (it did not work when I tested it), as well as thumbnails
about the object or the person with links to the larger (and sharper) images
(photos, engravings, clip arts, etc.).
The next section is the word
usage statistics that reveal how frequently the word appears in the 100
million word subset of the huge British National Corpus, and the word's frequency
rank among the 700,000 words used in English. If the word is also a
personal name, similar statistics (based on U.S. census data) are shown for
its use and popularity
rank as first and/or last name. This may be followed by lists of
derivative names, company
names and compound terms in which the word is used.
The statistical data about the
daily use of the word in queries submitted to the most popular English
language search
engines is very interesting, and a goldmine for Web site optimizers.
Translations of the word in a
variety of languages are then listed (only three for this word, but dozens
for others) along with a list of words for which rosetta is recommended by
spell checking programs as the correct
term. This section is followed by direct anagram(s) for the word (such as
toaster and rotates) and by various Scrabble riddles with some of the letters
in the word. A series of professional photos may follow this section,
which includes images of books, CDs, software and household items whose name,
or author's or performer's first or last name includes the search
term.
This section concludes with a
few bibliographic citations of primary newspaper and magazine articles from HighBeam
with the search word automatically passed forward. HighBeam is not a free
service, so you can see only a small snippet of the full HighBeam record if
you follow the link. As a nod toward Google, there is a Google search box
with your term already in the search cell, ready for launching.
You will probably not use this
last option too often because you may already be full from the mountain of
well-clustered information about a single word.
And this is only the tip of
the iceberg. For other more common words, there are definitions from many
more dictionaries, encyclopedias and thesauri in zillions of contexts with
example sentences to illustrate the use of the word in a great variety of
sources, such as:
·
the Bible (in numerous English versions of different eras
and in many foreign language translations)
·
classical literary texts (dramas, novels, poems)
·
famous historical and contemporary speeches and talks
·
contemporary fiction and non-fiction
The word is often shown in
different notation systems and orthographies ranging from hexadecimal notations to Braille and Morse
code, from sign language to Leonardo's mirror-writing. For some words there
are also animations and sound bites.
This is a fascinating carnival
of words. It is a very smart and honest project aimed at appreciating and
learning about English, as well as foreign languages and cultures. After all,
Noah Webster was a polyglot and solving the enigma of the Rosetta Stone
depended on understanding foreign languages and cultures.
Sample Praises
Here are a few samples from email or blog praises
(thanks to all for the constructive comments!):
·
Hello and thanks for your fantastic dictionary!
Gorgeous! … Bo Bergman, Sweden.
·
Just want to drop a note and show my appreciation for
the great dictionary you guys put up on the website. It is very user friendly
and has lots of information that I am looking for. My favorite part: the
floating menu that takes me to each individual section directly. After trying
out so many online dictionaries, my conclusion is: yours is the best. Thank
you! K.
·
Hi. I very much like the concept and execution of
your dictionary. Hope things continue to flourish for you. Sincerely, Nora
Miller Clackamas, OR, USA
·
I love your site. I think it's a great project. … I'm
an aspiring novelist and your site really comes in handy. I'm going to
mention it to all my colleagues as well. All the best, Renu
·
It is great. Oliver White, Arizona.
·
I've just been browsing your beautiful site, a
splendid idea! Christine Alba
·
Congratulations on the Webster project - very
worthwhile and impressive. Dr. J. Neill Richardson.
·
Hi! Great work! … Peter Schoplocher
·
First off, I want to say how in love I am with the
goals of your site. I run a comparatively tiny e-book site abiding by the
same principles. … Good luck with your site. I point volunteers to
Distributed Proofreaders on my site (not that more than a few people a day
see it) and will also now point them to yours. … Hope you are well, Ian
·
What a comprehensive website, … Nancy Stewart, AZ,
USA
·
Hi, First I would like to express my appreciation to
your site. I've just learnt of its existence, and found it very interesting,
enriching and useful. … Best regards, Sigal (from Jerusalem)
·
Hi, Do you realise you are a googlewhack with
achillean lynx? By the way a googlewhack is a 1 in 3 billion chance of 2
words leading to one site and these two words lead to one site only yours, so
your a googlewhack. Congratulations! Yours Faithfully, Simon J Green.
·
A tremendous piece of work: an amazing dictionary.
The multilingual part increases its usefulness by a huge factor. Keep up the
good work! Regards, Joe Kurleto
·
Hi! Great work!… Regards, Peter Schoplocher
·
Dear Phil! I received the letter. … I'm gladly giving
you the permission you asked for. There're several Webster's editions I know.
They are great. The online version is amazing too, so the honour is mine.
Best wishes, András Németh.
·
Sublime trouvaille ! Mission : créer le plus grand
dictionnaire de langues modernes (l'équivalent de 500 encyclopédies). Le dictionnaire
couvre 30 langues modernes, 10 langues anciennes, et contient au-delà de 30
millions d'entrées. Il se veut libre d'accès à tous les habitants du monde,
par le biais d'Internet. Websters
Online Dictionnary, The
Rosetta Edition. Interface en anglais, mais (évidemment)
présence du français. Si je jette un oeil à « vamp »,
par exemple, je vois défiler les définitions (pour le substantif et le
verbe), les définitions relatives à des domaines spécialisés, où on me
souligne que le mot est probablement originaire du français
« avant-pied » (ce que me confirme le GDT),
une liste des acronymes (le cas échéant), les synonymes, les synonymes en
contexte, des exemples d'usage en langue courante, d'usage commercial, la
fréquence d'usage, les expressions, un recensement de la présence dans les
médias et la littérature (multilingue), anagrammes, rimes, et, évidemment, le
sens ou l'équivalence dans une multitude de langues et de langages (incluant
le HTML, la langue des signes, le morse, le code binaire, etc.). Et j'en passe
! Allez jeter un oeil, le petit bouquin qui s'affichera à droite de votre
écran lorsque vous accéderez au mot recherché est l'index (on l'écrit,
d'ailleurs), il vous offrira un menu contextuel qui facilitera la navigation
au coeur de ces riches et interminables pages. L'accès au menu contextuel
vous suit d'ailleurs partout, de haut en bas. Une pure merveille ! Mise à jour : Le distingué Language Hat (un carnetier qui mérite
respect, il convient de le dire) est allé faire un tour du côté de la
trouvaille et en a été beaucoup moins heureux
que bibi ! Les traductions sont quelquefois oiseuses semble-t-il... (mea
culpa j'avoue ne pas être allée trifouiller du côté du hongrois et du russe).
D'un point de vue de langagier, il trouve le site truffé d'inutilités. Un
point de vue que je ne partage pas (je crois que c'est bien la première fois
que cela m'arrive dans le cas de language hat...). Je crois sincèrement que
ce jeune site est prometteur et que s'il a ses défauts, comme tous ses pairs,
langagiers ou pas, il constitue un carrefour d'informations intéressantes et
un excellent point de départ pour qui veut élargir ses horizons ou ajouter
une ressource (dussions-nous la qualifier d'« alternative ») à celles que
l'on consulte déjà. L'auteur du carnet Semantic Compositions,
par ailleurs, émet un commentaire
un tantinet plus positif. À suivre sur language hat...NDLGR : Pour fouiller les entrailles de la bête
par ordre alphabétique, c'est par ici. Source: Les coups de langue de la grande
rousse
Our email: webstersedits2@hotmail.com
|