Universal dictionary system

27913 visits 

Basic idea
1 universal dictionary instead of 1001 bilingual dictionaries => much less work + whole system is much faster and more flexible.

Universal dictionary system V1.3

Languages table
id
156
Czech:
kocour {m}; kočka {f}
Dutch:
kat
English:
cat
Esperanto:
kato
French:
chat; chatte
German:
Katze {f}; Kater {m}
Italian:
gatto; gatta
Japanese:
猫 [ねこ]
Latin:
catus
Russian:
кот
Slovak:
kocur
Spanish:
gato
Swedish:
katt
Traditional Chinese:
貓 [mao1]


English-wordnet table
id
156
Note1:
n 1:
Note2:
Note3:
Word:
cat
Meaning:
Feline mammal usually having thick soft fur and being unable to roar
Examples:
"black cat"
Synonyms:
feline; felid
Antonyms:
(Meanings in other languages can be also added anytime. But not before the system is filled with translations. Dictionary before thesaurus!)

German notes and examples table
word:
Katze
notes:
all notes for Katze
examples:
all examples for Katze
(Many users asked for this table so here it is :-).)

Images
id:
156
Image:
Pronunciation
English: cat
German: Katze
Spanish: gato
Italian: gatto


Entries added automatically
English, German, French and Czech play important role in the Universal dictionary. There are many bilingual dictionaries for these languages. Entries for other languages can be added very quickly. Relations like for example "cat-chatte-Katze" can be created from bilingual dictionaries. Lets call these relations "triplets". With thousands of English, German, French and Czech entries already in the UD more languages can be filled into the system very quickly.
Universal dictionary: 156-cat-chatte-Katze

Bilingual dictionaries: cat-кошка, chatte-кошка, Katze-кошка

156-cat-chatte-Katze + cat-chatte-Katze-кошка = 156-cat-chatte-Katze-кошка

There is very high probability that кошка is correct Russian translation for record with cat-chatte-Katze words in it. This method can be used for all languages that have at least three bilingual dictionaries (three languages are strong enough to bind words in fourth language). Special program processes many bilingual dictionaries and extracts all translations that are interconnected with basic triplets (like for example with English-German-French triplets).

Yet this method can also add incorrect translations into the system. Especially when there are two different meanings described by same words in all three languages and fourth language has two different words for these meanings. Universal dictionary grows quickly thanks to this new method.

Entries added by contributors
Entries added by contributors are not added directly into the UD. Not all entries added by contributors will be added into the system. All entries:
  • have to pass spellchecker test - this takes care of "jrewjrekpsj"
  • have to be in English-xy dictionary - this takes care of "cat-perro"
  • have to be confirmed by at least one translation in other bilingual dictionary - this takes care of for example "noun"-"verb" translation
You see it is useless to add incorrect entries into the system. Bad entries will never make it into the system.

Notes
Only synonyms and most important notes can be in the languages table.
Synonyms are separated with ";": Katze {f}; Kater {m}
Genders: Katze {f}
Pronunciation (only if required): 猫 [ねこ]
All other notes have to be in () braces

Advantages
There are many advantages of the new system. Among others:
  • No redundancy (10 English-xy bilingual dicts have at least 10 times string "cat")
  • Wordnet meaning definition solves the problem of multiple meanings
  • (14*(14-1))/2=91 translations (bilingual dictionaries) can be created from one record for "cat" when 14 languages are in the system
  • For every new translation you make you get (count of language columns-1) translations
  • Any language can be added anytime
  • Same word in more selected languages can be listed
  • Just one select SQL query for searching
  • Notes of the translation are stored in the backbone description
  • Bilingual dictionaries can be created online from the universal dictionary