This week's review is for Tatoeba ( 例えば | for example ). I would recommend it for all levels of learners. The website covers most areas of Vocabulary and example sentences made by the community. This is mostly a search engine of example phrases. This search engine is what many modern language databases or corpuses for Japanese-English run on. I highly recommend looking at the origins and open source related history of the corpora, it is really cool and very interesting to see people working together on work like this.
What
Tatoeba is a multilingual text Corpus of example sentences. This is a creative commons database of example sentences created by Tatoebans. To start, use the link I provided as it can be a bit jumbled otherwise for our purposes. Along the top we have logo, and Browse. Under Browse we have Show random sentence, this will take you to any random sentence in your main language with other translations of the same example sentence. This may or may not have Japanese in it so probs avoid. Browse by language takes you to a selection of languages, including our language of choice, Japanese. Browse by list takes you to public lists of example sentences collated by Tatoebans and randos on Youtube apparently. Browse by tag takes you through a list of other corpus related tags such as company peddling sentences and other tags we are interested in. Tanaka Corpus is a good one for example, but you will find your own ones and hopefully create your own as you go if you like this format.
Using Sentence #325745 as an example, we have the main sentence, then the copy sentence function, audio for the sentence if there is some. and the sentence page. Dropping down to the first Japanese translation as of writing, this brings us to sentence page (the last spanish ! icon) this takes us to Sentence #77972. This gives us the Help, Advanced Search, the regular Search bar, the from and to bars, then previous, random and next which also allow you to search for your sentence if you ever remember a serial number but not the sentence itself in the Tanaka Corpus metadata axiom. These then go into tags, lists, sentence texts and logs which are the public data regarding the sentence metadata metircs.
Further down the page we have the Quick Start Guide, the Tatoeba Wiki, Help, Developers, Downloads, Socials for GitHub and Google Group/X and Doomerbook, another explanation for the project, contact, Status, terms of use details and the blog which goes through some juicy copyright legal stuff right off the 2021 bat. Downloads sections contains sections including custom export files tool atop, random sentence export generators, and the like. The downloads section is most useful if you are making software really as it uses more corpus based tools there. The Wiki is mostly administrative and behave yourself appropriately rules.
Then Community which utilises a social media wall when you sign up (I have yet to as it is 午前一時 and well other stuff) and is full of language nerds. Then a list of all the members (80, 672 total) and then language of members which has a legend, which is legendary. with around 553 fluent speakers registered in Japanese. Native speakers breaks this fluency metric listing down for admin, maintainers, and contributors as well. #12 if you were interested.
To use the search bar, you may as well just use kanji/hiragana as other scripts the database seems to struggle with, so romaji to my understanding is rather hit and miss. Keyword search or "example" will give you a better Boolean search (for those interested, George Boole; 1815-1864; or the man Google was named after). The Advanced Search functions includes keyword/phrase search via language search with translation, length, audio, frequency and owners for a sentence. Tags, fluency of their writer and lists can also be searched in order of relevance and reversed order.
Where
Available at https://tatoeba.org/en/sentences/show_all_in/jpn/none .
https://en.wiki.tatoeba.org/articles/show/main for the Wiki.
https://en.wiki.tatoeba.org/articles/show/text-search for advanced search options.
https://tatoeba.org/en/sentences/search?from=jpn&query=%E5%8D%88%E5%89%8D&to= for some reason.
https://en.wiki.tatoeba.org/articles/show/make-anki for how to use anki in this whole set of affairs. It may be whatever for you, でもね、から私は午前一時ごじゅういぷん一人です。
Who
All Tatoeba project content belongs to its creator, and is licensed under Creative Commons license 2.0 FR.
Relative other Tatoeba content is licensed under Creative Commons 2.0, and belongs to Trang Ho, as Benevolent Dictator for Life. 2022 never happened shush.
https://tatoeba.org/en/user/profile/Trang
Other respective content belongs to:
- Tanaka Corpus, JMDict, JMnedict and KanjiDIC2 belongs to EDRDG project and founded on the work of Yasuhito Tanaka in 2001, with contributions from Jim Breen at the Electronic Dictionary Research and Development Group.
- Tatoeba Wiki content belongs to the Tatoeba Project and uses C++, cppcms-skeleton and cppcms and is licensed under Creative Commons licenses
- Kradfile2 & kradfile-u belongs to Micheal Raine, contributions by Breen and Jim Rose.
These all have many, many people involved in these projects whom I would love to highlight if they wish to be.
When
Available 24/7, might require subscription though for some features.
Why
I would recommend Tatoeba because it is a nice study tool with loads of free examples which are freely available and can be downloaded to make nice tools. As a vocabulary and practice corpus, this is a very handy tool especially when you just need examples of things like grammar and similar vocabulary that may be difficult to search in a dictionary or encyclopaedia for example that librarianship linguistics really needs to work on lol.
There is also the viability of creating new tools for yourself or others if you have coding or software development experience with the tools and corpus provided. With all that said, long live Creative Commons!
Socials
Email : learnjapanese43@gmail.com
Wikimedia: https://commons.wikimedia.org/wiki/User:LearnJapanese43
Discord : @learnjapaneseforfree
Tiktok : @learnjapaneseforfree
Youtube: @learnjapaneseforfree /LJ43?
This review is part of the Learn Japanese for free project. I have, do not and never will derive any profit from this project. Please send any requests, questions or further information about free tools for learning Japanese to learnjapanese43@gmail.com which is checked every 2 weeks.

No comments:
Post a Comment