Sunday, 19 May 2024

Review #16 | Jisho | Learn Japanese For Free

This week's review is for Jisho.org. I would recommend it for all levels of learners. The website covers most areas of the Vocabulary, Grammar, Stroke Order, Example Sentences, Kanji and general dictionary functions. Jisho is an encyclopaedia that is having an identity crisis it seems.

漢和辞典 (2013, CC3.0) TASHが撮影

What

Jisho is a Dictionary Website, which means dictionary. Searchbar is the main tool of searching Jisho, and where you will start your journey into the pit of the advanced searchbar when we all realize it is a Dictionary with an identity crisis in that it wants to be an Encyclopaedia which didn't quite grasp the depth of the datasets required to do this. The lemmas are not propositions after all. Searchbar results return back romaji readings and alternative search spellings.

Next to the bar, the ability to search by Draw, Radicals and Voice which allow you to search using these tools to search around the website. 

Below this, we find the total number of search results, furigana, Kanji pictographs, several indexical signposts such as 'common word' JLPT level and Wanikani level. Then play audio, collocation (similar meaning terms) and Links, which provides further examples and 'Kanji details'. 

We then find the definition and type of word such as noun and definition. To the right, names are listed, and More Names and other Dictionaries. More words then follows on until the end/

Kanji details takes you to a similar page, pictograph and semantical definition. Then on the left, the stroke number, radicals used and 'parts' which are closer to what we may call radicals following Heisigs methodology.  Then there are also variants of these which have more complicated versions of the pictograph. A handy stroke order animation is displayed below these. 

Atop, you have the Forum, About, Theme and Login sections. The Forum section is more or less a type of social media feed, About section gives a list of credits to original content creators and site creatives, Theme gives Night/Day lighting options and Log in allows you to log into the website with an account to log your way around the site.

After definition, you get to Kun/Onyomi, hyperlinks for words starting/ending/containing the pictograph involved and external links to Unihan, Wiktionary and Google Image Search. Next to these are the Joyo, JLPT and frequency metric grading of the Kanji. Below that is the Stroke Order and text-to-speech versions of these. Compounds lists reading variations of the kanji affixation/collocationary nature of the pictograph (Kanji-ish/Radical) to other Kanji. Kun reading compounds are then provided. Other Readings or more accurately translations are then provided into other languages.

Dictionary Indices provide references to various dictionarys for the pictograph involved. Some of these methodologies may be more suited to your learning needs and are pretty much considered established versions to self taught learners of the language it seems. Classifications and Codepoints refers to the points for Kun/Onyomi readings and coding languages in wider datasets and corpuses such as Mark Spahns The Learners Kanji Dictionary. I do not personally recommend learning the Kun/Onyomi as they will and wont be important, depending on your needs (see https://archive.org/details/learnerskanjidic00mark/page/n5/mode/1up). .

https://jisho.org/docs has a more robust listing of the search functions, but it essentially has the non-initial user friendly pitfalls of converting Romaji/Standard Englishes into the kana equivalent. Searching for a specific phrase requires using an asterisk or little star icon (*). Years and dates also seem to work a bit funky, but does follow the Japanese system of dates which works out for the best. Additive search also mean you need to use a "find this phrase" rather than 'find this "phrase"' format. AND or OR may also be useful if you need to use this type of search into Google, as searches are otherwise automatically single word formatted. Hashtags are also prominently used and a list is provided in the advanced search section, including grammatical searches.[1]

Where

Available at https://jisho.org/search/jisho  .

Who

The Jisho site is belongs to its respective creators of Kim Ahlström, Miwa Ahlström and Andrew. Plummer. Graphics and other contents to Brian Takumi and Sophian Bensaou and is licensed under Creative Commons 3.0 and GNU Free Documentation License.

Other respective content belongs to:

 - DBpedia content comes from Wikipedia, and is licensed under Creative Commons 3.0 and GNU Free Documentation License.

 - EDRDG Tanaka Corpus, JMDict, JMnedict, KanjiDIC2 and RADKFILE belongs to EDRDG project, with contributions from Jim Breen at the Electronic Dictionary Research and Development Group and is licensed under the projects various licenses.

 - KanjiVG content belongs to Ulrich Apel and is licensed under Creative Commons 3.0.

 - kanjivg2svg content belongs to the Jisho project and is licensed under Creative Commons 3.0 and GNU Free Documentation License.

 - Kradfile2 & kradfile-u belongs to Micheal Raine, with contributions by Breen and Jim Rose of the EDRDG project.

 - JMdict Database system belongs to Stuart McGraw and is licensed under Creative Commons 4.0.

 - J-Reibun belongs to Suzuki Tomomi, under copyright with plans to become open source. Contributors include Yoshiba, Junko Asano, Ryoko Ieda Akiko, Yoko, Oyama Yuuri, Oka Yoko, Kajikawa, Katsuya Kato Rie, Shibuya Hiroko, Tamaru Nozomi, Nakamura Ami, Nishijima Eriko, Noda Taishi, Haruna Fujimura, Yasuko Mitani and Kim Ahlstrom.

 -  MeCab belongs to Taku Kudo and the Free Software Foundation, under the FSF 2007 GNU/GPL license.

 - SKIP (System of Kanji Indexing by Patterns) system for ordering kanji content belonfs to Jack Halpern (Kanji Dictionary Publishing Society at http://www.kanji.org/), and is licensed under Creative Commons  4.0 International.

 - Tanaka Corpus comes from the work of Yasuhito Tanaka & many University Students and is licensed in the public domain.

 -  Tanos content comes from Jonathan Waller and is licensed under Creative Commons 4.0. 

 - Tatoeba content belongs to Trang Ho. and is licensed under Creative Commons 2.0.

 - Ve content belongs to Kim Ahlstrom and is licensed under a public domain license/FSF 2007 GNU/GPL license.

 - Wanakana content belongs to Tofugu and is licensed under the Open Source Initiative MIT License.

These all have many, many people involved in these projects whom I would love to highlight if they wish to be.

When

Available 24/7, also has an app which is free, might require subscription though for some features.

Why

I would recommend Jisho.org as a study tool guide. It can make finding a nice creative commons version a lot easier.

Bibliography

[1] https://jisho.org/docs

Socials

Email :        learnjapanese43@gmail.com

Wikimedia: https://commons.wikimedia.org/wiki/User:LearnJapanese43

Discord :     @learnjapaneseforfree

Tiktok :       @learnjapaneseforfree

Youtube:     @learnjapaneseforfree /LJ43?

This review is part of the Learn Japanese for free project. I have, do not and never will derive any profit from this project. Please send any requests, questions or further information about free tools for learning Japanese to learnjapanese43@gmail.com which is checked every 2 weeks.

Sunday, 5 May 2024

Review 15 | Tatoeba | Learn Japanese For Free

This week's review is for Tatoeba ( 例えば | for example ). I would recommend it for all levels of learners. The website covers most areas of Vocabulary and example sentences made by the community. This is mostly a search engine of example phrases. This search engine is what many modern language databases or corpuses for Japanese-English run on. I highly recommend looking at the origins and open source related history of the corpora, it is really cool and very interesting to see people working together on work like this.

Tatoeba English Homepage (2021, CC2.0) Tatoeba

What

Tatoeba is a multilingual text Corpus of example sentences. This is a creative commons database of example sentences created by Tatoebans. To start, use the link I provided as it can be a bit jumbled otherwise for our purposes. Along the top we have logo, and Browse. Under Browse we have Show random sentence, this will take you to any random sentence in your main language with other translations of the same example sentence. This may or may not have Japanese in it so probs avoid. Browse by language takes you to a selection of languages, including our language of choice, Japanese. Browse by list takes you to public lists of example sentences collated by Tatoebans and randos on Youtube apparently. Browse by tag takes you through a list of other corpus related tags such as company peddling sentences and other tags we are interested in. Tanaka Corpus is a good one for example, but you will find your own ones and hopefully create your own as you go if you like this format.

UsinSentence #325745 as an example, we have the main sentence, then the copy sentence function, audio for the sentence if there is some. and the sentence page. Dropping down to the first Japanese translation as of writing, this brings us to sentence page (the last spanish ! icon) this takes us to Sentence #77972. This gives us the Help, Advanced Search, the regular Search bar, the from and to bars, then previous, random and next which also allow you to search for your sentence if you ever remember a serial number but not the sentence itself in the Tanaka Corpus metadata axiom. These then go into tags, lists, sentence texts and logs which are the public data regarding the sentence metadata metircs.

Further down the page we have the Quick Start Guide, the Tatoeba Wiki, Help, Developers, Downloads, Socials for GitHub and Google Group/X and Doomerbook, another explanation for the project, contact, Status, terms of use details and the blog which goes through some juicy copyright legal stuff right off the 2021 bat. Downloads sections contains sections including custom export files tool atop, random sentence export generators, and the like. The downloads section is most useful if you are making software really as it uses more corpus based tools there. The Wiki is mostly administrative and behave yourself appropriately rules.

Then Community which utilises a social media wall when you sign up (I have yet to as it is 午前一時 and well other stuff) and is full of language nerds. Then a list of all the members (80, 672 total) and then language of members which has a legend, which is legendary. with around 553 fluent speakers registered in Japanese. Native speakers breaks this fluency metric listing down for admin, maintainers, and contributors as well. #12 if you were interested. 

To use the search bar, you may as well just use kanji/hiragana as other scripts the database seems to struggle with, so romaji to my understanding is rather hit and miss. Keyword search or "example" will give you a better Boolean search (for those interested, George Boole; 1815-1864; or the man Google was named after).  The Advanced Search functions includes keyword/phrase search via language search with translation, length, audio, frequency and owners for a sentence. Tags, fluency of their writer and lists can also be searched in order of relevance and reversed order.

Where

Available at https://tatoeba.org/en/sentences/show_all_in/jpn/none .

https://en.wiki.tatoeba.org/articles/show/main for the Wiki.

https://en.wiki.tatoeba.org/articles/show/text-search for advanced search options.

https://tatoeba.org/en/sentences/search?from=jpn&query=%E5%8D%88%E5%89%8D&to= for some reason.

https://en.wiki.tatoeba.org/articles/show/make-anki  for how to use anki in this whole set of affairs. It may be whatever for you, でもね、から私は午前一時ごじゅういぷん一人です。

Who

All Tatoeba project content belongs to its creator, and is licensed under Creative Commons license 2.0 FR. 

Relative other Tatoeba content is licensed under Creative Commons 2.0, and belongs to Trang Ho, as Benevolent Dictator for Life. 2022 never happened shush.

https://tatoeba.org/en/user/profile/Trang

Other respective content belongs to:

 - Tanaka Corpus, JMDict, JMnedict and KanjiDIC2 belongs to EDRDG project and founded on the work of Yasuhito Tanaka in 2001, with contributions from Jim Breen at the Electronic Dictionary Research and Development Group.

 - Tatoeba Wiki content belongs to the Tatoeba Project and uses C++, cppcms-skeleton and cppcms and is licensed under Creative Commons licenses

 - Kradfile2 & kradfile-u belongs to Micheal Raine, contributions by Breen and Jim Rose.

These all have many, many people involved in these projects whom I would love to highlight if they wish to be.

When

Available 24/7, might require subscription though for some features.

Why

I would recommend Tatoeba because it is a nice study tool with loads of free examples which are freely available and can be downloaded to make nice tools. As a vocabulary and practice corpus, this is a very handy tool especially when you just need examples of things like grammar and similar vocabulary that may be difficult to search in a dictionary or encyclopaedia for example that librarianship linguistics really needs to work on lol.

There is also the viability of creating new tools for yourself or others if you have coding or software development experience with the tools and corpus provided. With all that said, long live Creative Commons!   

Socials

Email :        learnjapanese43@gmail.com

Wikimedia: https://commons.wikimedia.org/wiki/User:LearnJapanese43

Discord :     @learnjapaneseforfree

Tiktok :       @learnjapaneseforfree

Youtube:     @learnjapaneseforfree /LJ43?

This review is part of the Learn Japanese for free project. I have, do not and never will derive any profit from this project. Please send any requests, questions or further information about free tools for learning Japanese to learnjapanese43@gmail.com which is checked every 2 weeks.

RTK search engine

Hochanh is a website search engine for RTK. Its for all levels and comes in handy for searching through the garbled mess that is the origina...