Korean hangul to hanja converter

Over half of the Korean language vocabulary consists of words borrowed from Chinese. For many centuries, Koreans used a mixture of Chinese characters (hanja) and the hangul phonetic alphabet to write their language. However, most writing since the 1990s does not use Hanja.

Korean learners who know Chinese or Japanese would find it much easier to understand Korean text with hanja. After much digging around, I found a tool that seems to use some NLP algorithm to convert Korean text into mixed script. The interface is messy, but the conversion seems good from what I can tell as a beginner in Korean. Here it is:

http://203.250.77.242:5900/

If you select 한자 변환, it will convert the appropriate hangul into hanja. If you select 한글(한자) 병기, it will insert hanja in parentheses.

Can anyone who knows Korean comment on the accuracy of the conversion? And does anyone know of similar converters?

1 Like

I checked it and found it to be a decent tool, with some good and bad points.

Good:

  • it seems to have quite a large hanja database to convert even obscure set phrases. I tried uncommon words like 하마평, 주마간산, 일장춘몽 which are mostly used in literature, and it found the correct hanja for them. Its word coverage seems extensive.

  • it can separate hanja words from simple noun+particle phrases, although it seems to give up if something more complicated is attached to the word. For example it works on things like 정부는, 정책을, etc, but not on 진행했다.

  • when there are multiple hanja candidates, it uses the most likely one based on general frequency of usage. It lists all the other less likely ones too with a number suggesting the match probability. So while it may get it wrong, say, about 15% of the times, one will at least see the correct one in the list of other choices it provides.

Bad:

  • as you said, the user interface it not good.

  • from what I saw, it is just a simplistic one way mapping tool, without any consideration of context. So no matter what the sentence or surrounding words, you get the same hanja for a given word based on the frequency rating which seems to be fixed globally.

I thought about the utility of the tool. It seems it can serve a small niche, maybe a too small one, of learners of Korean who already know Chinese characters pretty well, mostly Chinese and Japanese people, I suppose. Even for these learners, however, single word hanja lookup is already done much better by the online dictionaries. And the fact that it doesn’t offer China’s simplified characters (AFAIK) as well as the occasionally different meanings of some hanja between the different countries would further reduce its usefulness.

So I think the only use it has is in text block conversion. You can convert a sentence or even a paragraph to read it with hanja in it. And Koreans who want to brush up on hanja can use it to quickly check the hanja for some words in the text, since we are often stumped for which hanja character is used for which Korean word.

1 Like

Chinese people are not a “small niche”. Korean dramas are very popular these last several years.

I agree the Chinese make up a large (probably the largest) demographic among Korean learners. But the software doesn’t seem to support 简体字, so the utility is further limited there.

Tranditonal/unsimplified Chinese characters pose no problem for most educated Chinese.

However, most Chinese who watch Korean drama don’t proceed to learn Korean.

1 Like

Oh - I was expecting “most Chinese who watch Korean drama ain’t so educated”. Shame on me.

1 Like