Language

Korean Hangul

Wednesday, January 2nd, 2008

In the middle of the 15th century, Korea was still using Chinese characters for their written language, despite having a very different spoken language. King Sejong argued that the Chinese script, with its thousands of characters, was too complicated for commoners to learn and was awkward due to the differences between spoken Chinese and Korean. So in 1446 he published a document demonstrating a new writing system, Hangul, which used only 51 characters, making it much easier to learn. 24 of the characters map closely to letters of the Latin alphabet. The most interesting part is that the characters are drawn to show the way the lips and tongue are positioned to form the sound, enabling non-native speakers to sound out words without extensive training.

th-hangul.jpg

Word Generator

Friday, December 28th, 2007

Indromia, Quard, Conistate, Vercurelince, Quiniferphose!

In an inexplicable fit of word geekery, I wrote a program to generate new words via a statistical analysis of existing words. First I generated a histogram to count the number of times each possible three-letter combination occurs at the beginnings, middles, or ends of existing words. Then, to generate a new word, the program tries random overlapping three-letter combinations until each of their frequencies of occurrence in the histogram is above a specified threshold. Isn’t that idimogous?

Go ahead and give it a try below! You can specify a “normalness” scaling factor (5 gives very daisewisfasy-sounding words, 95 gives very conistate-sounding words), a maximum word length, a minimum word length, the source text for the statistical analysis (choose from the dictionary, the Bible, the complete works of William Shakespeare, etc), and an optional “seed” word for the generator to build upon (seeding with “muffin” could yield the wonderful “Muffinetry”).

Oh, and I also made a variation that uses U.S. Census data to generate new baby names (just click on “names”).