top of page

The Pursuit of Patterns in Language

  • ruogu-ling
  • Oct 6
  • 7 min read

I’ve always been the kind of person who looks for patterns. Whenever something happens, my instinct is to find the logic behind it. But the moment it comes to the Chinese language, I always feel a sense of defeat. Once, a foreigner told me, “Gu Sen, your body is excellent.” I immediately said, “No, no, no — you can’t say a body is excellent.” He asked, “Why not?” and I asked myself the same thing — why not? That was when I realized: in language, there are rules that seem impossible to pin down. So I began paying attention to the oddities of Chinese, and I want to share some of these with you in a slightly unconventional way.

ree

Look at these ten words. I divided them into two groups according to a certain hidden rule. The top five share a feature that the bottom five do not. Can you see it? “反复, 高兴, 磨蹭, 说笑, 许多.” Each of these can be repeated in an AABB pattern: 反反复复, 高高兴兴, 磨磨蹭蹭, 说说笑笑, 许许多多. But “地震, 动静, 金黄, 巨大, 雕刻” cannot. You can’t say 地地震震, 动动静静, or 金金黄黄. Where exactly is the difference? I still haven’t found that rule.


Now let’s test another pattern. These five words share a new property the others don’t. They’re all nouns: 鱼, 路, 船, 裙子, 短信. Their classifier is 条 — 一条鱼, 一条路, 一条船, 一条裙子, 一条短信. But “山, 剑, 伞, 文章, 水母” don’t work with 条. Why can’t we say 一条山 or 一条伞? Some people assume 条 applies to long, thin things. But that can’t be the rule — 一条鱼 makes sense, yet fish aren’t long and thin in the same way as roads or snakes. 一条短信? 一条政策? 一条人命? 一条好汉? Clearly, the rule isn’t about shape.


Let’s try another one. These five — 腿, 门, 气味, 鱼刺, 笔记本 — can all take the “儿化” suffix: 腿儿, 门儿, 气味儿, 鱼刺儿, 笔记本儿. But “手, 电, 建筑, 铅笔, 地球仪” cannot. Why can some words take 儿 and others can’t? People often say 儿化 makes things sound affectionate or diminutive — but that’s unreliable. Two related words can behave completely differently. 笔记本 can become 笔记本儿, yet 铅笔 can’t become 铅笔儿.


Someone once suggested that maybe words ending in the “i” vowel can’t take 儿. That’s wrong — 铅笔 can’t, but 小鸡 can (小鸡儿). Then maybe it’s the whole syllable combination — also wrong, because 小鸡儿 works, yet 手机儿 doesn’t. Maybe it depends only on the final character? Still wrong. The same character sometimes can and sometimes can’t take 儿, and the meaning changes when it does. 盖 (to cover) vs 盖儿 (a lid). 头 (a head) vs 头儿 (a boss).


So imagine learning Chinese as a foreigner — you’d have to memorize which words can take 儿 and which cannot, just like we memorize irregular verbs in English. But this is worse, because there is no pattern.


Differences like these can lead to even stranger phenomena. Take “别” and “甭.” We use both to mean “don’t.” “别理他” can become “甭理他.” “别吃了” can become “甭吃了.” But not always. Some phrases only work with “别.” You can say “别感冒了,” but not “甭感冒了.” You can say “别忘了,” but not “甭忘了.” “别饿了”? Not “甭饿了.”


What’s the difference? Which verbs allow both, and which only “别”? To study this, I listed them. “走, 吃, 买, 洗, 讨论, 打扫, 参加” all accept both. “病, 忘, 饿, 怕, 感冒, 看见, 知道” only take “别.” The distinction lies in control: the first set are actions one can consciously perform — self-initiated verbs; the second set are states that happen to a person — non-volitional verbs.


Once that distinction is clear, many puzzles in Chinese grammar fall into place. Only volitional verbs can stand alone as imperatives: “走,” “吃,” “买.” You can’t point to someone and say “病!” or “感冒!” or “看见!” Volitional verbs can reduplicate: “出去走走,” “去餐厅吃吃.” But you can’t say “冬天到了我们感冒感冒,” or “我们病病.”


Why study such things? First, to teach Chinese to foreigners. Without understanding these patterns, how could anyone know which forms are possible? But second — and more crucially — to teach computers how to process Chinese intelligently.


We’ve begun to formalize this. For every word, we record its part of speech, number of syllables, whether it can take 儿, whether it’s volitional, and so on. We add rules: words that can’t take 儿 may not attach 儿; non-volitional verbs can’t stand alone; they also can’t reduplicate.


At first, it looks like we’ve captured the system. But the counterexamples are everywhere. Who says non-儿化 words can never combine with 儿? Look at “这种铅笔儿童不宜使用” — “铅笔” followed by “儿童.” Who says non-volitional verbs can’t reduplicate? “看了又忘,忘了又看”; “不知道知道了会怎么样.”


Critics say, “Those are cheating examples.” Why? Because they cross linguistic levels. Words and phrases in Chinese — like in many languages — are built layer by layer. Sentences start simple and expand through recursion. For example, a basic sentence might be “学生学习.” The noun phrase “学生” can expand: “聪明的学生,” or “王老师的学生.” Step by step, the sentence grows: “老师逗乐了,” becomes “老师被学生逗乐了,” then “老师被迟到的学生逗乐了.”


So we can confine our earlier rules to their own layers. If a noun phrase becomes “名词 + 儿,” the noun must allow 儿化. If a sentence consists solely of a verb, that verb must be volitional. If a verb phrase duplicates a verb, both must be volitional. This layering helps block “cheating” examples that break boundaries.


That gives us a kind of universal framework to describe many bizarre patterns in Chinese — but only a small subset: whether a sentence can be said. We haven’t even touched meaning.


Semantics are far trickier. Take “我吃完了” and “苹果吃完了.” Both share the same syntax and word order, yet one means the person finished eating, the other means the apples are gone. And “孩子吃完了” — now it’s ambiguous. It could mean the child finished eating, or, in some dark anthropological context, that the child was eaten.


How does a computer tell? A human knows intuitively — because “我” and “孩子” are human, capable of performing “eat,” whereas “苹果” is an inanimate object, only capable of being eaten. In other words, our brains store semantic categories for words — human, object, abstract, event — and we know what kind of nouns typically appear with what kinds of verbs.


Linguists classify verb–noun relationships into seventeen types. Even within “agent,” there are subtypes: actor, experiencer, patient, and force.


The actor is the true initiator of an action — in “我吃苹果,” “我” is the actor. The experiencer undergoes a mental or emotional state — “我喜欢她,” “我知道了.” The patient undergoes change — “老王病了,” where “老王” is not acting but changing state. The force, or causer, is an external natural cause — “洪水淹没了房屋,” where “洪水” is the force.


Given a verb, the possible roles of accompanying nouns are largely predetermined. “下雨” needs no noun; “休息” needs one actor — who rests. We never say “老王休息手.” “洗” requires both an actor and a patient — the washer and the washed. “去” requires an actor and a goal. “淹没” takes a force and a patient. “送” involves three: sender, object, and recipient.


Now, by encoding such relations, we can make machines understand meaning. For “吃,” we specify: actor = human or animal; patient = food or medicine. That solves “我吃完了” (actor = me), “苹果吃完了” (patient = apples), and “孩子吃完了” (actor = child). For “淹没,” force = natural entity; patient = structure. “洪水淹没了房屋” fits. “洪水淹没了村庄” or “城市”? Fine — we extend “patient” to include structures and spaces. What about “悲伤淹没了我” or “黑暗淹没了我”? We add “abstract” to force and “human” to patient. Step by step, the model captures more of the language.


But does this solve all of Chinese? Not yet. There are still countless quirks that need new models.


Consider these four phrases: “砍光了,” “砍累了,” “砍钝了,” “砍快了.” Each means something different. “砍光了” — all the trees are gone. “砍累了” — the person is tired. “砍钝了” — the axe is dull. “砍快了” — the action was too fast. Our existing models can’t explain how one verb can pair with so many distinct result states.


Or these: “我答应他明天去” means I go; “我说服他明天去” means he goes. What decides that shift? Or when two verbs appear together, how do we capture their hidden relationships? “抓住不放” expresses repetition or persistence — “抓住” equals “不放.” “说起来气人” expresses condition — every time “说起来,” it “气人.”


We can prove these are genuinely different. Take “留着没用.” It’s ambiguous. One meaning: a repetitive one — “It’s been kept all this time and never used.” Another: conditional — “If we keep it, it’ll be useless.”


Such phenomena show we constantly need new models — and perhaps one day a model of models. A universal meta-pattern that explains how all linguistic patterns arise.


And if we go even further, maybe that “pattern of patterns” — that ultimate rule — is exactly what distinguishes the human brain from all other animals.


Among all the unsolved mysteries of human knowledge, this one — the search for the hidden laws behind language itself — must be one of the most thrilling of all.

Comments


Contact

China, Liaoning, Shenyang

Xinggong North Street 104 Avenue

General Inquiries:
86 15566156705

Customer Care:
Ruogu-ling@hotmail.com

Follow

Sign up to get the latest news on our product.

© 2035 by Yumeyi. 

bottom of page