[Herald Interview] Proficient process to get linguistic edge

Linguist emphasizes extensive possibilities of corpus-related English studies

Nov. 20, 2014 - 21:26 By Yoon Min-sik

Living in the land of the oft-touted “English fever,” Koreans are no strangers to working hard to achieve their academic goals. They try a number of techniques ranging from eating pages off English dictionaries ― a ritual that supposedly helps you “digest” the words you memorized ― to combing book stores for the “perfect vocabulary list.”

But relying on rote memorization can only get you so far. Studying English as a foreign language requires a more proficient process, which is made possible by what is called “corpus,” said Goh Gwang-yoon, a professor of English language and literature at Yonsei University and the head of the Center for Corpus-based English Language Studies.

The word, corpus, means a large collection of written or spoken texts that is used for language research.

“The slogan of our center is to ‘popularize and extend the use of corpus,’ which is basically to let people know just how useful corpus is. On top of that, we plan to establish corpuses that the public can easily access,” he said.

In Korea, the Ministry of Education and other educational institutions offer a list of vocabularies that one needs to study for specific purposes, like preparing for college entrance exam.

But many such vocabulary lists tend to rely heavily on the author’s personal experience or skills and some native speakers’ intuition, Goh said. While these individuals may be competent, corpus analysis can cover far more ground.

“Corpus-based studying analyzes 5,000 to 6,000 books, in which tens of thousands of native speakers are involved. We are essentially borrowing their insights; no amount of brilliance could top that,” he said.

Goh Gwang-yoon, an English professor from Yonsei University, speaks during one of his workshops on how to use corpus in English-related studies.

Identifying the key vocabularies considerably lessens the workload of students. Recent studies have shown 2,000 most frequently used words make up roughly 80 percent of any English text.

Corpus-based analysis is important since it allows scholars to find out exactly what these words are, Goh explained. “Because if you randomly pick any 2,000 words, you’ll be lucky to cover 1 percent of the text.”

To fluently read a text without referring to a dictionary, one has to know 98 percent of its vocabularies, Goh said. As it is virtually impossible for non-native speakers to recognize an average of 98 percent of English words in a given text, what’s more important is to narrow the range of keywords to study with the help of a corpus.

The data analysis allows non-native speakers to overcome other shortfalls as well, such as lack of reading.

Experts like renowned linguist Stephen Krashen have emphasized the importance of reading. In “Extensive reading in English as a foreign language,” Krashen said that reading is the “only way” to acquire good writing skills, adequate vocabulary, advanced grammar and spelling skills.

Goh’s students proved the power of extensive reading. He said they read thousands of books, which was the main reason why they got near-perfect or perfect TOEIC scores in elementary school.

But realistically speaking, it would be impossible for most students to get the same benefits out of reading, Goh added. “Reading alone is not enough. Comprehensible input of high quality text is crucial, which only a few can provide,” he said.

Since the late 1990s, private education institutes have strived to adapt real world text in their teaching repertoire. But few institutes run extensive reading programs that require more time and preparations. After all, the majority of students demand quick, visible results in English classes. Education institutes also openly publicize their programs as a path to gain better grades in a short period of time.

As a result, teachers in charge of English reading classes often ask students to study short excerpts from texts.

This is where corpus enters the picture. “What we need is a modified extensive reading model. This encourages students to read up to a certain level, but uses corpus to make up for the shortcomings from the lack of reading,” Goh said.

Contrary to what many Koreans think, corpus-based research already accounts for the lion’s share in practical use of English-related works like dictionaries, Goh said. Such a trend has been expanding here as well.

“Those who have used intuition are now realizing that disregarding corpus would be their downfall. Corpus has nothing to do with one’s major or the framework of his or her work; it is a method, an enormous amount of English data that one can explore.”

By Yoon Min-sik (minsikyoon@heraldcorp.com)

Yoon Min-sik

minsikyoon@heraldcorp.com