If you are not familiar with the term computational linguistics, you are not alone.
But there is a group of 12 Culver Academies students preparing to take the first step toward qualifying for an international computational linguistics competition. They met for the first time Sunday afternoon in the Huffington Library.
CGA sophomore Xinran (Olivia) Ma (Shanghai, China) is putting together the group that will compete in the first round of the North American Computational Linguistic Open Competition (NACLO) at the University of Notre Dame on Jan. 23. Notre Dame serves as one of the approximately 200 regional sites for the competition.
The competition uses logic skills to solve linguistic puzzles. No knowledge of linguistics or different languages is needed. And, in some instances, people who speak multiple languages can get mired down in the details, Ma said.
“It’s real easy to overthink it,” she explained, adding that people who speak multiple languages will dive too deep into how verbs are conjugated and nouns are declined in each language. And, when you’re dealing in Mayan, what good does conjugating a verb really do?
Yes, Mayan does show up on some of the sample questions; as does Choctaw and Latin. Even braille. Some of the group’s sample questions included translating common Japanese words like karate, karaoke, and haiku into braille.
What is important, Ma explained, is finding the patterns or rhythms in each language. Once a person unlocks that, it becomes easier to do the translations. That is why a strong math and science background can be of benefit.
“I call it Sudoku for language,” she said.
The students will be cramming for the competition after they return from winter break. Ma said she will spend her time making notes and study guides for the others after they return. That should help ease the learning curve. Two more group sessions are also planned before they head to Notre Dame.
The competition is designed to teach students about the diversity and consistency of language, while exercising logic skills, according to the NACLO website. The competition uses dozens of languages to create problems that represent cutting edge issues in the fields of linguistics, computational linguistics and language technologies. “It is truly an opportunity for young people to experience a taste of natural-language processing in the 21st century.”
The open round of the competition features a three-hour test that will determine who moves on to the invitational round involving students from the United States and Canada. This past January, 1506 students in both countries participated in the open round hosted by the regional sites. The top 10 percent (152 students) moved on to the invitational round in March, which features a more difficult four-hour test.
From the invitational round, the U.S. selected two teams of four students each and five alternates. Canada fielded Anglophone and Francophone teams. Those teams then competed in the international competition at the Hankuk University of Foreign Studies in Yongin, South Korea. The international competition features both individual and team competitions. A total of 209 competitors traveled from 36 countries to participate.
The individual competition featured a six-hour exam with five problems. The featured languages and scripts were Yongom, Yurok, Book Pahlavi script, West Tarangan, and Nooni. The three-hour team competition was just one problem: work out the rules of the notation system used by rhythmic gymnastics judges. The U.S. teams did win 10 individual and team medals, including the overall team trophy, and the Anglophone team from Canada collected three medals.
The 2020 international competition will be conducted in Ventspils, Latvia, on July 20-24. The international competition started in 2003. The U.S. has been competing since 2007.