Developing Linguistic Resources for Chinese Deep Linguistic Processing
  • The Open Mandarin Grammar (OMG) project is a research project aimed at developing computational resources for deep linguistic processing for Mandarin Chinese.

  • The very first goal of OMG is a description of central phenomena in Chinese using different grammatical frameworks, including Tree-Adjoining Grammar (TAG), Lexical-Functional Grammar (LFG), Head-Diven Phrase Structure Grammar (HPSG), and (perhaps) Combinatory Categorial Grammar (CCG). Particular attention will be paid to the syntax-semantics interface. Based on elegant phenomenon-driven analysis, The project also targets the development of a large-scale, hand-crafted HPSG grammar for Mandarin Chinese, exploring the connection between linguistic theory and computational implementation. Especially, we will adopt the DELPH-IN methodology.

  • In addition to grammar-oriented research, we are also working corpus-oriented development for grammar engineering. In particular, relatively deep grammars as well as annotations will be built based on existing, relatively shallow treebanks. The goal here is more practical: building applicable natural language understanding systems. An example is our ACL-2014 work. Based on Chinese TreeBank, we build deep dependency annotations in a short time. The annotations allow us to develop useful statistical parsers for various Natural Language Processing applications.
  • Publications
    Weiwei Sun, Yantao Du, Xin Kou, Shuoyang Ding and Xiaojun Wan. 2014. Grammatical Relations in Chinese: GB-Ground Extraction and Data-Driven Parsing. In Proceedings of ACL 2014.
