Structure and Learning in Natural Language.

Jonathan Rawski

Human language is an incredibly rich yet incredibly constrained system. Learning and generalizing these systematic constraints from small, sparse, and underspecified data presents a fundamental inference problem. Therapidity and ease by which humans learn these constraints has made this a foundational study in cognitive science, linguistics, and artificial intelligence. Traditional approaches treat this problem as grammar induction, positing structured mental representations where statistical learning strategies form inductive biases for heuristically privileging some types of constraints over others. This dissertation shows how structural properties of the space of possible grammars themselves enable learning, revealing that the role of statistical heuristics is overrated for a variety of linguistically relevant learning problems. The representational primitives of a grammar themselves -- whatever they may be -- form a partial order, and the dissertation presents a learning algorithm which traverses this space to select a grammar. Since the algorithm is agnostic to the type of representations, the dissertation provides a computational separation between the mental structures learners extract from data, and the learning strategy they use to generalize. The dissertation then demonstrates the effectiveness of the algorithm on several well-understood phonological patterns governing the distribution of sounds into words. While the learning algorithm succeeds for typical representations advocated by phonologists, it reveals that the constraint space is not only large but also redundant, and the algorithm is guaranteed to find all surface-true grammars. For this reason, induction alone is insufficient for successful learning, and the dissertation describes additional non-statistical abductive principles for selecting particular grammars over others. Finally, while the representations considered by the algorithm are discrete, the dissertation shows how to translate these structures and constraints into the distributed representations characteristic of neural learning systems via tensor algebra. In this way, the thesis addresses fundamental questions about structure and learning. Overall, these results clarify the role of induction and abduction in grammatical inference. from induction to abduction, help us understand the role structure and statistics play in these processes, and provide an analytical link between the cognitive issues of structure, learning, and bias in natural language. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]