ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-06032010-151929


Tipo di tesi
Tesi di laurea specialistica
Autore
LAPESA, GABRIELLA
URN
etd-06032010-151929
Titolo
Starting Where the Dictionaries Stop: a Distributional Resource for the Study of Italian Verbs at the Syntax-Semantics Interface
Dipartimento
INTERFACOLTA'
Corso di studi
LINGUISTICA
Relatori
relatore Prof. Lenci, Alessandro
correlatore Prof. Marotta, Giovanna
Parole chiave
  • Manner of Motion Verbs
  • Lexical Sets
  • Selectional Preferences
  • Subcategorization Frames
  • Argument Realization
  • Lexical Semantics
  • Distributional Semantics
Data inizio appello
23/06/2010
Consultabilità
Non consultabile
Data di rilascio
23/06/2050
Riassunto
Verb meanings represent construals of events, lexicalizations of actual happenings in the world: a proper representation of verb meaning has to account for both the representational content to be ascribed to the verb and for the mechanisms underlying the interaction among verb meanings and event participants in argument realization.
Events have associated semantic roles, while syntactic constructions have argument slots to be filled: how do syntax and semantics interact? What are the principles governing argument realization?

This thesis, grounded in the Distributional Semantics framework, is concerned with the automatic extraction of distributional information with regards to the behavior of Italian verbs at the syntax-semantics interface. In the literature this topic is further articulated in different subtasks, deeply related to one another: extraction of subcategorization frames, assignment of selectional preferences to verb arguments in terms of both lexical fillers and semantic classes, semantic role labeling, identification of diathesis alternations, and the automatic extraction of verb classes.
The goal of this work is to describe the behavior of verbs in terms of both a syntactic and a semantic distributional profile. The former is composed of the syntactic constructions (subcategorization frames) that best characterize the target verb; the latter is articulated in three subsets: the lexical sets of the words that best fill each frame slot, the semantic classes (selectional preferences) abstracted from these words, and the polysemies built by combining these classes.
The “final product” is LexIt, a distributional resource for the study of Italian verbs based on the "La Repubblica corpus": in this database syntactic frames, lexical fillers, semantic classes and polysemies are brought together and scored for their statistical salience for verbs.

Distributional data extracted from "La Repubblica" were used to tackle a case study: the behavior of Italian verbs of Manner of Motion at the syntax-semantics interface. The distributional correlates of an existing classification were analysed to shed light on specific issues: the asymmetry between SOURCE and GOAL; the property of MANNER verbs to be modifiers of telic, (and, in particular, boundary crossing) predicates; the property of MANNER verbs to be modified by directional adverbs; the metaphorical uses of these verbs. The linguistic analysis of Manner of Motion verbs has to be considered a part of the actual resource because the applied methodology represents a proposal for the application of Lexit to actual lexical analysis.

The first chapter of this thesis tackles the issue of argument realization from the point of view of both theoretical linguistics and lexicographic resources; the second chapter outlines the distributional methodology applied for data extraction: it provides, for every subtask, an introduction of the linguistic phenomenon to be computationally modeled, a review of the state of the art and a detailed description of the algorithms applied, together with detailed examples of the results; the third chapter goes through the case study described above.
File