Persistent URL of this record https://hdl.handle.net/1887/3485800
Documents
-
- Download
- Chapter 2
- open access
- Full text at publishers site
-
- Download
- Bibliography
- open access
-
- Download
- Summary in Dutch
- open access
-
- Download
- Curriculum Vitae
- open access
-
- Download
- Propositions
- open access
In Collections
This item can be found in the following collections:
Towards the automatic detection of syntactic differences
Over the course of five chapters it is shown through case studies involving English, Dutch, German, Czech and Hungarian that correct hypotheses on syntactic differences between languages can be generated automatically from parallel corpora through the use of the minimum description length principle, counting mismatches between part-of-speech pattern occurrences, word alignment and mapping...Show moreThis dissertation centers around the question whether syntactic differences between languages can be detected automatically, and if so, how. With the enormous number of natural languages and dialects, the very high level of variation they exhibit between one another, and the technically infinite number of possible sentences per language or dialect, systematic manual comparison is a hugely daunting task. The field would therefore significantly benefit from the (partial) automatization of the process, as it would increase the scale, speed, systematicity and reproducibility of research.
Over the course of five chapters it is shown through case studies involving English, Dutch, German, Czech and Hungarian that correct hypotheses on syntactic differences between languages can be generated automatically from parallel corpora through the use of the minimum description length principle, counting mismatches between part-of-speech pattern occurrences, word alignment and mapping annotation from an annotated language onto another unannotated language. The tools developed for the purposes of this research work well and can aid a linguist significantly in their search for differences or similarities, but do not replace the human researcher.
Show less
- All authors
- Kroon, M.S.
- Supervisor
- Barbiers, L.C.J.; Odijk, J.E.J.M.
- Co-supervisor
- Pas, S.L. van der
- Committee
- Raaijmakers, S.A.; Nerbonne, J.; Grünwald, P.D.; Noord, G.J.M. van; Prokic, J.
- Qualification
- Doctor (dr.)
- Awarding Institution
- Leiden University Centre for Linguistics (LUCL), Faculty of Humanities, Leiden University
- Date
- 2022-11-10
- Title of host publication
- LOT dissertation series
- Publisher
- Amsterdam: LOT
- ISBN (print)
- 9789460934148
Publication Series
- Name
- 629