At the start of each research enterprise, historical sociolinguists have to deal with the key issue of the required historical data. In the present world of big data optimism, the idea may arise... Show moreAt the start of each research enterprise, historical sociolinguists have to deal with the key issue of the required historical data. In the present world of big data optimism, the idea may arise that the time-consuming compilation of specialised corpora is no longer needed. Can taking a shortcut still lead us to convincing results? In this article I discuss the crucial relationship between specific research questions and appropriate historical data. This methodological issue will be illustrated by concentrating on three historical-sociolinguistic research programmes, conducted at Leiden University: Letters as loot: Towards a non-standard view on the history of Dutch (2008–2013), Going Dutch: The construction of Dutch in policy, practice and discourse (2013–2018) and Pardon my French? Dutch-French language contact in the Netherlands, 1500–1900 (2018–2023). What do we learn from these large-scale projects which address different research questions and focus on different periods in the history of Dutch? The use of specific sources, handwritten material such as ego-documents, a multi-genre approach and details of corpus compilation will be discussed. The various approaches and results are considered against the background of methodological developments and current debates in historical sociolinguistics. I argue and conclude that the careful compilation of specialised corpora remains essential as a solid foundation for historical sociolinguistic research. Show less
This paper seeks to approach the topic of historical language choice from a quantitative perspective, arguing that solid baseline evidence drawn from a substantial dataset is a much-needed... Show moreThis paper seeks to approach the topic of historical language choice from a quantitative perspective, arguing that solid baseline evidence drawn from a substantial dataset is a much-needed complement to the largely qualitative findings of previous research. We propose a methodological framework which enables us to examine the sociolinguistic factors that condition language choice in the private domain. Illustrating the possibilities of our methodology, we present a case study on Dutch-French language choice in the Northern Low Countries (i.e., the present-day Netherlands), focusing on nineteenth-century family correspondence. Our paper shows that a careful selection procedure is crucial in order to achieve a balanced representation of language choice in a large-scale dataset. With respect to our analyses, the role of French in private letters turns out to be relatively small against the prevalence of Dutch. However, interesting patterns become visible when looking at regional differences, gender constellations and familial relationships. These quantitative findings can therefore constitute an interpretational frame for qualitative studies on historical language choice in the Dutch-French context and beyond. Show less