In this resource paper we release ChiSCor, a new corpus containing 619 fantasy stories, told freely by 442 Dutch children aged 4-12. ChiSCor was compiled for studying how children render character... Show moreIn this resource paper we release ChiSCor, a new corpus containing 619 fantasy stories, told freely by 442 Dutch children aged 4-12. ChiSCor was compiled for studying how children render character perspectives, and unravelling language and cognition in development, with computational tools. Unlike existing resources, ChiSCor’s stories were produced in natural contexts, in line with recent calls for more ecologically valid datasets. ChiSCor hosts text, audio, and annotations for character complexity and linguistic complexity. Additional metadata (e.g. education of caregivers) is available for one third of the Dutch children. ChiSCor also includes a small set of 62 English stories. This paper details how ChiSCor was compiled and shows its potential for future work with three brief case studies: i) we show that the syntactic complexity of stories is strikingly stable across children’s ages; ii) we extend work on Zipfian distributions in free speech and show that ChiSCor obeys Zipf’s law closely, reflecting its social context; iii) we show that even though ChiSCor is relatively small, the corpus is rich enough to train informative lemma vectors that allow us to analyse children’s language use. We end with a reflection on the value of narrative datasets in computational linguistics. Show less
Duijn, M.J. van; Dijk, B.M.A. van; Kouwenhoven, T.; Valk, W. de; Spruit, M.; Putten, P.W.H. van der 2023
To what degree should we ascribe cognitive capacities to Large Language Models (LLMs), such as the ability to reason about intentions and beliefs known as Theory of Mind (ToM)? Here we add to this... Show moreTo what degree should we ascribe cognitive capacities to Large Language Models (LLMs), such as the ability to reason about intentions and beliefs known as Theory of Mind (ToM)? Here we add to this emerging debate by (i) testing 11 base- and instruction-tuned LLMs on capabilities relevant to ToM beyond the dominant false-belief paradigm, including non-literal language usage and recursive intentionality; (ii) using newly rewritten versions of standardized tests to gauge LLMs’ robustness; (iii) prompting and scoring for open besides closed questions; and (iv) benchmarking LLM performance against that of children aged 7-10 on the same tasks. We find that instruction-tuned LLMs from the GPT family outperform other models, and often also children. Base-LLMs are mostly unable to solve ToM tasks, even with specialized prompting. We suggest that the interlinked evolution and development of language and ToM may help explain what instruction-tuning adds: rewarding cooperative communication that takes into account interlocutor and context. We conclude by arguing for a nuanced perspective on ToM in LLMs. Show less
Dijk, B.M.A. van; Kouwenhoven, T.; Spruit, M.; Duijn, M.J. van 2023
Current Large Language Models (LLMs) are unparalleled in their ability to generate grammatically correct, fluent text. LLMs are appearing rapidly, and debates on LLM capacities have taken off, but... Show moreCurrent Large Language Models (LLMs) are unparalleled in their ability to generate grammatically correct, fluent text. LLMs are appearing rapidly, and debates on LLM capacities have taken off, but reflection is lagging behind. Thus, in this position paper, we first zoom in on the debate and critically assess three points recurring in critiques of LLM capacities: i) that LLMs only parrot statistical patterns in the training data; ii) that LLMs master formal but not functional language competence; and iii) that language learning in LLMs cannot inform human language learning. Drawing on empirical and theoretical arguments, we show that these points need more nuance. Second, we outline a pragmatic perspective on the issue of ‘real’ understanding and intentionality in LLMs. Understanding and intentionality pertain to unobservable mental states we attribute to other humans because they have pragmatic value: they allow us to abstract away from complex underlying mechanics and predict behaviour effectively. We reflect on the circumstances under which it would make sense for humans to similarly attribute mental states to LLMs, thereby outlining a pragmatic philosophical context for LLMs as an increasingly prominent technology in society. Show less
Dijk, B.M.A. van; Spruit, M.R.; Duijn, M.J. van 2023
Children are the focal point for studying the link between language and Theory of Mind (ToM) competence. Language and ToM are often studied with younger children and standardized tests, but as both... Show moreChildren are the focal point for studying the link between language and Theory of Mind (ToM) competence. Language and ToM are often studied with younger children and standardized tests, but as both are social competences, data and methods with higher ecological validity are critical.We leverage a corpus of 442 freely-told stories by Dutch children aged 4-12, recorded in their everyday classroom environments, to study language and ToM with NLP-tools. We labelled stories according to the mental depth of story characters children create, as a proxy for their ToM competence ‘in action’, and built a classifier with features encoding linguistic competences identified in existing work as predictive of ToM.We obtain good and fairly robust results (F1-macro = .71), relative to the complexity of the task for humans. Our results are explainable in that we link specific linguistic features such as lexical complexity and sentential complementation, that are relatively independent of children’s ages, to higher levels of character depth. This confirms and extends earlier work, as our study includes older children and socially embedded data from a different domain. Overall, our results support the idea that language and ToM are strongly interlinked, and that in narratives the former can scaffold the latter. Show less
Story characters not only perform actions, they typically also perceive, feel, think, and communicate. Here we are interested in how children render characters’ perspectives when freely telling a... Show moreStory characters not only perform actions, they typically also perceive, feel, think, and communicate. Here we are interested in how children render characters’ perspectives when freely telling a fantasy story. Drawing on a sample of 150 narratives elicited from Dutch children aged 4-12, we provide an inventory of 750 instances of character-perspective representation (CPR), distinguishing fourteen different types. Firstly, we observe that character perspectives are ubiquitous in freely told children’s stories and take more varied forms than traditional frameworks can accommodate. Secondly, we discuss variation in the use of different types of CPR across age groups, finding that character perspectives are being fleshed out in more advanced and diverse ways as children grow older. Thirdly, we explore whether such variation can be meaningfully linked to automatically extracted linguistic features, thereby probing the potential for using automated tools from NLP to extract and classify character perspectives in children’s stories. Show less
Visser, M.; Duijn, M.J. van; Tong, S.; Lamers, M.H. 2022
Studies into the history of mate search show that important mate choice criteria vary through time, influenced by societal circumstances, whereas others seem to remain constant. Nowadays, online... Show moreStudies into the history of mate search show that important mate choice criteria vary through time, influenced by societal circumstances, whereas others seem to remain constant. Nowadays, online dating has become one of the most popularways to meet a partner, which changed the mate selection process because this computer-mediated communication facilitates selective self-presentation. Biographies of dating app profiles are space limited and often selfwritten, providing insight in what users find crucial to mention about themselves and their wishes for a future partner. Here we collected 300 biographies from three online dating platforms and coded them on four content themes: dating intention, personality and appearance of self vs. potential partner, and lifestyle.We present a preliminary comparison of these mate choice criteria between gender and age. Show less
From age 3-4, children are generally capable of telling stories about a topic free of choice. Over the years their stories become more sophisticated in content and structure, reflecting various... Show moreFrom age 3-4, children are generally capable of telling stories about a topic free of choice. Over the years their stories become more sophisticated in content and structure, reflecting various aspects of cognitive development. Here we focus on children’s ability to construe characters with increasing levels of mental depth, arguably reflecting socio-cognitive capacities including Theory of Mind. Within our sample of 51 stories told by children aged 4-10, characters range from flat “actors” performing simple actions, to “agents” having basic perceptive, emotional, and intentional capacities, to fully-blown “persons” with complex inner lives. We argue for the underexplored potential of computationally extracted story-internal factors (e.g. lexical/syntactic complexity) in explaining variance in character depth, as opposed to story-external factors (e.g. age, socioeconomic status) on which existing work has focused. We show that especially lexical richness explains variance in character depth, and this effect is larger than and not moderated by age. Show less
Laakasuo, M.; Rotkirch, A.; Duijn, M.J. van; Berg, V.; Jokela, M.; David-Barrett, T.; ... ; Dunbar R. 2020
Personality affects dyadic relations and teamwork, yet its role among groups of friends has been little explored. We examine for the first time whether similarity in personality enhances the... Show morePersonality affects dyadic relations and teamwork, yet its role among groups of friends has been little explored. We examine for the first time whether similarity in personality enhances the effectiveness of real-life friendship groups. Using data from a longitudinal study of a European fraternity (10 male and 15 female groups), we investigate how individual Big Five personality traits were associated with group formation and whether personality homophily related to how successful the groups were over 1 year (N = 147–196). Group success was measured as group performance/identification (adoption of group markers) and as group bonding (using the inclusion-of-other-in-self scale). Results show that individuals’ similarity in neuroticism and conscientiousness predicted group formation. Furthermore, personality similarity was associated with group success, even after controlling for individual’s own personality. Especially higher group-level similarity in conscientiousness was associated with group performance, and with bonding in male groups. Show less
Emoji, colourful pictographs showing faces, creatures and objects, have seen a surge in popularity and number in recent years. This exploratory study strives to answer the following question: how... Show moreEmoji, colourful pictographs showing faces, creatures and objects, have seen a surge in popularity and number in recent years. This exploratory study strives to answer the following question: how and why are emoji used on Twitter in the Netherlands and England? By combining quantitative and qualitative methods, we identified three important factors explaining emoji usage: the individual’s purpose on Twitter, the perceived functionality of emoji and the individual’s selection criteria for emoji. Overall, emoji play an important role in online communication and their use is more complex than their light-hearted appearance may suggest. Show less
Coordinating different viewpoints is an essential part of human interaction. Languages have evolved conventional ways of supporting this process: many linguistic items are somehow involved in... Show moreCoordinating different viewpoints is an essential part of human interaction. Languages have evolved conventional ways of supporting this process: many linguistic items are somehow involved in viewpoint management, ranging from morphological elements and lexical units to grammatical constructions and narrative patterns. In this paper we propose a conceptual model for analysing how particular instances (or combinations) of such linguistic items can be used to coordinate the viewpoints of signallers, addressees, and third parties involved in an interaction event. In essence, our model augments Langacker’s (1987) “viewing arrangement” through the addition of a third dimension to the existing two. We discuss the details of our model using a range of examples from spoken discourse, newspaper articles, and literary fiction, and end by placing it in broader discussions on human social cognition. Show less
Human interaction is characterised by an ongoing polyphony of perspectives and perspectives-on-perspectives. Not only do we share and coordinate our own inner life with that of the people we... Show moreHuman interaction is characterised by an ongoing polyphony of perspectives and perspectives-on-perspectives. Not only do we share and coordinate our own inner life with that of the people we interact with, but we also constantly make implicit and explicit reference to the intentional states of others who may or may not be present at the time of speaking, or who may even exist only in the imagined worlds of thought and fiction. In the cognitive sciences, this polyphony has generally been conceptualised as a series of embedded 'orders of intentionality': A thinks that B understands that C expects... (etc.). I argue that this conceptualisation stands in stark contrast to how multiple perspectives are handled in actual discourse and interaction. Based on linguistic and narratological analysis of literary texts, newspaper articles, examples from spoken discourse, and stimuli from psychological experiments investigating multiple-order intentionality in the lab, a new perspective is offered on how we deal with networks of embedded and interlinked intentional states. The findings are discussed in the light of current theories about mindreading (a.k.a. ‘theory of mind’), discussing in particular cognitive models, issues of development and learning, and scenarios of how this capacity may have evolved in our lineage. Show less