Cargo ships navigating global waters are required to be sufficiently safe and compliant with international treaties. Governmental inspectorates currently assess in a rule-based manner whether a... Show moreCargo ships navigating global waters are required to be sufficiently safe and compliant with international treaties. Governmental inspectorates currently assess in a rule-based manner whether a ship is potentially noncompliant and thus needs inspection. One of the dominant ship characteristics in this assessment is the ‘colour’ of the flag a ship is flying, where countries with a positive reputation have a so-called ‘white flag’. The colour of a flag may disproportionately influence the inspector, causing more frequent and stricter inspections of ships flying a non-white flag, resulting in confirmation bias in historical inspection data.In this paper, we propose an automated approach for the assessment of ship noncompliance, realising two important contributions. First, we reduce confirmation bias by using fair classifiers that decorrelate the flag from the risk classification returned by the model. Second, we extract mobility patterns from a cargo ship network, allowing us to derive meaningful features for ship classification. Crucially, these features model the behaviour of a ship, rather than its static properties. Our approach shows both a higher overall prediction performance and improved fairness with respect to the flag. Ultimately, this work enables inspectorates to better target noncompliant ships, thereby improving overall maritime safety and environmental protection. Show less
Rede, in verkorte vorm uitgesproken bij het openbare afscheid van het ambt van hoogleraar Recht en Informatica bij eLaw – Centrum voor Recht en Digitale Technologie van de Faculteit der... Show moreRede, in verkorte vorm uitgesproken bij het openbare afscheid van het ambt van hoogleraar Recht en Informatica bij eLaw – Centrum voor Recht en Digitale Technologie van de Faculteit der Rechtsgeleerdheid van de Universiteit Leiden op vrijdag 8 oktober 2021 Show less
Link prediction is a well-studied technique for inferring the missing edges between two nodes in some static representation of a network. In modern day social networks, the timestamps associated... Show moreLink prediction is a well-studied technique for inferring the missing edges between two nodes in some static representation of a network. In modern day social networks, the timestamps associated with each link can be used to predict future links between so-far unconnected nodes. In these so-called temporal networks, we speak of temporal link prediction. This paper presents a systematic investigation of supervised temporal link prediction on 26 temporal, structurally diverse, real-world networks ranging from thousands to a million nodes and links. We analyse the relation between global structural properties of each network and the obtained temporal link prediction performance, employing a set of well-established topological features commonly used in the link prediction literature. We report on four contributions. First, using temporal information, an improvement of prediction performance is observed. Second, our experiments show that degree disassortative networks perform better in temporal link prediction than assortative networks. Third, we present a new approach to investigate the distinction between networks modelling discrete events and networks modelling persistent relations. Unlike earlier work, our approach utilises information on all past events in a systematic way, resulting in substantially higher link prediction performance. Fourth, we report on the influence of the temporal activity of the node or the edge on the link prediction performance, and show that the performance differs depending on the considered network type. In the studied information networks, temporal information on the node appears most important. The findings in this paper demonstrate how link prediction can effectively be improved in temporal networks, explicitly taking into account the type of connectivity modelled by the temporal edge. More generally, the findings contribute to a better understanding of the mechanisms behind the evolution of networks. Show less
In this paper we analyse the classification of zoological illustrations. Historically, zoological illustrations were the modus operandi for the documentation of new species, and now serve as... Show moreIn this paper we analyse the classification of zoological illustrations. Historically, zoological illustrations were the modus operandi for the documentation of new species, and now serve as crucial sources for long-term ecological and biodiversity research. By employing computational methods for classification, the data can be made amenable to research. Automated species identification is challenging due to the long-tailed nature of the data, and the millions of possible classes in the species taxonomy. Success commonly depends on large training sets with many examples per class, but images from only a subset of classes are digitally available, and many images are unlabelled, since labelling requires domain expertise. We explore zero-shot learning to address the problem, where features are learned from classes with medium to large samples, which are then transferred to recognise classes with few or no training samples. We specifically explore how distributed, multi-modal background knowledge from data providers, such as the Global Biodiversity Information Facility (GBIF), iNaturalist, and the Biodiversity Heritage Library (BHL), can be used to share knowledge between classes for zero-shot learning. We train a prototypical network for zero-shot classification, and introduce fused prototypes (FP) and hierarchical prototype loss (HPL) to optimise the model. Finally, we analyse the performance of the model for use in real-world applications. The experimental results are encouraging, indicating potential for use of such models in an expert support system, but also express the difficulty of our task, showing a necessity for research into computer vision methods that are able to learn from small samples. Show less
In link prediction, the goal is to predict which links will appear in the future of an evolving network. To estimate the performance of these models in a supervised machine learning model, disjoint... Show moreIn link prediction, the goal is to predict which links will appear in the future of an evolving network. To estimate the performance of these models in a supervised machine learning model, disjoint and independent train and test sets are needed. However, objects in a real-world network are inherently related to each other. Therefore, it is far from trivial to separate candidate links into these disjoint sets.Here we characterize and empirically investigate the two dominant approaches from the literature for creating separate train and test sets in link prediction, referred to as random and temporal splits. Comparing the performance of these two approaches on several large temporal network datasets, we find evidence that random splits may result in too optimistic results, whereas a temporal split may give a more fair and realistic indication of performance. Results appear robust to the selection of temporal intervals. These findings will be of interest to researchers that employ link prediction or other machine learning tasks in networks. Show less
The goal of this paper is to learn the dynamics of truck co-driving behaviour. Understanding this behaviour is important because co-driving has a potential positive impact on the environment. In... Show moreThe goal of this paper is to learn the dynamics of truck co-driving behaviour. Understanding this behaviour is important because co-driving has a potential positive impact on the environment. In the so-called co-driving network, trucks are nodes while links indicate that two trucks frequently drive together. To understand the network’s dynamics, we use a link prediction approach employing a machine learning classifier. The features of the classifier can be categorized into spatio-temporal features, neighbourhood features, path features, and node features. The very different types of features allow us to understand the social processes underlying the co-driving behaviour. Our work is based on a spatio-temporal data not studied before. Data is collected from 18 million truck movements in the Netherlands. We find that co-driving behaviour is best described by using neighbourhood features, and to lesser extent by path and spatio-temporal features. Node features are deemed unimportant. Findings suggest that the dynamics of a truck co-driving network has clear social network effects. Show less
Tracking cookies and similar tracking techniques are nowadays omnipresent on the internet. In fact, many popular online services are made possible due to online advertisements. When you are about... Show moreTracking cookies and similar tracking techniques are nowadays omnipresent on the internet. In fact, many popular online services are made possible due to online advertisements. When you are about to book a last-minute holiday online, you may experience that your purchase intention follows you on other websites. To understand online advertising (or Real-Time Bidding), we provide a literature review of online-tracking technologies and propose a new paradigm for Web Privacy Measurement (WPM). Show less
Computer chess has stimulated human imagination over some two hundred and fifty years. In 1769 Baron Wolfgang von Kempelen promised Empress Maria Theresia in public: “I will invent a machine for a... Show moreComputer chess has stimulated human imagination over some two hundred and fifty years. In 1769 Baron Wolfgang von Kempelen promised Empress Maria Theresia in public: “I will invent a machine for a more compelling spectacle [than the magnetism tricks by Pelletier] within half a year.” The idea of an intelligent chess machine was born. In 1770 the first demonstration was given.The real development of artificial intelligence (AI) began in 1950 and contains many well-known names, such as Turing and Shannon. One of the first AI research areas was chess. In 1997, a high point was to be reported: world champion Gary Kasparov had been defeated by Deep Blue. The techniques used included searching, knowledge representation, parallelism, and distributed systems. Adaptivity, machine learning and the recently developed deep learning mechanism were only later on added to the computer chess research techniques.The major breakthrough for games in general (including chess) took place in 2017 when (1) the AlphaGo Zero program defeated the world championship program AlphaGo by 100-0 and (2) the technique of deep learning also proved applicable to chess. In the autumn of 2017, the Stockfish program was beaten by AlphaZero by 28-0 (with 72 draws, resulting in a 64-36 victory). However, the end of the disruptive advance is not yet in reach. In fact, we have just started. The next milestone will be to determine the theoretical game value of chess (won, draw, or lost). This achievement will certainly be followed by other surprising developments. Show less
Numerical models of chemical transport have been used to simulate the complex processes involved in the formation and transport of air pollutants. Although these models can predict the... Show moreNumerical models of chemical transport have been used to simulate the complex processes involved in the formation and transport of air pollutants. Although these models can predict the spatiotemporal variability of a variety of chemical species, the accuracy of these models is often limited. Therefore, in the past two decades, data assimilation methods have been applied to use the available measurements for improving the forecast. Nowadays, machine learning techniques provide new opportunities for improving the air quality forecast. A case study on PM10 concentrations during a dust storm is performed. It is known that the PM10 concentrations are caused by multiple emission sources, e.g., dust from the desert and anthropogenic emissions. Accurate modeling of the PM10 concentration levels owing to the local anthropogenic emissions is essential for an adequate evaluation of the dust level. However, real-time measurement of local emissions is not possible, so no direct data is available. Actually, the lack of in-time emission inventories is one of the main reasons that current numerical chemical transport models cannot produce accurate anthropogenic PM10 simulations. Using machine learning techniques to generate local emissions based on past observations is a promising approach. We report how it can be combined with data assimilation to improve the accuracy of air quality forecast considerably. Show less
Large collections of historical biodiversity expeditions are housed in natural history museums throughout the world. Potentially they can serve as rich sources of data for cultural historical and... Show moreLarge collections of historical biodiversity expeditions are housed in natural history museums throughout the world. Potentially they can serve as rich sources of data for cultural historical and biodiversity research. However, they exist as only partially catalogued specimen repositories and images of unstructured, non-standardised, hand-written text and drawings. Although many archival collections have been digitised, disclosing their content is challenging. They refer to historical place names and outdated taxonomic classifications and are written in multiple languages. Efforts to transcribe the hand-written text can make the content accessible, but semantically describing and interlinking the content would further facilitate research. We propose a semantic model that serves to structure the named entities in natural history archival collections. In addition, we present an approach for the semantic annotation of these collections whilst documenting their provenance. This approach serves as an initial step for an adaptive learning approach for semi-automated extraction of named entities from natural history archival collections. The applicability of the semantic model and the annotation approach is demonstrated using image scans from a collection of 8, 000 field book pages gathered by the Committee for Natural History of the Netherlands Indies between 1820 and 1850, and evaluated together with domain experts from the field of natural and cultural history. Show less
Krabbenbos, J.; Herik, H.J. van den; Haworth, G. 2018
Valedictory Address presented in abbreviated form at the public farewell to the office of Professor of Computer Science at the Tilburg center for Cognition and Communication (TiCC) of the Faculty... Show moreValedictory Address presented in abbreviated form at the public farewell to the office of Professor of Computer Science at the Tilburg center for Cognition and Communication (TiCC) of the Faculty of Humanities of Tilburg University on Friday, January 29th, 2016. Show less
Mattheij, R.; Groeneveld, K.; Postma, E.O.; Herik, H.J. van den 2016