The invention of neural networks marks a critical milestone in the pursuit of true artificial intelligence. Despite their impressive performance on various tasks, these networks face limitations in... Show moreThe invention of neural networks marks a critical milestone in the pursuit of true artificial intelligence. Despite their impressive performance on various tasks, these networks face limitations in learning efficiently as they are often trained from scratch. Deep meta-learning is one approach to improve the learning efficiency by leveraging prior knowledge and experience. Whilst many succesful deep meta-learning techniques have been proposed, our understanding of the performance of these methods remains limited. In this dissertation, we delve deeper into the underlying principles of these algorithms, and aim to gain a comprehensive understanding of why certain algorithms succeed while others fall short. This allows us to design enhanced deep meta-learning algorithms and reason about the impact of specific design choices on the performance of different algorithms. Moreover, we investigate the integration of theoretical principles into meta-learning algorithms to improve their performance. Overall, we make a small step toward a better understanding of deep meta-learning algorithms, paving the way for more robust and principled meta-learning techniques with broader applicability and superior performance. Show less
Transport inspectorates are looking for novel methods to identify dangerous behavior, ultimately to reduce risks associated to the movements of people and goods. We explore a data-driven approach... Show moreTransport inspectorates are looking for novel methods to identify dangerous behavior, ultimately to reduce risks associated to the movements of people and goods. We explore a data-driven approach to arrive at smart inspections of vehicles. Inspections are smart when they are performed (1) accurate, (2) automated, (3) fair, and (4) in an interpretable manner. We leverage tools from the network science and machine learning domain to encode the behavioral aspect of vehicle’s behavior. Tools used in this thesis include community detection, link prediction, and assortativity. We explore their applicability and provide technical methods. In the final chapter, we also discuss the matter of fairness in machine learning. Show less
Legal professionals spend up to a third of their time doing research. During this research legal information retrieval (IR) helps users find information that is relevant for them. These legal IR... Show moreLegal professionals spend up to a third of their time doing research. During this research legal information retrieval (IR) helps users find information that is relevant for them. These legal IR systems are important because the number of legal documents published online is growing exponentially.This research addresses the question: how can bibliometrics improve common ranking algorithms in legal information retrieval?Chapter 2 focuses on the users of legal IR systems. Users were surveyed to determine whether legal practitioners (searching for themselves) and information professionals (searching for others) have the same perception of relevance. This was done by comparing the factors of relevance they consider then evaluating search results. We found no reason to distinguish between these user groups. With regards to the distinction between legal scholars and legal practitioners, it was determined in Chapter 3 that the usage and citations between scholarly and non-scholarly publications show no reason to create separate rankings users based on their affiliation.Chapter 3 regards the documents in the legal IR system. The citation and usage analysis provided the theoretical insight that citations in legal documents measure part of a broad scope of impact, or relevance, on the entire legal field. Using this information a bibliometric-enhanced ranking variable was created.There are several challenges to evaluating a live domain specific IR system. Chapter 4 deals with these challenges and why common evaluation methods in IR are not applicable. In the end, in Chapter 5, a cost based model is used for evaluation, which shows a reduction of cost for the user.Combining all this information this thesis shows that a bibliometric-enhanced ranking feature that takes into account both usage and citations (two flavors of impact relevance), and increases in influence as the reliability of the data grows (in combination with a recency feature that gives new documents the benefit of the doubt and decreases at the same rate as the bibliometric feature increases), can reduce the cost required from legal professionals (whether practitioner, scholar or legal information professional) to find the point of satisfaction in the completeness ideal/research reality trade-off. Show less
In today's volatile market environments, companies must be able to continuously innovate. In this context, innovation does not only refer to the development of new products or business models but... Show moreIn today's volatile market environments, companies must be able to continuously innovate. In this context, innovation does not only refer to the development of new products or business models but often also affects the entire organization, which has to transform its structures, processes, and ways of working.Corporate entrepreneurship (CE) programs are often used by established companies to address these innovation and transformation challenges. In general, they are understood as formalized entrepreneurial activities to (1) support internal corporate ventures or (2) work with external startups. The organizational design and value creation of CE programs exhibit a high degree of heterogeneity. On the one hand, this heterogeneity makes CE programs a valuable management tool that can be used for many purposes. On the other hand, it can be seen as a reason for the current challenges that companies experience in effectively using and managing CE programs.By systematically analyzing 54 different cases in established companies in Germany, Switzerland, and Austria, this study contributes to a better understanding of the heterogeneity of CE programs. The taxonomic approach provides clearly defined types of CE programs that are distinguished according to their organizational design and the outputs they generate. Show less
In this work, we attempt to answer the question: "How to learn robust and interpretable rule-based models from data for machine learning and data mining, and define their optimality?".Rules provide... Show moreIn this work, we attempt to answer the question: "How to learn robust and interpretable rule-based models from data for machine learning and data mining, and define their optimality?".Rules provide a simple form of storing and sharing information about the world. As humans, we use rules every day, such as the physician that diagnoses someone with flu, represented by "if a person has either a fever or sore throat (among others), then she has the flu.". Even though an individual rule can only describe simple events, several aggregated rules can represent more complex scenarios, such as the complete set of diagnostic rules employed by a physician.The use of rules spans many fields in computer science, and in this dissertation, we focus on rule-based models for machine learning and data mining. Machine learning focuses on learning the model that best predicts future (previously unseen) events from historical data. Data mining aims to find interesting patterns in the available data.To answer our question, we use the Minimum Description Length (MDL) principle, which allows us to define the statistical optimality of rule-based models. Furthermore, we empirically show that this formulation is highly competitive for real-world problems. Show less
Interactive exploration of large volumes of data is increasingly common, as data scientists attempt to extract interesting information from large opaque data sets. This scenario presents a... Show moreInteractive exploration of large volumes of data is increasingly common, as data scientists attempt to extract interesting information from large opaque data sets. This scenario presents a difficult challenge for traditional database systems, as (1) nothing is known about the query workload in advance, (2) the query workload is constantly changing, and (3) the system must provide interactive responses to the issued queries. This environment is challenging for index creation, as traditional database indexes require upfront creation, hence a priori workload knowledge, to be efficient.In this work, we introduce Progressive Indexing, a novel performance-driven indexing technique that focuses on automatic index creation while providing interactive response times to incoming queries. Its design allows queries to have a limited budget to spend on index creation. The indexing budget is automatically tuned to each query before query processing. This allows for systems to provide interactive answers to queries during index creation while being robust against various workload patterns and data distributions.We develop progressive algorithms to index one and multiple dimensions. In addition, we introduce Progressive Merges, a robust algorithm that merges appends into our Progressive Indexes without penalizing single queries. Show less
Today, knowledge is the most crucial element to stimulate organizational competitiveness and economic development. The ability of a firm to quickly recognize, assimilate, and utilize external... Show moreToday, knowledge is the most crucial element to stimulate organizational competitiveness and economic development. The ability of a firm to quickly recognize, assimilate, and utilize external knowledge is one of the core capabilities that bring organizational competitive advantages. Such an ability is called absorptive capacity (AC). This study focuses on three AC-related topics in the context of Chinese SMEs: 1) How do SMEs absorb external knowledge in terms of its recognition, assimilation, and utilization? 2) What challenges do SMEs face when absorbing external knowledge? And, 3)Which knowledge assimilation mechanisms do have an impact on the performance of SMEs? Show less
A New Technology-Based Firm (NTBF) is a significant enabler of job creation and a driver of the economy through stimulating innovation. In the last two decades, we have seen an enormous development... Show moreA New Technology-Based Firm (NTBF) is a significant enabler of job creation and a driver of the economy through stimulating innovation. In the last two decades, we have seen an enormous development of the NTBFs. However, the liability of smallness, newness, and weak networking ties are three important obstacles in the early stages of an NTBF’s lifecycle. Consequently, there is a high rate of failure among NTBFs.A remedy to avoid these failures is in using the support and resources by Business Incubators (BIs). BIs provide supportive services to promote the NTBFs capabilities and to help them address their liabilities.So far, there is almost no reliable evidence on the effectiveness of BIs on the performance of NTBFs. Therefore, we aim to identify the supportive activities by BIs and, to understand to what extent the supports by them have a serious impact on the performance of their NTBFs. Building on qualitative and quantitative research methods, a model to measure the impact of support by BIs on the performances of NTBFs is developed, and tested among Dutch and German NTBFs. The research results provide practical guidelines for the management teams of the incubators, which can increase the effectiveness of their performances. Show less
In this dissertation non-parametric Bayesian methods are used in the application of robotic vision. Robots make use of depth sensors that represent their environment using point clouds. Non... Show moreIn this dissertation non-parametric Bayesian methods are used in the application of robotic vision. Robots make use of depth sensors that represent their environment using point clouds. Non-parametric Bayesian methods can (1) determine how good an object is recognized, and (2) determine how many objects a particular scene contains. When there is a model available for the object to be recognized and the nature of perceptual error is known, a Bayesian method will act optimally.In this dissertation Bayesian models are developed to represent geometric objects such as lines and line segments (consisting out of points). The infinite line model and the infinite line segment model use a non-parametric Bayesian model, to be precise, a Dirichlet process, to represent the number of objects. The line or the line segment is represented by a probability distribution. The lines can be represented by conjugate distributions and then Gibbs sampling can be used. The line segments are not represented by conjugate distributions and therefore a split-merge sampler is used.A split-merge sampler fits line segments by assigning points to a hypothetical line segment. Then it proposes splits of a single line segment or merges of two line segments. A new sampler, the triadic split-merge sampler, introduces steps that involve three line segments. In this dissertation, the new sampler is compared to a conventional split-merge sampler. The triadic sampler can be applied to other problems as well, i.e., not only problems in robotic perception.The models for objects can also be learned. In the dissertation this is done for more complex objects, such as cubes, built up out of hundreds of points. An auto-encoder then learns to generate a representative object given the data. The auto-encoder uses a newly defined reconstruction distance, called the partitioning earth mover’s distance. The object that is learned by the auto-encoder is used in a triadic sampler to (1) identify the point cloud objects and to (2) establish multiple occurrences of those objects in the point cloud. Show less
This dissertation presents the results of the importance of creativity for ICT-students of Dutch universities of applied sciences (in Dutch: hogescholen), and the functioning of training courses... Show moreThis dissertation presents the results of the importance of creativity for ICT-students of Dutch universities of applied sciences (in Dutch: hogescholen), and the functioning of training courses that aim to promote creative abilities is highlighted. The ability to generate new and potentially useful ideas and problem-solving skills as a result of creative thinking is an important driver of human evolution. According to many, creativity is a very valued and sought-after accomplishment for today's society and for the future. In addition, computers, and everything related to them, have become an integral part of society. The ‘computer’ is one of the most important innovations in the history of mankind. Computers have radically changed our lives. It is even hardly conceivable to innovate without ICT. It is therefore logical that ICT-professionals play an extremely prominent role in innovation. This applies in particular to students taking a Bachelor of ICT-course in a Dutch University of Applied Sciences, because they are trained as leading IT-specialists.These phenomena led to two interrelated research questions: (i) ”Is creativity training important for ICT-students at Dutch hogescholen?”; and (ii): “Does creativity training work, as it is integrated in the curriculum of these ICT-students?” Show less
The thesis is part of a bigger project, the HEPGAME (High Energy Physics Game). The main objective for HEPGAME is the utilization of AI solutions, particularly by using MCTS for simplification of... Show moreThe thesis is part of a bigger project, the HEPGAME (High Energy Physics Game). The main objective for HEPGAME is the utilization of AI solutions, particularly by using MCTS for simplification of HEP calculations. One of the issues is solving mathematical expressions of interest with millions of terms. These calculations can be solved with the FORM program, which is software for symbolic manipulation. Since these calculations are computationally intensive and take a large amount of time, the FORM program was parallelized to solve them in a reasonable amount of time.Therefore, any new algorithm based on MCTS, should also be parallelized. This requirement was behind the problem statement of the thesis: “How do we design a structured pattern-based parallel programming approach for efficient parallelism of MCTS for both multi-core and manycore shared-memory machines?”.To answer this question, the thesis approached the MCTS parallelization problem in three levels: (1) implementation level, (2) data structure level, and (3) algorithm level.In the implementation level, we proposed task-level parallelization over thread-level parallelization. Task-level parallelization provides us with efficient parallelism for MCTS to utilize cores on both multi-core and manycore machines.In the data structure level, we presented a lock-free data structure that guarantees the correctness. A lock-free data structure (1) removes the synchronization overhead when a parallel program needs many tasks to feed its cores and (2) improves both performance and scalability.In the algorithm level, we first explained how to use pipeline pattern for parallelization of MCTS to overcome search overhead. Then, through a step by step approach, we were able to propose and detail the structured parallel programming approach for Monte Carlo Tree Search. Show less
The database research community has made tremendous strides in developing powerful database engines that allow for efficient analytical query processing. However, these powerful systems have gone... Show moreThe database research community has made tremendous strides in developing powerful database engines that allow for efficient analytical query processing. However, these powerful systems have gone largely unused by analysts and data scientists. This poor adoption is caused primarily by the state of database-client integration. In this thesis we attempt to overcome this challenge by investigating how we can facilitate efficient and painless integration of analytical tools and relational database management systems. We focus our investigation on the three primary methods for database-client integration: client-server connections, in-database processing and embedding the database inside the client application. Show less
Real-life processes are characterized by dynamics involving time. Examples are walking, sleeping, disease progress in medical treatment, and events in a workflow. To understand complex behavior one... Show moreReal-life processes are characterized by dynamics involving time. Examples are walking, sleeping, disease progress in medical treatment, and events in a workflow. To understand complex behavior one needs expressive models, parsimonious enough to gain insight. Uncertainty is often fundamental for process characterization, e.g., because we sometimes can observe phenomena only partially. This makes probabilistic graphical models a suitable framework for process analysis. In this thesis, new probabilistic graphical models that offer the right balance between expressiveness and interpretability are proposed, inspired by the analysis of complex, real-world problems. We first investigate processes by introducing latent variables, which capture abstract notions from observable data (e.g., intelligence, health status). Such models often provide more accurate descriptions of processes. In medicine, such models can also reveal insight on patient treatment, such as predictive symptoms. The second viewpoint looks at processes by identifying time points in the data where the relationships between observable variables change. This provides an alternative characterization of process change. Finally, we try to better understand processes by identifying subgroups of data that deviate from the whole dataset, e.g., process workflows whose event dynamics differ from the general workflow. Show less
By training with virtual opponents known as computer generated forces (CGFs), trainee fighter pilots can build the experience necessary for air combat operations, at a fraction of the cost of... Show moreBy training with virtual opponents known as computer generated forces (CGFs), trainee fighter pilots can build the experience necessary for air combat operations, at a fraction of the cost of training with real aircraft. In practice however, the variety of CGFs is not as wide as it can be. This is largely due to a lack of behaviour models for the CGFs. In this thesis we investigate to what extent behaviour models for the CGFs in air combat training simulations can be automatically generated, by the use of machine learning.The domain of air combat is complex, and machine learning methods that operate within this domain must be suited to the challenges posed by the domain. Our research shows that the dynamic scripting algorithm greatly facilitates the automatic generation of air combat behaviour models, while being sufficiently flexible to be moulded into answers to the challenges. However, ensuring the validity of the newly generated behaviour models remains to be a point of attention for future research. Show less
In this thesis the subject of our investigation is the use of the first legal copy of the notarised reporting deed (“de notariële proces-verbaalakte”) as enforceable verdict in the civil... Show moreIn this thesis the subject of our investigation is the use of the first legal copy of the notarised reporting deed (“de notariële proces-verbaalakte”) as enforceable verdict in the civil proceedings before a private court (arbitration or a binding third-party ruling) under Dutch law. Show less
We have studied shape with a particular focus on the zebrafish model system. The shape is an essential appearance of the phenotype of a biological specimen and it can be used to read out a... Show moreWe have studied shape with a particular focus on the zebrafish model system. The shape is an essential appearance of the phenotype of a biological specimen and it can be used to read out a current state or response or to study gene expression. So accurate shape analysis requires a precise shape description. Moreover, a sufficiently large sampling size of the specimens is necessary to ensure a justified and unbiased shape analysis. The latter is very important for high-throughput in compound screening. Therefore, top performance in zebrafish analysis requires high-throughput imaging (HTI). To deal with HTI, we aim to design an elaborate and well-performing HTI architecture. For the essential operations we need computational approaches to obtain the 2D/3D shape representations that are precise and yet can be acquired fast. The quality of the obtained shape descriptions will be validated in a straightforward manner with scalar primitives, i.e., the volume and surface area of a 3D shape. These primitives serve as 3D measurements for a robust primary shape assessment in the phenotype characterisation. Using only shape description is not sufficient, e.g., for high-resolution imaging on tissue and cellular level, so texture should be considered to complement and enhance the shape analysis. Show less
Collaborative innovation processes in unpredictable environments are a challenge for traditional management. But new demands in a global digital society push public and corporate leadership to... Show moreCollaborative innovation processes in unpredictable environments are a challenge for traditional management. But new demands in a global digital society push public and corporate leadership to collaborate ad hoc, without predictable goals and planned working rules. In this study, an actor-network approach (ANT) is combined with critical incident technique (CIT) to elaborate dynamic network principles for a new real-time foresight (RTF). Real-time foresight replaces traditional planning and strategic management in ad hoc multi-sector collaborations. Although ANT originates from science and technologies studies, it is here applied to a management problem due to ist ability to merge voluntaristic and evolutionary managerial components and micro- and macro perspectives. The investigation is placed in an exemplary management field of high dynamics: global disaster management. From process analysis and from comparison of three dynamic innovation networks that emerged around Indian coastal villages after Tsunami 2004, five dynamic network patterns are obtained which underly successful collaborative innovation processes. These dynamic structures build the agenda for a new real-time foresight, and for an instrument to evaluate in real-time the emergence of dynamic innovation networks (DINs). Show less
Nowadays, there is a continuous need for many corporations to renew their business portfolio strategically in anticipation of changes in the business environment (e.g., technological change). The... Show moreNowadays, there is a continuous need for many corporations to renew their business portfolio strategically in anticipation of changes in the business environment (e.g., technological change). The ongoing booming of founding international start-ups suggests that small entrepreneurial teams are an effective means to develop new businesses. Corporations should be able to benefit from this form of self-organized innovation when entering novel business domains for strategic renewal. However, corporations that establish small entrepreneurial teams (corporate ventures) are facing two obstacles. First, corporate ventures often fail for reasons that are not well explored. Second, it remains unclear how the partial successes may be improved to large successes. Although the key success factors remain ambiguous, there is little hope that corporate ventures will be successful without effective management. Since an empirical model for corporate venture management does not exists so far, the thesis formulates and answers the following problem statement: How can corporate management effectively manage corporate ventures? Building on qualitative and quantitative research methodologies, a model for effective corporate venture management is developed and tested statistically in the German IT consulting industry. The research results reveal some of the essential management principles through which corporate management can increase corporate venture success systematically. Show less