Real-life processes are characterized by dynamics involving time. Examples are walking, sleeping, disease progress in medical treatment, and events in a workflow. To understand complex behavior one... Show moreReal-life processes are characterized by dynamics involving time. Examples are walking, sleeping, disease progress in medical treatment, and events in a workflow. To understand complex behavior one needs expressive models, parsimonious enough to gain insight. Uncertainty is often fundamental for process characterization, e.g., because we sometimes can observe phenomena only partially. This makes probabilistic graphical models a suitable framework for process analysis. In this thesis, new probabilistic graphical models that offer the right balance between expressiveness and interpretability are proposed, inspired by the analysis of complex, real-world problems. We first investigate processes by introducing latent variables, which capture abstract notions from observable data (e.g., intelligence, health status). Such models often provide more accurate descriptions of processes. In medicine, such models can also reveal insight on patient treatment, such as predictive symptoms. The second viewpoint looks at processes by identifying time points in the data where the relationships between observable variables change. This provides an alternative characterization of process change. Finally, we try to better understand processes by identifying subgroups of data that deviate from the whole dataset, e.g., process workflows whose event dynamics differ from the general workflow. Show less
Today, virtually everything, from natural phenomena to complex artificial and physical systems, can be measured and the resulting information collected, stored and analyzed in order to gain new... Show moreToday, virtually everything, from natural phenomena to complex artificial and physical systems, can be measured and the resulting information collected, stored and analyzed in order to gain new insight. This thesis shows how complex systems often exhibit diverse behavior at different temporal scales, and that data mining methods should be able to cope with the multiple resolutions (scales) at the same time in order to fully understand the data at hand and extract useful information from it. Under these assumptions, we introduce novel data mining and visualization methods for large time series data collected from complex physical systems. In particular, we focus on three fundamental problems: the detection of multi-scale patterns, the recognition of recurrent events, and the interactive visualization of massive time series data. We evaluate our methods on a real-world scenario provided by InfraWatch, a Structural Health Monitoring project centered around the management and analysis of data collected by a large sensor network deployed on a Dutch highway bridge. The application of our methods resulted in the identification of the relevant scales of analysis in the InfraWatch data (and other datasets), the detection of the different recurring motifs and the visualization of terabytes of time series data interactively. Show less
With the development of sensing and data processing techniques, monitoring physical systems in the field with a sensor network is becoming a feasible option for many domains. Such monitoring... Show moreWith the development of sensing and data processing techniques, monitoring physical systems in the field with a sensor network is becoming a feasible option for many domains. Such monitoring systems are referred to as Structural Health Monitoring (SHM) systems. By definition, SHM is the process of implementing a damage detection and characterisation strategy for engineering structures, which involves data collection, damage-sensitive feature extraction and statistical analysis. Most of the SHM process can be addressed by techniques from the Data Mining domain, so I conduct this research by combining these two fields. The monitoring system employed in this research is a sensor network installed on a Dutch highway bridge, which aims to monitor dynamic health aspects of the bridge and its long-term degradation. I have explored the specific focus of each sensor type under multiple scales, and analysed the dependencies between sensor types. Based on landmarks and constraints, I have proposed a novel predefined pattern detection method to select traffic events for modal analysis. I have analysed the influence of temperature and traffic mass on natural frequencies, and verified that natural frequencies decrease with temperature increases, but the influence of traffic mass is weaker than that of temperature. Show less