Documents
-
- Download
- Title Pages_Contents
- open access
-
- Download
- Chapter 6
- open access
- Full text at publishers site
-
- Download
- Chapter 7
- open access
- Full text at publishers site
-
- Download
- Chapter 8
- open access
- Full text at publishers site
-
- Download
- Chapter 9
- open access
- Full text at publishers site
-
- Download
- Chapter 10
- open access
-
- Download
- References
- open access
-
- Download
- Summary in Dutch
- open access
-
- Download
- Summary in English
- open access
-
- Download
- Acknowledgements_Curriculum Vitae
- open access
-
- Download
- Propositions
- open access
In Collections
This item can be found in the following collections:
Exceptional Model Mining
Finding subsets of a dataset that somehow deviate from the norm, i.e. where something interesting is going on, is a classical Data Mining task. In traditional local pattern mining methods, such deviations are measured in terms of a relatively high occurrence (frequent itemset mining), or an unusual distribution for one designated target attribute (subgroup discovery). These, however, do not encompass all forms of "interesting". To capture a more general notion of interestingness in subsets of a dataset, we develop Exceptional Model Mining (EMM). This is a supervised local pattern mining framework, where several target attributes are selected, and a model over these attributes is chosen to be the target concept. Then, subsets are sought on which this model is substantially different from the model on the whole dataset. For instance, we can find parts of the data where two target attributes have an unusual correlation, a classifier has a...
Show moreFinding subsets of a dataset that somehow deviate from the norm, i.e. where something interesting is going on, is a classical Data Mining task. In traditional local pattern mining methods, such deviations are measured in terms of a relatively high occurrence (frequent itemset mining), or an unusual distribution for one designated target attribute (subgroup discovery). These, however, do not encompass all forms of "interesting". To capture a more general notion of interestingness in subsets of a dataset, we develop Exceptional Model Mining (EMM). This is a supervised local pattern mining framework, where several target attributes are selected, and a model over these attributes is chosen to be the target concept. Then, subsets are sought on which this model is substantially different from the model on the whole dataset. For instance, we can find parts of the data where two target attributes have an unusual correlation, a classifier has a deviating predictive performance, or a Bayesian network fitted on several target attributes has an exceptional structure. We will discuss some real-world applications of EMM instances, including using the Bayesian network model to identify meteorological conditions under which food chains are displaced, and using a regression model to find the subset of households in the Chinese province of Hunan that do not follow the general economic law of demand.
Show less- All authors
- Duivesteijn, W.
- Supervisor
- Kok, J.N.
- Co-supervisor
- Knobbe, A.J.
- Qualification
- Doctor (dr.)
- Awarding Institution
- Leiden Institute of Advanced Computer Science (LIACS), Faculty of Sciences, Leiden University
- Date
- 2013-09-17
Funding
- Sponsorship
- This research is supported by the Netherlands Organisation for Scientific Research (NWO) under project number 612.065.822 (Exceptional Model Mining).