Documents
-
- Download
- Title pages_Contents
- open access
-
- Download
- References
- open access
-
- Download
- Summary in English
- open access
-
- Download
- Summary in Dutch
- open access
-
- Download
- Acknowledgements_Curriculum Vitae
- open access
-
- Download
- Propositions
- open access
In Collections
This item can be found in the following collections:
Improved strategies for distance based clustering of objects on subsets of attributes in high-dimensional data
This monograph focuses on clustering of objects in high-dimensional data, given the restriction that the objects do not cluster on all the attributes, not even on a single subset of attributes, but often on different subsets of attributes in the data. With the objective to reveal such a clustering structure, Friedman and Meulman (2004) proposed a framework and a specific algorithm, called COSA. In this monograph we propose various improvements to the original COSA algorithm. The first improvement targets the optimization strategy for the tuning parameters in COSA. Further, a reformulation of the COSA criterion brings down the number of tuning parameters from two to one, enables incorporation of pre-specified initial weights for the attribute distances and allows for a solution that consists of zero-valued attribute weights. The third improvement consists of a new definition of the COSA distances that yields a better separation between objects from...
Show moreThis monograph focuses on clustering of objects in high-dimensional data, given the restriction that the objects do not cluster on all the attributes, not even on a single subset of attributes, but often on different subsets of attributes in the data. With the objective to reveal such a clustering structure, Friedman and Meulman (2004) proposed a framework and a specific algorithm, called COSA. In this monograph we propose various improvements to the original COSA algorithm. The first improvement targets the optimization strategy for the tuning parameters in COSA. Further, a reformulation of the COSA criterion brings down the number of tuning parameters from two to one, enables incorporation of pre-specified initial weights for the attribute distances and allows for a solution that consists of zero-valued attribute weights. The third improvement consists of a new definition of the COSA distances that yields a better separation between objects from different clusters. We compared the `old' and the improved COSA with other state of the art methods. The comparison is based on simulated and real omics data sets.
Show less- All authors
- Kampert, M.M.D.
- Supervisor
- Meulman, J.J.
- Committee
- Smit, B. de; Vaart, A.J.W. van der; Friedman, J.H.; Hubert, M.; Rooij, M. de; Gill, R.D.
- Qualification
- Doctor (dr.)
- Awarding Institution
- Mathematical Institute, Faculty of Science, Leiden University
- Date
- 2019-07-03