Effective and Trustworthy Dimensionality Reduction Approaches for High Dimensional Data Understanding and Visualization

PhD Thesis

Hajderanj, L. (2022). Effective and Trustworthy Dimensionality Reduction Approaches for High Dimensional Data Understanding and Visualization. PhD Thesis London South Bank University School of Engineering https://doi.org/10.18744/lsbu.92850

Publication dates
Authors	Hajderanj, L.
Type	PhD Thesis
Abstract	In recent years, the huge expansion of digital technologies has vastly increased the volume of data to be explored. Reducing the dimensionality of data is an essential step in data exploration and visualisation. The integrity of a dimensionality reduction technique relates to the goodness of maintaining the data structure. The visualisation of a low dimensional data that has not captured the high dimensional space data structure is untrustworthy. The scale of maintained data structure by a method depends on several factors, such as the type of data considered and tuning parameters. The type of the data includes linear and nonlinear data, and the tuning parameters include the number of neighbours and perplexity. In reality, most of the data under consideration are nonlinear, and the process to tune parameters could be costly since it depends on the number of data samples considered. Currently, the existing dimensionality reduction approaches suffer from the following problems: 1) Only work well with linear data, 2) The scale of maintained data structure is related to the number of data samples considered, and/or 3) Tear problem and false neighbours problem.To deal with all the above-mentioned problems, this research has developed Same Degree Distribution (SDD), multi-SDD (MSDD) and parameter-free SDD approaches , that 1) Saves computational time because its tuning parameter does not 2) Produces more trustworthy visualisation by using degree-distribution that is smooth enough to capture local and global data structure, and 3) Does not suffer from tear and false neighbours problems due to using the same degree-distribution in the high and low dimensional spaces to calculate the similarities between data samples. The developed dimensionality reduction methods are tested with several popu- lar synthetics and real datasets. The scale of the maintained data structure is evaluated using different quality metrics, i.e., Kendall’s Tau coefficient, Trustworthiness, Continuity, LCMC, and Co-ranking matrix. Also, the theoretical analysis of the impact of dissimilarity measure in structure capturing has been supported by simulations results conducted in two different datasets evaluated by Kendall’s Tau and Co-ranking matrix. The SDD, MSDD, and parameter-free SDD methods do not outperform other global methods such as Isomap in data with a large fraction of large pairwise distances, and it remains a further work task. Reducing the computational complexity is another objective for further work.
Year	2022
Publisher	London South Bank University
Digital Object Identifier (DOI)	https://doi.org/10.18744/lsbu.92850
File	thesis -revised.pdf License CC BY 4.0 File Access Level Open
Print	07 Jun 2022
Publication process dates
Deposited	14 Nov 2022

Permalink -

https://openresearch.lsbu.ac.uk/item/92850

Download files

File

	thesis -revised.pdf
License: CC BY 4.0
File access level: Open

158
total views
126
total downloads
5
views this month
1
downloads this month

Export as

Related outputs

Novel Parameter-Free and Parametric Same Degree Distribution-based Dimensionality Reduction Algorithms for Trustworthy Data Structure Preserving

Hajderanj, L., Chen, D. and Dudley-Mcevoy, S. (2023). Novel Parameter-Free and Parametric Same Degree Distribution-based Dimensionality Reduction Algorithms for Trustworthy Data Structure Preserving. Information Sciences. 661, p. 120030. https://doi.org/10.1016/j.ins.2023.120030

Single- and Multi-Distribution Dimensionality Reduction Approaches for a Better Data Structure Capturing

Hajderanj, L., Chen, D., Grisan, E. and Dudley-McEvoy, S (2020). Single- and Multi-Distribution Dimensionality Reduction Approaches for a Better Data Structure Capturing. IEEE Access. 8, pp. 207141 - 207155. https://doi.org/10.1109/ACCESS.2020.3038460