Understanding Cancer Patients with Diagnostically Influential Factors using High Dimensional Data Embedding

Book chapter

Syed, A. S., Hajderanj, L., Guo, K. and Chen, D. (2022). Understanding Cancer Patients with Diagnostically Influential Factors using High Dimensional Data Embedding. in: Imoize, A. L., Hemanth, D. J., Do, D.-T. and Sur, S. N. (ed.) Explainable Artificial Intelligence in Medical Decision Support Systems The Institution of Engineering and Technology (IET).

Publication dates
Authors	Syed, A. S., Hajderanj, L., Guo, K. and Chen, D.
Editors	Imoize, A. L., Hemanth, D. J., Do, D.-T. and Sur, S. N.
Abstract	Analysing breast cancer data is a long-established research topic from both medical diagnosis and data modelling perspectives. Enormous predictive models have been employed in modelling breast cancer data, e.g., predicting a patient’s survival rate given certain medical circumstances and a patient’s demographics. However, these predictive models tend to take a black-box approach to the modelling and therefore can hardly provide any explainable results to be applied for diagnostic purposes, in particular, if neural network-based models are utilised. On the other hand, identifying diagnostically influential factors with exploratory descriptive models has been proven difficult due to the high dimensionality of breast cancer data under consideration. For instance, the breast cancer data provided by SEER, The Surveillance, Epidemiology, and End Results Program, typically has more than 100 dimensions of numeric and categorical data types and could expend to about some 1000 dimensions for analysis if orthogonal (one-hot) encoding is applied. Hence, effectively interpreting and understanding high dimensional data becomes crucial in modelling cancer data, and it is because of this that dimensionality reduction algorithms and manifold learning algorithms have been studied intensively and many relevant algorithms are available, with each having pros and cons of its own. In this Chapter, a comparative study is presented aiming at providing visualized, explainable insights in breast cancer survival rate analysis and identifying critical influential factors that strongly determine the likelihood of a patient’s survival. Two dimensionality reduction algorithms are considered in this study for comparison’s purpose: One is a typical and popular t-SNE (t-distributed stochastic neighbor embedding) algorithm, and the another is a relevant new SDD (same degree distribution) algorithm. The relevant experiments have demonstrated that, based on the same embedding performance assessment metrics, the SDD algorithm can achieve much better data embedding results which could be impossible or difficult if t-SNE is used. Furthermore, using the reliable embedding results from SDD, meaningful and explainable factors have been identified that reflect crucially the similarities of the patients who have survived and the diversities of the patients who, unfortunately, have died. Clusters of patients who survived are clearly recognizable in a two-dimensional embedding space, whereas the embedded points of patients who died are significantly scattered in the space. The entire package of the codes used for the analysis is available for replication.
Keywords	Breast cancer survivability; t-SNE; Same Degree Distribution algorithm; Dimensionality reduction; Visualization; Data embedding; Classification
Year	2022
Book title	Explainable Artificial Intelligence in Medical Decision Support Systems
Publisher	The Institution of Engineering and Technology (IET)
ISBN	9781839536212
Print	22 Nov 2022
Publication process dates
Accepted	10 Aug 2022
Deposited	31 Aug 2022
Digital Object Identifier (DOI)	https://doi.org/10.1049/PBHE050E

Permalink -

https://openresearch.lsbu.ac.uk/item/91949

Restricted files

File

Under embargo indefinitely

193
total views
1
total downloads
0
views this month
0
downloads this month

Export as

Related outputs

Photothermal Radiometry Data Analysis by Using Machine Learning

Xiao, P. and Chen, D. (2024). Photothermal Radiometry Data Analysis by Using Machine Learning. Sensors. 24 (10), p. 3015. https://doi.org/10.3390/s24103015

Novel Parameter-Free and Parametric Same Degree Distribution-based Dimensionality Reduction Algorithms for Trustworthy Data Structure Preserving

Hajderanj, L., Chen, D. and Dudley-Mcevoy, S. (2023). Novel Parameter-Free and Parametric Same Degree Distribution-based Dimensionality Reduction Algorithms for Trustworthy Data Structure Preserving. Information Sciences. 661, p. 120030. https://doi.org/10.1016/j.ins.2023.120030

Skin Capacitive Image Stitching and Occlusion Measurements

Ciortea, L. I., Chen, D. and Xiao, P. (2023). Skin Capacitive Image Stitching and Occlusion Measurements. Cosmetics. 10 (1), p. 32. https://doi.org/10.3390/cosmetics10010032

UAV target tracking method based on deep reinforcement learning

Zhang, H., He, P., zhang, M., Chen, D., Neretin, E. and Li, B. (2022). UAV target tracking method based on deep reinforcement learning. 2022 International Conference on Cyber-physical Social Intelligence. Nanjing, China 21 - 24 Oct 2022

Developing Phoneme-based Lip-reading Sentences System for Silent Speech Recognition

El Bialy, R., Chen, D., Fenghour, S., Hussein, W., Xiao, P., Karam, O. H. and Li, B. (2022). Developing Phoneme-based Lip-reading Sentences System for Silent Speech Recognition. CAAI Transactions on Intelligence Technology. 8 (1), pp. 128-139. https://doi.org/10.1049/cit2.12131

An effective context-focused hierarchical mechanism for task-oriented dialogue response generation

Zhao, M., Wang, L., Jiang, Z., Li, R., Lu, X., Hu, Z. and Chen, D. (2022). An effective context-focused hierarchical mechanism for task-oriented dialogue response generation. Computational Intelligence. 38 (5), pp. 1831-1858. https://doi.org/10.1111/coin.12544

The Effect of Sun Tan Lotion on Skin By Using Skin TEWL and Skin Water Content Measurements

Xiao, P. and Chen, D. (2022). The Effect of Sun Tan Lotion on Skin By Using Skin TEWL and Skin Water Content Measurements. MDPI Sensors. 22. https://doi.org/10.3390/s22093595

Few-shot Object Recognition based on Three-Way Decision and Active Learning

Li, B., Luo, S., Wang, J., Tian, L. and Chen, D. (2022). Few-shot Object Recognition based on Three-Way Decision and Active Learning. Visual Computer . 37.

An Effective Conversion of Visemes to Words for High-Performance Automatic Lipreading.

Fenghour, S., Chen, D., Guo, K., Li, B. and Xiao, P. (2021). An Effective Conversion of Visemes to Words for High-Performance Automatic Lipreading. Sensors. 21 (23). https://doi.org/s21237890

UAV visual flight control method based on deep reinforcement learning

Bai, S., Li, B., Gan, Z. and Chen, D. (2021). UAV visual flight control method based on deep reinforcement learning. 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI). https://doi.org/10.1109/iccsi53130.2021.9736242

UAV flight control method based on deep reinforcement learning

Bai, S., Li, B., Gan, Z. and Chen, D. (2021). UAV flight control method based on deep reinforcement learning. 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI). Beijing, China 18 Dec 2021 - 20 Mar 2022 Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/ICCSI53130.2021.9736242

ME‐MADDPG: An efficient learning‐based motion planning method for multiple agents in complex environments

Wan, K., Wu, D., Li, B., Gao, X., Hu, Z. and Chen, D. (2021). ME‐MADDPG: An efficient learning‐based motion planning method for multiple agents in complex environments. International Journal of Intelligent Systems. 37 (3), pp. 2393-2427. https://doi.org/10.1002/int.22778

Learning the structure of Bayesian networks with ancestral and/or heuristic partition

Tan, X., Gao, X., Wang, Z., Han, H., Liu, X. and Chen, D. (2021). Learning the structure of Bayesian networks with ancestral and/or heuristic partition. Information Sciences. https://doi.org/10.1016/j.ins.2021.10.052

Deep Learning-based Automated Lip-Reading: A Survey

Fenghour, S., Chen, D., Guo, K., Li, B. and Xiao, P. (2021). Deep Learning-based Automated Lip-Reading: A Survey. IEEE Access. https://doi.org/10.1109/ACCESS.2021.3107946

Deep Learning Causal Attributions of Breast Cancer

Chen, D., Hajderanj, L., Mallet, S., Camenen, P., Li, B., Hao, R. and Zhao, E. (2021). Deep Learning Causal Attributions of Breast Cancer. in: Arai, K. (ed.) Intelligent Computing, Proceedings of the 2021 Computing Conference, Lecture Notes in Networks and Systems, Vol 285, Intelligent Computing (A. Kohei, Editor) Springer.

The Impact of Supervised Manifold Learning on Structure Preserving and Classification Error: A Theoretical Study

Hajderanj, L., Chen, D. and Weheliye, I. (2021). The Impact of Supervised Manifold Learning on Structure Preserving and Classification Error: A Theoretical Study. IEEE Access. 9. https://doi.org/10.1109/ACCESS.2021.3066259

Enhancing Transformer-based language models with Commonsense Representations for Knowledge-driven Machine Comprehension

Li, R., Jiang, Z., Wang, L., Lu, X., Zhao, M. and Chen, D. (2021). Enhancing Transformer-based language models with Commonsense Representations for Knowledge-driven Machine Comprehension. Knowledge-Based Systems. 220, p. 106936. https://doi.org/10.1016/j.knosys.2021.106936

The Development of a Skin Image Analysis Tool by Using Machine Learning Algorithms

Xiao, P., Zhang, Xu, Pan, Wei, Ou, Xiang, Bontozoglou, C., Chirikhina, E. and Chen, D. (2020). The Development of a Skin Image Analysis Tool by Using Machine Learning Algorithms. Cosmetics. 7 (3), p. e67. https://doi.org/10.3390/cosmetics7030067

Three‐way decision of target threat decision making based on adaptive threshold algorithms

Li, B., Tian, Li., Han, Y. and Chen, D. (2020). Three‐way decision of target threat decision making based on adaptive threshold algorithms. The Journal of Engineering. 2020 (13), pp. 293-297. https://doi.org/10.1049/joe.2019.1202

Effectiveness analysis of ship formation air defence based on deep belief network

Li, B., Luo, H., Wang, Y. and Chen, D. (2020). Effectiveness analysis of ship formation air defence based on deep belief network. The Journal of Engineering. 2020 (13), pp. 394-398. https://doi.org/10.1049/joe.2019.1201

Maneuvering target tracking of UAV based on MN-DDPG and transfer learning

Li, B., Yang, Z.P., Chen, D.Q., Liang, S.Y. and Ma, H. (2020). Maneuvering target tracking of UAV based on MN-DDPG and transfer learning. Defence Technology. https://doi.org/10.1016/j.dt.2020.11.014

Lip Reading Sentences Using Deep Learning with Only Visual Cues

Fenghour, S., Chen, D., Guo, K. and Xiao, P. (2020). Lip Reading Sentences Using Deep Learning with Only Visual Cues. IEEE Access. https://doi.org/10.1109/ACCESS.2020.3040906

Deep Learning Causal Attributions of Breast Cancer

Chen, D, Hajderanj, L, Mallet, S, Camenen, P, Li, B, Ren, H and Zhao, E (2020). Deep Learning Causal Attributions of Breast Cancer. Computing 2021. London 15 - 16 Jul 2021 The Science and Information (SAI) Organization. https://doi.org/10.1007/978-3-030-80129-8_10

UAV Maneuvering Target Tracking in Uncertain Environments based on Deep Reinforcement Learning and Meta-learning

Li, B., Gan, Z., Chen, D. and Aleksandrovich, D.S. (2020). UAV Maneuvering Target Tracking in Uncertain Environments based on Deep Reinforcement Learning and Meta-learning. Remote Sensing. 12 (22), p. 3789. https://doi.org/10.3390/rs12223789

Single- and Multi-Distribution Dimensionality Reduction Approaches for a Better Data Structure Capturing

Hajderanj, L., Chen, D., Grisan, E. and Dudley-McEvoy, S (2020). Single- and Multi-Distribution Dimensionality Reduction Approaches for a Better Data Structure Capturing. IEEE Access. 8, pp. 207141 - 207155. https://doi.org/10.1109/ACCESS.2020.3038460

Learning Bayesian Networks based on Order Graph with Ancestral Constraints

Wang, Z., Gao, X., Tian, X., Yang, Y. and Chen, D. (2020). Learning Bayesian Networks based on Order Graph with Ancestral Constraints. Knowledge-Based Systems. https://doi.org/10.1016/j.knosys.2020.106515

An Adaptive Task Scheduling Method for Networked UAV Combat Cloud System Based on Virtual Machine and Task Migration

Li, B., Liang, S., Tian, L., Chen, D. and Zhang, M. (2020). An Adaptive Task Scheduling Method for Networked UAV Combat Cloud System Based on Virtual Machine and Task Migration. Mathematical Problems in Engineering. p. 5391479. https://doi.org/10.1155/2020/5391479

An adaptive dwell time scheduling model for phased array radar based on three-way decision

Li, B., Tian, L., Chen, D. and Liang, S. (2020). An adaptive dwell time scheduling model for phased array radar based on three-way decision. Journal of Systems Engineering and Electronics. pp. 500-509. https://doi.org/10.23919/JSEE.2020.000030

Intelligent Aircraft Maneuvering Decision Based on CNN

Li, B., Liang, S., Tian, L. and Chen, D. (2019). Intelligent Aircraft Maneuvering Decision Based on CNN. Proceedings of the 3rd International Conference on Computer Science and Application Engineering. (138). https://doi.org/10.1145/3331453.3362046

Intelligent Attitude Control of Aircraft Based on LSTM

Li, B, Gao, P, Li, X and Chen, D (2019). Intelligent Attitude Control of Aircraft Based on LSTM. 3rd International Conference on Artificial Intelligence Applications and Technologies. Beijing, China 01 - 03 Aug 2019 IOP Publishing. https://doi.org/10.1088/1757-899X/646/1/012013

A Task Scheduling Algorithm for Phased Array Radar Based on Dynamic Three-way Decision

Li, B., Tian, L., Chen, D. and Han, Y. (2019). A Task Scheduling Algorithm for Phased Array Radar Based on Dynamic Three-way Decision. Sensors. 20 (1). https://doi.org/10.3390/s20010153

Intelligent Flight Control of Combat Aircraft Based on Autoencoder

Li, B., Gao, P., Liang, S. and Chen, D. (2019). Intelligent Flight Control of Combat Aircraft Based on Autoencoder. 2019 The 4th International Conference on Robotics, Control and Automation. GuangZhou 26 - 28 Jul 2019 https://doi.org/10.1145/3351180.3351210

Skin Capacitive Imaging Analysis Using Deep Learning GoogLeNet

Zhang, X., Pan, W., Bontozoglou, C., Chirikhina, E., Chen, D. and Xiao, P. (2019). Skin Capacitive Imaging Analysis Using Deep Learning GoogLeNet. Computing Conference 2020. London, UK 16 - 17 Jul 2019 Springer.

FRS: A Simple Knowledge Graph Embedding Model for Entity Prediction

Wang, L.F., Lu, X., Jiang, Z., Zhang, Z., Li, R., Zhao, M. and Chen, D. (2019). FRS: A Simple Knowledge Graph Embedding Model for Entity Prediction. Mathematical Biosciences and Engineering. 16 (6), pp. 7789-7807. https://doi.org/10.3934/mbe.2019391

Predicting Customer Profitability Dynamically over Time: An Experimental Comparative Study

Chen, D., Guo, K. and Li, B. (2019). Predicting Customer Profitability Dynamically over Time: An Experimental Comparative Study. 24th Iberoamerican Congress on Pattern Recognition (CIARP 2019). Havana, Cuba 28 - 31 Oct 2019 https://doi.org/10.1007/978-3-030-33904-3_16

Learning Bayesian Networks using the Constrained Maximum a Posteriori Probability Method

Yang, Y, Gao, X, Guo, Z and Chen, D (2019). Learning Bayesian Networks using the Constrained Maximum a Posteriori Probability Method. Pattern Recognition. 91, pp. 123-134. https://doi.org/10.1016/j.patcog.2019.02.006

Learning Bayesian network parameters via minimax algorithm

Gao, X, Gao, G, Ren, H, Chen, D and He, C (2019). Learning Bayesian network parameters via minimax algorithm. International Journal of Approximate Reasoning. 108, pp. 62-75. https://doi.org/10.1016/j.ijar.2019.03.001

Improving Prediction Accuracy of Breast Cancer Survivability and Diabetes Diagnosis via RBF Networks trained with EKF models

Adegoke, V, Chen, D and Banissi, E (2019). Improving Prediction Accuracy of Breast Cancer Survivability and Diabetes Diagnosis via RBF Networks trained with EKF models. International Journal of Computer Information Systems and Industrial Management. 11, pp. 82-100.

A New Supervised t-SNE with Dissimilarity Measure for Effective Data Visualization and Classification

Hajderanj, L, Weheliye, I and Chen, D (2019). A New Supervised t-SNE with Dissimilarity Measure for Effective Data Visualization and Classification. 2019 8th International Conference on Software and Information Engineering. Cairo 09 - 12 Apr 2019

Recurrent Neural Networks for Decoding Lip Read Speech

Fenghour, S, Chen, D and Xiao, P (2019). Recurrent Neural Networks for Decoding Lip Read Speech. 2019 8th International Conference on Software and Information Engineering (ICSIE 2019). Cairo 09 - 12 Apr 2019

Decoder-Encoder LSTM for Lip Reading

Fenghour, S., Chen, D. and Xiao, P. (2019). Decoder-Encoder LSTM for Lip Reading. Proceedings of the 2019 8th International Conference on Software and Information Engineering. https://doi.org/10.1145/3328833.3328845

Enhancing Ensemble Prediction Accuracy of Breast Cancer Survivability and Diabetes Diagnostic using optimized EKF-RBFN trained prototypes, The 10th International Conference on Soft Computing and Pattern Recognition

Adegoke, V, Chen, D, Banissi, E and Barikzai, S (2019). Enhancing Ensemble Prediction Accuracy of Breast Cancer Survivability and Diabetes Diagnostic using optimized EKF-RBFN trained prototypes, The 10th International Conference on Soft Computing and Pattern Recognition. The 10th International Conference on Soft Computing and Pattern Recognition. Porto, Portugal 13 - 15 Dec 2018

Distributed deep networks based on Bagging-Down SGD algorithm

Qin, C, Gao, X and Chen, D (2019). Distributed deep networks based on Bagging-Down SGD algorithm. Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics. 41 (5), pp. 1021-1027. https://doi.org/10.3969/j.issn.1001-506X.2019.05.13

Towards automated cost analysis, benchmarking and estimating in construction: a machine learning approach

Chen, D, Hajderanj, L and Fiske, J (2019). Towards automated cost analysis, benchmarking and estimating in construction: a machine learning approach. 13th Multi Conference on Computer Science and Information Systems (MCCSIS). Porto, Portugal 16 - 18 Jul 2019

Design of a voice control 6DoF grasping robotic arm based on ultrasonic sensor, computer vision and Alexa voice assistance

Wang, Z, Chen, D and Xiao, P (2019). Design of a voice control 6DoF grasping robotic arm based on ultrasonic sensor, computer vision and Alexa voice assistance. International Conference on Information Technology in Medicine and Education. Qingdao, China 23 - 25 Aug 2019 Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/ITME.2019.00150

Visual analytics in the public sector: An analysis on diversities and similarities of London’s wards

Chen, D, Sanz, BM and Zhao, E (2018). Visual analytics in the public sector: An analysis on diversities and similarities of London’s wards. International Conference on Big Data Analytics, Data Mining and Computational Intelligence 2018 (BigDaCI 2018). Madrid, Spain 18 - 20 Jul 2018 Bigdaci.

Contour Mapping for Speaker-Independent Lip Reading System

Fenghour, S, Chen, D and Xiao, P (2018). Contour Mapping for Speaker-Independent Lip Reading System. The 11th International Conference on Machine Vision (ICMV 2018). Munich, Germany 01 - 03 Nov 2018

Learning Bayesian Network Parameters from a Small Data Set: A Further Constrained Qualitatively Maximum a Posteriori Method

Guo, Zhi-gao, Gao, Xiao-guang, Hao, Ren, Yang, Yu, Di, Ruo-hai and Chen, D (2017). Learning Bayesian Network Parameters from a Small Data Set: A Further Constrained Qualitatively Maximum a Posteriori Method. International Journal of Approximate Reasoning. 91 (Dec), pp. 22-35. https://doi.org/10.1016/j.ijar.2017.08.009

Feature Extraction and Labelling Large Data Sets Using Deep Learning

Chen, D (2017). Feature Extraction and Labelling Large Data Sets Using Deep Learning. RESEARCHER LINK: Smart Technology for Fighting Virus Epidemics & Bioinformatics. Recife, Pernambuco, Brazil 10 - 13 Sep 2017

Prediction of Breast Cancer Survivability using Ensemble Algorithms

Adegoke, V, Chen, D, Banissi, E and Barikzai, S (2017). Prediction of Breast Cancer Survivability using Ensemble Algorithms. International Conference on Smart System and Technologies 2017 (SST 2017),. Osijek, Croatia 18 - 20 Oct 2017

Predictive Ensemble Modelling: An Experimental Comparison of Boosting Implementation Methods

Adegoke, V, Chen, D, Barikzai, S and Banissi, E (2017). Predictive Ensemble Modelling: An Experimental Comparison of Boosting Implementation Methods. 2017 European Modelling Symposium (EMS). Manchester 20 - 21 Nov 2017

Making Better Use of Big Data

Chen, D (2016). Making Better Use of Big Data. LSBU Enterprise Count Event, March 2016. London Southbank University 18 - 18 Mar 2016 London South Bank University.

Big Data Analytics In The Public Sector: A Case Study Of NEET Analysis For The London Boroughs

Chen, D, Asaolu, B and Qin, C (2016). Big Data Analytics In The Public Sector: A Case Study Of NEET Analysis For The London Boroughs. International Conference on Big Data Analytics, Data Mining and Computational Intelligence. Funchal, Madeira, Portugal 02 - 04 Jul 2016

On Distributed Deep Network for Processing Large-Scale Sets of Complex Data

Qin, C, Gao, X and Chen, D (2016). On Distributed Deep Network for Processing Large-Scale Sets of Complex Data. 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC). Hangzhou, China. 27 - 28 Aug 2016 Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/IHMSC.2016.55

A Bayesian Approach to Learn Bayesian Networks Using Data and Constraints

Gao, X, Yu, Y, Zhi-gao, G and Chen, D (2016). A Bayesian Approach to Learn Bayesian Networks Using Data and Constraints. 23rd International Conference on Pattern Recognition (ICPR 2016). Cancún, México 04 - 08 Dec 2016 Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/ICPR.2016.7900204

Big Data Analytics System for Fact/Data-driven Decision Making

Chen, D (2015). Big Data Analytics System for Fact/Data-driven Decision Making. The Royal Statistical Society, Business and Industry Section. London, UK 18 Nov 2015 Royal Statistical Society .

Determining Key (Predictor) Modules for Early Identification of Students At-Risk

Chen, D and Elliott, G (2013). Determining Key (Predictor) Modules for Early Identification of Students At-Risk. International Conference on Advanced Information Engineering and Education Science (ICAIEES 2013). Beijing, China 19 - 20 Dec 2013 Atlantis Press. https://doi.org/10.2991/icaiees-13.2013.22

Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining

Chen, D (2012). Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing and Customer Strategy Management. 19 (3), pp. 197-208. https://doi.org/10.1057/dbm.2012.17