TY - JOUR
T1 - Hierarchical Maximum Likelihood Clustering Approach
AU - Sharma, Alok
AU - Boroevich, Keith A.
AU - Shigemizu, Daichi
AU - Kamatani, Yoichiro
AU - Kubo, Michiaki
AU - Tsunoda, Tatsuhiko
N1 - Publisher Copyright:
© 1964-2012 IEEE.
PY - 2017/1
Y1 - 2017/1
N2 - Objective: In this paper, we focused on developing a clustering approach for biological data. In many biological analyses, such as multiomics data analysis and genome-wide association studies analysis, it is crucial to find groups of data belonging to subtypes of diseases or tumors. Methods: Conventionally, the k-means clustering algorithm is overwhelmingly applied in many areas including biological sciences. There are, however, several alternative clustering algorithms that can be applied, including support vector clustering. In this paper, taking into consideration the nature of biological data, we propose a maximum likelihood clustering scheme based on a hierarchical framework. Results: This method can perform clustering even when the data belonging to different groups overlap. It can also perform clustering when the number of samples is lower than the data dimensionality. Conclusion: The proposed scheme is free from selecting initial settings to begin the search process. In addition, it does not require the computation of the first and second derivative of likelihood functions, as is required by many other maximum likelihood-based methods. Significance: This algorithm uses distribution and centroid information to cluster a sample and was applied to biological data. A MATLAB implementation of this method can be downloaded from the web link http://www.riken.jp/en/research/labs/ims/med-sci-math/.
AB - Objective: In this paper, we focused on developing a clustering approach for biological data. In many biological analyses, such as multiomics data analysis and genome-wide association studies analysis, it is crucial to find groups of data belonging to subtypes of diseases or tumors. Methods: Conventionally, the k-means clustering algorithm is overwhelmingly applied in many areas including biological sciences. There are, however, several alternative clustering algorithms that can be applied, including support vector clustering. In this paper, taking into consideration the nature of biological data, we propose a maximum likelihood clustering scheme based on a hierarchical framework. Results: This method can perform clustering even when the data belonging to different groups overlap. It can also perform clustering when the number of samples is lower than the data dimensionality. Conclusion: The proposed scheme is free from selecting initial settings to begin the search process. In addition, it does not require the computation of the first and second derivative of likelihood functions, as is required by many other maximum likelihood-based methods. Significance: This algorithm uses distribution and centroid information to cluster a sample and was applied to biological data. A MATLAB implementation of this method can be downloaded from the web link http://www.riken.jp/en/research/labs/ims/med-sci-math/.
UR - http://www.scopus.com/inward/record.url?scp=85008481824&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85008481824&partnerID=8YFLogxK
U2 - 10.1109/TBME.2016.2542212
DO - 10.1109/TBME.2016.2542212
M3 - Article
C2 - 27046867
AN - SCOPUS:85008481824
SN - 0018-9294
VL - 64
SP - 112
EP - 122
JO - IEEE Transactions on Biomedical Engineering
JF - IEEE Transactions on Biomedical Engineering
IS - 1
M1 - 7440832
ER -