Mining Order-Preserving Submatrices from Data with Repeated Measurements
Publication in refereed journal


摘要Order-preserving submatrices (OPSM's) have been shown useful in capturing concurrent patterns in data when the relative magnitudes of data items are more important than their exact values. For instance, in analyzing gene expression profiles obtained from microarray experiments, the relative magnitudes are important both because they represent the change of gene activities across the experiments, and because there is typically a high level of noise in data that makes the exact values untrustable. To cope with data noise, repeated experiments are often conducted to collect multiple measurements. We propose and study a more robust version of OPSM, where each data item is represented by a set of values obtained from replicated experiments. We call the new problem OPSM-RM (OPSM with repeated measurements). We define OPSM-RM based on a number of practical requirements. We discuss the computational challenges of OPSM-RM and propose a generic mining algorithm. We further propose a series of techniques to speed up two time dominating components of the algorithm. We show the effectiveness and efficiency of our methods through a series of experiments conducted on real microarray data.
著者Yip KY, Kao B, Zhu XJ, Chui CK, Lee SD, Cheung DW
期刊名稱IEEE Transactions on Knowledge and Data Engineering
詳細描述To ORKTS: I am the first author of the publication.

2012 ISI journal impact factor: 1.892
出版社Institute of Electrical and Electronics Engineers (IEEE)
頁次1587 - 1600
關鍵詞bioinformatics; Data mining; mining methods and algorithms
Web of Science 學科類別Computer Science; Computer Science, Artificial Intelligence; COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE; Computer Science, Information Systems; COMPUTER SCIENCE, INFORMATION SYSTEMS; Engineering; Engineering, Electrical & Electronic; ENGINEERING, ELECTRICAL & ELECTRONIC

上次更新時間 2020-27-10 於 00:34