Unsupervised clustering for nontextual web document classification
Publication in refereed journal

Times Cited
Web of Science6WOS source URL (as at 07/07/2020) Click here for the latest count
Altmetrics Information

Other information
AbstractWhile the breath of vocabulary used in long documents may mislead the traditional keyword-based retrieval systems, the demands for techniques in nontextual Web classification and retrieval from a large document collection are mounting. Only a few prototype systems have attempted to classify hypertext on the basis of nontextual elements in order to locate unfamiliar documents. As a result, a large portion of Web documents having pictorial information in nature is far beyond the reach of most current search engines. In this research, we devise a novel quantitative model of nontextual World Wide Web (WWW) classification based on image information. An intelligent content-sensitive, attribute-rich image classifier is presented. An image similarity measure is used to deduce the likelihood among images. Different image feature vectors have been constructed and evaluated. Evaluation shows images judged to be similar by human form interesting clusters in our unsupervised learning. Comparison with other clustering technique, such as Hierarchical Agglomerative Clustering (HAC), demonstrates that our approach is found useful in content-based image information retrieval. (C) 2003 Elsevier B.V. All rights reserved.
All Author(s) ListChan SWK, Chong MWC
Journal nameDecision Support Systems
Volume Number37
Issue Number3
Pages377 - 396
LanguagesEnglish-United Kingdom
Keywordsimage classification; neural networks; unsupervised clustering
Web of Science Subject CategoriesComputer Science; Computer Science, Artificial Intelligence; COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE; Computer Science, Information Systems; COMPUTER SCIENCE, INFORMATION SYSTEMS; Operations Research & Management Science; OPERATIONS RESEARCH & MANAGEMENT SCIENCE

Last updated on 2020-08-07 at 03:08