DSpace Repository

An extended k-means++ with mixed attributes for outlier detection

Show simple item record

dc.contributor.advisor Guha, Sumanta (Chairperson) en_US
dc.contributor.author Sarunya Kanjanawattana en_US
dc.contributor.other Phan Minh Dung (Member) en_US
dc.contributor.other Dailey, Matthew N. (Member) en_US
dc.date.accessioned 2015-01-12T10:41:32Z
dc.date.available 2015-01-12T10:41:32Z
dc.date.issued 2011-08 en_US
dc.identifier.other AIT Thesis no.CS-11-14 en_US
dc.identifier.uri http://www.cs.ait.ac.th/xmlui/handle/123456789/397
dc.description Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer science. en_US
dc.description.abstract Fraud detection becomes an important task nowadays especially a health insurance. Some developing country such as Thailand, we use human inspection and some heuristic rules to detect fraud on financial statements. That spends a lot of time to success and may provide incorrect results. This is a reason why we need effective fraud detection system to support the auditing of organization process. We proposed a new algorithm, called MixK-means++, which is an extension of K-means++. The new algorithm can be applied to mixed numeric and categorical data. The limitation of K-means and K-means++ is to apply with only numeric data. Though, almost data in real world are combined by categorical and numeric data. This MixK-means++ algorithm can overcome this disadvantage. We compared the speed and performance of clustering between a standard K-means versus MixK-means++, and the results of outlier detection. MixK-means++ provided favor results even it worked with mixed attributes of data set. It was better both speed and performance than K-means. In order to compare the performance in outlier detection system, we can determine that MixK-means++ was also better in term of outlier detection because it provided higher detection rate than K-means even some cases of K-means might offer greater accuracy rate. en_US
dc.description.sponsorship Royal Thai Government en_US
dc.language.iso eng en_US
dc.publisher Asian Institute of Technology en_US
dc.subject Data mining en_US
dc.subject K-means en_US
dc.subject K-means++ en_US
dc.subject Outlier detection en_US
dc.subject Clustering en_US
dc.subject Fraud detection system en_US
dc.subject Computer algorithms en_US
dc.subject.lcsh Others en_US
dc.title An extended k-means++ with mixed attributes for outlier detection en_US
dc.type Thesis en_US
dc.rights.holder Copyright (C) 2011 by Asian Institute of Technology. en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account