Mining Imperfect Data. Dealing with Contamination and Incomplete Records
Buy a book... In Association with Amazon.co.uk
Author(s): R. Pearson
Publisher: SIAM
ISBN: 0898715822
Format: softback
305pp
Price: $70.00
Review Date: 31 May 2005
Review: Data mining is an automated method of extracting useful information from datasets. In practice, these datasets often include outliers, missing or incomplete data, or incorporate other more subtle phenomena such as misalignments. This book deals with the problems of detecting such data anomalies and overcoming their deleterious effects. The two approaches used here are data pre-treatment, and analytic validation. These two strategies can be used in conjunction with most data mining methods. Examples of pre-treatment and validation methods are given for various situations, including simulation-based examples, in which the ‘correct’ results are known, and real examples that illustrate typical cases.