| Review: |
Data mining is an automated method of extracting useful information from datasets. In practice, these datasets often include outliers, missing or incomplete data, or incorporate other more subtle phenomena such as misalignments. This book deals with the problems of detecting such data anomalies and overcoming their deleterious effects. The two approaches used here are data pre-treatment, and analytic validation. These two strategies can be used in conjunction with most data mining methods. Examples of pre-treatment and validation methods are given for various situations, including simulation-based examples, in which the ‘correct’ results are known, and real examples that illustrate typical cases. |