Data Cleansing Research Project*


This research is aimed at defining a framework for automated data cleansing.  That is, given a large data set, automatically find and correct errors (semantic and syntactic) within the set.  The underlying theoretical aspects of data quality research are being combined with problem solving methods from software testing, data mining, statistics, knowledge based systems, clustering, and machine learning to address this framework.  The framework will define an underlying theory to support an accurate set of data quality metrics.  A basic understanding of the inherent problems faced by automated data cleansing are being uncovered and investigated.

Technical Reports:

*This research is supported by a grant from the Office of Naval Research under grant N00014-99-1-0730.  All opinions, findings, conclusions and recommendations in any material resulting from this research are those of the investigators and do not necessarily reflect the views of the Office of Naval Research.

