Data Cleansing Research Project*
research is aimed at defining a framework for automated data cleansing.
That is, given a large data set, automatically find and correct errors
(semantic and syntactic) within the set. The underlying theoretical
aspects of data quality research are being combined with problem solving
methods from software testing, data mining, statistics, knowledge based
systems, clustering, and machine learning to address this framework.
The framework will define an underlying theory to support an accurate set
of data quality metrics. A basic understanding of the inherent problems
faced by automated data cleansing are being uncovered and investigated.
Progress Report on Data Cleansing 10-18-1999
Automated Identification of Errors in Data Sets 2-2-2000
Utilizing Association Rules for Identifcation of Possible Errors in Data
Utilizing Association Rules for the Data Cleansing 5-8-2000
research is supported by a grant from the Office of Naval Research under
grant N00014-99-1-0730. All opinions, findings, conclusions and recommendations
in any material resulting from this research are those of the investigators
and do not necessarily reflect the views of the Office of Naval Research.
Return Jonathan's home
Published By: J. Maletic
Last Update: 7/26/2000