|
With data profiling tools, database managers can improve the ability of a team to ensure that source data acts according to their assumptions. Instead of finding a subject matter expert and creating new rules to fix bad data and then running the process again only to find more data defects that require a rewrite of extraction, transformation, and loading (ETL) code, an automated data profiling tool can be used to scan every record in every column and table in a source system. Rather than just generating a list of data values, data profiling tools output reports ample with statistics and charts that streamline and ease understanding of all that needs to be known about the data. Such tools are available from data quality vendors that include Ascential Software, DataFlux, Evoke Software, First logic, Informatica, and Trillium Software, which was recently acquired by Avellino Technologies. A data profiling tool has a high likelihood also of exposing new or unexpected structures and values in the data than can be expected with manual profiling methods. There are huge benefits to data profiling tools, and several users describe their experiences, which involve time savings of terms of weeks and much better accuracy. Among topics covered are maintaining accuracy in extant systems; audit of third party feeds; exposing inconsistent business processes; data quality process; and evaluation of data processing tools.
|