|
Regulatory initiatives such as Basel II, Sarbanes Oxley (SOX), and the Health Insurance Portability and Accounting Act (HIPAA) have required companies to focus on the quality of their data. Data profiling can examine and identify questions such as 'what' and 'where' in reference to data quality. It can also shed light on the data's content, structure and context, and provide information on its completeness and accuracy. An efficient data profiling methodology will include a number of steps. The first step is to define business requirements such as business rules, data standards, and domain definitions. Next, are the use of discovery techniques such as the usage of statistical methods for verification of content and form, including histograms, calculation of mean and standard deviation, frequency count, and maximum and minimum values. The next step is to analyze the findings of the discovery phase and compare them with the requirements identified in the definition phase. Subsequently, the company performs identification and execution of actions based on the results of the analysis stage. Lastly, monitoring translates into examining the metrics for identification of the current state and trends of data quality. As part of an exercise in compliance, data profiling can be a complicated initiative. However, with comprehensive planning and resource commitment, a company can be successful in achieving compliance and high data quality that meets regulatory requirements.
|