Data verification and validation for clean data
Initial data verification before data is entered into a usable form prevents the need for expensive data cleansing activities later. A set of good Data verification processes ensure the integrity of source data and prevent errors when transferring from data sources to final storage in a database.
Validation is a set of procedures to check if the entered data makes business sense.
Some Verification methods
- Dual Entry-This is a simple yet effective method wherein two different data entry personnel enter the same record. If the information entered by both the personnel is identical, the database accepts one copy. If there are differences between the same data entered by the two personnel, the database rejects the write request with the data entry application highlighting the differences which are subsequently resolved by re-checking against source data.
- On screen prompts-Source data pops up on the screen for the user or data entry personnel to verify if the data in the database match source data. The user is then required to manually confirm that the data is verified against its source.
Few Validation Methods
- Mandatory Fields- Checks if all mandatory fields are filled. For example, if the Date of Birth or the Gender field are left blank, a pop up message will be displayed. Some entry systems will not allow the remaining data to be entered if some mandatory fields are not first entered.
- Type check- Certain fields can have only one data type-the door # field or the Zip code field for instance can only have numeric values. Any variation is flagged or recorded for future corrective action.
- Length Check- Data elements such as telephone numbers or SSN’s have standard lengths. Checking the length of a standard telephone number or other standard length data is useful in detecting incorrect entries.
- A framework of data verification and validation rules helps define acceptable or unacceptable data. Verification and validation methods, though numerous, can be automated to a large extent by professional data management companies. The efficiency of automation depends on the clarity with which verification and validation rules are defined by an organization. Data Management companies provide useful inputs defining verification and validation rules as well.
- Verification and validation can be conducted on both new and existing data. Prevention is always better than cure-proper verification and validation procedures can bring down data cleansing costs substantially.