Data Quality Issues in Data Warehouses

Instructor: Temitayo Odugbesan

Temitayo has 11+ years Industrial Experience in Information Technology and has a master's degree in Computer Science.

In this lesson, we will be looking at what data, data quality and data warehousing is all about. We will also learn about how data quality issues affects in Data Warehouses.

What is a Data Warehouse?

Databases are classified based on the number of concurrent users, database size, data location, their use and time sensitivity of the information gathered.

Databases such as transaction or operational databases, record transactions immediately and accurately, reflecting critical daily operations. An example of this is a bank's customer database.

A data warehouse however, is used to store historical information in databases captured from diverse sources for the purpose of aiding tactical or strategic decision making. Its use falls under the database classification of time sensitivity.

Let's talk fashion. The fashion industry thrives on customer loyalty (we love our labels), customer changing tastes (we like to stay trendy), timing and designer inspirations. Consider Gladys Fashion House (GFH),which keeps abreast by analysing trends (collated historical data) in all areas of the industry, captured from diverse sources.

GFH is able to predict its clients' taste and expected sales surge from the analysis of these sets of data and as a result, is able to stock up in anticipation of the next fashion season (critical decisions).

What is Quality Data?

The first thing that comes to mind with poor quality data, is incorrect data which actually forms only part of the poor quality issue. On the flip side, good quality data may not necessarily be free from errors either! Where does that leave us? Quality data can be defined as data that consistently meets the needs of the knowledge worker and user requirements. It is data that is usable and applicable to the business requirement.

In GFH, its ensuring the perfect outfit is picked out for the perfect occasion. A well-tailored (good quality) workman's overall may not be suitable for dinner (wrong occasion) but perfect for a construction site (good quality and appropriate needs met).

What Makes Good Quality Data?

Data quality satisfies the following attributes or characteristics:

  1. Accuracy - The data is an accurate representation or is from a verified source.
  2. Integrity - The data are accurate and consistent without any discrepancies
  3. Consistency - Data elements are consistently defined and understood.
  4. Completeness - All necessary data are present.
  5. Validity - Data values fall within acceptable ranges defined by the business.
  6. Timeliness - Data are available when needed.
  7. Accessibility - The data is easily accessible, understandable, and usable.

The first five attributes cover most of the common issues lacking in poor data quality and as long as our data satisfy these attributes, it is considered to be error-free. Error-free data, however, does not necessarily constitute quality data as we have seen from the workman's overall illustration earlier.

If GFH set a trend that is considered hot (its creative and likeable by clients) ''error-free'' but released in the wrong season ''bad timing'', poor sales figures would result. Data must be timely and useful (as depicted in attributes 6 and 7). The bottom line is that the data must be well suited where it's needed, when it's needed.

A stylist at GFH can pick a perfect outfit for a complete stranger. How? Historical data analysis and training! She requests the occasion, estimates your age, assess your body type, what you currently have on, knows what's trending and voila! she picks you right outfit.

The sales representatives need demographic analysis and gender distribution to make a good sales pitch. While the financial analyst leaves no room for errors. He needs precise records of customer purchases down to the last cent to make accurate financial forecasts. We see that each knowledge worker requires a different level of accuracy , completeness and consistency in the data.

When is Poor Quality Data Likely to Be Injected?

Poor quality data are injected in the following processes:

Data Entry Processes

Data entry errors, including misspellings, numerical transpositions, incorrect codes, misplaced data and abbreviations can occur as companies migrate their businesses to the web and, in the process, allow customers and suppliers enter data directly into their systems.

Lack of Validation Routines

The ability of the systems to check the data as it is entered. Validation routines are however limited. Valid data could be permissible, but not necessarily correct.

An example here, is a GFH item: button-front denim skirt with item code G2345545. However, during data entry, item code G2343545 was keyed in. Item code validation criteria requires: item code's first character must be a letter followed by 7 numbers. The entry therefore satisfies the criteria which made it permissible during the validation routine, but the correct code should have been G2345545 making it incorrect but goes undetected.

Changes to Source Systems

Organizations frequently change their systems and as they do these migrations, mismatched syntax formats can occur.

For example, the GFH naming convention ,in the old system, is ''Button-front denim skirt''. In the new system though, the naming convention is ''Denim button-front skirt''.

To unlock this lesson you must be a Member.
Create your account

Register to view this lesson

Are you a student or a teacher?

Unlock Your Education

See for yourself why 30 million people use

Become a member and start learning now.
Become a Member  Back
What teachers are saying about
Try it risk-free for 30 days

Earning College Credit

Did you know… We have over 200 college courses that prepare you to earn credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the first two years of college and save thousands off your degree. Anyone can earn credit-by-exam regardless of age or education level.

To learn more, visit our Earning Credit Page

Transferring credit to the school of your choice

Not sure what college you want to attend yet? has thousands of articles about every imaginable degree, area of study and career path that can help you find the school that's right for you.

Create an account to start this course today
Try it risk-free for 30 days!
Create an account