Betsy has a Ph.D. in biomedical engineering from the University of Memphis, M.S. from the University of Virginia, and B.S. from Mississippi State University. She has over 10 years of experience developing STEM curriculum and teaching physics, engineering, and biology.
What's Wrong with These Books?
In 1938, a physicist working at General Electric (GE) named Frank Benford was tired of always having to buy new books of logarithm tables. For some reason, it seemed that the first few pages of the book always got worn out and dirty a lot faster than the rest of the book. The first few pages contained logarithms that started with the number one, while those that started with other digits were farther back in the book.
Dr. Benford was pretty sure that all the scientists and engineers that worked at GE didn't have some particular fondness for the number one, but he was puzzled by why these pages were obviously being used more than all the others in the book. He decided to investigate and find out what was really going on.
He collected over 20,000 sets of data to see if he could find some general trends. Some of the data sets he looked at were baseball statistics, some were list of addresses, some were areas of rivers and cities, and some were just seemingly random numbers listed in magazine articles. Even though all this data was obviously very different, Dr. Benford found that in almost every case, the data followed a familiar pattern.
No matter which data set he examined, about 30% of the numbers in the data set had a first digit of one. This was surprising, because just by random chance, you would expect one to turn up as the first digit in one out of every nine numbers. This means that about 11% of the numbers should have started with one, but instead 30% started with one. What was going on?
Based on his analysis of these thousands of data sets,Dr. Benford was able to show that the probability of the first digit being one was always about 30%, while the probability of the first digit being nine was less than 5%. This probability distribution in sets of numerical data is now known as Benford's Law.
While Benford's Law very accurately predicts the distribution of digits in most large sets of data, it doesn't always work. For example, it can't be used to predict which numbers will help you win the lottery. This is because, in the lottery, each digit in the number is truly random.
Dr. Benford finally had an explanation of why the first pages of his logarithm book kept getting used more than all the others, but it turns out that Benford's Law has a lot more applications in all kinds of fields.
How is Benford's Law Used?
Because Benford's Law has been shown to be true for so many different types of data sets, it's used today in some pretty surprising ways. For example, did you know that tax fraud can often be detected using Benford's Law?
That's right! The numbers submit on your tax return can be checked to see if they conform to Benford's Law or not. If your return is legitimate, then it is very likely that Benford's Law will accurately predict the distribution of first digits in the data you submitted. However, if the data is made up, it's likely that it will NOT conform to Benford's Law because in that case the numbers are truly random. You can't prove that a tax return is not legitimate using Benford's Law, but you can certainly identify those returns that deserve a little more scrutiny!
In addition to flagging potentially fraudulent tax returns, Benford's Law can be used to detect fraud in all kinds of data, from sales and expense reports to voting records. In one famous court case from Arizona, a man working in the state treasurer's office tried to steal over two million dollars from the state. He set up a fake vendor account and started diverting state funds into this account (which was really just going to him). He thought he was being very careful and taking precautions to avoid being caught. He made sure that he didn't repeat any amounts, and he tried to create numbers that would appear random.
However, he obviously did not know about Benford's Law and that was exactly why his deception was detected! A majority of the dollar amounts that he made up started with the digits seven, eight, and nine, and most were for amounts just under $100,000. When his fabricated data was compared to the predictions of Benford's Law, it was very obvious that it was not real, and he was convicted of fraud.
Benford's Law says that in most data sets, the first digit of any given number is most likely to be one and is least likely to be nine. It was discovered by Dr. Frank Benford, a physicist at GE, in the 1930s when he began investigating why the first few pages (which contained numbers whose first digit was one) of his book of logarithm tables were used much more than the pages at the end of the book.
Benford's Law is used in accounting to detect mistakes and fraud. If a set of data does not seem to conform to Benford's Law, it is likely that the data is fabricated or altered in some way. Benford's Law has been used to detect tax fraud, and even to prove that a state employee in Arizona was stealing money from the state treasury!
To unlock this lesson you must be a Study.com Member.
Create your account
Register to view this lesson
Unlock Your Education
See for yourself why 30 million people use Study.com
Become a Study.com member and start learning now.Become a Member
Already a member? Log InBack