Mathematics 505D

Data Analysis and Probability

Summer 2011


The Digits of Pi


Pi is the ratio of the circumference of a circle to its diameter.  From the first time that someone noticed that the same ratio occurred on every circle, people have wondered what exactly ¹ is.   Nowadays, everyone seems to know that the number Pi has a decimal expansion that starts 3.14, but is actually infinite. Further digits in this pattern have no pattern.  Your question is: do the digits of Pi have any statistical patterns?


You have been given a spreadsheet that gives 10,000 decimal digits of Pi.  It also has a rearrangement of these digits so that they appear as 5,000 base 100 digits. (For example, consider 42 as a single base 100 digit.)  You are going to investigate the frequency of the digits: the digits base 10 {0,1,2,3,4,5,6,7,8,9} and the digits base 100 {0,1,2,3É97,98,99}.  You will need to count the number of times a particular digit appears in the first 100 digits expansion of ¹ this spreadsheet. We look at base 100 digits to learn more about base 10 digits. If we count the number of times the decimal 42 base 100 appears, this somewhat similar to counting how often a 4 is followed by a 2 in base 10.


 Here are some questions to investigate (first base 10 and then base 100.):

1.    Do any digits appear more frequently or less frequently that the others in the full collection?


2.   Suppose you concentrate on one digit, and compute the frequency with which it occurs as you make your way from the first few digits of pi until the 10,000th digit. You can speed things up a bit by considering 10 rows at a time.  To do this for the digit 7,  find the percentage of 7Õs are there in the first 10 rows of the table, the first 20 rows, the first 30 rows, et cetera.  See if this percentage increases or decreases as the number of rows you include increase.


3.   Does any base 10 digit occur in long consecutive strings? Do some digits appear in consecutive strings that are longer than others?


4.   What if you compare your results above with a table of the same length but composed of random numbers generated by the spreadsheet?


5.   What have you discovered about ¹ from the questions above?




Start with the first spreadsheet of base 10 digits and investigate question 1.  (The COUNTIF function built into your spreadsheet is useful for all these questions.) Work your way through all the base 10 digits, and compare the results.


For question 2,  get started by looking at one digit base 10. You might try using the digit 7 or by using the third digit of your phone number.  Use the COUNTIF again, but a bit more selectively.  There are several ways of using the spreadsheet to get the results you need.  Work your way through all the base 10 digits. Compare the results if you can, or postpone this until later when you get a better feel for the question.


After you begin to understand how questions 1 and 2 work in base 10, go back to investigate the questions in base  100 with the second spreadsheet.  You donÕt need to answer question for every digit this time, but a few selected ones might be interesting just to get a feel for what is happening. 


You may find both spreadsheets helpful in investigating the 3rd question even though it is about base 10 digits.  It may require some hand counts, unless you are quite good at using spreadsheet functions. This time the COUNTIFS function can be helpful finding long strings especially if you apply to rows or pairs of rows in base 100 with digits like 22 or 33


Question 4 asks you to answer the same questions by to a table of digits you make up.  You can concentrate on questions 1 and 2 if you like.  To generate random integers mod 10, use INT(RAND()*10).  RAND() gives a random nine digit decimal number between 0 and 1 (ie 0.466387456).  RAND()*10 gives a random nine digit decimal number between 0 and 10 (ie 6.092719963).  The INT function gives just the integer part.  It gives a random number between 0 and 9, inclusive of the 0 and 9.