Mathematics 505D

Data Analysis and
Probability

Summer 2011

Projects

The Digits of Pi

Pi is the ratio of the
circumference of a circle to its diameter.
From the first time that someone noticed that the same ratio occurred on
every circle, people have wondered what exactly ¹ is. Nowadays, everyone seems to know
that the number Pi has a decimal expansion that starts 3.14, but is actually
infinite. Further digits in this pattern have no pattern. Your question is: do the digits of Pi
have any statistical patterns?

You have been given a spreadsheet that
gives 10,000 decimal digits of Pi.
It also has a rearrangement of these digits so that they appear as 5,000
base 100 digits. (For example, consider 42 as a single base 100 digit.) You are
going to investigate the frequency of the digits: the digits base 10 {0,1,2,3,4,5,6,7,8,9}
and the digits base 100 {0,1,2,3É97,98,99}. You will need to count the number of
times a particular digit appears in the first 100 digits expansion of ¹ this
spreadsheet. We look at base 100 digits to learn more about base 10 digits. If
we count the number of times the decimal 42 base 100 appears, this somewhat
similar to counting how often a 4 is followed by a 2 in base 10.

Here are some questions to investigate
(first base 10 and then base 100.):

1.
Do any
digits appear more frequently or less frequently that the others in the full
collection?

2.
Suppose you
concentrate on one digit, and compute the frequency with which it occurs as you
make your way from the first few digits of pi until the 10,000^{th}
digit. You can speed things up a bit by considering 10 rows at a time. To do this for the digit 7, find the
percentage of 7Õs are there in the first 10 rows of the table, the first 20
rows, the first 30 rows, et cetera. See if this percentage increases or
decreases as the number of rows you include increase.

3.
Does any
base 10 digit occur in long consecutive strings? Do
some digits appear in consecutive strings that are longer than others?

4.
What if you
compare your results above with a table of the same length but composed of
random numbers generated by the spreadsheet?

5.
What have
you discovered about ¹ from the questions above?

Suggestions.

Start with the first spreadsheet of base 10 digits and investigate question 1. (The COUNTIF function built into your
spreadsheet is useful for all these questions.) Work your way through all the
base 10 digits, and compare the results.

For question 2, get started by
looking at one digit base 10. You might try using the digit 7 or by using the
third digit of your phone number.
Use the COUNTIF again, but a bit more selectively. There are several ways of using the
spreadsheet to get the results you need.
Work your way through all the base 10 digits. Compare the results if you
can, or postpone this until later when you get a better feel for the question.

After you begin to understand
how questions 1 and 2 work in base 10, go back to investigate the questions in base 100 with
the second spreadsheet. You donÕt
need to answer question for every digit this time, but a few selected ones
might be interesting just to get a feel for what is happening.

You may find both
spreadsheets helpful in investigating the 3^{rd} question even though
it is about base 10 digits. It may
require some hand counts, unless you are quite good at using spreadsheet
functions. This time the COUNTIFS function can be helpful finding long strings
especially if you apply to rows or pairs of rows in base 100 with digits like
22 or 33

Question 4 asks you to
answer the same questions by to a table of digits you make up. You can concentrate on questions 1 and 2
if you like. To generate random
integers mod 10, use INT(RAND()*10). RAND() gives a
random nine digit decimal number between 0 and 1 (ie
0.466387456). RAND()*10
gives a random nine digit decimal number between 0 and 10 (ie
6.092719963). The INT function
gives just the integer part. It
gives a random number between 0 and 9, inclusive of the 0 and 9.