Frequency Distributions


Part 1.

When you have a list L1 of outcomes of an experiment, you may describe its distribution in several ways:

****

When the number of different outcomes is small (for example, each outcome is the result of a throw of two dice), you put the individual outcomes (here, numbers from 2 to 12) on the x-axis, and their frequencies on the y-axis.

Example 1:

 

****

You may also draw a histogram where individual outcomes are represented by vertical bars of the same width, 1, whose heights (and areas) are equal to the frequencies of individual outcomes.

Example 2 (for the same distribution):

 

 

****

When the number of outcomes is too big (for example, L1 contains 360 numbers between 0 and 1), e.g., numbers generated by rand(360)), you still may use a histogram by dividing the range of the outcomes (rounded up) into bins of the same width (for example, 10 bins, each with length .1), and drawing a histogram where the height and the area of each bar is proportional to the frequency of outcomes in its bin (namely, the number of outcomes in this bin divided by the length of the list).

Example 3:

****

There are many situations when we want to create bins of different sizes. In school practices this happens when we score a test, for example, on a scale from 0 to 100 points, and we assign grades as follows: A: 91-100, B: 76-90, C: 61-75: D: 51-60, and F: 0-50.

 

When bins are of the same width, the heights of bars and their areas are either equal or proportional to each other. But with bins of different widths, we have to choose whether to represent frequency by the height of the bar or by its area = width*height. There is general agreement that the height of the bar, as shown in Example 4 below, misrepresents the data (can you say how the data are misrepresented?). So only the second method, representing frequency by area, as shown in Example 5 below, should be used. Such a representation is called a frequency distribution. Thus, a histogram with the same width of all bars is a special case of a frequency distribution.

 

Example 4. (Representing frequencies of grades by the heights of the bars)

In a classroom of 30 students the grade distribution on a test was the following:

A: 5 (17%)       B: 6 (20%)       C: 8 (28%)       D: 7 (23%)       F: 4 (13%)

In the graph below the widths of the bars represent the range of scores for each grade (10, 15, 15, 10, 51), and their heights represent the frequencies. Not a good representation!

 

****

Example 5. (Representing the same frequencies by the areas of the bars)

This graph shows the same data. The widths of the bars remain the same as in the previous example, but now their areas represent frequencies. Therefore their heights are proportional to: 5/10, 6/15, 8/15, 7/10, and 4/51.

Part 2.

Creating a frequency distribution from a cumulative frequency distribution on the TI-84

 

L1 should hold the list of outcomes, and L2 should hold the list (of length D) of bins, entered in the following way:

L2(1) holds the beginning of the first bin. (Note that L2(1) should be smaller than the smallest element of list L1. So for example, in data for rolling three dice [in the example below], start with something less than 3, such as 2.5.)

L2(2) holds the end of the first bin, which is treated as the beginning of the second bin;

L2(3) holds the end of the second bin, which is treated as the beginning of the third bin;

... and so on ...

L2(D) holds the end of the last bin. (In data for rolling three dice below, this element will be 18.5.)

Thus, D is the number of bins plus one.

 

Define

\Y1=sum(L1≤X)/dim(L1)       This function computes a cumulative distribution.

It does not need to be activated when you run FREQ.

 

For a frequency distribution separated into bins, where frequencies will be represented by the AREAS of the bars, you will need this program:

 

 

Here is an explanation of the steps.

PROGRAM:FREQ

 

:Disp "BINS?"

You enter the list of bins L2.

:Input LB

List B holds the number of bins plus one.

:dim(LB)→D

 

:D→dim(LP)

 

:For (I,1,D-1)

LP holds the frequency of outcomes in each bin.

:Y1(LB(I+1))-Y1

 

LB(I)))/LB(I+1

 

)-LB(I)))→LP(I)

 

:End:0→ LP(D)

But LP(D) holds 0.

:" LP(sum(seq(LB(

The function Y2 displaying the frequency graph is

I)≤X,I,1,D)))"→Y

defined.

2

 

:LB(I)→Xmin

And the window is set up.

: LB(D)→Xmax

 

:0→Ymin

 

:max(LP)→Ymax

 

:DispGraph

The graph is shown, and the program stops.You may use TRACE

to look at it.

 

 

Run the program, and when you see "BINS?", enter L2.

 

An example.

We stored in L1 60 throws of three dice (do you see that outcomes could be from 3 to 18?), and we made 4 equal bins.(Notice that the beginning and end points of the bins are between whole numbers.) Then we ran the program FREQ. The program creates a function Y2 that is graphed.

 

 

 

Our graph looked like this:

 

 

We changed the maximum and minimum Y values in WINDOW so that the graph was easier to see:

 

 

And we used TRACE to look at the four values:

 

 

We also computed how many of the 60 values were in each bin:

 

 

So the four frequencies are 1/60, 32/60, 19/60, and 8/60. And each bin has width 4. The Y2-values given for the four bins, and the areas of the bins, are computed as follows:

bin 1: Y2(1) = (1/60)/4=.00416667; area = (L2(2)-L2(1))*Y2(1) = 1/60

bin 2: Y2(2) = (32/60)/4=.13333333; area = (L2(3)-L2(2))*Y2(2) = 32/60

bin 3: Y2(3) = (19/60)/4=.07916667; area = (L2(4)-L2(3))*Y2(3) = 19/60

bin 4: Y2(4) = (8/60)/4=.03333333; area = (L2(5)-L2(4))*Y2(4) = 8/60

 

The four areas of the bins sum to one.


Webpage Maintained by Owen Ramsey
Lesson Index