An interval, also called a class, is a set of values within which an observation falls.
A frequency distribution is a tabular display of data categorized into a small number of non-overlapping intervals. Note that:
It is important to consider the number of intervals to be used. If too few intervals are used, too much data may be summarized and we may lose important characteristics; if too many intervals are used, we may not summarize enough.
A frequency distribution is constructed by dividing the scores into intervals and counting the number of scores in each interval. The actual number of scores and the percentage of scores in each interval are displayed. This helps in the analysis of large amount of statistical data, and works with all types of measurement scales.
The following steps are required when organizing data into a frequency distribution together with suggestions on constructing the frequency distribution.
Data can be divided into two types: discrete and continuous.
There are two methods that graphically represent continuous data: histograms and frequency polygons.
1. A histogram is a bar chart that displays a frequency distribution. It is constructed as follows:
From a histogram, we can see quickly where most of the observations lie. The shapes of histograms will vary, depending on the choice of the size of the intervals.
2. The frequency polygon is another means of graphically displaying data. It is similar to a histogram but the bars are replaced by a line joined together. It is constructed in the following manner:
Unlike a histogram, a frequency polygon adds a degree of continuity to the presentation of the distribution.
It is helpful, when drawing a frequency polygon, first to draw a histogram in pencil, then to plot the points and join the lines, and finally to rub out the histogram. In this way, the histogram can be used as an initial guide to drawing the polygon.
The relative frequency for a class is calculated by dividing the number of observations in a class by the total number of observations and converting this figure to a percentage (multiplying the fraction by 100). Simply, relative frequency is the percentage of total observations falling within each interval. It is another way of analyzing data; it tells us, for each class, what proportion (or percentage) of data falls in that class.
Let's look at an example.
The following table shows the holding period returns of a portfolio of 40 stocks.
The highest HPR is 32% and the lowest one is -27%. Let's use 6 non-overlapping intervals, each with a width of 10%. The first interval starts at -27% and the last one ends at 33%. Therefore, the entire range of the HPRs is covered.
Hint: If, in an examination, your relative frequency column does not sum to 1 (or 100%), you know that you have made a mistake.
status | not read | reprioritisations | ||
---|---|---|---|---|
last reprioritisation on | suggested re-reading day | |||
started reading on | finished reading on |