11 Frequency Distributions

Jenna Lehmann

In statistics, a lot of tests are run using many different points of data and it’s important to understand how those data are spread out and what their individual values are in comparison with other data points. A frequency distribution is just that–an outline of what the data look like as a unit. A frequency table is one way to go about this. It’s an organized tabulation showing the number of individuals located in each category on the scale of measurement. When used in a table, you are given each score from highest to lowest (X) and next to it the number of times that score appears in the data (f). A table in which one is able to read the scores that appear in a data set and how often those particular scores appear in the data set. Here’s a Khan Academy video we found to be helpful in explaining this concept:

 

Organizing Data into a Frequency Distribution

  1. Find the range
  2. Order the table from highest score to lowest score, not skipping scores that might not have shown up in the data set
  3. In the next column, document how many times this score shows up in the data set

 

Organizing data into a group frequency table

  1. The grouped frequency table should have about 10 intervals. A good strategy is to come up with some widths according to Guideline 2 and divide the total range of numbers by that width to see if there are close to 10 intervals.
  2. The width of the interval should be a relatively simple number (like 2, 5, or 10)
  3. The bottom score in each class interval should be a multiple of the width (0-9, 10-19, 20-19, etc.)
  4. All intervals should be the same width.

 

Proportions and Percentages

Proportions measure the fraction of the total group that is associated with each score (they’re called relative frequencies because they describe the frequency in relation to the total number of scores). For example, if I have 10 pieces of fruit and 3 of them are oranges, 3/10 is the proportion of oranges. On the other hand, percentages express relative frequency out of 100, but essentially report the same values. Keeping in line with our fruit example, 30% of my fruit is oranges. Here’s a YouTube video which might be helpful:

Real Limits

Real limits are continuous variables require a calculation of a real limit. They can be calculated by taking the apparent limit and subtracting and then separately adding half the value of the smallest digit available or presented. For example, I have a value of 50 and I want the real limits. To make it easier to see, I make the number 50.0. The smallest digit shown is the 1 digit, so I subtract half of one (49.5) and add half of one (50.5). Sometimes one isn’t the smallest digit. If I have a value of 34.5, I add another digit to the end to make 34.50, and the smallest value is the 0.5, so we divide by 2 to get 0.25. So the limits are 34.75 and 34.25. Finally, sometimes the smallest value of measurement is given. If the smallest unit a scale can measure is 0.2 pounds, and you have a value of 80 pounds, you add and subtract half of 0.2 pounds and get 80.1 and 79.9. This can be a difficult concept two grasp, so here are two YouTube videos we found helpful.

Frequency Distribution Graphs

A frequency distribution is often best grasped conceptually though the use of graphs. These graphs are like the tables in that they represent the same data, but graphs show it in a different way. This can be done with bar graphs (discrete), histograms (continuous), or polygons (continuous). Here are two Khan Academy videos we found helpful.

These graphs can come in a multitude of shapes, but here are just a few important descriptive words generally used in statistics:

  • Symmetrical: When the shape of the distribution is, at least for the most part, mirrored on both sides if you were to view the mean as the flipping point.
  • Asymmetrical: When the shape of the distribution is not mirrored on both sides for whatever reason (usually because of skew).
  • Positively Skewed: This is when there is what looks like a tail of data trailing off to the right. I like to remember this is as the P in Positive having fallen on its back.
  • Negatively Skewed: This is when there is what looks like a tail of data trailing off to the left.
  • Unimodal: This literally means having a buildup of data around what looks to be one number, so one mode. Your typical bell curve is unimodal.
  • Bimodal: This is when there is data clustering around two different numbers or spots on the distribution, so having two modes. This can often look like camel humps.
  • Multimodal: When a distribution has two or more “humps” in the graph.

Here’s a video which may be helpful in teaching you how to interpret data presented in a table and organizing data into a frequency distribution graph.


This chapter was originally posted to the Math Support Center blog at the University of Baltimore on on June 4, 2019. 

License

Share This Book