![]() |
| GMAT Home > Q&A Corner > GMAT Descriptive-Statistics Questions |
|
|
C O A C H
Y O U R S E L F GMAT Descriptive-Statistics Questions— Why "Simple Average" Isn’t Always so Simple | |
| This Q&A focuses on the statistical concepts of arithmetic mean (simple average), median, and range. You'll learn how these seemingly simple concepts can make for surprisingly difficult GMAT questions. Q: Can you briefly define the term descriptive statistics, and describe what aspects of descriptive statistics GMAT test-takers are likely to encounter? For any set of numerical terms:
You should understand each of these terms, because the test-makers won’t provide you with their definitions during the test. By the way, the same goes for standard deviation—a more advanced statistical concept that inherently makes for a relatively challenging GMAT question. Q: The three concepts you just defined seem very straightforward. How can the test-makers design challenging questions—or even moderately difficult ones—involving these concepts?Which of the following expressions represents the arithmetic mean (average) of the five terms p, q, p + q, p – 1, and q + 1 ? Solving the problem requires not only application of the arithmetic-mean concept, but also a bit of algebraic manipulation, rendering the question a bit more complex than simply adding together five numbers and dividing by five. Here are the algebraic steps, plugging the variable expressions into a general equation for arithmetic mean (AM):
Still not to difficult, is it? So to further increase the difficulty level of an arithmetic-mean question, the test-makers might provide the arithmetic mean and ask instead for the value of an unknown term in the set. Consider this variation on the problem I just solved (again, I’m omitting answer choices): Which of following expressions represents the fifth term of a set that also includes the terms p, q, p + q, and p – 1, if This one’s a bit more challenging, isn’t it? It’s more difficult to understand and to determine how to approach and solve. Moreover, although you apply the same arithmetic-mean formula to solve this problem as for the previous one, you need to perform more algebraic steps along the way:
Q: What about the concepts of median and range? How might the test-makers design a challenging GMAT question involving either of these simple concepts?If 0 < q < p, and if the median of the four terms p, q, p + q, and q – p is 2, what is the arithmetic mean (average) of the four terms? Your first task here is to rank the four terms from least to greatest in value. Given q < p and that p and q are both positive, q – p must be negative and hence lowest in value among the four terms, while q + p must be greatest in value among the four terms. Here are the four terms, then, ranked from smallest to greatest in value: (q – p) ... q ... p ... (p + q) The median value, given as 2, is the average (arithmetic mean) of the two middle terms q and p:
To answer the question, you can substitute the value 4 for (p + q) in the arithmetic-mean formula:
Had the question asked instead for the range of values in the set, once you’ve determined the lowest and highest valued terms, you could express the range as the sum of the greatest term’s value, which you know is positive, and the absolute value of the lowest value, which you know is negative: Range = (p + q) + |q – p| Q: So far you’ve used examples only in the Problem Solving format. How do the test-makers employ the Data Sufficiency format to cover the concepts of arithmetic mean, median, and range?
If you’re given any two of these, you can determine the third. Thus the correct response to the following Data Sufficiency question would be (C): How many sweaters does Hritik own? (1) Hritik paid an average of $25 for each sweater he owns. (2) Hritik paid a total of $240 for all of the sweaters he owns. This is a very simple example, of course. Just as with Problem Solving questions, to enhance the difficulty of an arithmetic-mean question in the Data Sufficiency format the test-makers will often incorporate either the median or range concept into the question. Q: Can you illustrate how an arithmetic-mean question in the Data Sufficiency format can be made more difficult by incorporating the concept of either median or range?If Hritik paid an average of $25 per sweater for four sweaters, one of which was more expensive than any of the others, how much did he pay for the most expensive sweater? (1) The amount Hritik paid for the most expensive sweater was $25 more than the lowest amount he paid for a sweater. (2) Hritik paid an average of $20 per sweater for three of the sweaters. First consider statement (1) alone, which provides the range of values in the set. Without more information about the price of individual sweaters it is not possible to answer the question. Next consider statement (2) alone, which establishes that Hritik paid a total of $60 for three of the four sweaters. Given an average price of $25 for each of the four sweaters, the total for all four sweaters was $100. Thus the fourth sweater must have cost $40. But is that $40 sweater necessarily the most expensive one? No. For example, the three sweaters whose total cost was $60 might have cost $45, $10, and $5 individually. Thus statement (2) alone does not suffice to answer the question. Considered together, however, statements (1) and (2) establish that the most expensive sweater must have cost $40. Why? Assume the contrary: that the $40 sweater was not the most expensive one. Given this assumption along with statement (1), the least expensive sweater must have cost more than $15. But the total cost of all four sweaters would total more than $100: ($40) + ($40+) + ($15) + ($15+) > $100 Since the contrary assumption is impossible, the most expensive sweater must have cost $40, and correct response to this question is (C). Q: What other devices might the test-makers use to enhance the difficulty of Data Sufficiency questions involving arithmetic mean, median, and range?S: {p, q, p + q, p – 1, and q + 1} Determining the median value of these five terms requires additional information. The median value would depend on:
For example, assuming p > q, whether (p + q) is greater or less than p and q depends on the sign of q. If q is positive, then (p + q) > p > q. But if q is negative, then p > (p + q) > q. Even if you assume p and q are both positive, the median value might be either (p – 1) or q, depending on the difference between p and q. If the difference is less than 1, then the median is p, whereas if the difference is greater than 1, then the median is (p – 1): If p – q < 1, then (p + q) > (q + 1) > p > q > p – 1. If p – q > 1, then (p + q) > p > (p – 1) > (q + 1) > q. These sorts of dynamics between variable expressions is great fodder for Data Sufficiency questions, because whether you can determine the relationships between the expressions depends on how much and what type of information you’re provided about them. For example, here’s the scenario we just looked at, transformed into a Data Sufficiency question: Among the terms p, q, p + q, p – 1, and q + 1, which represents the median value? (1) p > q (2) p – q < 1 The correct answer is (E). Even considering both statements (1) and (2) together, the median value depends on the signs of p and q. Q: In your last example, whether the question was answerable depended on the sign and relative values of the variable expressions. Is this typical of GMAT Data Sufficiency questions? If so, is there a systematic process for ensuring that your analysis accounts for all possible values of the variable expressions?
The question might look something like one of the following: If..., is x > y ? If..., does x = y ? If..., is x > 0 ? Your immediate reaction to this sort of question should be to consider the following value ranges along the real-number line:
Why these four ranges? Well, when you perform certain operations with variables, the result depends on what range the variable falls into. For instance, when you square a number or take its cube root, whether you end up with a positive number, negative number, a smaller number, or a larger number, depends on which of the five ranges the original number falls into: If x > 1, then 1 < If 0 < x < 1, then x < If –1 < x < 0, then –1 < If x < –1, then x < As you prepare for GMAT Data Sufficiency, go through the exercise of applying exponents (odd as well as even) and roots (odd as well as even) to numbers, and note the patterns that result. In fact, any good GMAT-prep book will step you through this exercise and lay out the patterns for you, as I’ve done in "Day 7" of my book 30 Days to the GMAT CAT. | ||
Home | Top. | ||