How to Exploit the GMAT Computer-Adaptive Test (CAT) Algorithm

This tutorial addresses common questions about the GMAT computer-adaptive test (CAT) algorithm — specifically, how the Quantitative and Verbal sections adapt to your ability level, and how the GMAT scoring process accounts for this feature. The tutorial also explores what these inherent features suggest in terms of GMAT preparation and testing strategies.

Q: The GMAT Quantitative and Verbal sections are computer adaptive, meaning that they adapt to each individual test taker. But how do they do that?

A: During the Quantitative and Verbal sections, each question the test presents to you depends on your responses to earlier questions of the same type — for example, Reading Comprehension, Critical Reasoning, or Problem Solving. For each question type, the first question posed will be average in difficulty level. If you respond correctly to the question, the next question of that type will be more difficult; conversely, if you respond incorrectly, then the next question of that type will be easier. So as you proceed, you'll encounter fewer and fewer questions that are either "gimmees" or, at the other extreme, far too difficult for you.

Thus the CAT can "zero in" on your ability level with fewer questions than a non-adaptive test can. The end result is that the particular GMAT you take will be custom-built for you; no other test taker will encounter the same combination of questions.

Q: Given this adaptive feature, your score must be based on more than just the number of correct responses, right? Otherwise, to maximize your score wouldn't you want to intentionally respond incorrectly to difficult questions, to keep the overall difficulty level of your test down to a level that you can handle comfortably?

A: That's right. And that's why the CAT scoring system determines your GMAT Quantitative and Verbal scores by accounting for not only the number of questions you answer correctly but also the difficulty level of the questions you answered correctly. Your reward — in terms of points — for responding correctly to a difficult question is greater than for an easier question. Of course, the scoring system for a non-adaptive test can also account for difficulty level — simply by assigning greater weight to more difficult questions. But the adaptive feature creates a certain dynamic — a self-adjustment mechanism — that continually homes in on your level of ability in each test area.

Q: Does the scoring system take into account any other factors as well?

A: Yes. The scoring system accounts for a third factor as well: the range of cognitive abilities tested by the questions you answered correctly — within each of these two exam sections. The Quantitative section, for example, embraces a variety of substantive areas: number theory, arithmetical operations, algebra, geometry, statistical reasoning, interpretation of graphical data, and so forth. Also, the Quantitative section employs two distinct question formats: Problem Solving and Data Sufficiency. Problem Solving questions gauge your ability to work to a numerical solution, whereas Data Sufficiency questions stress your ability to reason quantitatively. Proving to the CAT that you can handle a variety of substantive areas in both question formats will boost your GMAT score.

As for how the CAT quantifies this third factor, the calculation involves the statistical concept of standard deviation. The greater the deviation among your areas of ability, the lower your score. In other words, the GMAT rewards generalists — test takers who demonstrate a broad range of competencies — while punishing less versatile test takers who are not as well-rounded in terms of their skill sets. The significance of this third factor should not be overstated, however. The other two — number of correct responses and difficulty level — are the primary determinants of your score.

Why is the scoring system designed to account for this third factor? Because the GMAC (Graduate Management Admissions Council) recognizes that crack mathematicians or grammarians don't necessarily make good business managers. It's people who can put it all together — people with an overall package of quantitative, verbal, and analytical skills — who are most likely to succeed in B-school and beyond.

Q: Given how the adaptive test moves you up and down the difficulty ladder, with point rewards dependant on difficulty level, it would seem that random guessing can do more damage than good to your score, since the odds of guessing correctly are stacked against you? Is this correct?

A: The conventional advice that you should avoid random guessing is generally good advice. The Quantitative and Verbal sections provide only 37 and 41 opportunities, respectively, for you to prove yourself to the CAT. Actually, the number is even lower, since at least a few questions in each section are pretest, or unscored, questions. A random guess will save you a bit of time, of course. But the risks far outweigh the time reward. Your chances of guessing correctly are only one in five. Moreover, incorrect responses move you down the difficulty ladder, which exerts downward pressure on your score. In the meantime, you're wasting precious questions.

But this advice should be refined somewhat. When it comes to resorting to guesswork, you should also consider how far along you are in the exam section. An unlucky guess early in a section is far more damaging to your score than later in the section. Why? Toward the beginning of a section, the computer-adaptive algorithm moves you up and down the ladder of difficulty rather dramatically and quickly. In as few as four questions you can move up to the highest possible level — by responding correctly to all four questions — or down to the lowest possible level — by responding incorrectly to all of them.

Once the test establishes what it thinks is the appropriate difficulty level for you, the algorithm places a heavy burden on you to prove the system wrong — that your first few incorrect — or correct — responses were flukes and you're actually quite a bit brighter — or dimmer — than the CAT believes. If you've established a low ability level, and only have a few questions remaining in the section, the CAT algorithm is not going to let you take a stab at a few very difficult questions so late in the game to let you pile up some last minute points.

Think of a GMAT score like your college GPA. Low grades during your freshman year will establish a very low GPA, and you'll be swimming upstream the next three years to redeem yourself. But low grades during the final semester of your senior year will have almost no impact on your 4-year GPA. The analogy isn't perfect, but it's useful nonetheless in helping you appreciate that guesswork can do far more damage to your score early in a test section.

Q: You mentioned pretest, or unscored, questions. Why does the testing service include them on the exam, and what do they mean for the test taker?

A: The testing service is continually replacing questions in its database with new ones, if for no other reason to prevent test-prep companies from hiring sharp test takers with keen memories to take the GMAT again — in order to replicate the official test bank. Before a new question is added to the bank of scored questions, it is included in the bank of unscored questions, so that the testing service can determine its difficulty level and its integrity (both as determined by test takers' responses to the question).

Pretest questions will look just like scored questions, and you won't be able to distinguish one type from the other. So there's no sense in trying to guess which ones are unscored so that you can spend more time on scored questions.

Q: Does the computer-adaptive algorithm and scoring system you've described suggest any specific test-taking strategies?

A: Yes. Exercise special care in responding to the initial questions during the Quantitative and Verbal sections. Read very carefully, double-check calculations, and so forth. This advice should be augmented with respect to the Verbal section. During this section, take particular care with the first few questions of each type — Reading Comprehension, Critical Reasoning, and Sentence Correction. Typically, you won't encounter at least one question of all three types until you're at least ten questions into the Verbal section. So whenever you see that first question of each type, slow down and take your time with it.

However, I'd caution against taking the foregoing advice to the extreme — for two reasons. First, if you spend too much time on a few questions, you might not have adequate time for reasoned responses to all of the questions in the section. So it's a balancing act in terms of proper pacing. Secondly, intuition plays a role in multiple-choice testing, and second-guessing yourself can be counterproductive, because changing your initial response to a question more often than not results in an incorrect response.

Q: The testing service claims that the CAT's adaptive feature enables a more accurate measurement of your cognitive abilities relative to other test takers than the old paper-based test, even with fewer questions. How is this possible?

A: The primary advantage — in terms of fairness — of adaptive testing over non-adaptive testing, whether computer-based or paper-based, has to do with distribution of scores. Assume two GMAT test takers X and Y. Suppose that X has great difficulty with every question type at even low difficulty levels, while Y can handle any question type at even the highest difficulty level. Because the GMAT CAT adapts to individual ability, and rewards fewer points for correct responses to easy questions than difficult ones, the difference between GMAT scores for X and Y might be far greater than if they had taken the same bank of questions. In other words, a non-adaptive test does not allow for as wide a distribution of scores.

To the extent that the CAT creates a broader distribution of scores, it is a better means of comparing the cognitive abilities of test takers. This is a statistics concept that's really pretty easy to understand on a non-technical level. Scores for multiple test takers that all cluster closely together are less reliable for the purpose of comparing ability levels than more widely distributed scores are.

Q: Okay, I understand that the adaptive feature leads to a wider score distribution, and in turn to more reliable performance comparisons. Nevertheless, with only 27 scored Quantitative questions and 31 scored Verbal questions, not to mention the wide variety of question types within each section, how can the CAT possibly make a fair assessment of your abilities?

A: You've hit on the most common complaint about the GMAT. But this drawback is not unique to the GMAT; you can say the same about almost any standardized exam. The greater the number of questions, the more accurate the assessment — all else being equal. But all else is not necessarily equal. During a longer test endurance becomes a factor — a factor that can undermine the purpose of the test to begin with. Also, with the inception of the CAT test takers can take the GMAT far more often than they could under the old paper-based testing system; the more often a test taker takes the GMAT, the more reliable the measurement.

In an ideal world, perhaps a more extensive battery of tests spread over several weeks — or even months — and that includes an oral component as well would be fairer. But it comes down to a trade-off between fairness and administrative efficiency. The testing service couldn't provide such a test on an affordable basis, especially considering that hundreds of thousands of GMAT tests are administered every year!

Q: Given the adaptive nature of the Quantitative and Verbal sections and the resulting scoring system, is the best way to prepare for these two sections to use software that simulates the computerized GMAT — rather than GMAT-prep books?

A: The best advice is to take a balanced GMAT-prep approach. Use books to brush up on your math skills, to review rules of grammar, to identify your weak areas, and for exercises and drills that help strengthen those weak areas. Use software to determine your optimal pace, to acclimate yourself to the computer interface, and to measure your performance — so that when you take the actual test you'll have a good idea whether you should cancel your scores and/or retake the exam.

All this is not to suggest that taking paper-based practice tests is not worthwhile. As long as they accurately reflect the style and difficulty level of the actual GMAT, they're quite useful for additional practice. By the same token, you shouldn't assume that any GMAT software product will be a reliable predictor of your performance on the actual GMAT. Some GMAT software products are better than others — both in terms of replicating the style and difficulty level of actual GMAT questions and in terms of forecasting your scores on the actual GMAT. So choose your test-prep software carefully.