Gender bias in tests: Numbers themselves prove sexist

Courtesy of Alberto G./Flickr. Link to original image:

I remember sitting in my high school gym as a junior, surrounded by my classmates, with pencils sharpened and ready. We were about to take the PSAT. I also remember the days I took the SAT (twice) and the ACT. I remember taking practice test after practice test—all by myself, alone in my room, because there’s really no such thing as tutoring in my town. I remember the frustration and disappointment of two subpar SAT scores that no number of practice tests seemed to improve. I remember my bewilderment as to why I was not scoring highly on the math SAT when I had always excelled in math and had even been told by my teachers that I should consider going into a math-related career.

I will never know exactly what factored into my scores on these high-stakes standardized tests—whether it was the temperature of the room, how much sleep I’d gotten, my brain’s test-taking ability or just some other, more systemic factor completely beyond my control. Regardless of my own experience, however, numerous studies have shown that standardized testing favors males over females.

In their book “Still Failing at Fairness,” David Sadker and Karen Zittleman explain, “For decades, boys scored so much higher than girls [on the PSAT] that two out of three Merit semifinalists were male.” This is a huge gap, and it has financial consequences. The inequality was so obvious that in 1989, a New York District Judge barred the state from using the test score alone to award scholarships (Sadker and Zittleman, “Still Failing at Fairness,” 2009).

The PSAT is only the beginning. While the PSAT can lead to important scholarships and honors, the SAT is used for college admissions, and the gender gap is as bad, if not worse. The authors state, “In 1967, boys scored 10 points higher [on the SAT] than girls in mathematics; by 1987, the boys’ lead grew to 24 points; between 1987 and 2006, the boys’ math lead grew again to between 33 and 41 points” (Sadker and Zittleman, 2009). Reading these statistics, I couldn’t help but think: Is there something fundamentally flawed about this test that caused me—a girl who had always done exceptionally well in math—to get a score much lower than I had hoped?

Despite the prevailing misconception, these tests are not accurate indicators of performance nor ability. Colleges use the SAT as a predictor of how well students will do in college; however, girls receive better grades in their first year of college (and in the following years, and in graduate school) than boys do. As Sadker and Zittleman state, “The SAT Reasoning Test (and the PSAT) consistently underpredicts female performance while over predicting male performance. In short, the PSAT and the SAT are broken” (Sadker and Zittleman, 2009). In fact, according to studies as early as 1926, the test has never accurately predicted performance in college (Silverstein, “Standardized Tests: The Continuation of Gender Bias in Higher Education,” 2000). So if the SAT fails at its one job, we need to ask: Why?

One possible explanation for the gender gap is that most high-stakes tests are composed almost entirely of multiple-choice questions due to cost and time restraints. According to the Stanford Graduate School of Education, “Girls perform better on standardized tests that have more open-ended questions while boys score higher when the tests include more multiple-choice” (Stanford Graduate School of Education, “Question format may impact how boys and girls score on standardized tests, Stanford study finds,” 03.29.2018).

Other reasons for the gender gap include questions that have mostly male characters, because we do better on tests when the questions reflect ourselves; questions that are centered around topics or activities that are usually more “male” in practice such as sports and politics—though we wish such topics were not gendered, we must admit that, in society, they are; time constraints, as girls do better with more time because they are more likely to fully solve problems and think through multiple possible answers; and penalties for guessing—boys are more likely to guess, which, ironically, results in higher scores, whereas girls are more likely to heed the instructions of the test and leave the question blank, which loses them more points than if they had guessed (Sadker and Zittleman, 2009). Many of these aspects of high-stakes testing actually punish girls for traits that are more valuable in school, work and life, leaving them with lower scores and, subsequently, fewer opportunities than boys.

Despite the clear evidence that the gender gap on high-stakes tests like the SAT is due to flaws in the test itself rather the intellectual ability of girls, the score disparity it produces is still used as an excuse for sexist thinking and practices. Instead of questioning why these patterns may exist, or even acknowledging that SAT scores are not in line with the academic performance of girls in math not only in high school but also in college, Mark Perry in a 2016 article claims that these scores alone prove an inherent difference in mathematical ability. He states, [T]he scientific data about gender differences in math performance would seem to present a serious challenge to…frequent claims that there are no gender differences in math performance.” (American Enterprise Institute, “2016 SAT Test Results Confirm Pattern That’s Persisted for 50 Years—High School Boys are Better at Math Than Girls,” 09.27.2016).

Statements like this are objectively harmful to girls as a group, but his next claim raises even more alarm: “If there are some inherent gender differences for mathematical ability, as the huge and persistent gender differences for the math SAT test suggests, closing the STEM gender degree and job gaps may be a futile attempt in socially engineering an unnatural and unachievable outcome” (American Enterprise Institute, 09.27.2016). So not only are these high-stakes tests benefiting boys and hurting girls when it comes to scholarships and college acceptances, but they are being used to bar women from access to entire fields. Perry’s claims are not only harmful, but also incorrect; the SAT consistently underpredicts women’s performance in college math and physical science courses (American Physical Society, “Fighting the Gender Gap: Standardized Tests Are Poor Indicators of Ability in Physics,” 1996). This is an excuse to ignore the real, structural issues in a sexist system that prevent women from having equal representation across the STEM field.

I am a woman at a prestigious liberal arts college, receiving a substantial amount of financial aid and on track to graduate with a degree in Political Science. I am luckier than most. I cannot pinpoint the reasons behind my test scores, nor do I know if they would have been different in a system that was not inherently sexist. I was lucky enough to get a PSAT score that made me a National Merit Semifinalist and an ACT score that got me into Vassar. However, thousands of girls like me fall through the cracks every year. I have two incredibly bright, intelligent younger sisters who will be taking these tests in the years to come, and my own days of high-stakes testing are not over. I will likely have to take the GRE or the LSAT after I graduate, and standardized testing for graduate school exhibits the same trends and gender gaps as those for undergrad (Sadker and Zittleman, 2009).

These tests are clearly misrepresentative and flawed, and yet they are still used by almost every institution of higher education in the country to determine college acceptance and financial aid (The College Solution, “How a 1 Point Increase on the ACT can Equal $24,000,” 01.04.2013). It is time we stop using a system that produces extremely harmful consequences for girls, that has proven time and time again to be inaccurate and that reduces human beings to a single number. Vassar should join the growing number of higher education institutions that are choosing to opt out of requiring test scores for college applications. Admissions has already proven that it prioritizes a 40/60 gender ratio on campus over accepting qualified girls (The Miscellany News, “Vassar Admissions exhibits gender bias against women,” 04.10.2019). Given the biases present both structurally and within our own institution, we don’t need yet another inequity working against us.


  1. Helen, this is very well written! You look at the issue from multiple perspectives and ground it all in your personal story. I absolutely agree that the tests are profoundly biased (by gender, of course, but also by race and class at very least). What makes them all the worse is their claim to be objective. In the end what these tests measure best is performance on these tests, a solipcism on which a supposed meritocracy is built. It is more than ironic that Vassar, given its history, should fall for this scam.

  2. Back in the day I scored 791 on Math and 609 on Verbal

    In no way is the test sexist, what is sexist is assuming all girls will do worse simply because they are girls and girls have some quirk, some “difference”

    I took the test in 1986 We need these tests!

    BTW I’m happy I avoided essays < my worst area!

    • Jen,

      Did you read the article? It never said that “girls do worse simply because they are girls.” Quite the opposite, in fact. They do worse because their is bias built into the tests.

      Someone as intelligent as yourself should know that one girl — in this case, you — doing very well on the test does not disprove the fact that the test is biased against girls in general. Congrats on your scores, but don’t try to use your personal experience to negate the facts.

  3. Women in other countries score much higher in math than American women, the reason, the women in other countries aren’t constantly told they are worse at math

    There is a gender bias in America which is beyond belief

Leave a Reply

Your email address will not be published. Required fields are marked *

The Miscellany News reserves the right to publish or not publish any comment submitted for approval on our website. Factors that could cause a comment to be rejected include, but are not limited to, personal attacks, inappropriate language, statements or points unrelated to the article, and unfounded or baseless claims. Additionally, The Misc reserves the right to reject any comment that exceeds 250 words in length. There is no guarantee that a comment will be published, and one week after the article’s release, it is less likely that your comment will be accepted. Any questions or concerns regarding our comments section can be directed to