Technology

How Smart is ChatGPT?

Published

12 months ago

April 26, 2023

Marcus Lu

Graphics/Design:

Rosey Eason

How smart is ChatGPT? We examine exam scores in this infographic

Can I share this graphic?
Yes. Visualizations are free to share and post in their original form across the web—even for publishers. Please link back to this page and attribute Visual Capitalist.

When do I need a license?
Licenses are required for some commercial uses, translations, or layout modifications. You can even whitelabel our visualizations. Explore your options.

Interested in this piece?
Click here to license this visualization.

Visualizing ChatGPT’s Performance in Human Exams

ChatGPT, a language model developed by OpenAI, has become incredibly popular over the past year due to its ability to generate human-like responses in a wide range of circumstances.

In fact, ChatGPT has become so competent, that students are now using it to help them with their homework. This has prompted several U.S. school districts to block devices from accessing the model while on their networks.

So, how smart is ChatGPT?

In a technical report released on March 27, 2023, OpenAI provided a comprehensive brief on its most recent model, known as GPT-4. Included in this report were a set of exam results, which we’ve visualized in the graphic above.

GPT-4 vs. GPT-3.5

To benchmark the capabilities of ChatGPT, OpenAI simulated test runs of various professional and academic exams. This includes SATs, the bar examination, and various advanced placement (AP) finals.

Performance was measured in percentiles, which were based on the most recently available score distributions for test takers of each exam type.

Percentile scoring is a way of ranking one’s performance relative to the performance of others. For instance, if you placed in the 60th percentile on a test, this means that you scored higher than 60% of test-takers.

The following table lists the results that we visualized in the graphic.

Category	Exam	GPT-4 Percentile	GPT-3.5 Percentile
Law	Uniform Bar Exam	90	10
Law	LSAT	88	40
SAT	Evidence-based Reading & Writing	93	87
SAT	Math	89	70
Graduate Record Examination (GRE)	Quantitative	80	25
Graduate Record Examination (GRE)	Verbal	99	63
Graduate Record Examination (GRE)	Writing	54	54
Advanced Placement (AP)	Biology	85	62
Advanced Placement (AP)	Calculus	43	0
Advanced Placement (AP)	Chemistry	71	22
Advanced Placement (AP)	Physics 2	66	30
Advanced Placement (AP)	Psychology	83	83
Advanced Placement (AP)	Statistics	85	40
Advanced Placement (AP)	English Language	14	14
Advanced Placement (AP)	English Literature	8	8
Competitive Programming	Codeforces Rating	<5	<5

The scores reported above are for GPT-4 with visual inputs enabled. Please see OpenAI’s technical report for more comprehensive results.

As we can see, GPT-4 (released in March 2023) is much more capable than GPT-3.5 (released March 2022) in the majority of these exams. It was, however, unable to improve in AP English and in competitive programming.

Regarding AP English (and other exams where written responses were required), ChatGPT’s submissions were graded by “1-2 qualified third-party contractors with relevant work experience grading those essays”. While ChatGPT is certainly capable of producing adequate essays, it may have struggled to comprehend the exam’s prompts.

For competitive programming, GPT attempted 10 Codeforces contests 100 times each. Codeforces hosts competitive programming contests where participants must solve complex problems. GPT-4’s average Codeforces rating is 392 (below the 5th percentile), while its highest on a single contest was around 1,300. Referencing the Codeforces ratings page, the top-scoring user is jiangly from China with a rating of 3,841.

What’s Changed With GPT-4?

Here are some areas where GPT-4 has improved the user experience over GPT-3.5.

Internet Access and Plugins

A limiting factor with GPT-3.5 was that it didn’t have access to the internet and was only trained on data up to June 2021.

With GPT-4, users will have access to various plugins that empower ChatGPT to access the internet, provide more up to date responses, and complete a wider range of tasks. This includes third-party plugins from services such as Expedia which will enable ChatGPT to book an entire vacation for you.

Visual Inputs

While GPT-3.5 could only accept text inputs, GPT-4 has the ability to also analyze images. Users will be able to ask ChatGPT to describe a photo, analyze a chart, or even explain a meme.

Greater Context Length

Lastly, GPT-4 is able to handle much larger amounts of text and keep conversations going for longer. For reference, GPT-3.5 had a max request value of 4,096 tokens, which is equivalent to roughly 3,000 words. GPT-4 has two variants, one with 8,192 tokens (6,000 words) and another with 32,768 tokens (24,000 words).

Interested in learning more about the impact artificial intelligence is having on the world of work? VC+ members have access to this special dispatch as well as our entire archive of VC+ content. Find out more.

Related Topics:technology artificial intelligence ai chatgpt OpenAI GPT-4 GPT-3.5

Up Next

Visualizing Global Attitudes Towards AI

Don't Miss

Timeline: The Shocking Collapse of Silicon Valley Bank

Click for Comments

Technology

Visualizing AI Patents by Country

See which countries have been granted the most AI patents each year, from 2012 to 2022.

Published

1 day ago

April 24, 2024

Marcus Lu

Visualizing AI Patents by Country

This was originally posted on our Voronoi app. Download the app for free on iOS or Android and discover incredible data-driven charts from a variety of trusted sources.

This infographic shows the number of AI-related patents granted each year from 2010 to 2022 (latest data available). These figures come from the Center for Security and Emerging Technology (CSET), accessed via Stanford University’s 2024 AI Index Report.

From this data, we can see that China first overtook the U.S. in 2013. Since then, the country has seen enormous growth in the number of AI patents granted each year.

Year	China	EU and UK	U.S.	RoW	Global Total
2010	307	137	984	571	1,999
2011	516	129	980	581	2,206
2012	926	112	950	660	2,648
2013	1,035	91	970	627	2,723
2014	1,278	97	1,078	667	3,120
2015	1,721	110	1,135	539	3,505
2016	1,621	128	1,298	714	3,761
2017	2,428	144	1,489	1,075	5,136
2018	4,741	155	1,674	1,574	8,144
2019	9,530	322	3,211	2,720	15,783
2020	13,071	406	5,441	4,455	23,373
2021	21,907	623	8,219	7,519	38,268
2022	35,315	1,173	12,077	13,699	62,264

In 2022, China was granted more patents than every other country combined.

While this suggests that the country is very active in researching the field of artificial intelligence, it doesn’t necessarily mean that China is the farthest in terms of capability.

Key Facts About AI Patents

According to CSET, AI patents relate to mathematical relationships and algorithms, which are considered abstract ideas under patent law. They can also have different meaning, depending on where they are filed.

In the U.S., AI patenting is concentrated amongst large companies including IBM, Microsoft, and Google. On the other hand, AI patenting in China is more distributed across government organizations, universities, and tech firms (e.g. Tencent).

In terms of focus area, China’s patents are typically related to computer vision, a field of AI that enables computers and systems to interpret visual data and inputs. Meanwhile America’s efforts are more evenly distributed across research fields.

Learn More About AI From Visual Capitalist

If you want to see more data visualizations on artificial intelligence, check out this graphic that shows which job departments will be impacted by AI the most.