--- headline: "Study of 100,000 People Finds Generative AI Now Outperforms Humans on Creativity Tests" slug: genai-outperforms-humans-creativity category: research story_number: 11 date: 2026-05-27 ---
The largest-ever comparison of human and machine creativity reveals AI models like GPT-4 can now surpass the average person on divergent thinking tasks -- but the most imaginative humans still reign supreme.
In what researchers are calling a turning point for artificial intelligence, a landmark study published in Scientific Reports has found that leading AI models now outperform the average human on standardized creativity tests. The study, which compared several large language models against more than 100,000 human participants, represents the largest direct comparison of human and machine creativity ever conducted.
The research was led by Professor Karim Jerbi from the Department of Psychology at the Universite de Montreal, with co-first authors Antoine Bellemare-Pepin and Francois Lespinasse. The team also included Yoshua Bengio, the Turing Award-winning pioneer of deep learning and founder of Mila, Quebec's AI Institute. Collaborating institutions spanned Universite Concordia, the University of Toronto Mississauga, and Google DeepMind.
"Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks," Professor Jerbi explained. "This result may be surprising -- even unsettling -- but our study also highlights an equally important observation: even the best AI systems still fall short of the levels reached by the most creative humans."
How They Measured Creativity
The primary tool was the Divergent Association Task (DAT), a well-established instrument in cognitive psychology developed by study co-author Jay Olson of the University of Toronto. The DAT asks participants -- human or AI -- to generate ten words with meanings as distinct from one another as possible. Scores are calculated based on the semantic distance between the chosen words, with greater distance indicating higher divergent creativity.
Critically, DAT performance in humans correlates strongly with results on other established creativity assessments used in idea generation, writing, and creative problem solving -- meaning the task captures genuine creative cognition, not merely vocabulary breadth. The test takes only two to four minutes and can be administered online, which enabled the researchers to amass their enormous sample of over 100,000 human participants.
The researchers did not stop at word association. They extended the comparison to creative writing challenges including haiku composition, movie plot summaries, and flash fiction. Across these richer tasks, the pattern held: AI systems sometimes exceeded average human performance, but the most skilled human creators consistently delivered stronger and more original work.
The Top Ten Percent Pull Away
The headline finding -- that models like GPT-4 exceeded average human creativity scores -- tells only part of the story. When the researchers examined the most creative half of participants, their average scores surpassed every AI model tested. Among the top 10 percent of the most creative individuals, the gap widened dramatically.
The study also revealed that AI creativity is not fixed. Adjusting a model's temperature parameter -- which controls how predictable or adventurous its outputs are -- significantly affected creative performance. Higher temperatures produced more varied and original associations. Prompt design mattered too: instructions based on etymology, encouraging models to draw on word origins and structure, yielded notably higher creativity scores.
These findings underscore that AI creativity depends heavily on human guidance, making the interaction between user and model a central element of the creative process.
Why This Matters
The study arrives at a moment when creative industries are grappling with AI's expanding role in writing, design, music, and visual art. For educators, it raises questions about how creativity tests should be interpreted in a world where machines can pass them. For businesses, it suggests that AI tools can genuinely augment brainstorming and ideation -- but cannot yet replace the spark that distinguishes truly exceptional creative work.
"Even though AI can now reach human-level creativity on certain tests, we need to move beyond this misleading sense of competition," Professor Jerbi said. "Generative AI has above all become an extremely powerful tool in the service of human creativity: it will not replace creators, but profoundly transform how they imagine, explore, and create -- for those who choose to use it."
The research also carries methodological significance. By establishing a rigorous, large-scale framework for comparing human and AI creativity using identical instruments, the team has set a benchmark that future studies can build upon as models continue to improve.
What to Watch Next
As frontier models grow more capable with each generation, the gap between AI and average human creativity will likely continue to narrow -- or widen further in AI's favor. The more consequential question is whether AI can close the distance to peak human creativity, the kind of originality that produces great literature, groundbreaking design, and paradigm-shifting ideas. For now, that territory remains distinctly human. But with researchers now equipped to measure the race precisely, the next round of results may arrive sooner than anyone expects.
The paper "Divergent creativity in humans and large language models" was published in Scientific Reports on January 21, 2026. DOI: 10.1038/s41598-025-25157-3
"Generative AI has above all become an extremely powerful tool in the service of human creativity: it will not replace creators, but profoundly transform how they imagine, explore, and create."— Karim Jerbi, Professor of Psychology, Universite de Montreal