Loading stock data...
Media 071c1d0a 945a 49a5 b1ed b870f353ed23 133807079768726410

AI-generated meme captions are funnier on average, but humans still craft the funniest memes and AI-assisted collaborations yield the most creative, shareable ones.

A new study exploring how AI and humans perform in meme caption creation finds that AI-generated captions on popular meme templates tend to land higher on humor, creativity, and shareability on average than captions crafted by people. Yet the strongest individual memes—those that are funniest or most distinctive—still come from human creativity, sometimes with AI collaboration. The findings present a nuanced view: AI can boost productivity and broaden appeal, but human nuance and personal experience remain crucial for standout content. The research, conducted by an international team, also highlights how context, collaboration, and the specifics of the task shape the effectiveness of AI-assisted humor.

Study overview and research context

Memes have become a core form of online communication, blending cultural references, rapid iteration, and social signaling. As AI tools grow more capable, a natural question emerges: can machine-generated humor match or surpass human creativity when the goal is to produce caption text for familiar meme formats? This study addresses that question by examining how humans perform in meme-captioning tasks under three distinct conditions: individuals working solo, individuals working with large language models (LLMs), and memes generated entirely by an AI model, without human input. The research team sought to quantify and compare outcomes across multiple dimensions of humor, creativity, and shareability.

The investigators chose three well-known meme categories—work-related scenarios, food-related jokes, and sports-themed humor—to ground the analysis in everyday contexts that audiences commonly recognize. This contextual framing enabled the researchers to test whether certain domains are more amenable to AI-generated humor than others and whether the context influences how audiences perceive funny content. In practice, the study did not have AI-generated images; instead, it used widely circulated, pre-existing meme templates. Human participants and AI systems were tasked with generating captions that could accompany those images, allowing the team to isolate caption-level creativity and humor from image-level design decisions.

A notable element of the study is its framing around a modern reinterpretation of the Turing test. The researchers and commentators frequently discuss whether machine-generated content can reach or surpass human benchmarks in a domain as subjective as humor. Echoing historical conversations about distinguishing machine output from human work, the study contributes to the ongoing dialogue about when AI content feels convincingly human and when it does not. One prominent voice in the discourse—a well-known AI researcher—joked that the “meme Turing Test” might have been passed, underscoring the surprising strength of AI-generated captions in aggregate metrics while also inviting careful scrutiny of what “success” means in creative tasks.

The international research consortium behind the work includes scholars from three technical institutions in Europe, each bringing expertise in computational creativity, human-computer interaction, and AI-assisted workflows. The collaboration reflects a broader trend in which top universities cross-pollinate approaches to evaluate AI’s impact on everyday creative tasks. The study’s central aim was to produce a nuanced portrait of AI’s capabilities: to show where AI can reliably complement human work and where human insight remains indispensable for high-impact meme content.

Methodology: test design, participants, and evaluation

The study’s design centers on a controlled comparison across three distinct creation modes: human-only captioning, human captioning with AI assistance, and fully AI-generated captions. In the human-with-AI condition, participants collaborated with an advanced language model, specifically a GPT-4o variant, to brainstorm, refine, and produce caption options. In the AI-only condition, captions were produced entirely by the AI model without human curation, to evaluate the model’s autonomous creative potential on familiar formats. These modes were applied to captioning tasks for three widely recognized meme templates, enabling consistent cross-condition comparisons.

To gauge performance, the researchers assembled a diverse pool of crowd evaluators who assessed the memes along three core dimensions: humor, creativity, and shareability. Humor captures the perceived hilarity or comedic quality of the caption in the context of the image; creativity measures novelty, originality, and inventive wordplay; shareability reflects the meme’s potential to spread across networks, driven by relatability, timeliness, and cultural resonance. Shareability, in particular, was defined operationally by the study as the observable likelihood that a meme would be widely circulated, given the interaction of humor, relatability, and topical relevance.

In terms of data collection, the study used widely circulated, pre-existing meme templates rather than creating new images. This choice ensured that the evaluation focused on caption quality and the human-AI interaction in content creation rather than on image-generation capabilities. Captions were produced under all three conditions for the same set of templates, ensuring a fair comparison across modalities. The evaluation phase relied on crowdsourced participants to rate the memes, providing a broad, diverse set of perspectives on humor and appeal.

The research team also captured process-oriented data to understand the dynamics of AI-assisted creation. They tracked the number of meme ideas generated by each approach, the level of effort reported by participants, and subjective measures of ownership over the final product. The intention was not only to assess final scores but also to uncover how AI tools alter the creative workflow, the cognitive load on creators, and the motivational aspects that influence ongoing engagement with AI-assisted tasks.

A critical nuance in the methodology concerns the interpretation of the results. AI-generated memes tended to score higher on average across humor, creativity, and shareability, suggesting that AI’s broad training on vast internet-scale humor patterns enables it to produce broadly appealing captions. However, the researchers emphasize that this does not mean AI captures the full spectrum of human wit or personal experience. In fact, when looking at the best-performing individual memes, humans consistently produced the most humorous content, while human-AI collaborations tended to yield memes with the richest creative expression and greater shareability, albeit not necessarily the highest average scores across all metrics.

Findings: performance across categories, averages vs. best memes

A striking finding from the study is that AI-generated captions, when evaluated in aggregate, tended to outperform human-generated captions on three key dimensions—humor, creativity, and shareability. Across the three meme categories, AI-driven captions demonstrated a broad appeal, often aligning with widely recognizable humor patterns and contemporary cultural references. This broad appeal translated into higher average scores when crowdsourced raters assessed the captions’ humor, novelty, and potential to circulate widely in online communities.

The results, however, are not a blanket endorsement of AI supremacy in meme creation. While the AI-only condition produced higher average scores, the strongest individual memes—those that stood out for being exceptionally funny or uniquely clever—most often originated from human creativity, either in solo human generation or in close collaboration with AI. In effect, AI’s strength lies in producing a wide base of content that resonates broadly, but humans maintain the edge when it comes to peak-quality outputs that push the boundaries of humor and originality.

An important nuance emerges when comparing performance by category. The study found that work-related memes frequently ranked higher for humor and shareability than memes centered on food or sports. This suggests that context matters: the work context may be more conducive to shared experiences around productivity, office culture, or workplace frustrations, which AI can tap into through patterns learned from large-scale data. Conversely, food- and sports-themed memes can hinge more on niche or highly personal references, where individual life experiences and insider jokes may deliver outsized laughter but at the cost of broad appeal. The researchers underscore that context—not merely the AI’s technical capability—shapes the effectiveness of meme humor.

In addition to scoring outcomes, the study highlights the practical implications of AI-assisted workflows. Participants using AI assistance reported generating a larger set of meme ideas and described the process as easier and less labor-intensive. Yet, the analysis notes that the productivity gains did not translate into consistently higher quality memes when averaged across all outputs. The researchers state, in their own words, that “the increased productivity of human-AI teams does not lead to better results—just to more results.” This finding points to a key tension in creative collaboration: quantity vs. quality, and the role of human curation in elevating content beyond mass appeal toward standout impact.

Ownership and creative satisfaction also surfaced as important variables. Evaluators observed that participants who used AI assistance felt a touch less ownership over their creations compared with solo creators. Ownership, a known driver of intrinsic motivation in creative work, can influence willingness to invest in revisions, experimentation, and long-term engagement with AI-enabled workflows. The study therefore suggests that creators contemplating AI tools should consider how to balance AI assistance with a sense of authorship and personal control, in order to sustain motivation and creative drive.

The research team also presents concrete visualizations from the experiment, illustrating the comparative performance of memes generated by AI, humans, and human-AI collaboration across the three key metrics of humor, creativity, and shareability. These visuals emphasize the overarching pattern: AI provides a strong baseline of broadly appealing content, humans push the boundaries of quality, and collaboration tends to generate the most diverse set of ideas, even if not always the highest aggregate scores.

Best memes vs AI results: caveats and interpretive notes

A central takeaway from the study concerns the balance between broad appeal and peak originality. While AI-generated captions demonstrate consistent strength in average performance, the study consistently shows that the top-tier memes—the ones most likely to be celebrated, remembered, and shared in the long term—are predominantly crafted by humans. This distinction matters for content creators who rely on memes for audience engagement, brand voice, and cultural impact. AI’s ability to produce large volumes of content with a high baseline of humor and relatability can fuel growth and experimentation, but human insight remains critical for crafting moments that truly resonate, surprise, or subvert expectations.

The researchers discuss the role of AI as a productivity amplifier, rather than a wholesale replacement for human creativity. AI’s strength lies in identifying patterns of humor that have broad resonance across diverse audiences, enabling creators to explore a wide array of caption options quickly. Human curators, in turn, can select, refine, and push captions toward sharper timing, nuanced irony, or culturally specific references that AI may overlook or underemphasize. This dynamic—AI as a generator of breadth plus humans as curators of depth—appears to be the most effective model for meme-captioning tasks.

The study is careful to specify that AI did not generate the images used in the experiments. Instead, it produced captions for already popular meme templates. This separation between image and caption generation helps isolate caption-level creativity and humor while acknowledging that fully end-to-end AI meme generation (including image synthesis) would introduce other variables and potential biases. As such, conclusions about AI’s capability should be understood within the scope of caption generation for established templates rather than as a verdict on AI-driven image-meme production as a whole.

Another important caveat concerns the evaluation method. Crowdsourced raters offer a broad, democratic perspective on humor but bring with them subjectivity and potential biases toward more mainstream, widely understood humor. This bias toward accessibility may favor AI-generated captions that appeal to a broad audience while underrepresenting more niche or culturally specific humor that could emerge in expert or dedicated communities. The researchers acknowledge this limitation and propose future studies that involve expert panels or demographically targeted cohorts to better capture subtleties of humor and creativity across diverse audiences.

The paper also flags the relatively short duration of meme-caption creation sessions. In real-world settings, creators often iterate over longer timespans, refine prompts, and experiment with more elaborate collaborative workflows. The current study’s design raises questions about how extended engagement with AI tools and strategic prompting might further influence the balance between AI productivity and human-driven quality. Future research could explore longer-term collaborations, different prompt strategies, and more nuanced human-AI roles, including roles that empower domain experts to curate AI-generated options more effectively.

Additionally, the researchers propose exploring scenarios in which an AI model rapidly generates multiple caption ideas, with humans acting as curators who select, refine, and assemble the best content. This curation-centric workflow could reveal whether AI’s capacity to generate breadth, when paired with human refinement, yields higher-impact memes than either approach alone. In the near term, however, the evidence points to humans maintaining a decisive edge in the most entertaining meme captions and in creating content that uniquely resonates with audiences.

Implications for creative practice, marketing, and platform dynamics

The study’s findings have meaningful implications for the way individuals, teams, and organizations approach meme creation in an age of AI-enabled tools. For content creators and social-media professionals, the results suggest a practical workflow: leverage AI to generate wide-ranging caption options quickly, then apply human judgment to select, refine, and tailor the best candidates to the brand voice, audience sensibilities, and cultural moment. In this model, AI accelerates productivity and helps teams explore diverse tonalities and formats, while human oversight ensures the content remains sharp, emotionally resonant, and contextually appropriate.

From a branding perspective, the results underscore the value of strategic curation. Brands can benefit from AI’s capacity to surface a broad spectrum of ideas that might otherwise go unexplored, but the final selection should reflect the brand’s identity, values, and target audience. The study’s observation about ownership further suggests that marketing and creative teams should design AI-assisted processes that preserve a sense of authorship and creative agency among contributors. When creators feel ownership over the content they produce, they may be more willing to invest in iterative improvements, experimentation, and long-term experimentation with AI-assisted workflows.

For platforms hosting meme content, the findings offer a balanced lens on content discovery, moderation, and trend amplification. If AI can reliably generate broadly appealing captions, platforms could curate or promote AI-assisted creations as a distinct content stream, while still prioritizing human-originated memes that often drive deeper engagement and cultural commentary. The distinction between AI-generated baseline content and human-curated standout memes could shape how platforms design features for collaboration, prompt sharing, and remixing while preserving opportunities for authentic human expression.

The research also has implications for education and research in computational creativity. It highlights the importance of designing rigorous evaluation frameworks that separate content quality from production efficiency and that account for the social and cultural dimensions of humor. The study’s multi-metric approach—assessing humor, creativity, and shareability—serves as a blueprint for future inquiries into AI-assisted creative tasks, offering a way to quantify subjective experiences without reducing creativity to a single dimension.

In terms of AI policy and ethics, the findings invite reflection on authorship, attribution, and the evolving rights of human creators who collaborate with machines. The sense of ownership and motivation linked to creative work may be influenced by how teams structure prompts, feedback loops, and the degree to which AI suggestions are treated as collaborative partners rather than as free-standing producers. As AI systems become embedded more deeply in creative workflows, clear guidelines and organizational practices will be essential to maintain morale, protect intellectual property, and ensure that human creativity remains central to the process.

Limitations and avenues for future research

The study acknowledges several limitations that leave room for further exploration. First, the relatively short duration of caption creation sessions may not fully reflect real-world workflows where teams iterate over longer periods, test multiple rounds of prompts, and refine their approach based on audience feedback. Longer experimental horizons could reveal whether extended use of AI tools enhances or diminishes the overall quality of memes, and whether improved prompting strategies enable humans to extract even higher-value outputs from AI systems.

Second, crowdsourced evaluation introduces subjectivity and potential biases toward mainstream humor, which could skew results in favor of AI-generated captions that align with broad cultural norms. Future work could incorporate expert panels, demographic stratification, or audience-specific cohorts to capture a wider range of humor styles, including niche, culturally specific, or subcultural humor that might not be widely recognized by a general audience.

Third, the study’s design deliberately used established meme templates rather than generating new images. While this approach isolates caption-level creativity, it also omits considerations related to synchronized image and text generation. Subsequent research could examine end-to-end AI meme generation, exploring how integrated image synthesis with caption writing impacts the balance of humor, creativity, and shareability, as well as how such models handle timing, visual gag execution, and cultural context.

Fourth, there is interest in exploring different AI models beyond GPT-4o, including family members of large language models with varying training data, architectural approaches, and bias profiles. Comparative studies could assess whether certain models excel at particular humor styles or contexts, and how different prompting strategies influence the quality and creativity of AI-generated captions.

Finally, there is value in studying long-term effects of AI collaboration on creative skills and industry dynamics. Longitudinal studies could track how repeated AI-assisted meme creation influences creators’ confidence, originality, and willingness to take creative risks. They could also examine how teams adapt their workflows over time, how ownership perceptions evolve, and how AI-assisted processes shape career trajectories within creative fields.

Practical takeaways for creators, educators, and researchers

  • Use AI to generate broad caption options quickly: Leverage AI to seed a wide array of ideas, explore different tonalities, and map a larger creative space in a shorter period. This approach can help creators jumpstart projects, identify promising directions, and speed up brainstorming sessions.

  • Rely on human curation to elevate quality: Treat AI outputs as raw material that requires thoughtful selection, refinement, and alignment with brand voice and audience expectations. Human judgment remains crucial for high-impact memes that combine timing, irony, and cultural specificity.

  • Balance ownership with collaboration: Cultivate workflows that preserve a sense of authorship and control for individual creators, even when AI assists with generation. This balance helps sustain motivation, encourages experimentation, and fosters a creative culture that values human input.

  • Consider category-specific strategies: Recognize that humor effectiveness may vary by context. Work-related memes might offer more opportunities for broad comedic resonance, while food and sports memes may benefit from targeted references or insider humor that resonates with particular communities.

  • Plan for ethical and legal considerations: As AI-assisted content proliferates, creators should be mindful of copyright, attribution, and the responsible use of AI-generated material. Establish clear guidelines for when and how AI-generated captions are used, cited, or remixed within collaborative projects.

Conclusion

The study provides a layered portrait of AI’s role in meme caption creation. It demonstrates that AI-generated captions can outperform human-generated ones on average in humor, creativity, and shareability when evaluated across broad audiences, underscoring AI’s potential as a powerful productivity tool in creative disciplines. Yet the research also reinforces the enduring value of human originality, showing that the most entertaining and distinctive memes often arise from human effort, whether alone or in collaboration with AI. The results point to a practical, symbiotic model: AI expands the landscape of possible captions, enabling rapid ideation and broad appeal, while humans bring sharper timing, personal experience, and cultural nuance that drive standout meme moments.

The study’s nuanced findings invite creators, brands, and platforms to adopt AI as a strategic ally rather than a wholesale replacement for human creativity. By embracing AI-assisted workflows, preserving a sense of authorship, and prioritizing high-impact, culturally aware humor, teams can blend speed with depth—producing memes that not only entertain widely but also leave a memorable imprint on audiences. As AI tools continue to evolve, ongoing research will be essential to refine collaboration models, expand assessment methods, and deepen our understanding of how machines and humans co-create the memes that define online culture.