{"id":13612,"date":"2025-02-24T10:00:00","date_gmt":"2025-02-24T10:00:00","guid":{"rendered":"https:\/\/modernsciences.org\/staging\/4414\/?p=13612"},"modified":"2025-02-13T17:56:39","modified_gmt":"2025-02-13T17:56:39","slug":"openai-deep-research-agent-ai-tool-limitations-february-2025","status":"publish","type":"post","link":"https:\/\/modernsciences.org\/staging\/4414\/openai-deep-research-agent-ai-tool-limitations-february-2025\/","title":{"rendered":"OpenAI\u2019s new \u2018deep research\u2019 agent is still just a fallible tool \u2013 not a\u00a0human-level\u00a0expert"},"content":{"rendered":"\n<div class=\"theconversation-article-body\">\n    <figure>\n      <img  decoding=\"async\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-pk-src=\"https:\/\/images.theconversation.com\/files\/648223\/original\/file-20250211-15-slwsyu.jpg?ixlib=rb-4.1.0&#038;rect=0%2C0%2C3818%2C2674&#038;q=45&#038;auto=format&#038;w=754&#038;fit=clip\" >\n        <figcaption>\n          \n          <span class=\"attribution\"><a class=\"source\" href=\"https:\/\/unsplash.com\/photos\/brown-wooden-drawer-lRoX0shwjUQ\" target=\"_blank\" rel=\"noopener\">Jan Antonin Kolar\/Unsplash<\/a><\/span>\n        <\/figcaption>\n    <\/figure>\n\n  <span><a href=\"https:\/\/theconversation.com\/profiles\/raffaele-f-ciriello-1079723\" target=\"_blank\" rel=\"noopener\">Raffaele F Ciriello<\/a>, <em><a href=\"https:\/\/theconversation.com\/institutions\/university-of-sydney-841\" target=\"_blank\" rel=\"noopener\">University of Sydney<\/a><\/em><\/span>\n\n  <p>OpenAI\u2019s \u201c<a href=\"https:\/\/openai.com\/index\/introducing-deep-research\/\" target=\"_blank\" rel=\"noopener\">deep research<\/a>\u201d is the latest artificial intelligence (AI) tool <a href=\"https:\/\/www.forbes.com\/sites\/quickerbettertech\/2025\/02\/09\/business-tech-news-openai-launches-a-powerful-new-ai-research-tool\/\" target=\"_blank\" rel=\"noopener\">making waves<\/a> and promising to do in minutes what would take hours for a human expert to complete.<\/p>\n\n<p>Bundled as a feature in ChatGPT Pro and <a href=\"https:\/\/www.theguardian.com\/technology\/2025\/feb\/03\/openai-deep-research-agent-chatgpt-deepseek\" target=\"_blank\" rel=\"noopener\">marketed<\/a> as a research assistant that can match a trained analyst, it autonomously searches the web, compiles sources and delivers structured reports.  It even <a href=\"https:\/\/www.zdnet.com\/article\/openais-new-deep-research-agent-can-do-in-5-minutes-what-might-take-you-hours\/\" target=\"_blank\" rel=\"noopener\">scored<\/a> 26.6% on Humanity\u2019s Last Exam (HLE), a tough AI benchmark, <a href=\"https:\/\/www.techradar.com\/computing\/artificial-intelligence\/openais-deep-research-smashes-records-for-the-worlds-hardest-ai-exam-with-chatgpt-o3-mini-and-deepseek-left-in-its-wake\" target=\"_blank\" rel=\"noopener\">outperforming<\/a> many models.<\/p>\n\n<p>But deep research doesn\u2019t quite live up to the hype. While it produces polished reports, it also has serious flaws. <a href=\"https:\/\/www.theverge.com\/openai\/607587\/chatgpt-deep-research-hands-on-section-230\" target=\"_blank\" rel=\"noopener\">According to journalists<\/a> <a href=\"https:\/\/www.platformer.news\/chatgpt-deep-research-hands-on\/\" target=\"_blank\" rel=\"noopener\">who\u2019ve tried it<\/a>, deep research can miss key details, struggle with recent information and sometimes invents facts.<\/p>\n\n<p>OpenAI flags this when listing the limitations of its tool. <a href=\"https:\/\/openai.com\/index\/introducing-deep-research\/\" target=\"_blank\" rel=\"noopener\">The company also says it<\/a> \u201ccan sometimes hallucinate facts in responses or make incorrect inferences, though at a notably lower rate than existing ChatGPT models, according to internal evaluations\u201d. <\/p>\n\n<p>It\u2019s no surprise that unreliable data can slip in, since AI models don\u2019t \u201cknow\u201d things in the same way humans do. <\/p>\n\n<p>The idea of an AI \u201cresearch analyst\u201d also raises a slew of questions. Can a machine \u2013 no matter how powerful \u2013 truly replace a trained expert? What would be the implications for knowledge work? And is AI really helping us think better, or just making it easier to stop thinking altogether?<\/p>\n\n<h2 id=\"what-is-deep-research-and-who-is-it-for\">What is \u2018deep research\u2019 and who is it for?<\/h2>\n\n<p>Marketed towards professionals in finance, science, policy, law and engineering, as well as academics, journalists and business strategists, deep research is the latest \u201c<a href=\"https:\/\/openai.com\/index\/introducing-deep-research\/\" target=\"_blank\" rel=\"noopener\">agentic experience<\/a>\u201d OpenAI has rolled out in ChatGPT. It promises to do the heavy lifting of research in minutes.<\/p>\n\n<p>Currently, deep research is only available to ChatGPT Pro users in the United States, at a cost of US$200 per month. OpenAI <a href=\"https:\/\/openai.com\/index\/introducing-deep-research\/\" target=\"_blank\" rel=\"noopener\">says<\/a> it will roll out to Plus, Team and Enterprise users in the coming months, with a more cost-effective version planned for the future.<\/p>\n\n<p><\/p>\n\n<p>Unlike a standard chatbot that provides quick responses, deep research follows a multi-step process to produce a structured report:<\/p>\n\n<ol>\n<li>The user submits a request. This could be anything from a market analysis to a legal case summary.<\/li>\n<li>The AI clarifies the task. It may ask follow-up questions to refine the research scope.<\/li>\n<li>The agent searches the web. It autonomously browses hundreds of sources, including news articles, research papers and online databases.<\/li>\n<li>It synthesises its findings. The AI extracts key points, organises them into a structured report and cites its sources.<\/li>\n<li>The final report is delivered. Within five to 30 minutes, the user receives a multi-page document \u2013 <a href=\"https:\/\/futureofbeinghuman.com\/p\/can-ai-write-your-phd-dissertation\" target=\"_blank\" rel=\"noopener\">potentially even a PhD-level thesis<\/a> \u2013 summarising the findings.<\/li>\n<\/ol>\n\n<p>At first glance, it sounds like a dream tool for knowledge workers. A closer look reveals significant limitations.<\/p>\n\n<p><a href=\"https:\/\/www.theverge.com\/openai\/607587\/chatgpt-deep-research-hands-on-section-230\" target=\"_blank\" rel=\"noopener\">Many<\/a> <a href=\"https:\/\/www.nature.com\/articles\/d41586-025-00377-9\" target=\"_blank\" rel=\"noopener\">early<\/a> <a href=\"https:\/\/www.datacamp.com\/blog\/deep-research-openai\" target=\"_blank\" rel=\"noopener\">tests<\/a> have exposed shortcomings:<\/p>\n\n<ul>\n<li><strong>It lacks context.<\/strong> AI can summarise, but it doesn\u2019t fully understand what\u2019s important.<\/li>\n<li><strong>It ignores new developments.<\/strong> It has missed major legal rulings and scientific updates.<\/li>\n<li><strong>It makes things up.<\/strong> Like other AI models, it can confidently generate false information.<\/li>\n<li><strong>It can\u2019t tell fact from fiction.<\/strong> It doesn\u2019t distinguish authoritative sources from unreliable ones.<\/li>\n<\/ul>\n\n<p>While OpenAI claims its tool rivals human analysts, AI inevitably lacks the judgement, scrutiny and expertise that make good research valuable.<\/p>\n\n<p><\/p>\n\n<h2 id=\"what-ai-cant-replace\">What AI can\u2019t replace<\/h2>\n\n<p>ChatGPT isn\u2019t the only AI tool that can scour the web and produce reports with just a few prompts.  Notably, a mere <a href=\"https:\/\/arstechnica.com\/ai\/2025\/02\/after-24-hour-hackathon-hugging-faces-ai-research-agent-nearly-matches-openais-solution\/\" target=\"_blank\" rel=\"noopener\">24 hours after OpenAI\u2019s release<\/a>, Hugging Face released a free, open-source version that nearly matches its performance.<\/p>\n\n<p>The biggest risk of deep research and other AI tools marketed for \u201chuman-level\u201d research is the illusion that AI can replace human thinking. AI can summarise information, but it can\u2019t question its own assumptions, highlight knowledge gaps, think creatively or understand different perspectives.<\/p>\n\n<p>And AI-generated summaries don\u2019t match the <a href=\"https:\/\/futureofbeinghuman.com\/p\/can-ai-write-your-phd-dissertation\" target=\"_blank\" rel=\"noopener\">depth<\/a> of a <a href=\"https:\/\/www.tandfonline.com\/doi\/full\/10.1080\/14780887.2024.2311427\" target=\"_blank\" rel=\"noopener\">skilled<\/a> human researcher.<\/p>\n\n<p>Any AI agent, no matter how fast, is still just a tool, not a replacement for human intelligence. For knowledge workers, it\u2019s more important than ever to invest in skills that AI can\u2019t replicate: critical thinking, fact-checking, deep expertise and creativity.<\/p>\n\n<p>If you do want to use AI research tools, there are ways to do so responsibly. Thoughtful use of AI can enhance research without sacrificing accuracy or depth.  You might use AI for efficiency, like summarising documents, but retain human judgement for making decisions.<\/p>\n\n<p>Always verify sources, as AI-generated citations can be misleading. Don\u2019t trust conclusions blindly, but apply critical thinking and cross-check information with reputable sources. For high-stakes topics \u2014 such as <a href=\"https:\/\/www.theguardian.com\/technology\/2023\/may\/31\/eating-disorder-hotline-union-ai-chatbot-harm\" target=\"_blank\" rel=\"noopener\">health<\/a>, <a href=\"https:\/\/www.theguardian.com\/law\/2025\/feb\/10\/fake-cases-judges-headaches-and-new-limits-australian-courts-grappling-with-lawyers-using-ai-ntwnfb\" target=\"_blank\" rel=\"noopener\">justice<\/a> and <a href=\"https:\/\/www.theguardian.com\/us-news\/2024\/sep\/12\/twitter-ai-bot-grok-election-misinformation\" target=\"_blank\" rel=\"noopener\">democracy<\/a>  \u2014  supplement AI findings with expert input. <\/p>\n\n<p>Despite prolific marketing that tries to tell us otherwise, generative AI still has plenty of limitations. Humans who can creatively synthesise information, challenge assumptions and think critically will remain in demand \u2013 AI can\u2019t replace them just yet.<!-- Below is The Conversation's page counter tag. Please DO NOT REMOVE. --><img  loading=\"lazy\"  decoding=\"async\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  alt=\"The Conversation\"  width=\"1\"  height=\"1\"  style=\"border: none !important; box-shadow: none !important; margin: 0 !important; max-height: 1px !important; max-width: 1px !important; min-height: 1px !important; min-width: 1px !important; opacity: 0 !important; outline: none !important; padding: 0 !important\"  referrerpolicy=\"no-referrer-when-downgrade\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-pk-src=\"https:\/\/counter.theconversation.com\/content\/249496\/count.gif?distributor=republish-lightbox-basic\" ><!-- End of code. If you don't see any code above, please get new code from the Advanced tab after you click the republish button. The page counter does not collect any personal data. More info: https:\/\/theconversation.com\/republishing-guidelines --><\/p>\n\n  <p><span><a href=\"https:\/\/theconversation.com\/profiles\/raffaele-f-ciriello-1079723\" target=\"_blank\" rel=\"noopener\">Raffaele F Ciriello<\/a>, Senior Lecturer in Business Information Systems, <em><a href=\"https:\/\/theconversation.com\/institutions\/university-of-sydney-841\" target=\"_blank\" rel=\"noopener\">University of Sydney<\/a><\/em><\/span><\/p>\n\n  <p>This article is republished from <a href=\"https:\/\/theconversation.com\" target=\"_blank\" rel=\"noopener\">The Conversation<\/a> under a Creative Commons license. Read the <a href=\"https:\/\/theconversation.com\/openais-new-deep-research-agent-is-still-just-a-fallible-tool-not-a-human-level-expert-249496\" target=\"_blank\" rel=\"noopener\">original article<\/a>.<\/p>\n<\/div>\n\n\n\n\n<p class=\"\"><\/p>\n","protected":false},"excerpt":{"rendered":"Jan Antonin Kolar\/Unsplash Raffaele F Ciriello, University of Sydney OpenAI\u2019s \u201cdeep research\u201d is the latest artificial intelligence (AI)&hellip;\n","protected":false},"author":1082,"featured_media":13614,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","fifu_image_url":"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/thumb\/0\/0b\/Multi-Token_Prediction_%28DeepSeek%29_01.svg\/2560px-Multi-Token_Prediction_%28DeepSeek%29_01.svg.png","fifu_image_alt":"","footnotes":""},"categories":[16],"tags":[4838,4829,4830,4840,4842,4836,4833,4846,4843,4831,4845,4835,4844,4827,4834,4837,4828,4841,4832,4839,474],"class_list":{"0":"post-13612","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-tech","8":"tag-ai-and-critical-thinking","9":"tag-ai-for-knowledge-workers","10":"tag-ai-hallucinations","11":"tag-ai-in-academia","12":"tag-ai-in-finance","13":"tag-ai-in-journalism","14":"tag-ai-limitations","15":"tag-ai-market-analysis","16":"tag-ai-research-accuracy","17":"tag-ai-research-assistant","18":"tag-ai-tool-evaluation","19":"tag-ai-vs-human-analysts","20":"tag-ai-generated-misinformation","21":"tag-ai-generated-reports","22":"tag-chatgpt-pro","23":"tag-fact-checking-ai","24":"tag-generative-ai-flaws","25":"tag-hugging-face-ai","26":"tag-openai-deep-research","27":"tag-responsible-ai-use","28":"tag-the-conversation","29":"cs-entry","30":"cs-video-wrap"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/13612","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/users\/1082"}],"replies":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/comments?post=13612"}],"version-history":[{"count":1,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/13612\/revisions"}],"predecessor-version":[{"id":13613,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/13612\/revisions\/13613"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/media\/13614"}],"wp:attachment":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/media?parent=13612"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/categories?post=13612"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/tags?post=13612"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}