{"id":14658,"date":"2025-06-01T22:00:00","date_gmt":"2025-06-01T22:00:00","guid":{"rendered":"https:\/\/modernsciences.org\/staging\/4414\/?p=14658"},"modified":"2025-05-26T18:34:26","modified_gmt":"2025-05-26T18:34:26","slug":"ai-audio-description-accessibility-low-vision-accuracy-may-2025","status":"publish","type":"post","link":"https:\/\/modernsciences.org\/staging\/4414\/ai-audio-description-accessibility-low-vision-accuracy-may-2025\/","title":{"rendered":"AI is now used for audio description. But it should be accurate and actually useful for people with low vision"},"content":{"rendered":"\n\n<div class=\"theconversation-article-body\">\n    <figure>\n      <img  decoding=\"async\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-pk-src=\"https:\/\/images.theconversation.com\/files\/668902\/original\/file-20250520-56-dqlxlr.jpg?ixlib=rb-4.1.0&#038;rect=272%2C319%2C3695%2C2504&#038;q=45&#038;auto=format&#038;w=754&#038;fit=clip\" >\n        <figcaption>\n          \n          <span class=\"attribution\"><a class=\"source\" href=\"https:\/\/www.shutterstock.com\/image-photo\/portrait-asian-woman-blindness-disability-enjoy-2184758945\" target=\"_blank\" rel=\"noopener\">Chansom Pantip\/Shutterstock<\/a><\/span>\n        <\/figcaption>\n    <\/figure>\n\n  <span><a href=\"https:\/\/theconversation.com\/profiles\/kathryn-locke-2258383\" target=\"_blank\" rel=\"noopener\">Kathryn Locke<\/a>, <em><a href=\"https:\/\/theconversation.com\/institutions\/curtin-university-873\" target=\"_blank\" rel=\"noopener\">Curtin University<\/a><\/em> and <a href=\"https:\/\/theconversation.com\/profiles\/tama-leaver-1798\" target=\"_blank\" rel=\"noopener\">Tama Leaver<\/a>, <em><a href=\"https:\/\/theconversation.com\/institutions\/curtin-university-873\" target=\"_blank\" rel=\"noopener\">Curtin University<\/a><\/em><\/span>\n\n  <p>Since the recent explosion of widely available generative artificial intelligence (AI), it now seems that a new AI tool emerges every week.<\/p>\n\n<p>With varying success, AI offers solutions for productivity, creativity, research, and also accessibility: making products, services and other content more usable for people with disability.<\/p>\n\n<p>The <a href=\"https:\/\/www.designrush.com\/news\/google-pixel-s-super-bowl-spot-wins-kellogg-ad-review-for-2nd-year-straight\" target=\"_blank\" rel=\"noopener\">award-winning 2024 Super Bowl ad for Google Pixel 8<\/a> is a poignant example of how the latest AI tech can intersect with disability.<\/p>\n\n<p>Directed by blind director Adam Morse, it showcases an AI-powered feature that uses audio cues, haptic feedback (where vibrating sensations communicate information to the user) and animations to assist blind and low-vision users in capturing photos and videos.<\/p>\n\n<figure>\n            <iframe loading=\"lazy\" width=\"440\" height=\"260\" src=\"https:\/\/www.youtube.com\/embed\/wYPTZIFQoDQ?wmode=transparent&amp;start=0\" frameborder=\"0\" allowfullscreen=\"\"><\/iframe>\n            <figcaption><span class=\"caption\">Javier in Frame showcases an accessibility feature found on Pixel 8 phones.<\/span><\/figcaption>\n          <\/figure>\n\n<p>The ad was applauded for being disability inclusive and representative. It also demonstrated a growing capacity for \u2013 and interest in \u2013 AI to generate more accessible technology.<\/p>\n\n<p>AI is also poised to challenge how audio description is created and what it may sound like. This is the focus of our research team.<\/p>\n\n<p>Audio description is a track of narration that describes important visual elements of visual media, including television shows, movies and live performances. Synthetic voices and quick, automated visual descriptions might result in more audio description on our screens. But will users lose out in other ways? <\/p>\n\n<h2 id=\"ai-as-peoples-eyes\">AI as people\u2019s eyes<\/h2>\n\n<p>AI-powered accessibility tools are proliferating. Among them is Microsoft\u2019s <a href=\"https:\/\/www.microsoft.com\/en-us\/garage\/wall-of-fame\/seeing-ai\/\" target=\"_blank\" rel=\"noopener\">Seeing AI<\/a>, an app that turns your smartphone into a talking camera by reading text and identifying objects. The app <a href=\"https:\/\/www.bemyeyes.com\/blog\/introducing-be-my-ai\/\" target=\"_blank\" rel=\"noopener\">Be My AI<\/a> uses virtual assistants to describe photos taken by blind users; it\u2019s an AI version of the original app Be My Eyes, where the same task was done by human volunteers.<\/p>\n\n<p>There are increasingly more AI software options for text-to-speech and document reading, as well as <a href=\"https:\/\/www.perkins.org\/resource\/audible-sight-a-new-method-for-creating-audio-descriptions\/\" target=\"_blank\" rel=\"noopener\">for producing audio description<\/a>. <\/p>\n\n<p>Audio description is an essential feature to make visual media accessible to blind or vision impaired audiences. But its benefits go beyond that.<\/p>\n\n<p>Increasingly, research shows <a href=\"https:\/\/openresearch.surrey.ac.uk\/esploro\/outputs\/99513568402346\" target=\"_blank\" rel=\"noopener\">audio description benefits other disability groups<\/a> and <a href=\"https:\/\/doi.org\/10.4324\/9781003052968\" target=\"_blank\" rel=\"noopener\">mainstream audiences without disability<\/a>. Audio description can also be a creative way to further <a href=\"https:\/\/www.thespace.org\/resource\/the-creative-potential-of-audio-description\" target=\"_blank\" rel=\"noopener\">develop or enhance a visual text<\/a>.<\/p>\n\n<p>Traditionally, audio description has been created using human voices, script writers and production teams. However, in the last year several international streaming services including Netflix and <a href=\"https:\/\/www.aboutamazon.com\/news\/entertainment\/artificial-intelligence-prime-video-streaming\" target=\"_blank\" rel=\"noopener\">Amazon Prime<\/a> have begun offering audio description that\u2019s at least partially generated with AI.<\/p>\n\n<p>Yet there are a number of issues with the current AI technologies, including their ability to generate false information. These tools need to be critically appraised and improved.<\/p>\n\n<p><\/p>\n\n<h2 id=\"is-ai-coming-for-audio-description-jobs\">Is AI coming for audio description jobs?<\/h2>\n\n<p>There are multiple ways in which AI might impact the creation \u2013 and end result \u2013 of audio description. <\/p>\n\n<p>With AI tools, streaming services can get <a href=\"https:\/\/discovery.ucl.ac.uk\/id\/eprint\/10059094\/1\/Fryer_Walczak_Fryer_BJVI.pdf\" target=\"_blank\" rel=\"noopener\">synthetic voices to \u201cread\u201d an audio description script<\/a>. There\u2019s potential for various levels of <a href=\"https:\/\/dl.acm.org\/doi\/full\/10.1145\/3706599.3719966\" target=\"_blank\" rel=\"noopener\">automation<\/a>, while giving <a href=\"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3663548.3675617\" target=\"_blank\" rel=\"noopener\">users the chance to customise audio description<\/a> to suit their specific needs and preferences. Want your cooking show to be narrated in a British accent? With AI, you could change that with the press of a button.<\/p>\n\n<p>However, in the audio description industry many are worried AI could undermine the quality, creativity and professionalism humans bring to the equation.<\/p>\n\n<p>The language-learning app Duolingo, for example, recently announced it was moving forward with <a href=\"https:\/\/techcrunch.com\/2025\/04\/30\/duolingo-launches-148-courses-created-with-ai-after-sharing-plans-to-replace-contractors-with-ai\/\" target=\"_blank\" rel=\"noopener\">\u201cAI first\u201d development<\/a>. As a result, many contractors lost jobs that can now purportedly be done by algorithms.<\/p>\n\n<p>On the one hand, AI could help broaden the range of audio descriptions available for a range of media and live experiences.<\/p>\n\n<p>But AI audio description may also cost jobs rather than create them. The worst outcome would be a huge amount of lower-quality audio description, which would undermine the value of creating it at all.<\/p>\n\n<figure class=\"align-center zoomable\">\n            <a href=\"https:\/\/images.theconversation.com\/files\/668901\/original\/file-20250520-56-4gp4yc.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=1000&amp;fit=clip\" target=\"_blank\" rel=\"noopener\"><img  decoding=\"async\"  alt=\"A young man sits on a park bench with phone and headphones in hand, holding a folded white cane.\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-ls-sizes=\"(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px\"  data-pk-src=\"https:\/\/images.theconversation.com\/files\/668901\/original\/file-20250520-56-4gp4yc.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;fit=clip\"  data-pk-srcset=\"https:\/\/images.theconversation.com\/files\/668901\/original\/file-20250520-56-4gp4yc.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=600&amp;h=400&amp;fit=crop&amp;dpr=1 600w, https:\/\/images.theconversation.com\/files\/668901\/original\/file-20250520-56-4gp4yc.jpg?ixlib=rb-4.1.0&amp;q=30&amp;auto=format&amp;w=600&amp;h=400&amp;fit=crop&amp;dpr=2 1200w, https:\/\/images.theconversation.com\/files\/668901\/original\/file-20250520-56-4gp4yc.jpg?ixlib=rb-4.1.0&amp;q=15&amp;auto=format&amp;w=600&amp;h=400&amp;fit=crop&amp;dpr=3 1800w, https:\/\/images.theconversation.com\/files\/668901\/original\/file-20250520-56-4gp4yc.jpg?ixlib=rb-4.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;h=503&amp;fit=crop&amp;dpr=1 754w, https:\/\/images.theconversation.com\/files\/668901\/original\/file-20250520-56-4gp4yc.jpg?ixlib=rb-4.1.0&amp;q=30&amp;auto=format&amp;w=754&amp;h=503&amp;fit=crop&amp;dpr=2 1508w, https:\/\/images.theconversation.com\/files\/668901\/original\/file-20250520-56-4gp4yc.jpg?ixlib=rb-4.1.0&amp;q=15&amp;auto=format&amp;w=754&amp;h=503&amp;fit=crop&amp;dpr=3 2262w\" ><\/a>\n            <figcaption>\n              <span class=\"caption\">AI shouldn\u2019t undermine the quality of assistive technologies, including audio description.<\/span>\n              <span class=\"attribution\"><a class=\"source\" href=\"https:\/\/www.shutterstock.com\/image-photo\/young-blind-man-smartphone-sitting-on-1513660814\" target=\"_blank\" rel=\"noopener\">Ground Picture\/Shutterstock<\/a><\/span>\n            <\/figcaption>\n          <\/figure>\n\n<h2 id=\"can-we-trust-ai-to-describe-things-well\">Can we trust AI to describe things well?<\/h2>\n\n<p>Industry impact and the technical details of how AI can be used in audio description are one thing.<\/p>\n\n<p>What\u2019s currently lacking is research that centres the perspectives of users and takes into consideration their experiences and needs for future audio description.<\/p>\n\n<p>Accuracy \u2013 and trust in this accuracy \u2013 is vitally important for blind and low-vision audiences.<\/p>\n\n<p>Cheap and often free, AI tools are now widely used to summarise, transcribe and translate. But it\u2019s a well-known problem that <a href=\"https:\/\/theconversation.com\/heres-how-researchers-are-helping-ais-get-their-facts-straight-245463\" target=\"_blank\" rel=\"noopener\">generative AI struggles to stay factual<\/a>. Known as \u201challucinations\u201d, <a href=\"https:\/\/theconversation.com\/what-are-ai-hallucinations-why-ais-sometimes-make-things-up-242896\" target=\"_blank\" rel=\"noopener\">these plausible fabrications<\/a> proliferate even when the AI tools are <a href=\"https:\/\/www.science.org\/content\/article\/ai-transcription-tools-hallucinate-too\" target=\"_blank\" rel=\"noopener\">not asked to create anything new<\/a> \u2013 like doing a simple audio transcription.<\/p>\n\n<p>If AI tools simply fabricate content rather than make existing material accessible, it would even further distance and disadvantage blind and low-vision consumers.<\/p>\n\n<p><\/p>\n\n<h2 id=\"we-can-use-ai-for-accessibility-with-care\">We can use AI for accessibility \u2013 with care<\/h2>\n\n<p>AI is a relatively new technology, and for it to be a true benefit in terms of accessibility, its accuracy and reliability need to be absolute. Blind and low-vision users need to be able to turn on AI tools with confidence. <\/p>\n\n<p>In the current \u201cAI rush\u201d to make audio description cheaper, quicker and more available, it\u2019s vital that the people who need it the most are closely involved in how the tech is deployed.<!-- Below is The Conversation's page counter tag. Please DO NOT REMOVE. --><img  loading=\"lazy\"  decoding=\"async\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  alt=\"The Conversation\"  width=\"1\"  height=\"1\"  style=\"border: none !important; box-shadow: none !important; margin: 0 !important; max-height: 1px !important; max-width: 1px !important; min-height: 1px !important; min-width: 1px !important; opacity: 0 !important; outline: none !important; padding: 0 !important\"  referrerpolicy=\"no-referrer-when-downgrade\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-pk-src=\"https:\/\/counter.theconversation.com\/content\/256808\/count.gif?distributor=republish-lightbox-basic\" ><!-- End of code. If you don't see any code above, please get new code from the Advanced tab after you click the republish button. The page counter does not collect any personal data. More info: https:\/\/theconversation.com\/republishing-guidelines --><\/p>\n\n  <p><span><a href=\"https:\/\/theconversation.com\/profiles\/kathryn-locke-2258383\" target=\"_blank\" rel=\"noopener\">Kathryn Locke<\/a>, Associate Researcher in Digital Disability, Centre for Culture and Technology, <em><a href=\"https:\/\/theconversation.com\/institutions\/curtin-university-873\" target=\"_blank\" rel=\"noopener\">Curtin University<\/a><\/em> and <a href=\"https:\/\/theconversation.com\/profiles\/tama-leaver-1798\" target=\"_blank\" rel=\"noopener\">Tama Leaver<\/a>, Professor of Internet Studies, <em><a href=\"https:\/\/theconversation.com\/institutions\/curtin-university-873\" target=\"_blank\" rel=\"noopener\">Curtin University<\/a><\/em><\/span><\/p>\n\n  <p>This article is republished from <a href=\"https:\/\/theconversation.com\" target=\"_blank\" rel=\"noopener\">The Conversation<\/a> under a Creative Commons license. Read the <a href=\"https:\/\/theconversation.com\/ai-is-now-used-for-audio-description-but-it-should-be-accurate-and-actually-useful-for-people-with-low-vision-256808\" target=\"_blank\" rel=\"noopener\">original article<\/a>.<\/p>\n<\/div>\n\n","protected":false},"excerpt":{"rendered":"Chansom Pantip\/Shutterstock Kathryn Locke, Curtin University and Tama Leaver, Curtin University Since the recent explosion of widely available&hellip;\n","protected":false},"author":1216,"featured_media":14660,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","fifu_image_url":"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/thumb\/2\/21\/Google_Pixel_8_Pro.jpg\/2560px-Google_Pixel_8_Pro.jpg","fifu_image_alt":"","footnotes":""},"categories":[16],"tags":[11129,11131,11123,11139,11112,11127,11130,11140,11120,11132,11135,4830,11128,11113,11134,11137,11125,11124,11116,11119,11126,11122,11136,11114,11118,11115,11138,11121,11133,11117],"class_list":{"0":"post-14658","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-tech","8":"tag-ai-accessibility-research","9":"tag-ai-accessibility-tools","10":"tag-ai-accuracy-issues","11":"tag-ai-and-creativity","12":"tag-ai-and-disability","13":"tag-ai-and-user-trust","14":"tag-ai-audio-description","15":"tag-ai-disability-inclusion","16":"tag-ai-document-reading","17":"tag-ai-for-blind-users","18":"tag-ai-for-vision-impairment","19":"tag-ai-hallucinations","20":"tag-ai-impact-on-jobs","21":"tag-ai-in-streaming-services","22":"tag-ai-in-visual-media-accessibility","23":"tag-ai-reliability-in-accessibility","24":"tag-ai-text-to-speech","25":"tag-ai-generated-audio-description","26":"tag-ai-powered-accessibility","27":"tag-amazon-prime-audio-description","28":"tag-assistive-technology-ai","29":"tag-audio-description-in-media","30":"tag-be-my-ai-app","31":"tag-generative-ai","32":"tag-google-pixel-8-accessibility","33":"tag-haptic-feedback-technology","34":"tag-inclusive-technology","35":"tag-netflix-audio-description-ai","36":"tag-seeing-ai-app","37":"tag-synthetic-voices","38":"cs-entry","39":"cs-video-wrap"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/14658","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/users\/1216"}],"replies":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/comments?post=14658"}],"version-history":[{"count":1,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/14658\/revisions"}],"predecessor-version":[{"id":14659,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/14658\/revisions\/14659"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/media\/14660"}],"wp:attachment":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/media?parent=14658"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/categories?post=14658"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/tags?post=14658"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}