{"id":6824,"date":"2023-08-07T10:00:00","date_gmt":"2023-08-07T10:00:00","guid":{"rendered":"https:\/\/modernsciences.org\/staging\/4414\/?p=6824"},"modified":"2023-07-28T02:51:30","modified_gmt":"2023-07-28T02:51:30","slug":"your-genetic-code-has-lots-of-words-for-the-same-thing-information-theory-may-help-explain-the-redundancies","status":"publish","type":"post","link":"https:\/\/modernsciences.org\/staging\/4414\/your-genetic-code-has-lots-of-words-for-the-same-thing-information-theory-may-help-explain-the-redundancies\/","title":{"rendered":"Your genetic code has lots of \u2018words\u2019 for the same thing \u2013 information theory may help explain the redundancies"},"content":{"rendered":"\n  <figure>\n    <img  decoding=\"async\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-pk-src=\"https:\/\/images.theconversation.com\/files\/538088\/original\/file-20230718-19-gbku0q.jpg?ixlib=rb-1.1.0&#038;rect=136%2C136%2C1677%2C1105&#038;q=45&#038;auto=format&#038;w=754&#038;fit=clip\" >\n      <figcaption>\n        The same amino acid can be encoded by anywhere from one to six different strings of letters in the genetic code.\n        <span class=\"attribution\"><a class=\"source\" href=\"https:\/\/www.gettyimages.com\/detail\/illustration\/and-binary-code-illustration-royalty-free-illustration\/545863911?adppopup=true\" target=\"_blank\" rel=\"noopener\">Andrzej Wojcicki\/Science Photo Library via Getty Images<\/a><\/span>\n      <\/figcaption>\n  <\/figure>\n\n<span><a href=\"https:\/\/theconversation.com\/profiles\/subhash-kak-416870\" target=\"_blank\" rel=\"noopener\">Subhash Kak<\/a>, <em><a href=\"https:\/\/theconversation.com\/institutions\/oklahoma-state-university-2062\" target=\"_blank\" rel=\"noopener\">Oklahoma State University<\/a><\/em><\/span>\n\n<p>Nearly all life, from bacteria to humans, uses the same <a href=\"https:\/\/www.genome.gov\/genetics-glossary\/Genetic-Code#\" target=\"_blank\" rel=\"noopener\">genetic code<\/a>. This code acts as a dictionary, translating genes into the amino acids used to build proteins. The <a href=\"https:\/\/doi.org\/10.1146\/annurev-genet-120116-024713\" target=\"_blank\" rel=\"noopener\">universality of the genetic code<\/a> indicates a common ancestry among all living organisms and the essential role this code plays in the structure, function and regulation of biological cells.<\/p>\n\n<p>Understanding how the genetic code works is the foundation of <a href=\"https:\/\/www.sciencedirect.com\/topics\/neuroscience\/genetic-engineering\" target=\"_blank\" rel=\"noopener\">genetic engineering<\/a> and <a href=\"https:\/\/www.genome.gov\/about-genomics\/policy-issues\/Synthetic-Biology\" target=\"_blank\" rel=\"noopener\">synthetic biology<\/a>. But there are still many unsolved mysteries, such as why the code is important for various biological processes such as <a href=\"https:\/\/theconversation.com\/when-researchers-dont-have-the-proteins-they-need-they-can-get-ai-to-hallucinate-new-structures-173209\" target=\"_blank\" rel=\"noopener\">protein folding<\/a>.<\/p>\n\n<p>As a <a href=\"https:\/\/scholar.google.com\/scholar?start=10&amp;q=s+kak+%26+subhash+kak&amp;hl=en&amp;as_sdt=0,37\" target=\"_blank\" rel=\"noopener\">scholar working at the interface of biology and physics<\/a>, I apply information theory \u2013 the mathematics of how information is stored and communicated \u2013 to study some of these intriguing questions. Just as computers need strings of binary code to function, <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/books\/NBK9843\/\" target=\"_blank\" rel=\"noopener\">biological processes<\/a> also rely on bits of information. <\/p>\n\n<p>In my <a href=\"https:\/\/doi.org\/10.1007\/s12064-023-00396-y\" target=\"_blank\" rel=\"noopener\">recent research<\/a>, I propose that <a href=\"https:\/\/doi.org\/10.1007\/s00034-020-01583-8\" target=\"_blank\" rel=\"noopener\">optimization theory<\/a> may provide a potential explanation for a long-standing mystery about a certain redundancy in how amino acids are encoded.<\/p>\n\n<h2 id=\"different-words-for-the-same-thing\">Different words for the same thing<\/h2>\n\n<p>The genetic codebook is made of \u201cwords\u201d composed of four letters: A, C, G and U. Each of these letters stands for a different chemical building block <a href=\"https:\/\/www.genome.gov\/genetics-glossary\/Nucleotide\" target=\"_blank\" rel=\"noopener\">called a nucleotide<\/a>: adenine, cytosine, guanine and uracil. A molecular machine <a href=\"https:\/\/www.genome.gov\/genetics-glossary\/Ribosome\" target=\"_blank\" rel=\"noopener\">called a ribosome<\/a> reads the codebook to translate genes into proteins.<\/p>\n\n<figure class=\"align-center zoomable\">\n            <a href=\"https:\/\/images.theconversation.com\/files\/538605\/original\/file-20230720-23211-1do6c1.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=1000&amp;fit=clip\" target=\"_blank\" rel=\"noopener\"><img  decoding=\"async\"  alt=\"Circular diagram encoding all 64 possible combinations of the letters A, C, G, and U, which are colored red, yellow, blue, and green, respectively. Abbreviations for different codons are listed around the outer edge of the circle.\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-ls-sizes=\"(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px\"  data-pk-src=\"https:\/\/images.theconversation.com\/files\/538605\/original\/file-20230720-23211-1do6c1.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;fit=clip\"  data-pk-srcset=\"https:\/\/images.theconversation.com\/files\/538605\/original\/file-20230720-23211-1do6c1.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=600&amp;h=441&amp;fit=crop&amp;dpr=1 600w, https:\/\/images.theconversation.com\/files\/538605\/original\/file-20230720-23211-1do6c1.png?ixlib=rb-1.1.0&amp;q=30&amp;auto=format&amp;w=600&amp;h=441&amp;fit=crop&amp;dpr=2 1200w, https:\/\/images.theconversation.com\/files\/538605\/original\/file-20230720-23211-1do6c1.png?ixlib=rb-1.1.0&amp;q=15&amp;auto=format&amp;w=600&amp;h=441&amp;fit=crop&amp;dpr=3 1800w, https:\/\/images.theconversation.com\/files\/538605\/original\/file-20230720-23211-1do6c1.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;h=554&amp;fit=crop&amp;dpr=1 754w, https:\/\/images.theconversation.com\/files\/538605\/original\/file-20230720-23211-1do6c1.png?ixlib=rb-1.1.0&amp;q=30&amp;auto=format&amp;w=754&amp;h=554&amp;fit=crop&amp;dpr=2 1508w, https:\/\/images.theconversation.com\/files\/538605\/original\/file-20230720-23211-1do6c1.png?ixlib=rb-1.1.0&amp;q=15&amp;auto=format&amp;w=754&amp;h=554&amp;fit=crop&amp;dpr=3 2262w\" ><\/a>\n            <figcaption>\n              <span class=\"caption\">The codon sequence is read from the center of the wheel of genetic code.<\/span>\n              <span class=\"attribution\"><span class=\"source\">Mouagip via Wikimedia Commons<\/span><\/span>\n            <\/figcaption>\n          <\/figure>\n\n<p>Ribosomes read three-letter words <a href=\"https:\/\/www.genome.gov\/genetics-glossary\/Codon\" target=\"_blank\" rel=\"noopener\">called codons<\/a>, and there are 64 different possible combinations of the four letters that make different codons. In this list of 64 words, 61 <a href=\"https:\/\/www.acs.org\/education\/whatischemistry\/landmarks\/geneticcode.html\" target=\"_blank\" rel=\"noopener\">encode amino acids<\/a>, and three signal the ribosome to stop protein synthesis in the cell. For example, \u201cAUG\u201d codes for the amino acid methionine and also indicates the start of a protein.<\/p>\n\n<p>But just as in any other language, there are synonyms \u2013 different codons can encode the same amino acid. In fact, since there are only 20 amino acids but 61 different words to encode them, there is quite a lot of overlap. An amino acid can have anywhere from one to six different codons that encode it. There are only two amino acids that have <a href=\"https:\/\/www.genome.gov\/sites\/default\/files\/media\/images\/tg\/Genetic-code.jpg\" target=\"_blank\" rel=\"noopener\">exactly one codon<\/a>, methionine and trytophan. This redundancy helps ribosomes perform their tasks correctly even when there\u2019s a <a href=\"https:\/\/doi.org\/10.3389\/fgene.2014.00140\" target=\"_blank\" rel=\"noopener\">typo in the genetic code<\/a>.<\/p>\n\n<h2 id=\"engineering-natures-guidelines\">Engineering nature\u2019s guidelines<\/h2>\n\n<p>Why certain amino acids have more synonyms than others is a mystery that has puzzled scientists for decades. Is there a pattern to this variability, or is it random? To answer this question, scientists study the rules that govern nature\u2019s decision-making.<\/p>\n\n<p>If a human engineer designed the genetic code, they would want to make sure that each amino acid had a similar degree of redundancy to protect against errors and to promote uniformity. The mapping of the 61 codes onto the the 20 amino acids would be roughly equal, with each amino acid assigned three codons.<\/p>\n\n<p>But nature has different priorities. <a href=\"https:\/\/theconversation.com\/simulating-evolution-how-close-do-computer-models-come-to-reality-57538\" target=\"_blank\" rel=\"noopener\">Evolutionary models of natural systems<\/a> like bacteria demonstrate that nature is always <a href=\"https:\/\/doi.org\/10.1038\/nature03842\" target=\"_blank\" rel=\"noopener\">striving for optimization<\/a>. Not only does the final form of a protein need to be optimal, but so do its intermediate forms. Optimization ensures that natural systems can adapt to different environments.<\/p>\n\n<p>Scientists understand some of the guidelines that nature follows when engineering the genetic code. For instance, the <a href=\"https:\/\/doi.org\/10.1002\/iub.146\" target=\"_blank\" rel=\"noopener\">spatial arrangement of atoms and molecules<\/a> within and surrounding the genetic code can affect its function, as well as the <a href=\"https:\/\/doi.org\/10.3389\/fgene.2014.00140\" target=\"_blank\" rel=\"noopener\">coevolution of other cellular structures<\/a> involved in creating proteins.<\/p>\n\n<h2 id=\"information-theory-and-genetics\">Information theory and genetics<\/h2>\n\n<p><a href=\"https:\/\/doi.org\/10.1007\/s12064-023-00396-y\" target=\"_blank\" rel=\"noopener\">My research indicates<\/a> that there may be two other significant factors that natural systems consider: the information-theoretic nature of the genetic code and the principle of maximum entropy. <\/p>\n\n<p>Paralleling how the computer processes data consisting of 0s and 1s, life processes the genetic code based on data consisting of the four letters A, C, G and U. Mathematically, however, the most energy-efficient way to represent data isn\u2019t binary (or base 2) \u2013 using 0s and 1s, as computers do \u2013 <a href=\"https:\/\/doi.org\/10.1007\/s00034-020-01480-0\" target=\"_blank\" rel=\"noopener\">but rather base e<\/a>. <a href=\"https:\/\/theconversation.com\/pi-gets-all-the-fanfare-but-other-numbers-also-deserve-their-own-math-holidays-200046\" target=\"_blank\" rel=\"noopener\">Short for Euler\u2019s number<\/a>, e is an irrational number \u2013 meaning that there\u2019s no way to write down its exact value using fractions or decimals (although it\u2019s approximately 2.718). <\/p>\n\n<figure class=\"align-center zoomable\">\n            <a href=\"https:\/\/images.theconversation.com\/files\/536904\/original\/file-20230711-16-bunaau.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=1000&amp;fit=clip\" target=\"_blank\" rel=\"noopener\"><img  decoding=\"async\"  alt=\"The Mandelbrot set, a mathematical fractal, shown in black against a blue background. The edges of the fractal are blue and white\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-ls-sizes=\"(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px\"  data-pk-src=\"https:\/\/images.theconversation.com\/files\/536904\/original\/file-20230711-16-bunaau.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;fit=clip\"  data-pk-srcset=\"https:\/\/images.theconversation.com\/files\/536904\/original\/file-20230711-16-bunaau.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=600&amp;h=600&amp;fit=crop&amp;dpr=1 600w, https:\/\/images.theconversation.com\/files\/536904\/original\/file-20230711-16-bunaau.png?ixlib=rb-1.1.0&amp;q=30&amp;auto=format&amp;w=600&amp;h=600&amp;fit=crop&amp;dpr=2 1200w, https:\/\/images.theconversation.com\/files\/536904\/original\/file-20230711-16-bunaau.png?ixlib=rb-1.1.0&amp;q=15&amp;auto=format&amp;w=600&amp;h=600&amp;fit=crop&amp;dpr=3 1800w, https:\/\/images.theconversation.com\/files\/536904\/original\/file-20230711-16-bunaau.png?ixlib=rb-1.1.0&amp;q=45&amp;auto=format&amp;w=754&amp;h=754&amp;fit=crop&amp;dpr=1 754w, https:\/\/images.theconversation.com\/files\/536904\/original\/file-20230711-16-bunaau.png?ixlib=rb-1.1.0&amp;q=30&amp;auto=format&amp;w=754&amp;h=754&amp;fit=crop&amp;dpr=2 1508w, https:\/\/images.theconversation.com\/files\/536904\/original\/file-20230711-16-bunaau.png?ixlib=rb-1.1.0&amp;q=15&amp;auto=format&amp;w=754&amp;h=754&amp;fit=crop&amp;dpr=3 2262w\" ><\/a>\n            <figcaption>\n              <span class=\"caption\">The Mandelbrot set is a mathematically generated fractal.<\/span>\n              <span class=\"attribution\"><a class=\"source\" href=\"https:\/\/commons.wikimedia.org\/wiki\/File:Mandelbrot20210909_ABC02_65535x65535.png\" target=\"_blank\" rel=\"noopener\">PantheraLeo1359531 via Wikimedia Commons<\/a>, <a class=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by\/4.0\/\" target=\"_blank\" rel=\"noopener\">CC BY<\/a><\/span>\n            <\/figcaption>\n          <\/figure>\n\n<p>Nature\u2019s affinity for optimization using this irrational number is responsible <a href=\"https:\/\/theconversation.com\/mathematics-of-scale-big-small-and-everything-in-between-115890\" target=\"_blank\" rel=\"noopener\">for the infinitely repeating fractals<\/a> seen in <a href=\"https:\/\/fractalfoundation.org\/OFC\/OFC-10-4.html\" target=\"_blank\" rel=\"noopener\">jagged shorelines<\/a>, <a href=\"https:\/\/www.smithsonianmag.com\/innovation\/fractal-patterns-nature-and-art-are-aesthetically-pleasing-and-stress-reducing-180962738\/\" target=\"_blank\" rel=\"noopener\">fern leaves, snowflakes and trees<\/a>. <a href=\"https:\/\/doi.org\/10.1007\/s40819-022-01251-2\" target=\"_blank\" rel=\"noopener\">Beyond biology<\/a>, information optimization using e also has applications in <a href=\"https:\/\/doi.org\/10.1007\/s00034-021-01726-5\" target=\"_blank\" rel=\"noopener\">mathematics<\/a> and <a href=\"https:\/\/doi.org\/10.1038\/s41598-020-77855-9\" target=\"_blank\" rel=\"noopener\">cosmology<\/a>. <\/p>\n\n<p>Another principle operating in the natural world is that of <a href=\"https:\/\/doi.org\/10.1016\/1355-2198(95)00022-4\" target=\"_blank\" rel=\"noopener\">maximum entropy<\/a>. Entropy is a measure of disorder in a system, and the maximum entropy principle states that systems evolve to states of greater disorder. This principle allows researchers to <a href=\"https:\/\/doi.org\/10.1016\/j.heliyon.2018.e00596\" target=\"_blank\" rel=\"noopener\">make inferences<\/a> from limited data and has been used to explain how <a href=\"https:\/\/doi.org\/10.1103\/PhysRevLett.100.078102\" target=\"_blank\" rel=\"noopener\">amino acids interact in proteins<\/a>. <\/p>\n\n<p>In the context of codon groupings, the maximum entropy principle implies that nature is scrambling data as much as possible \u2013 meaning the function that describes the distribution of codon groupings should be mathematically difficult to undo. Studying how to maximize the mathematical complexity of this function <a href=\"https:\/\/www.britannica.com\/science\/Fibonacci-number\" target=\"_blank\" rel=\"noopener\">reveals potential patterns<\/a> underlying the codon groupings.<\/p>\n\n<p>I believe these two principles may <a href=\"https:\/\/doi.org\/10.1007\/s12064-023-00396-y\" target=\"_blank\" rel=\"noopener\">help describe<\/a> the distribution of the codon groups in the genetic code and point to the usefulness of mathematics in analyzing natural systems. Although there are many biological mysteries that scientists have yet to solve, information theory can be a powerful tool to help crack the genetic code.<!-- Below is The Conversation's page counter tag. Please DO NOT REMOVE. --><img  loading=\"lazy\"  decoding=\"async\"  src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABAQMAAAAl21bKAAAAA1BMVEUAAP+KeNJXAAAAAXRSTlMAQObYZgAAAAlwSFlzAAAOxAAADsQBlSsOGwAAAApJREFUCNdjYAAAAAIAAeIhvDMAAAAASUVORK5CYII=\"  alt=\"The Conversation\"  width=\"1\"  height=\"1\"  style=\"border: none !important; box-shadow: none !important; margin: 0 !important; max-height: 1px !important; max-width: 1px !important; min-height: 1px !important; min-width: 1px !important; opacity: 0 !important; outline: none !important; padding: 0 !important\"  referrerpolicy=\"no-referrer-when-downgrade\"  class=\" pk-lazyload\"  data-pk-sizes=\"auto\"  data-pk-src=\"https:\/\/counter.theconversation.com\/content\/209471\/count.gif?distributor=republish-lightbox-basic\" ><!-- End of code. If you don't see any code above, please get new code from the Advanced tab after you click the republish button. The page counter does not collect any personal data. More info: https:\/\/theconversation.com\/republishing-guidelines --><\/p>\n\n<p><span><a href=\"https:\/\/theconversation.com\/profiles\/subhash-kak-416870\" target=\"_blank\" rel=\"noopener\">Subhash Kak<\/a>, Professor of Electrical and Computer Engineering, <em><a href=\"https:\/\/theconversation.com\/institutions\/oklahoma-state-university-2062\" target=\"_blank\" rel=\"noopener\">Oklahoma State University<\/a><\/em><\/span><\/p>\n\n<p>This article is republished from <a href=\"https:\/\/theconversation.com\" target=\"_blank\" rel=\"noopener\">The Conversation<\/a> under a Creative Commons license. Read the <a href=\"https:\/\/theconversation.com\/your-genetic-code-has-lots-of-words-for-the-same-thing-information-theory-may-help-explain-the-redundancies-209471\" target=\"_blank\" rel=\"noopener\">original article<\/a>.<\/p>\n\n","protected":false},"excerpt":{"rendered":"The same amino acid can be encoded by anywhere from one to six different strings of letters in&hellip;\n","protected":false},"author":544,"featured_media":6815,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","fifu_image_url":"","fifu_image_alt":"","footnotes":""},"categories":[11,16],"tags":[294,88,228,895,474],"class_list":{"0":"post-6824","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-nature","8":"category-tech","9":"tag-dna","10":"tag-gene","11":"tag-genetics","12":"tag-information-theory","13":"tag-the-conversation","14":"cs-entry","15":"cs-video-wrap"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/6824","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/users\/544"}],"replies":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/comments?post=6824"}],"version-history":[{"count":1,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/6824\/revisions"}],"predecessor-version":[{"id":6825,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/posts\/6824\/revisions\/6825"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/media\/6815"}],"wp:attachment":[{"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/media?parent=6824"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/categories?post=6824"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/modernsciences.org\/staging\/4414\/wp-json\/wp\/v2\/tags?post=6824"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}