by Chet Raymo
"The first string of letters above represents an actual sequence of amino acids on human chromosome 10. The second string is the corresponding sequence for an elephant. I copy the strings from a New Yorker article on Neanderthals by Elizabeth Kolbert. She tosses them in more or less at random just to show what a DNA sequence looks like. Still, they jump off the page. Humans and elephants. A four-letter code.
Four molecules called neucleotides, arranged in pairs along a spiraling ladder, the double-helix – adenine, thymine, guanine and cytosine, represented by the letters A, T, G and C. A always pairs with T, G with C. The complete human genome is a string of something like 3 billion As, Ts, Gs and Cs. Ditto for the elephant. Some 30,000 sequences, of variable length, are genes. Most of the strings are apparently non-functional; so-called “junk.” Give the sequence to a genomist and she can tell you if it belongs to a human or an elephant. Or, for that matter, to an Asian elephant, and African elephant, or an extinct woolly mammoth. Or a modern human or a Neanderthal.
There have been some pretty exciting discoveries in science in my lifetime – plate tectonics, for example, or the cosmic microwave background radiation – that have revolutionized our understanding of the Earth and the universe. But to my mind nothing has been more stunning than the recognition that we share with all of life an elegantly simple four-letter code that determines what we are as a species. And not only our species, but the color of our eyes and the dimples in our cheeks. An identical arm’s-length of DNA in every one of the trillions of cells of our bodies (except red blood cells). And somewhere in that sequence of 3 billion As, Ts, Cs and Gs is presumably the variation that let modern humans prosper at the expense of our Neanderthal neighbors.”