''An interesting question to ponder: How many contributors to this page are:'' * ''Biologists (molecular or otherwise), biochemists, chemists, geneticists, physicians, chemical engineers, etc?'' * ''Degree-holders in any of the above fields, or a related field (even if not an active practitioner)?'' * ''Students in any of the above fields, or a related field? (Student means "studying the field for a degree (or at least a minor)", not "I got an A- in freshman biology")'' You're going for appeal to authority? How about just "how many contributors know WTF they're talking about?" The answer may vary depending on whether you ask a contributor or his audience. ;-) More seriously, it's pretty obvious, as usual, due to which paragraphs go into technical detail versus not. ''I'm not an authority myself, though I'm mostly a lurker on this page. One interesting phenomenon I've long observed on Wiki (and elsewhere) is the amazing ability of programmers to expound at length on topics which have little to do with programming. I don't doubt that many of the people here are quite familiar with the material; OTOH I suspect that much of the material here is of similar quality to what you would find were a forum full of biologists to start discussing programming or software design.'' * Quite. It's clear that some of the contributors here really do know some depth about the subject, and others of course do not. But you know, the same thing happens on the pages talking about programming topics. It's not like every programmer knows every aspect of software. ** But most of 'em think they do. :) ''Of course, a good argument can be lots of fun... :)'' ---- The building blocks of life: http://en.wikipedia.org/wiki/DNA Uses a very clever built-in redundancy: one strand of DNA (half a double helix) can be used to create the other strand. Many organisms find this level of redundancy to be insufficient and additionally maintain a duplicate molecule as a backup. It appears that OnceAndOnlyOnce isn't always workable. * It is roughly equivalent to disk mirroring. Is disk mirror a "bad" violation of OnceAndOnlyOnce? It is mostly a hardware safeguard rather than conceptual duplication from the user's standpoint. ** IntentionalRedundancyDoesNotViolateOnceAndOnlyOnce ''Unfortunately, your genome has no revision control system.'' (Yet.) ''It may be noted that diploidy, a second redudancy made by keeping two strands containing the same genes, was probably advantageous because it lets things keep old versions of genes around while still working off new versions. It falls considerably short of a whole revision control system, though.'' Why doesn't DNA have a CheckSum? ''It does, but it is external. If the DNA is wrong, the organism dies and only workable DNA moves on. Otherwise it is just a bug, and the usual unit tests - life, school, work - will probably detect it'' * Don't use that word for it. It's an insult to common sense. DNA is ''never'' refactored. It's cobbled together, tinkered with, messed up and fucked over, but it is never, ''ever'' refactored! ** It is functionally refactored all the time. The process may be messy, and without deliberate design, but the results are still quite elegant. The most straightforward example of refactoring would be viral genomes. Viruses are under enormous pressure to make their genomes small so you often see examples of genes that have meaning when read both forwards and backwards, or overlapping coding sequences, or even self-modifying code. There isn't any "new" functionality (although this can also happen), but rather a more efficient expression of existing functionality. Viral code resembles the best results of "hand-coded optimized AssemblyLanguage''. Plus, DNA does have checks on it, whenever it is copied (which is the only time base replacement can reasonably occur). They're reasonably good, in fact. They simply aren't perfect. The checks are actually vastly superior to "reasonably good". Cells make a copying error on average only once per 10^10 bases copied. ''I counted it as only reasonably good because there are other sorts of mistakes which can be made, e.g. in chromosome segregation, and I'm not as sure about those.'' ''Unless I'm misreading what you wrote, that speaks to the quality of the copying, not to the goodness of the error detection/correction.'' No, the overall error rate for a round of replication depends on both the error rate during copying and the rate of error detection and correction after copying. Some of the enzymes which do the copying (DNA polymerases) have built-in mechanisms for detecting and correcting their errors, but these are limited; in order to copy the genome in a reasonable amount of time, the enzymes must be error-prone. There is a whole other set of enzymes that are involved in correcting errors in the already-copied genome. * This is true. The main replicative polymerases (that do the copying) make mistakes at a frequency of about 10^-5. They then have an auto-correcting function (proofreading) that lowers this frequency by two orders of magnitude down to 10^-7. There is then a completely separate system that follows on afterwards to do error detection and subsequent correction, decreasing the error frequency by three more orders of magnitude to a final frequency of 10^-10. These results are for the ''E.coli'' bacterium. Human milage may vary, but probably doesn't. :) * Of course, there are other related factors, not all of which are strictly similar, such as simple redundancy of critical genes. Some of the evolutionarily-newest of human genes are insufficiently redundant, giving rise to severe genetic disease from single base-pair mutations. This sort of thing becomes progressively less likely with evolutionary age of a gene, for obvious reasons. -- dm * ''[...in order to copy the genome in a reasonable amount of time...]'' How fast is a reasonable amount of time? How do the individual operations involving copying genome compare to the instructions of MachineCode in regards to speed? * '''Everything''' sequential in biology is glacially slow compared with modern electronics, although parallelism more than makes up the difference in some systems (e.g. the brain). Prokaryotes (e.g. E. Coli) copy about 1000 base pairs per second, eukaryotes (e.g. humans) copy about 50 base pairs per second. http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/D/DNAReplication.html ** While this may be true for information processing, it is certainly not true for manufacturing. Can an electronic machine that generates covalent bonds between nucleotide monomers even be constructed? Most organisms, ''E. coli'' and ''Homo sapiens'' included, have atrocious DNA error-correction. This is proved by the extreme vulnerability of most species to ionizing radiation when compared with ''Deinococcus radiodurans'' that can tolerate lethal doses thousands of times higher than those of your average dumb happy-go-lucky species. The lesson is clear; most species don't WANT tolerable error-correction. 10^-10 is shit. * Ionizing radiation doesn't kill organisms by making errors in DNA but rather by physically fragmenting chromosomes. There are repair mechanisms for this form of damage as well, but it would be wrong to call this "error correction" in the same way that a computer's error-correction mechanisms aren't going to much help after you smash the chips with a hammer. * 10^-10 is perfectly acceptable. At this level of correction, the organism passes all its unit tests (can it live and reproduce?). Higher levels of correction would violate YAGNI. DNA fragmentation is as much a sort of error to expect in a physical system as base transition in copying is. It's simply a far less common one, especially in multicellular organisms, so they have extremely careful checks on the latter (especially eukaryotes) and minimal checks on the former (especially eukaryotes). But evolutionary sufficiency isn't the same thing as YAGNI, because it allows the possibility that you might very seriously need it, just not that everyone will all the time. To say people don't need resistance to large doses radiation is ignoring cases where they were exposed to it. * Actually, the cellular damage sensing and response systems to physical damage like broken chromosomes is also quite sophisticated and efficient, and the spontaneous breakage of chromosomes is not a particularly rare event. It didn't seem particularly germane to the issue of "error correction", since it's a sort of orthogonal damage to simple information-content damage in the form of miscopied bases. Note also that although ''people'' (instances) might on occasion need resistance to large doses of radiation, ''humanity'' (the class) does not. * The other author neglected to mention that the worst short term damage comes from ionization-produced free radicals that interfere with basic cell metabolism; damage to DNA is secondary after that. ** Untrue that this is the "worst short term damage". Actual cell killing by ionizing radiation (the measure the first author mentioned) is directly related the number of DNA double-strand breaks formed, and not to amounts of other forms of damage. ---- '''Central Dogma''' What is the Central Dogma of Biology? That the sequence of peptides in proteins can be deduced solely by the sequence of nucleotides in the genes which code for them. Another way to put it is that inheritance of traits is exclusively via the gene, which is a sequence of DNA limited by start and stop codes in the Universal Genetic Code (a single code is used to express the meaning of DNA base pair triplets in essentially all organisms in all 5 taxonomic kingdoms, with the exception of organelles such as mitochondria that became cellular symbiotes prior to a finalization of the universal code - organisms using alternate DNA encodings became symbiotes of universal code users or became extinct). ''Not true; see discussion moved below.'' http://psyche.uthct.edu/shaun/SBlack/geneticd.html * '''"gene":''' The ''gene'' is the DNA making up the "coding sequence", plus all the control elements responsible for expression and translation of the coding sequence. This would include transcription promoters and terminators, 5' and 3' untranslated regions, introns and various associated splicing signals, poly-adenylation signals, RNA nuclear export and/or degradation signals, etc. Mechanisms of post-translational modification although clearly vital for correct protein function are generally not considered to be part of the ''gene'' per se, that is, a gene is everything necessary to get you to the primary sequence of the polypeptide chain, but not beyond that point. Also note that the "gene" generally consists of a linear DNA sequence only. Epigenetic factors such as chromatin structure, methylation status etc. are generally not considered part of the "gene". The inclusion of factors which act in ''trans'' such as micro-RNA translational inhibitors is somewhat nebulous at this time although they would also generally not be included in what would be considered the "gene". * '''"coding sequence":''' The coding sequence is that region of the ''mRNA'' between the start and stop codons which is translated into protein. There are (almost always) corresponding pieces of DNA from which the mRNA is derived (as asserted by the "central dogma"), sometimes (usually, depending on species) separated by introns, which are transcribed but subsequently "spliced out" and therefore not translated. DNA that directly corresponds to the translated sequence generated from mRNA by reverse-transcription is sometimes called a "cDNA" ** [I've never heard this definition of "gene". I've always heard it defined as "the smallest unit of heredity."] ''The above is a molecular biology definition of "gene", contrasted with "unit of heredity" which is more of a population genetics definition. They are roughly equivalent.'' At its essence, the "central dogma" is about the flow of information: that information stored in DNA is passed to information stored in RNA which is passed to information stored in protein. ---- '''Problems with the "Central Dogma"''' It might also be a good place to mention that the Central Dogma of Biology is wrong, completely dead wrong. And the various mechanisms responsible for this; methylation, et cetera. Ah, here's a nice summary: "The central dogma says that information flows in a rigid way within a cell, originating in the DNA, moving to the RNA, which then couples with a ribosome to create proteins out of the naturally occurring amino acids according to the universal genetic code" And this is how wrong the Central Dogma is: nowadays, serious molecular biologists have given up on the notion of "genes"! There are coding sequences and non-coding sequences, but no "genes" as such. * Serious molecular biologists talk about "genes" every day, but serious molecular biologists also understand the limitations of the terminology, and understand that the people they talk to will also understand these limitations. I knew about it 10 years ago. I was laughed at. Just goes to show the difference between the avant-garde and the rear-garde. ''Yep. But it's one thing to know the truth, it's another to be able to prove it in a reproducible way, which is why a minority of specialists in the field who knew better were largely ignored until recent years.'' * [It's not surprising that someone a decade could have guessed our knowledge is incomplete, nor would it have been much help on its own.] Unfortunately, I don't know exactly how methylation works. Also, there's something I've read recently, to do with RNA being active (working as an enzyme?) which I've completely forgotten. * There are now many examples of enzymatic RNA (first noted in 1982), and this was big enough news to win Sidney Altman and Tom Cech the NobelPrize in chemisty in 1989. DNA has also been shown to be capable of enzymatic activity in some cases as well. ''There are several other known ways that the dogma is broken in addition to those, btw; the list keeps growing.'' Details please. ''There's actually a pretty nice summary in the famous Human Genome Project "final report" or whatever it was called that was published in Nature. One area concerns things that are encoded with the help of introns, or exclusively by introns, such as the infant form of hemaeoglobin. Or genes that are encoded as claimed by the old Dogma, '''but''' that do not get expressed without the help of an "upstream promoter" in the intron.'' ''Even less well understood are the cell metabolic machinery's many mechanisms, which control expression of all DNA, but are not wholly coded for by DNA themselves; parts are simply copied from parent to child cell during mitosis, giving rise to the classic (i.e. I made it up the other day ;-) TheLawOfMutatingBinaryImages problems: you can't reproduce an entire functioning cell purely from DNA, you have to start with a functioning cell (so forget about Jurassic park; in all likelihood, dinosaur DNA is not quite compatible with any existing cell's metabolic machinery).'' ''The heritable methylation that you mention has gotten the most press, I think.'' * The Central Dogma is a useful thing to teach high-school students, kind of like the way physics teaches the Bohr model of the atom to high-school students, or even Newtonian physics. Ok, sure, we all know they're wrong, but they are useful teaching models none-the-less. The "central dogma" that DNA is transcribed to RNA and then translated to protein is correct in the vast majority of the cases, with clear exceptions such as retroviruses. For examples of heritability that is not based in DNA, there are RNA viruses (heritable RNA) and prions (heritable protein structure). Heritable methylation patterns and similar phenomena are collectively referred to as "imprinting". * ''Really bad analogy, unless you intended to self-destruct your argument. The Bohr model of the atom is not only a useless thing to teach to high-school students but it is actually harmful. Teaching Newtonian mechanics is actively harmful to the education of modern physics. It is NOT an acceptable approximation of reality, it is NOT "okay". Establishing concepts of classical reality only to have to knock them down in future is a retarded process. It is degrading, wasteful and contemptuous of the students' intellect.'' * There's a huge difference. Newtonian mechanics is philosophically wrong, accurate only as a way to approximate results, and so valid as an engineering tool but not as a way to understand physics. The processes described in the central dogma are still an enormous part of how a cell functions. I suppose a way to put it is that it's dead wrong as a dogma, but useful if incomplete as a model. Would you agree with that? The Central Dogma of Genetics having been invalidated makes it sound like all our information on molecular biology is invalid. It's not, it's just incomplete. ''What do you call it when the missing pieces are more important than what's been discovered so far? Because in many important respects, methylation is more important than the rest of DNA to protein translation. It explains human history.'' * can you elaborate on the important effects of methylation on protein translation? *''Methylation controls gene expression. It's responsible for, among other things, the level of stress response in human beings. And that in turn is responsible for a great deal of human history, especially its otherwise inexplicable decreasing bloodthirstiness.'' ** gene expression is transcription, not translation. Methylation, although an important modulator of transcription in some contexts is, by and large, much less important than a wide variety of other mechanisms of transcriptional regulation. * First, methylation is no more an explanation of human history than quantum mechanics is, it's an explanation of human behavior, from which history is a second-order effect. Second, I don't see why it's so fundamentally opposite to the central dogma. It was definitely always known conditions in the cell could affect protein construction and expression, probably always known they could affect translation and transcription. So modulation of these turns out to be more important than thought. Does this remove the central role that translation and transcription have in protein construction? Not at all. Your case seems remarkably over-stated, and worse, this sort of emphasis discourages recognition of the importance of new findings. ** The truly MASSIVE difference between methylation imprinting and mere DNA encoding is what they enable at the higher level of organisms interacting with each other and their environments. DNA encoding allows for Darwinian evolution. Imprinting of methylation patterns allows for Lamarkian evolution. This is a far cry from "hmmm, this effect is more important than we thought", it's rather "holy shit, what the hell is that???" ** Second, you downgrade the importance of imprinting and the Central Dogma of Biology, so it's not like you are likely to appreciate its being overturned. So let me put this into simpler terms: strict darwinian evolution has been overturned. Strict Darwinian evolution was overturned with the discovery of lateral gene transfer in bacteria and via viruses. Incidentally, it didn't completely overturn the way we think about evolution - it was a modification, not a revolution. But how does methylation imprinting help in multicellular organisms, unless improvements made in somatic cells can be passed to reproductive cells, which is not inherent in any of the above? As for the central dogma and its importance, I understand them well enough, and I know full well that it was over-relied upon, but see above. * Imprinting is by definition that non-sequence information passed to reproductive cells. Generally it doesn't have anything to do with response to environment, but with basic-level control of gene expression. Several known diseases are linked to imprinting or to defects in imprinting. A human imprinting map is at http://genes.uchicago.edu/upd/ but note that sequence-level alteration is orders of magnitude more influential on organismal development that imprinting is (not to devalue imprinting, just to provide some perspective). Regarding "Central Dogma" being an existing term, not something made up here: Crick invented it in 1957. A quick Google search turns up 44 hits on nature.com alone: http://www.google.com/search?sourceid=navclient&q=%22central+dogma%22+site%3Awww%2Enature%2Ecom The Human Genome Project final report was published in Nature, Feb 2001, see http://www.nature.com/cgi-taf/dynapage.taf?file=/nature/journal/v409/n6822/index.html#humang In particular see the 60 page summary article, "Initial sequencing and analysis of the human genome" HTML: http://www.nature.com/cgi-taf/DynaPage.taf?file=/nature/journal/v409/n6822/full/409860a0_fs.html PDF: http://www.nature.com/cgi-taf/DynaPage.taf?file=/nature/journal/v409/n6822/full/409860a0_fs.html&content_filetype=pdf ---- '''Evolutionary taxonomy discussion:''' * Incidentally, mitochondria and chloroplasts came from bacteria, one of the five kingdoms (''but modern bacteria do not share their DNA encoding scheme with mitochondria and chloroplasts, which makes it quite dubious that they arose from the same kind of "bacteria"''). Not at all. Chloroplasts ''definitely'' evolved from cyanobacteria (including ''Prochloron'' and such). Variant genetic codes are found within a few groups - for instance the ciliates among the protozoa, and a peculiar species of yeast among the fungi. It's obviously possible, though not common, for the scheme to change ''within'' a group. The general outlines of the code remain the same, however. It was always known, however, that the code doesn't uniquely determine protein composition because some amino acids (most notably cystine) aren't even in the code. ** as a side note, molecular genetics has somewhat obsoleted the "five kingdom" taxonomy. The modern view is three kingdoms (super-kingdoms?) consisting of Bacteria, Archaea and Eukaryota. ** Not true. Carl Woese's 3 Domains are not replacements for the (3/4/5/6/7) kingdoms, they are a higher order of taxonomy. For instance, the domain Eukarya includes both plants and animals, but that doesn't mean that they are in the same kingdom. The exact number of kingdoms is a subject of ongoing debate; six kingdoms is a fairly popular choice, but not universal. *** Maybe we should call them "empires"... *** Empires is the traditional name, but Woese chose domains to identify the uniqueness of his construction. A few authors reject it, arguing that the conventional tree is misrooted and that the Archaea and Eukaryota probably arose within the Bacteria, in which case the Archaea aren't really distinct enough to be a separate group. Exactly how ''obsolete'' the five-kingdom system is depends on how much you agree with those who insist on classifications composed of clades, groups containing an ancestral form and ''all'' its descendants, but it should be noted such classifications can't actually handle ancestral species. This could be discussed elsewhere. ** Molecular genetics has also somewhat obsoleted traditional taxonomies and introduced cladistics. *** Again, not true. Cladistics is not a method of taxonomy, it is a method of investigating evolutionary relationships, which classifications have been expected to reflect since the idea of evolution became widely accepted. Thus, most changes associated with molecular genetics are simple consequences of new ideas about how the creatures evolved. There is currently an argument over whether taxa must be clades (an ancestor and all descendants) or may be grades (including only some), but practioners of both may make use of molecular genetics and cladistics to construct their tree. On-line materials often identify cladistics with the former, but they are wrong, and there are some good though rarely acknowledged arguments for the latter (namely every species, genus, etc must be the descendant of another). ---- This might be a good point to introduce PNA (PeptideNucleicAcid). Because DNA sucks. * If DNA "sucks", then what are the advantages of PNA? ** PNA sucks. Since the backbone is uncharged, it's too hard to separate the strands of a PNA-PNA duplex. This makes it difficult for the information in the PNA to be read and/or copied. ---- Discussion about parallels between genetics and memetics moved to MemeticsGenetics. ---- See also: GeneticCode BiologicalDeadlock DnaCancerBasis