PubMed Commons Comments

The NCBI's PubMed Commons commenting was stopped March 2018 and links to PubMed Abstracts were cut. Although we did not know it, we contributors to PMC from 2013-2018 had been guinea pigs. It had been an "experiment." The comments were stored in NCBI Excel files in a form that was difficult to access. Some comments stored in The Wayback Machine were retrievable. Comment files were also made available by the Hypothesis website, and the PubPeer website.

As a back up, I also placed my comments here. I begin with my comments made later than Sept 7 2016 and ending Feb 9 2018 and there are links to the Wayback site for earlier comments. My list is not entirely complete. To illustrate the issue, I begin with a posting I made in The Scholarly Kitchen Blog on 7 March 2018.


The pre-publication “peer review initiatives,” which Tim Vines cogently deplores as “a curious blindness,” prompts similar thoughts concerning a notable post-publication peer review initiative that the NCBI began in 2013. Whereas prepublication review provides some index of quality, a more exacting review is made after publication by readers who are influenced by, and may act upon, the information they have obtained. Citations provide one index of this. The other is post-publication peer review as provided by PubMed Commons. Remarkably, the results of the latter, may feed back into the pre-publication peer review process .

Those charged with reviewing a new paper, or grant application, or even a Nobel prize suggestion, are confronted with an author’s list of publications. Pasting each title into PubMed Commons, one sees an abstract of the publication, sometimes accompanied by the freely-given post-publication peer-review comments.

Often meant to be constructive, and monitored for politeness by PubMed staff, it seems likely that sometimes authors, editors, the original pre-publication peer reviewers, and even publishing houses, were embarrassed by the comments. Flaws, sometimes of a degree that Leonid Schneider so rightly deplores in his webpages, emerged, not only in works of authors world-wide, but also in the works of NCBI staff, which includes expatriates from Russia and other countries.

Sadly, the NCBI have now declared PubMed Commons an “experiment.” An experiment that failed. The criterion was the number of comments received, not their quality. Despite important feedback when its intention to terminate was announced, the NCBI’s post-publication peer-review ended last week, and with it, a source of invaluable assistance to the pre-publication, (and pre-Nobel), peer-review process!



PubMed Commons Comment on “Antigen Identification for Orphan T Cell Receptors Expressed on Tumor-Infiltrating Lymphocytes” Gee et al. 2018 Cell 172:549-563. [7 Feb 2018]


This study reveals the great power of the yeast-library approach in displaying peptide-MHC complexes that can be recognized by T-cells that have infiltrated tumors (TILs). As stated (1), the approach “requires no a priori knowledge regarding the nature of these antigens,” and “is an unbiased interrogation of TCR specificities.” While conceding that “we cannot conclude that any TIL TCR is exclusively present within tumor due to limited sampling of healthy tissue,” the authors express surprise that, of the four receptors identified, three recognized unmutated self-antigens. It can be noted, however, that this had been predicted on theoretical grounds two decades earlier (2). While the underlying theory received subsequent modification (3), and may indeed by entirely incorrect, the correlation of this intriguing experimental observation with theory may be worth noting.

1.Gee et al. (2018) Antigen identification for orphan T cell receptors expressed on tumor-infiltrating lymphocytes” Cell 172:549-563. <PMID:29275860>
2.Forsdyke DR (1999) Heat shock proteins as mediators of aggregation-induced "danger" signals: implications of the slow evolutionary fine-tuning of sequences for the antigenicity of cancer cells. Cell Stress & Chaperones 4:205-210. < PMID:10590834>
3.Forsdyke DR (2001) Adaptive value of polymorphism in intracellular self/not-self discrimination? Journal of Theoretical Biology 210:425-434. <PMID:11403563>

PubMed Comment on “An Evolutionary Perspective on the Systems of Adaptive Immunity” by Muller V, Boer RJ de, Bonhoeffer S, Szathmary E (2018) Biol Rev 93: 505-528 <PMID:28745003> [20 Jan 2018]


The distinction between selective and instructive (Lamarckian) systems of immunity (1) – originating with Paul Ehrlich – was clearly set out in 1957 by Talmage (2) who, with Burnet, can be considered a “father of clonal selection theory” (3, 4). Its historical omissions aside, this bold attempt to place the evolution of immune systems in a broad context raises other concerns.

Although mentioning “the complex adaptation of the immune repertoire to the antigenic environment,” and the need “continuously to acquire and store open-ended information about the antigenic environment,” the coevolution of that antigenic environment (e.g. the coevolution of pathogens) does not seem to have been considered.

While the authors agree with Burnet that “distinguishing tumours from normal self is likely to be the most challenging task for Darwinian immunity,” it is not recognized that the most successful pathogens are those that, through mutation, can come close to self. Whereas tumours represent mutations away from self, successful pathogens represent mutations towards self (by means of which they seek to exploit ‘holes’ in immune repertoires; 5). In both circumstances, this greatly simplifies the evolutionary task of a host. It does not have to depend on “the open-ended nature of the receptor repertoire.” It does not have to “constitutes a system of ‘unlimited heredity’ within the immune system.” It does not have to “be broad enough to recognize the ‘potential universe of antigens’.” The scope of its task is greatly reduced.

As long ago proposed (6), and increasingly recognized (7, 8), it would be evolutionarily advantageous for organisms to focus their immune cell receptors on ‘near self’ antigenic specificities, rather than to attempt to anticipate the entire universe of antigens. Organisms achieve this, not through negative, but through positive selection of their immune repertoires. From the outset, organisms and their pathogens have coevolved and it would seem incorrect to suppose for the immune system that positive selection “could only be added at advanced stages of its evolution” (9). It is fundamental to immune system evolution.

1. Muller V, Boer RJ de, Bonhoeffer S, Szathmary E (2018) Biol Rev 93: 505-528 <PMID:28745003>
2. Talmage DW (1957) Allergy and immunology. Ann Rev Med 8:239-256 <PMID:13425332>
3. Forsdyke DR (1996) The origins of the clonal selection theory of immunity. FASEB J 9:164-166. <PMID:7781918>
4. Lederberg J (2002) Instructive selection and immunological theory. Immunol Rev 185:50-53. <PMID:12190921>
5. Calis JJA, de Boer RJ, Kesmir C (2012) Degenerate T-cell recognition of peptides on MHC molecules creates large holes in the T-cell repertoire. PLoS Comput Biol 8: e1002412.<PMID:22396638>
6. Forsdyke DR (1975) Further implications of a theory of immunity. J Theoret Biol 52:l87-l98.<PMID:50501>
7. Vrisekoop N, Monteiro JP, Mandl JN, Germain RN (2014) Revisiting thymic positive selection and the mature T cell repertoire for antigen. Immunity 41: 181-190.<PMID:25148022>
8. Marrack P. et al. (2017) The somatically generated portion of T cell receptor CDR3alpha contributes to the MHC allele specificity of the T cell receptor. eLife 6: e30918.<PMID:29148973>
9.Forsdyke DR (2016) Evolutionary Bioinformatics. 3rd Edition. Springer, New York.

This provoked the following response:
Viktor Müller2018 Jan 26 7:13 p.m. (yesterday)
While it is true that there is considerable overlap in the recognition of self and (possibly pathogenic) non-self epitopes (Calis et al [1] estimated an overlap of around one third for HLA class I alleles), this is likely to make the job of the immune system harder, rather than easier. The overlapping peptides tend to be non-immunogenic, indicating tolerance, and the immune system needs to be able to target epitopes that are distinguishable from self peptides even with the degenerate recognition of T cell receptors [1]. Furthermore, even if the recognition task was indeed reduced to self and similar peptides, this would still vastly exceed the capacity of a fixed germline-encoded receptor repertoire. The number of distinct potential epitopes (for HLA class I) is of the order of magnitude 107 in humans [2] and in mice [3]; this exceeds the maximum number of germline immune receptors found in any species by several orders of magnitude.
We still maintain that "Distinguishing tumours from normal self is likely to be the most challenging task for Darwinian immunity that could only be added at advanced stages of its evolution" [4], but have never claimed the same for positive selection. Amphioxus has proto-MHC, and positive selection might indeed be an ancient characteristic of (vertebrate) Darwinian immunity. It will be instructive to elucidate whether and how the divergent adaptive system of jawless fish handles positive selection, or anything analogous to MHC restriction in general.
Finally, we note that the origin of vertebrate adaptive immunity is a notoriously difficult problem. We certainly do not know the whole truth about the complex events that took place more than half a billion years ago -- but we hope that, by surveying the most recent evidence, we have taken a small step in the right direction.

[1] Calis JJA, de Boer RJ, Keşmir C (2012) Degenerate T-cell Recognition of Peptides on MHC Molecules Creates Large Holes in the T-cell Repertoire. PLoS Comput Biol 8(3): e1002412.
[2] Burroughs, N.J., de Boer, R.J. & Keşmir, C. Immunogenetics (2004) 56: 311.
[3] Müller, V. & Bonhoeffer, S. (2003). Quantitative constraints on the scope of negative selection. Trends Immunol 24, 132-5.
[4] Müller V, Boer RJ de, Bonhoeffer S, Szathmáry E (2018) Biol Rev 93:505-528.


PubMed Commons Comment on “Post-translational peptide splicing and T cell responses. Mishto M, Liepe J. (2017) Trends Immunol 38:904-915 [3 December 2017]
A cell’s altruistic service to the population of cells that comprise its host organism may be compromised by a foreign pathogen or by a mutated driver cancer gene (both deemed “non-self”). Such _intracellular_ compromising agents can be first be addressed by _internal_ sensing and auto-destructive mechanisms. Should one of these fail, then _external_ sensing and destructive mechanisms, involving reactions with specific predatory T cells, may come into play. A compromised cell has the option of displaying peptides as pMHC complexes to see if they are recognized by members of T cell populations that, following thymic surveillance and deletion of nascent strongly self-reacting T cells, are programed to eliminate cells displaying non-self markers.
While such markers may arise from foreign proteins or mutated self proteins, Mishto and Liepe note that the scope of markers (“the antigenic landscape”) can be greatly increased by redesignating potential self markers (unspliced peptides in pMHC complexes) as non-self (1). This creation of foreign from self is achieved by splicing and trimming non-contiguous peptides to create novel peptides that would not have passed thymic filters and so would be seen as non-self. Two corollaries of this are that such peptide splicing must _not occur in the thymus_ and that, to militate against autoimmunity, extra-thymic specific splicing of separate protein segments would _not occur randomly_ in uncompromised cells.
Thus, some elements of an _internal_ sensing mechanism within a compromised cell would be needed to foster the extension of the antigenic landscape. The growing evidence for such a mechanism in the antigen presentation pathway (intracellular self/non-self discrimination) is presented elsewhere (2). I agree that “the unexpectedly large frequency and amount of … spliced peptides may … have profound implications for the concept of self/nonself peptide presentation” (3).

1.Mishto M, Liepe J. (2017) Post-translational peptide splicing and T cell responses. _Trends in Immunology_ 38:904-915 < PMID:28830734>
2.Forsdyke DR (2015) Lymphocyte repertoire selection and intracellular self/not-self discrimination: historical overview. _Immunology and Cell Biology_ 93:297-304. <PMID:25385066>
3.Liepe J et al. (2016) A large fraction of HLA class I ligands are proteasome-generated spliced peptides. _Science_ 354:354-358.< PMID:27846572>


PubMed Commons: Transcribed Junk Remains Junk If It Does Not Acquire A Selected Function in Evolution. Sverdlov E. Bioessays. [30 Nov 2017].


A “peculiarity of human thinking” invokes sad head-shaking in some quarters. It is argued, not only that “the vast majority of low abundant transcripts are simply junk,” but also that such junk is “simple” (1). Those led to think that junk DNA serves the organism (i.e. can under some conditions be functional and hence selectively advantageous) are labelled “determinists.” They can scarcely be distinguished from “ID believers”! There is no mention of the two-decade-old view that very low abundance transcripts (VLA RNAs) represent an intracellular antibody-like repertoire, for which much evidence has since accumulated (2-4).

For microorganisms, the CRISPR system provided a clear example of the functionality of the transcription of their spacer “junk DNA.” Ledford notes that the system “adapts to, and remembers, specific genetic invaders in a similar way to how human antibodies provide long-term immunity after an infection” (5). Just as we have germline cascades of V genes that confer immunological specificity on B and T lymphocytes, so microorganisms have their germline spacers that confer a similar specificity on their RNA populations. However, the functionality of an individual spacer “sense” transcript is only tested when a virus with a specific “antisense” sequence enters the cell. Transcription is conditional. The selective advantage can only emerge when the corresponding pathogen attacks.

Thus, the analytical problem is not so “simple” as showing by experimental DNA deletion that the transcript of a specific eukaryotic gene is functional, or as postulating a requirement for “unacceptably high birth rates.” Deletion of a single human V-region gene could show no selective effect if no corresponding pathogens invaded the body. Even if there were such an invasion, other V-regions would likely be able to compensate for the deletion. Similarly, deleting a segment of “junk” DNA is unlikely to impact survival if some of the wide spectrum of alternative “junk” transcripts can compensate for this defect in the RNA antibody-like repertoire.

1. Sverdlov E (2017) Transcribed junk remains junk if it does not acquire a selected function in evolution. BioEssays doi: 10.1002/bies.201700164. <PMID:29071727>

2. Cristillo AD, Mortimer JR, Barrette IH, Lillicrap TP, Forsdyke DR (2001) Double-stranded RNA as a not-self alarm signal: to evade, most viruses purine-load their RNAs, but some (HTLV-1, Epstein-Barr) pyrimidine-load. J Theor Biol 208:475-491. < PMID:11222051>

3. Forsdyke DR, Madill CA, Smith SD (2002) Immunity as a function of the unicellular state: implications of emerging genomic data. Trends Immunol 23:575-579. <PMID:12464568>

4. Forsdyke DR (2016) Evolutionary Bioinformatics. 3rd edition. Springer, New York, pp. 279-303.

5. Ledford H (2017) Five big mysteries about CRISPR’s origins. Nature 541:280-282.< PMID:28102279>


Viral taxonomy: the effect of metagenomics on understanding the diversity and evolution of viruses
(2017) EMBO Reports 18:1693-1696 Philip Hunter (Posted to PubMed Commons Oct 2 2017)

This otherwise admirable article <PMID:28877930> begins with the curious assertion that, “since they depend on their host for replication,” then viruses cannot “be categorized as species on the basis of reproductive isolation.” The latter prevents recombination between organisms and so forms the most generally accepted definition of species. Viruses species whose members share a common host cell, and depend on that cell for their replication, are still able to retain their species individuality. Their members do not mutually destroy each other by recombinational blending of their genomes. They are reproductively isolated from each other.
When we compare two viral species that have a common host cell, with two viral species that, even within a common host, do not share a common cell, we would expect to observe a fundamental difference related to their reproductive isolation mechanism. If that fundamental difference is found to apply to other viral pairs that occupy a common host cell, then a fundamental isolation mechanism has been identified.
Such a difference was first related to the base compositions of insect viruses (1), a then to the base composition of herpes viruses (2). A more extreme example arose from studies of retroviruses that share a T lymphocyte host. The AIDS virus (HIV1) and human T cell leukaemia virus (HTLV1), can be assumed to have evolved from a common ancestor. Differentiation of members of that ancestral species within a common host cell into two independent populations would have required some mechanism to prevent their blending. Thus, we see today a wide divergence in base compositions. HIV1 is one of the highest AT-rich species know. HTLV1 is one of the highest GC-rich species known (3). There is high differentiation of chromosomal nucleic acids.
In these viruses there has been no opportunity for other reproductive isolation mechanisms to supersede chromosomal mechanisms. Diffusible cytoplasmic products make the subsequent evolution of genic incompatibilities less likely, and being in a common host cell there is no equivalent of prezygotic isolation as conventionally understood (4).

1. Wyatt GR (1952) The nucleic acids of some insect viruses. J Gen Physiol 36:201-205. <PMID:13011277>
2. Schachtel GA et al. (1991) Evidence for selective evolution of codon usage in conserved amino acid segments of human alphaherpesvirus proteins. J Mol Evol 33:483-494. <PMID:1663999>
3. Bronson EC, Anderson JN (1994) Nucleotide composition as a driving force in the evolution of retroviruses. J Mol Evol 38:506-532. <PMID:8028030>
4. Forsdyke DR (1996) Different biological species "broadcast" their DNAs at different (G+C)% "wavelengths". J Theoret Biol 178:405-417. <PMID:8733478>


PubMed Commons Commentary on “Effects of thymic selection on T cell recognition of
foreign and tumor antigenic peptides” by George, Kessler and Levine
PNAS 114: E7875–E7881 [27 Sept 2017]


A major conclusion of this elegant modeling study is that “TCR selection against self-peptides has a minimal influence on the recognition of peptides which are ‘close’ to self.” Thus, “TCR negative selection by host peptides has only a weak suppressive effect on detecting peptides which closely resemble self.” This agrees with a somewhat less elegant modeling study which invoked lymphocyte clones selected for anti-“near-self” immune reactivity. These would normally have escaped negative selection (i.e. would have been positively selected; 1). The “near-self” viewpoint contrasted with the then prevailing “altered self” viewpoint (2). However, whereas George et al. (2017) regard their study as “empirical,” the earlier study (1) arose from consideration of alloreactive phenomena and recognized implications for cancer immunotherapy in keeping with an “overall objective of optimizing CRL therapy” (3, 4). Full historical reviews are available (5, 6).

1. Forsdyke DR (1975) Further implications of a theory of immunity. J Theor Biol 52: l87-l98.<PMID:50501>
2. Forsdyke DR (2005) “Altered-self” or “near-self” in the positive selection of lymphocyte repertoires? Immunol Lett 100: 103-106.<PMID:15894383>
3. Forsdyke (1977) Grant application:
4. Forsdyke DR (1999) Heat shock proteins as mediators of aggregation-induced "danger" signals: implications of the slow evolutionary fine-tuning of sequences for the antigenicity of cancer cel1s. Cell Stress Chaperone 4: 205-210.<PMID:10590834>
5. Forsdyke DR (2012) Immunology (1955-1975): The natural selection theory, the two signal hypothesis and positive repertoire selection. J Hist Biol 45: 139-161.< PMID:21336661>
6. Forsdyke DR (2015) Lymphocyte repertoire selection and intracellular self/not-self discrimination: historical overview. Immun Cell Biol 93: 297-304.<PMID:25385066>


 MISLEADING BRAIN FIGURE (as detected in Neurosceptic blog) [7 Sept 2017]


PubMed Commons Comment on Yaseem et al. (2017) FASEBJ 31, 2210–2219 “Lectin pathway effector enzyme mannan-binding lectin-associated serine protease-2 can activate native complement C3 in absence of C4 and/or C2” <PMID:28188176> [25 July 2017]


Papers on the lectin pathway (LP) of complement activation in animal sera generally refer to animal mannose-binding lectins (MBLs), with little reference to work with plant MBLs. For example, citing May and Frank (1973), this fine paper states: “Reports of unconventional complement activation in the absence of C4 and/or C2 predate the discovery of LP.” Actually, a case can be made that the discovery of the LP predates May-Frank.
The MASP-binding motif on animal MBL, which is necessary for complement activation, includes the amino acid sequence GKXG (at positions 54-57), where X is often valine. The plant lectin concanavalin-A (Con-A) has this motif at approximately the same position in its sequence (the 237 amino acid subunit of Con-A had the sequence GKVG at positions 45-48). The probability of this being a chance event is very low. Indeed, prior to the discovery of MASP involvement, Milthorp & Forsdyke (1970) reported the dosage-dependent activation of complement by Con-A.
As far as I am aware, it has not been formally shown that MASP is involved in the activation of the complement pathway by this plant MBL. Our studies in the 1970s demonstrated that Con-A activates complement through a cluster-based mechanism, which is consistent with molecular studies of animal MBL showing “juxtaposition- and concentration dependent activation” (Degn et al. 2014). References to our several papers on the topic may be found in a review of innate immunity (Forsdyke 2016).

Degn SE et al. (2014) Complement activation by ligand-driven juxtaposition of discrete pattern recognition complexes. Proc Natl Acad Sci USA 111:13445-13450. <PMID:25197071>

Forsdyke DR (2016) Almroth Wright, opsonins, innate immunity and the lectin pathway of complement activation: a historical perspective. Microb Infect 18: 450-459. <PMID:27109231>

May JE, Frank MM (1973) Hemolysis of sheep erythrocytes in guinea pig serum deficient in the fourth component of complement. I. antibody and serum requirements. J Immunol 111: 1671-1677. <PMID:4750864>

Milthorp PM, Forsdyke DR (1970) Inhibition of lymphocyte activation at high ratios of concanavalin A to serum depends on complement. Nature 227:1351-1352 <PMID:5455141>

Yaseem et al. (2017) Lectin pathway effector enzyme mannan-binding lectin-associated serine protease-2 can activate native complement C3 in absence of C4 and/or C2. FASEBJ 31: 2210-2219 <PMID:28188176>


PubMed Commons Comment on “Nucleolin directly mediates Epstein-Barr virus immune evasion through binding to G-quadruplexes of EBNA mRNA” Lista et al. (2017) Nature Commun [17 July 2017]


It is good to see the problem of EBV immune evasion focused, not on the translation product of EBNA1 mRNA (1), but on the mRNA itself (2). However, it is puzzling that the sequence encoding the glycine-alanine repeats is enriched not only in guanines (Gs), but also in adenines (As). In such a GC-rich genome (60% GC), there is a scarcity of As, yet they are concentrated in the glycine-alanine repeat-encoding region. In other words, codons have been selected for their general purine-richness, not just for their G-richness (3). While it is conceivable that the As somehow assist the formation of G-quadruplexes by consecutive Gs, consideration might have been given to the hypothesis that the G-quadruplexes may be a helpful by-product of the fundamental need to purine-load the mRNA.
EBV is not alone in this respect. EBV and HTLV-1 share common characters. Both are deeply latent, GC-rich viruses. They persist in their human hosts for long periods often with no obvious detrimental effects. Most of their proteins are encoded by pyrimidine-rich mRNAs. The HTLV-1 provirus encodes its pyrimidine-rich mRNAs in its "top" sense strand. But there is a "bottom" strand transcript. This is heavily R-loaded and is translated into a basic zipper protein (HBZ) which is poorly immunogenic and is increasingly seen, like EBNA-1, as playing a major role in immune evasion (4-6).

1. Levitskaya, J. et al. (1995) Inhibition of antigen processing by the internal repeat region of the Epstein-Barr virus nuclear antigen-1. Nature 375:685–688. <PMID:7540727>
2. Lista MJ et al. (2017) Nucleolin directly mediates Epstein-Barr virus immune evasion through binding to G-quadruplexes of EBNA-1 mRNA. Nature Commun 8:16043. <PMID:28685753>
3. Cristillo AD et al. (2001) Double-stranded RNA as a not-self alarm signal: to evade, most viruses purine-load their RNAs, but some (HTLV-1, Epstein-Barr) pyrimidine-load. J Theor Biol 208:475–491.< PMID:11222051>
4. Cook LB et al. (2013) HTLV-1: Persistence and pathogenesis. Virology 435:131–140. <PMID:23217623>
5. Shiohama et al. (2016) Absolute quantification of HTLV-1 basic leucine zipper factor (HBZ) protein and its plasma antibody in HTLV-1 infected individuals with different clinical status. Retrovirology 13:29 <PMID:27117327>
6. Forsdyke DR <a href=“”> EBV Webpage </a>.


PubMed Commons Comment on: “The CRISPR spacer space is dominated by sequences from the species-specific mobilome,” by Shmakov et al. [19 May 2017] BioRxiv referred to in PubMed comment on the Esposito paper.

The authors mention the virus-host arms race, but not the virus-virus (in common host) arms race (1). Thus, they seek to "improve our understanding of the evolution of the CRISPR spacer space and the virus-host arms race." There was a similar omission in a study of phage-host relationships in mycobacteria (2), upon which I have commented (3).
Apart from this, the slopes of regression plots of phage GC% against host GC% (Figs. 3, 4) indicate relative AT-enrichment in phage. The authors acknowledge our study (4), where we note differences in the pressures on individual codon positions between phage and bacteria. However, it is deemed that "GC-content … of microbial genomes, … and the cognate viral genomes show a nearly perfect correlation and are almost identical" (1).
This would not be expected from previous arguments (3) and is supported by the observations that "in most cases, there was indeed considerable AT-bias in phages," relative to hosts, although there are cases where "phage genomes had the same composition as the host" (1). The predicted high variance between phages that affect a common host (3), may be reflected in the scatter of points for phages in Fig. 3.
1. Shmakov SA, Sitnik V, Makarova KS, Wolf YI, Severinov KV, Koonin EV (2017) The CRISPR spacer space is dominated by sequences from the species-specific mobilome. BioRxiv preprint (doi: ).
2. Esposito LA, Gupta S, Streiter F, Prasad A, Dennehy JJ (2016). Evolutionary interpretations of mycobacteriophage biodiversity and host-range through the analysis of codon usage bias. Microbiol Genom 2(10):e000079. <pmid:28348827>
3. Forsdyke DR (2016) Elusive preferred hosts or nucleic acid level selection? arXiv Preprint ( ).

4. Mortimer JR, Forsdyke DR (2003) Comparison of responses by bacteriophage and bacteria to pressures on the base composition of open reading frames. Appl Bioinformatics 2:47-62. <pmid:15130833> (see 2003 paper )


PubMed Commons Comment: “Heroes of the engram”. Josselyn, Kohler and Frankland. J of Neuroscience 37, 4647-4657 [5 May 2017]


The view that Richard Semon’s work was neglected seems to be based on psychologist Daniel Schacter’s 1982 text (1). This was reissued with a new title and a few changes in 2001, without mention of the profound interim account by historian Laura Otis (2). While the authors cite my 2006 text on Samuel Butler and Ewald Hering, later work corroborates and extends Otis’s study and casts a somewhat different light on the authors’ prime hero (3, 4).
Even if offering a list of heroes that is “entirely personal,” a paper that extolls the “benefits of exploring the history of science” and of acknowledging our “debts … to those scientists who have offered key ideas,” could have mentioned the doubts cast on Semon by Freud and Hertzog, and Semon’s dismissal of Butler’s work as “rather a retrogression than an advance.”

1. Schacter DL (1982) Stranger behind the Engram: Theories of Memory and the Psychology of Science. Hillsdale, NJ: Erlbaum.

2. Otis L (1994) Organic Memory. History and the Body in the Late Nineteenth and Early Twentieth Centuries. Lincoln: University of Nebraska Press.

3. Forsdyke DR (2009) Samuel Butler and human long term memory: is the cupboard bare? J. Theor Biol 258:156-164.<PMID:19490862>

4. Forsdyke DR (2015) "A vehicle of symbols and nothing more." George Romanes, theory of mind, information, and Samuel Butler. History of Psychiatry 26:270-287. <PMID:26254127>



Splendor and misery of adaptation, or the importance of neutral null for
understanding evolution. E. V. Koonin BMC Biology (2016) 14:114 <PMID:28010725>
[Sent to Sandwalk blog and PubMed Commons 3 Jan 2017]


Following a multidisciplinary study of milk production at a dairy farm, a physicist returned to explain the result to the farmer. Drawing a circle she began: “Assume the cow is a sphere … .” (1) This insider math joke may explain Koonin’s puzzlement that “most biologists do not pay much attention to population genetic theory” (2). The bold statement that “nothing in evolution makes sense except in the light of population genetics,” cannot be accepted by biologists when evolution is portrayed in terms of just two variables, a “core theory” that involves “an interplay of selection and random drift.” While mathematical biologists might find it “counterintuitive” that “the last common eukaryotic ancestor had an intron density close to that in extant animals,” this is not necessarily so for their less mathematical counterparts, who are not so readily inclined to believe that an intron “is apparently there just because it can be” (3). While expediently adopting “null models” to make the maths easier, population geneticists are not “refuted by a <i> new </i> theoretical development.” They have long been refuted by <i> old </i> theoretical developments as illustrated by the early twentieth century clash between the Mendelians and the Biometricians (4). It is true that by fiddling with “selection coefficient values” and accepting that “streamlining is still likely to efficiently purge true functionless sequences,” the null models can closer approximate reality, a host of further variables – obvious to many biologists – still await the acknowledgement of our modern Biometricians.

1. Krauss LM (1994) Fear of Physics: A Guide for the Perplexed. Jonathan Cape, London.
2. Koolin EV (2016) Splendor and misery of adaptation, or the importance of neutral null for understanding evolution. BMC Biology 14:114 <PMID:28010725>
3. Forsdyke DR (2013) Introns First. Biological Theory 7, 196-203.
4. Cock AG, Forsdyke DR (2008) “Treasure Your Exceptions.” The Science and Life of William Bateson. Springer, New York.



PubMed Commons Commentary on Klitting R, Gould EA & de Lamballerie X (2016) “G + C content differs in conserved and variable amino acid residues of flaviviruses and other evolutionary groups.” Infection, Genetics and Evolution 45: 332-340. <PMID:27663721> [5 Dec 2016]


The authors correctly note that “the most obvious parameter associated with G + C content is the strength of molecular hybridization of polynucleotide duplexes” (1). Such hybridization controls recombination, which is favored when there is close sequence resemblance between different co-infecting viruses (“complete alignment conserved”), and is impeded when there is less sequence resemblance (“complete alignment variable”). The latter anti-recombination activity can be considered in relation to speciation mechanisms that initiate and retain taxonomic differentiations. As recently noted by Meyer et al., allied species of “viruses that infect the same [host] species and cell types are thought to have evolved mechanisms to limit recombination.” Without such limitations the genomes would blend and co-infectants would lose their independence as distinct viral species. Mechanisms overcoming this selective disadvantage include “divergences in nucleotide composition and RNA structure that are analogous to pre-zygotic barriers in plants and animals” (2).

Thus, a nucleic acid region may be “conserved,” not only because it encodes a protein (i.e. there is “protein pressure” on the sequence), but because it has a specific nucleotide composition (e.g. “GC-pressure”). While protein pressure mainly affects the first and second codon positions, GC-pressure can affect all codon positions. Indeed, at first and second codon positions there may be conflict between pressures, especially when protein pressure is high (i.e. in regions where amino acid conservation is high); then GC-pressure is constrained to vary only at the more flexible third codon position. In contrast, when protein pressure is low (i.e. in regions where amino acid conservation is low), then GC-pressure has greater freedom to affect all codon positions.

If, to avoid recombination, there is selective pressure on one branch of a diverging line to decrease its GC%, then it would be predicted that “the GC% of nucleotides encoding conserved amino acid (AA) residues” would be “consistently higher than that of nucleotides encoding variable AAs,” where the pressure to decrease GC% has fuller rein to encompass all three codon positions (1). Conversely, it would be predicted that when there is pressure on a diverging line to increase GC%, then it would be predicted that the GC% corresponding to conserved codons would be consistently lower than that of non-conserved codons (e.g. Ebolavirus).

For flavivirus “the mean G% of the core conserved AA residues is higher (35%) than that of the variable AA residues (28%), but the mean G3% of the core conserved AA residues (28%) is similar to that of the variable AA residues (29%).” While consistent with the above views, there is need for a similar breakdown for C3% and for information on relative frequencies of synonymous codons (e.g. the two cysteine codons correspond either to low or high GC%). More details of selective anti-recombination pressures are presented elsewhere (3, 4). Similar considerations may apply to codon biases and GC% among mycobacteriophages (5).

1.Klitting R, Gould EA & de Lamballerie X (2016) G + C content differs in conserved and variable amino acid residues of flaviviruses and other evolutionary groups. _Infection, Genetics and Evolution_ 45: 332-340.<PMID:27663721>
2.Meyer JR, Dobias DT, Medina SJ, Servilio L, Gupta A, Lenski RE (2016) Ecological speciation of bacteriophage lambda in allopatry and sympatry. _Science_ (in press) doi: 10.1126/science.aai8446 <PMID:27884940>
3.Forsdyke (2014) Implications of HIV RNA structure for recombination, speciation, and the neutralism-selectionism controversy. _Microbes & Infect_16:96-103. <PMID:24211872>
4.Forsdyke DR (2016) _Evolutionary Bioinformatics_, 3rd edition. Springer, New York.
5.Esposito LA, Gupta S, Streiter F, Prasad A, Dennehy JJ (2016). Evolutionary interpretations of mycobacteriophage biodiversity and host-range through the analysis of codon usage bias. _Microbiol Genomics_ 2(10), doi: 10.1099/mgen.0.000079.


Comments for PubMed Commons on “The RNA World at Thirty: A Look Back with its Author” by Neeraja Sankaran (2016) J Mol Evol. DOI 10.1007/s00239-016-9767-3 Posted 21 Nov 2016


The title of historian Neeraja Sankaran’s paper in a “special historical issue” of the Journal of Molecular Evolution implies that the RNA world idea was formulated 30 years ago (i.e. 1986) by a single author, Walter Gilbert (1). Yet the paper traces the story to authors who wrote at earlier times. Missing from the author list is Darryl Reanney who, like Gilbert, documented a “genes in pieces” hypothesis in February 1978 and went on to explore the RNA world idea with the imperative that error-correcting mechanisms must have evolved at a very early stage (2). Much of his work is now supported (3). However, Sankaran cites the video of a US National Library of Medicine meeting organized by historian Nathaniel Comfort on 17th March 2016 (4). Here W. F. Doolittle, who had consistently cited Reanney, discusses the evolutionary speculation triggered by the discovery of introns in 1977, declaring that “several things came together at that time,” things that “a guy named Darryl Reanney had been articulating before that.” Furthermore, “it occurred to several of us simultaneously and to Darryl Reanney a bit before – before me anyway – that you could just recast the whole theory in terms of the RNA world.” Gilbert himself thought that “most molecular biologists did not seriously read the evolution literature; probably still don’t.” Indeed, contemporary molecular biologists writing on “the origin of the RNA world,” do not mention Reanney (5). Thus, we look to historians to put the record straight.

1.Sankaran N (2016) The RNA world at thirty: a look back with its author. J Mol Evol DOI 10.1007/s00239-016-9767-3 < PMID: 27866234>
2. Reanney DC (1987) Genetic error and genome design. Cold Spring Harb Symp Quant Biol 52:751-757
3.Forsdyke DR (2013) Introns first. Biological Theory 7:196-203 <>

4.Comfort N (2016) The origins of the RNA world. Library of Congress Webcast.

5.Robertson MP, Joyce GF (2012) The origins of the RNA world. Cold Spring Harb Perspect Biol 4:a003608. <PMID: 20739415>


PubMed Commons Commentary on Exposito et al (2016) “Evolutionary interpretations of mycobacteriophage biodiversity and host-range through the analysis of codon usage bias”. Microbial Genomics (in press) [3 Nov 2016]


While confirming Richard Grantham’s view that “viruses do not closely imitate the use of the [host’s] … codon catalogue” (1), Esposito et al. nevertheless consider it a “surprising finding” that “despite having the ability to infect the same host, many mycobacteriophages share little or no genetic similarity” (i.e. similarity in their “GC contents and codon utilization patterns”). Arguing correctly that “efficient translation of a phage’s proteins within a host is optimized by the phage’s ability to match the codon usage patterns of their hosts,” the authors conclude that “the preferred host of many mycobacteriophages is not M. smegmatis, despite their having been isolated on M. smegmatis” (2). Thus, a virus and its elusive preferred hosts would have had similar GC% and codon usages, but the same virus could still infect a less-preferred host (M. smegmatis), where the virus-host similarity would be less evident.

All this rests on the incorrect assumption that efficient translation (protein level selection; 3) is evolutionarily decisive and cannot be overruled by nucleic acid level selection. Another interpretation is that, early in the diversification into distinctive mycobacteriophage species, prototypic phage lines acquired GC% differences that permitted coinfection of a common host cell by eliminating the recombination-dependent blending of sequences (4). Coinfectants either blend or speciate. Selection is primarily at the nucleic acid level. Translation efficiency is secondary. So powerful can be the pressure on genomes to avoid recombination that, in extremis, a virus that needs to translate more rapidly is forced to encode its own tRNAs tailored for this special need (2).

Grantham himself had noted that alpha and beta globin mRNAs are translated within the same cell yet have different GC% values and codon usage patterns (1). A simple evolutionary interpretation is that divergence from a prototypic globin gene had been assisted by early-developing GC% differences. These had impeded the recombinational blending between the emerging alpha and beta genes that would have reversed the divergence process (4). Likewise, Wyatt (5) had found that viruses that could co-infect a common host cell diverged widely in genome GC% (and hence in codon usage pattern), whereas viruses with different hosts differed much less in GC% (and hence in codon usage pattern). Other virus-pair examples include the low GC% HIV and the high GC% HTLV1 that are both hosted by CD4 lymphocytes and are likely derived from the same retroviral ancestor (6). The GC% differences may themselves be an expression of more fundamental oligonucleotide differences that bar recombination (7). Esposito et al. (2) cite work conceding the possibility of nucleic acid level selection (3), but here the emphasis is on selection on RNA secondary structure rather than at the genome-level (i.e. on M. smegmatis DNA).

1. Grantham R, Perrin P, Mouchiroud D (1986) Patterns in codon usage of different kinds of species. Oxford Surveys of Evolutionary Biology 3: 48-81.
2. Esposito LA, Gupta S, Streiter F, Prasad A, Dennehy JJ (2016) Evolutionary interpretations of mycobacteriophage biodiversity and host-range through the analysis of codon usage bias. Microbial Genomics 2(10). doi: 10.1099/mgen.0.000079.
3. Ran W, Kristensen DM, Koonin EV (2014) Coupling between protein level selection and codon usage optimization in the evolution of bacteria and archaea. Mbio 5(2):e00956-14 <PMID: 24667707>
4. Forsdyke DR (1996) Different biological species "broadcast" their DNAs at different (G+C)% "wavelengths". Journal of Theoretical Biology 178:405-417. <PMID: 8733478>
5. Wyatt GR (1952) The nucleic acids of some insect viruses. Journal of General Physiology 36:201-205.
6. Forsdyke DR (2014) Implications of HIV RNA structure for recombination, speciation, and the neutralism-selectionism controversy. Microbes & Infection 16:96-103. <PMID: 24211872>
7. Brbic M, Warnecke T, Krisko A, Supek F (2015) Global shifts in genome and proteome composition are very tightly coupled. Genome Biology & Evolution 7:1519-1532. < PMID: 25971281>


Pubmed comment on “Evolutionary switches between two serine codon sets are driven by selection.” By Rogozin et al. (2016) Reviewed by KH Wolfe (Dublin) and J Zhang (UMich)


The authors set out “to investigate the evolutionary factors that affect serine codon set switches” (i.e. between TCN and AGY). Their “findings imply unexpectedly high levels of selection” (1). Indeed, the data strongly support the conclusion that codon mutations “are driven by selection.” It is conjectured that the codon mutation “switch would involve as an intermediate either threonine ACN or cysteine TGY, amino acid residues with properties substantially different from those of serine, so that such changes are unlikely to be tolerated at critical functional or structural sites of a protein.”
However, it does not follow that the unsuitability of the interim amino acids drove the rapid tandem substitutions. Choice of “coincident codons” has long been seen as influenced by pressures acting at the nucleic acid level (2-4). These pressures evolve in parallel with, and sometimes dominate, protein pressures. One example is purine-loading pressure (3). If this cannot be satisfied by changes at third codon positions, then sometimes the organism must accept a less favorable amino acid. With serine codons, a change from TCN to AGY (i.e. first and second codon positions) can increase purine-loading pressure without compromising the amino acid that is encoded (

1. Rogozin IB, Belinky F, Pavlenko V, Shabilina SA, Kristensen DM, Koonin EV (2016) Evolutionary switches between two serine codon sets are driven by selection. Proc Natl Acad Sci USA<PMID: 27799560>
2. Bains W. (1987) Codon distribution in vertebrate genes may be used to predict gene length. J Mol Biol. 197:379-388.<PMID: 3441003>
3. Mortimer JR, Forsdyke DR (2003) Comparison of responses by bacteriophage and bacteria to pressures on the base composition of open reading frames. Appl Bioinf 2: 47-62.<PMID: 15130833>
4. Forsdyke DR (2016) Evolutionary Bioinformatics, 3rd edition (Springer, New York).


Please follow the Hypothesis or Wayback links for earlier PubMed Commons comments.


Return to: Bioinformatics/Genomics Index (Click Here)

Return to: Evolution Index (Click Here)

Return to: HomePage (Click Here)

This page was established March 2018 and last updated on 12 March 2018 by Donald Forsdyke.