Immunity as a function of the unicellular


state: implications of emerging genomic data

  

Donald R. Forsdyke*, Christopher A. Madill and Scott D. Smith

Trends in Immunology (2002) 23, 575-579

With copyright permission from Elsevier Science Ltd. This version has small differences from the published version, two extra figures, and end-notes.

An elementary immune system

Polymorphism creates unpredictability

Junk DNA is transcribed

Repetitive elements transcribed in infected cells

Double-stranded RNA as an alarm signal

Purine-loading to avoid self-recognition

Intracellular protein "immune receptors"

Conclusion

End Note (2003)

End Note (2006)

End Note (April 2007)

End Note (March 2008)

End Note (Jan 2009)

End Note (April 2009) 

End_Note_(October_2009)

End Note (Jan 2010)

End_Note_(October_2011)

End_Note_(December_2012)

End_Note_(Mar_2013)

End_Note_(July_2016)

End_Note_(Aug_2016)

End_Note_(March_2019)

End_Note_(March_2020)

Instead of being greeted as supporting the growing corpus of immunological theory, recent advances in the bioinformatic analysis of genomes have often surprised the discoverers and failed to attract the attention of immunologists.

    The view that multicellular immune systems  are adaptations of already highly evolved unicellular immune systems that are capable of self/not-self discrimination can assist our comprehension of phenomena such as "junk" DNA, genetic polymorphism and the ubiquity of repetitive elements. 

    The "hidden transcriptome," revealed by run-on transcription of genes or repetitive elements, contains a diverse repertoire  of RNA "immune receptors," with the potential to form double-stranded RNA with viral RNA "antigens," thus triggering intracellular alarms.

 Unicellular organisms are likely to have evolved some 800 million years before multicellular organisms. Brucke dubbed single cells "elementary organisms" [1] implying that many multicellular level functions might have prototypic equivalents at the unicellular level. We here explore implications of the postulate that immune systems of multicellular organisms arose as extensions of immune systems preexisting at the unicellular level [2,3].

 Fig. 1. The cell as an elementary immune organism. The left circle (a) represents a multicellular organism with Y-shaped antibodies of various specificities. The right circle (b) represents a unicellular organism with a repertoire of "antibody-like" protein and "antibody-like" RNA molecules (stem-loop structures) These are referred to as "immune receptors" implying that parts of these molecules can interact with intracellular antigens.

An elementary immune system

In a clonal unicellular population where asexual reproduction predominates, self-destruction (i.e. apoptosis) is the simplest mechanism to prevent spread of a pathogen and to promote survival of a "selfish gene". However, even such a primitive defence needs to be coupled to specific and adaptable sensors. We propose that such a sensory system is provided by a multiplicity of structurally distinct macromolecules, of which we emphasize here proteins and RNAs (Fig. 1). Many of these will have distinct properties (e.g. catalytic, structural, transporting, templating, etc.). 

    On the other hand, there is a high probability that, in the crowded cytosol [4], one or more of these molecules will be able to bind an invading virus with sufficient affinity to tag it as "not-self," thus initiating an innate immune response. Such an "immunological repertoire" could develop either over evolutionary or, as in the case of antibodies, over somatic time [5]. Whatever the mechanism and timing of the diversification process, there is a need to eliminate receptors with an affinity for "self" antigens.

     Unfortunately, given the high replication and mutation rates of viruses relative to those of their hosts, it would be highly probable that viruses would pre-adapt to avoid interaction with hostile host macromolecules. What a virus had "learned" (by mutation and selective proliferation) in one host, it would exploit on the next host. New information on host genome polymorphism suggests that this difficulty may now not be so formidable as it once appeared (see below). 

    In an elementary unicellular immune system viruses that, through mutation, acquired the ability to inactivate host apoptotic mechanisms, would preferentially survive. In the ensuing arms race, an intracellular "inflammatory" host response would have evolved to limit viral activities. However, in multicellular organisms apoptosis of the primarily infected cell might limit the opportunity to alert other target cells and cells of the immune system (e.g. for MHC-peptide presentation). Sophistications developed at the multicellular level are considered below.

Polymorphism creates unpredictability

 Fig. 2.  Specific and general functions of a protein as reflected in its structure. Dedicated functions are associated with conserved, internal, hydrophobic, globular domains. Potential immune receptor functions are associated with variable, external, hydrophilic, non-globular domains.  

 On average, the haploid maternal and paternal contributions to your diploid genome are likely to differ from each other at least once every 0.5-2.0 kilobases, and general intraspecies differences may arise at least once every 185 bases [6]. Such polymorphism should decrease the extent to which a pathogen from one host can anticipate the genomic characteristics of its next host. When the polymorphism affects proteins, it probably affects sequences of relatively low complexity that correspond to hydrophilic non-globular domains at the protein surface [7]. Thus, these domains, usually not critical for the specialized function of the protein, are available for interaction with complementary molecular patterns of intracellular pathogens ("not-self;" Fig. 2). These same domains should also have the potential to react with "self" proteins, sometimes to an extent sufficient to trigger adverse responses in the host (intracellular "autoimmune" pathology). Organisms with mutations avoiding this would have been favoured over evolutionary time [8-10].

 

Junk DNA is transcribed

 Had we not known of the existence of an antibody repertoire, the discovery of sets of V-genes would have been greeted with surprise. However, our surprise at learning that 98% of our DNA is non-genic has been somewhat blunted by a facile explanation, - "junk"[11,12].  

    To be functional it is likely that non-genic DNA would have to be transcribed [13]. Recent investigations of the transcriptional activities of the ß-globin region of human chromosome 11 [14], and of entire chromosomes 21 and 22, reveal a "hidden transcriptome," corresponding to a large number of low copy number cytoplasmic RNAs. It is estimated that there is "an order of magnitude" more transcriptionally active DNA than can be accounted for by conventional genes [15].  Can this be dismissed as mere cytoplasmic "junk," an unavoidable consequence of the existence of genomic "junk"?  

    To understand its role, if any, in the economy of the organism, we need to know, by analogy with known transcriptional processes, whether there are specific promoters, whether there are dedicated RNA polymerases, whether transcription occurs randomly or under specific conditions, and whether transcripts are diverse and include appreciable non-repetitive DNA.

 

Repetitive elements transcribed in infected cells

 Fig. 3.  Location of Alu elements is likely to permit downstream transcription of variable genomic segments. Alu and other repetitive elements are shown in part of the 100 kilobase segment of human chromosome one containing the two exon gene, G0S2

    Horizontal arrows indicate transcription directions of G0S2 (grey boxes), of Alu elements (red boxes demarcated by vertical dashed lines) and of other repetitive elements (cyan boxes). The abbreviated names of repetitive elements are printed vertically. 

    Purine-loading (excess of purines/kb over pyrimidines/kb; grey balls) and CpG frequency (dinucleotides/kb; green continuous line) were evaluated for 400 base windows moving in steps of 25 bases. When purine frequency equals pyrimidine frequency, purine-loading is zero. 

    Values for CpG frequency (plotted on the same scale) are zero or positive. The CpG peak ("CpG island") associated with G0S2 indicates a gene expressed in the germ line. (Note that, if the sequences of a virus and its host are known, then it should be possible to locate host segments complementary to virus segments and, from displays such as this, determine the feasibility of their transcription.)

Much non-genic DNA consists of repetitive elements, the most prominent of which in humans are the 1,090,000 Alu elements [16]. Both conventional genes and repetitive elements can provide promotors for the transcription of non-genic DNA. Some gene transcripts have been found longer then expected due to a failure of transcriptional termination ("run-on" transcription; [17,18]). Some classes of repetitive element contain promoters from which transcription can initiate and extend beyond the bounds of an  element into neighbouring genomic regions [19-22].

    Are such extended transcripts generated randomly in time? In the case of Alu elements, transcription (by RNA polymerase III) has been observed to increase at times of cell stress (e.g. viral infection, heat shock). Indeed, viral infection can trigger the heat shock response with the induction of heat shock proteins (for Refs. see [23]). Thus, it is possible that Alu transcription reflects as adaptive response to virus infection (for Refs. see [24]).

    Is the location of Alu elements likely to permit downstream transcription of variable genomic segments? Figure 3 shows a segment of human chromosome one containing the G0/G1 switch gene 2 (G0S2), which is upregulated in activated lymphocytes [25]. The gene demonstrates the general phenomenon of "purine loading" (more purines than pyrimidines) which is characteristic of most RNAs of most organisms. Thus, when transcription is to the right of the promoter exons are purine-loaded, and when transcription is to the left of the promoter exons are pyrimidine-loaded (i.e. negatively purine-loaded). In both circumstances the RNAs end up being purine-loaded [26], for reasons discussed below.

     The transcription direction of G0S2 being to the right (indicated by the horizontal arrow in Fig. 3a), the gene and the corresponding mRNA are purine-loaded. This purine-loading extends for about a kilobase downstream of G0S2 into a region with no repetitive elements. Thus, if there were conditions such that transcription did not terminate, then the extended transcript would itself be purine-loaded and contain non-repetitive DNA. 

     Also shown in Figure 3 are various repetitive elements with assigned potential transcription directions. Although within a class of repetitive element there is some variability, by definition the repetitive elements themselves tend to diminish genome variability. However, the regions downstream of Alu elements are often devoid of other repetitive elements. For example, the pyrimidine-loaded leftward-transcribing Alu element downstream of G0S2 has a clear downstream region that retains the pyrimidine-loading of the original transcript. On the other hand, several kilobases upstream of G0S2 are two leftward-transcribing Alu elements, one of which transcribes into a region that is purine-loaded and contains repetitive elements of the L2 family. These results are illustrative of the general features of this genomic region. A parallel study of the region of a much smaller human chromosome containing the FOSB/G0S3 gene (chromosome 19; [27]), revealed a much tighter packing of repetitive elements (data not shown). 

Fig. 4. Two RNA molecules (blue and red) meeting, "kissing", and forming dsRNA. [For space reasons this figure was omitted from the final paper.]

  

Double-stranded RNA as an alarm signal

Although protein molecules can recognize specific nucleic acids (and the converse), it is convenient here to consider proteins recognizing proteins and RNAs recognizing RNAs. In the cytosol RNA molecules adopt characteristic stem-loop configurations (Fig. 1b), and RNA-RNA interactions can initiate by way of a "kissing" homology-search between bases at the tips of loops. If sequence complementary is found (e.g. G pairing with C, and A pairing with U) then two RNA species can pair, partially or completely, to generate a length of double-stranded RNA (dsRNA) that in some circumstances can play a regulatory role [Fig. 4; see ref. 28].

     If a virus introduced its own RNA into a cell, would there be sufficient variability among host RNA species for a host "immune receptor" RNA to form a segment of dsRNA with the "not-self" RNA of the virus? Calculations made elsewhere [29] show this to be feasible, especially if the entire genome were available for transcription. Would the dsRNA be able to initiate an adaptive intracellular "inflammatory" response? How would the host cell prevent generation of "self" dsRNAs?

     Formation of dsRNA has long been recognized as an early cellular response to viral entry. Protein synthesis can be inhibited non-specifically by very low concentrations of dsRNA [30].  This involves activation of dsRNA-dependent protein kinase (PKR), which inhibits a protein involved in the initiation of protein synthesis. Evasive viral strategies would include the acceptance of mutations to avoid formation of dsRNA (see below), and inhibition of cell components required for the formation of, or the response to, dsRNA [31,32].

      Virus-infected cells produce interferons, which can be considered part of the inflammatory response. The interferons induce a general anti-viral state spreading together with various chemokines from the cell of origin to other cells [33]. Their production is powerfully stimulated by dsRNA [34]. There is now growing evidence that, both in animals and plants, another more sequence specific "inflammatory" response to dsRNA arises as part of an intracellular mechanism for self/not-self discrimination [35]. Just as in the antibody response there is amplification of the production of specific antibody, so, courtesy of enzymes such as RNA-dependent-RNA polymerase and "dicer," there is amplification of the production of specific "immune receptor" RNA (for Refs. see accompanying paper of Martinez et al. [36]).

Fig. 5. Run-on transcription reveals the "hidden transcriptome." (For space reasons this figure was omitted from the original paper.)

 

  Purine-loading to avoid self-recognition

Although it is currently believed that host cells detect dsRNA of virus origin [35], given the functioning of dsRNA as an alarm signal (Fig. 5), viruses should have evolved to avoid the formation of dsRNA replicative intermediates. Indeed, viruses with dsRNA genomes have adaptations that would appear to conceal their genomes from host cell surveillance mechanisms [37]. More than twenty base pairs are needed to activate PKR in vitro [38], or to silence specific genes [39]. 

    Among the RNA species of a cell there might be two whose members, by chance, happened to have enough base complementarity for formation of a mutual duplex of a length sufficient to trigger alarms. Thus, there would need to have been an evolutionary selection pressure favouring mutations in host RNAs that decrease the possibility of their interaction with other "self" RNAs in the same cell. In many cases mutations to a purine would assist this, since purines do not pair with purines. Indeed, interaction with "self" RNAs seems to have been avoided by "purine-loading" the loop regions of these RNAs, thus avoiding the initial loop-loop "kissing" reactions which precede more complete formation of dsRNA. The above-mentioned excess of purines, observed both at RNA and at DNA levels (in mRNA-synonymous DNA strands), is found in a wide variety of organisms and their viruses [26,40].          

     Exploratory "kissing" interactions between hybridizing nucleic acids involve transient base stacking interactions [28] with the exclusion of structured water. Such reactions have a strong entropy-driven component, and so might increase as temperature increases (i.e. fever [4]). Accordingly, purine-loading should be high in thermophiles, as is indeed found [41; R. Lambros, J. Mortimer and D. Forsdyke, unpublished work]. 

    Furthermore, proteins with a tendency to become involved in autoimmune reactions have acquired runs of charged amino acids with no known function at the protein level [42,43]. Charged amino acids correspond to codons rich in purines, which should countermand formation of dsRNA. Thus, the presence of runs of charged amino acids may be a consequence of the need to purine-load RNA, and not vice-versa.

    A general increase in transcription in cells exposed to "stress" (simulating virus invasion [44]), would dictate a period of preincubation without stress before testing for specific transcription. This has indeed been found as a requirement for studies with freshly explanted human lymphocytes [27].

 

Intracellular protein "immune receptors"

Amino acids in proteins do not pair on a one-to-one basis, like bases in nucleic acids. Nevertheless, similar considerations might apply in the case of protein molecules (Figs. 1b, 2). These would form heteroaggregates (aggregates of self-proteins with pathogen proteins), and "not-self" homoaggregates (aggregates of individual pathogen protein species) by mechanisms discussed elsewhere [4,8-10, 45,46]. Recent observations of diseases associated with protein aggregation suggest an interconnection between protein "self" and RNA "self" homoaggregates, which may both be required for disease progression [38, 46-47].

Conclusion

While the existence of an intracellular immune system remains unproven, a growing number of disparate observations appear comprehensible from this perspective. Non-genic "junk" DNA can be viewed in much the same way as we view the diverse genes encoding the variable regions of immunoglobulin antibodies. Just as B-cells capable of synthesizing a unique anti-self antibody would be eliminated during somatic time to prevent self-reactivity, so junk DNA would have been screened over evolutionary time (by positive selection of individuals in which favourable mutations had been collected together by recombination) to decrease the probability of two complementary "self" transcripts interacting to form dsRNA segments of more than 20 bases. High polymorphism of non-genic DNA would make it difficult for viruses to anticipate the "immune receptor" RNA repertoire of future hosts. Since viruses can be enriched for either purines or pyrimidines [29], the repertoire should include both purine-rich and pyrimidine-rich segments (Fig. 3). The initiating event is one of self/not-self discrimination, be it between two RNA species or between two protein species, and be it extracellular or intracellular.

 Acknowledgements We thank Jim Gerlach for assistance with computer configuration, and Jerzy Jurka and coworkers for access to Repbase. The Canadian Bioinformatics Resource (Halifax) provided access to the GCG program suite. Andrew Reynolds kindly provided the Brucke text. Queen's University hosts DRF's web pages where full texts of several of the references may be found.

1 Brucke, E. (1861) Die Elementarorganismen. Sitzungsber. Math-Nat. Cl. K. Akad. Wiss. 44, 381-406

2 Forsdyke, D.R. (1991) Early evolution of MHC polymorphism. J. Theor. Biol. 150, 451-456

3 Forsdyke, D.R. (1992) Two signal model of self/not-self discrimination: an update. J. Theor. Biol. 154, 109-118

4 Forsdyke, D.R. (1995) Entropy-driven protein self-aggregation as the basis for self/not-self discrimination in the crowded cytosol.  J. Biol. Sys. 3, 273-287

5 Lewis, S.M. (1994) The mechanism of V(J)D joining. Adv. Immunol. 56, 27-150

6 Stephens, J.C. et al. (2001) Haplotype variation and linkage disequilibrium in 313 human genes. Science 293, 489-493

7 Bustamente, C.D. et al. (2000) Solvent accessibility and purifying selection within proteins of Escherichia coli and Salmonella enterica. Mol. Biol. Evol. 17, 301-308

8 Forsdyke,D.R. (2001a) Adaptive value of polymorphism in intracellular self/not-self discrimination. J. Theor. Biol. 210, 425-434

9 Forsdyke, D.R. (2001b) The Origin of Species, Revisited. McGill-Queen's University Press

10 Forsdyke, D.R. (2001c) Functional constraint and molecular evolution. In Encyclopedia of Life Sciences, vol. 7, pp. 396-403, Nature Publishing Group, London

11 Ohno, S. (1972) So much junk DNA in our genome. Brookhaven Symp. Biol. 23, 366-370

12 Pennisi, E. (2002) Charting a genome's hills and valleys. Science 296, 1601-1603

13 Mattick, J. S. (2001) Non-coding RNAs: the architects of eukaryotic complexity. EMBO Reps. 2, 986-991

14 Plant, K.E. et al. (2001) Intergenic transcription in the human ß-globin gene cluster. Mol. Cell. Biol. 21, 6507-6514

15 Kapranov, P. et al. (2002) Large-scale transcriptional activity in chromosomes 21 and 22. Science 296, 916-919

16 Jurka, J. et al. (1996) CENSOR: a program for identification and elimination of repetitive elements from DNA sequences. Comput. Chem. 20, 119-122

17 Heximer, S.P. et al. (1998) Expression and processing of G0/G1 Switch Gene 24 (G0S24/TIS11/TTP/NUP475) RNA in cultured human blood mononuclear cells. DNA Cell Biol. 17, 249-263

18 Iseli, C. et al. (2002) Long range heterogeneity at the 3' ends of human mRNAs. Genome Res. 12, 1068-1074

19 Manley, J.L. and Colozzo, M.T. (1982) Synthesis in vitro of an exceptionally long RNA transcript promoted by an AluI sequence. Nature 300, 376-379

20 Feuchter, A.E. et al. (1992) Strategy for detecting cellular transcripts promoted by human endogenous long terminal repeats. Genomics 13, 1237-1246

21 Ferrigno, O. et al. (2001) Transposable B2 SINE elements can provide mobile RNA polymerase II promoters. Nature Genet. 28, 77-81

22 Nigumann, P. et al. (2002) Many human genes are transcribed from the antisense promoter of LI retroposon. Genomics 79, 628-634

23 Forsdyke, D.R. (1985) Heat shock proteins defend against intracellular pathogens. J. Theor. Biol. 115, 471-473

24 Kim, C. et al. (2001) Genome-wide chromatin remodelling modulates the Alu heat shock response. Gene 276, 127-133

25 Russell, L. and Forsdyke, D.R. (1991) A human putative lymphocyte G0/G1 switch gene containing a CpG-rich island encodes a small basic protein with the potential to be phosphorylated. DNA Cell Biol. 10, 581-591

26 Forsdyke, D.R. and Mortimer, J.R. (2000) Chargaff's legacy. Gene 261, 127-137

27 Heximer, S.P. et al. (1996) Sequence analysis and expression in cultured lymphocytes of the human FOSB gene (G0S3). DNA Cell Biol. 12, 1025-1038

28 Eguchi, Y. et al. (1991) Antisense RNA.  Annu. Rev. Biochem. 60, 631-652

29 Cristillo, A.D. et al. (2001) Double-stranded RNA as a not-self alarm signal. J. Theor. Biol. 208, 475-491

30 Ehrenfeld, E. and Hunt, T. (1971) Double-stranded poliovirus RNA inhibits initiation of protein synthesis by reticulocyte lysates. Proc. Natl. Acad. Sci. USA 68,1075-1078

31 Elia, A. et al. (1996) Regulation of the double-stranded RNA-dependent protein kinase PKR by RNAs encoded by a repeated sequence of the Epstein-Barr virus genome. Nucleic Acids Res. 24, 4471-4478

32 Mittelsten Scheid, O. (1999) New tool for Swiss army knife. Nature 397, 25

33 Suzuki, K. et al. (1999) Activation of target-tissue immune-recognition molecules by double-strand polynucleotides. Proc. Natl. Acad. Sci. USA 96, 2285-2290

34 Marcus, P. (1983) Interferon induction by viruses: one molecule of dsRNA as the threshold for induction. Interferon 5, 115-180

35 Plasterk, R.H.A. (2002) RNA silencing; The Genome's Immune System. Science 296, 1263-1265

36 Martinez, M.A. et al. (2002) RNA interference of HIV replication. Trends Immunol. 23, 559-561

37 Bamford, D.H. (2002) Those magnificent molecular machines: logistics in dsRNA virus transcription. EMBO Reports 3, 317-318

38 Tian, B. et al. (2000) Expanded CUG repeat RNAs form hairpins that activate the double-stranded RNA-dependent protein kinase PKR. RNA 6, 79-87

39 Elbashir, S.M. et al. (2001) RNA interference is mediated by 21- and 22-nucleotide RNAs. Genes Devel. 15, 188-200

40 Saul, A. and Battistutta, D. (1988) Codon usage in Plasmodium falciparum. Mol. Biochem. Parasitol. 27, 35-42

41 Lao, P.J. and Forsdyke, D.R. (2000) Thermophilic bacteria strictly obey Szybalski's transcription direction rule and politely purine-load RNAs with both adenine and guanine. Genome Res. 10, 228-236

42 Brendel, V. et al. (1991) Very long charge runs in systemic lupus erythematosus-associated autoantigens. Proc. Natl. Acad. Sci. USA 88, 1536-1540

43 Dohlman, J.G. et al. (1993) Long charge-rich alpha-helices in systemic autoantigens. Biochem. Biophys. Res. Comm. 195, 686-696

44 Suzuki, T. et al. (2000) Control selection for RNA quantitation. Biotechniques 29, 332-337

45 Forsdyke, D.R. (1999) Heat shock proteins as mediators of "danger" signals: implications of the slow evolutionary fine-tuning of sequences for the antigenicity of cancer cells. Cell Stress Chaperones 4, 205-210 

46 Forsdyke, D.R. (2000) Double-stranded RNA and/or heat-shock as initiators of chaperone mode switches in diseases associated with protein aggregation. Cell Stress Chaperones 5, 375-376

47 Peel, A.L. et al. (2001) Double-stranded RNA-dependent protein kinase, PKR, binds preferentially to Huntington's disease (HD) transcripts and is activated in HD tissue. Hum. Mol. Genet. 10, 1531-1538

End Note (27 Dec 2003)

The above view of the role of "junk DNA" predicted that large sets of low abundance "non-coding transcripts" would be a feature of many eukaryotic genomes and that, in view of the postulated role in intracellular aspects of immunological defenses, they would not be evolutionarily conserved. This was greatly supported by the discovery of multiple "non-coding" transcripts in cDNA libraries prepared from humans and mice. In a paper entitled "Complete Sequencing and Characterization of 21243 full-length human cDNAs", Toshio Ota and coworkers noted:

"It is interesting to note this type of "non-coding" transcripts was also found in mouse cDNA collections. - - What was significant was that [the] majority of the examined cDNAs were not evolutionally conserved. In this dataset of mouse genes, identification of 11665  similar transcripts (which would be categorized as "unclassified" according to our scheme) has also been reported. This suggests that there are little conservation for these "unclassified" transcripts and/or that there are huge numbers of such transcripts (at least in the order of 100000). Interestingly, - - we have recently examined the promoter activities of randomly isolated genomic DNA fragments on a large scale and observed that there are cryptic promoter activities throughout the genomic DNA (unpublished data). It may be possible that those cryptic promoters may act at low frequency to produce aberrant (or sporadic) transcripts."

Ota et al. (2004) Nature Genetics 36, 40-45.

End Note (April 2006)

The correspondence of Alu elements with min-CpG islands, as seen with the CpG-island-containing G0S2 gene in Figure 3, is supported by new work of Brohede and Rand (2006 Human Genetics 119, 457-458).This suggests that:

"CpG island-associated Alus are frequently unmethylated" and hence are probably expressed, in the germ line.

End Note (April 2007)

Bacteria appear to have a defence system analogous to that outlined above (and expanded on in my text Evolutionary Bioinformatics, 2006, pp. 270-2). Bacteria have "Clustered Regularly Interspaced Short Palindromic Repeats" (CRISPR) between which are variable spacer sequences that resemble sequences from the viruses that infect bacteria (bacteriophages). It appears that in the course of a "primary" infection these spacers acquire a sequence from the pathogen. When there is a "secondary" infection, this "memory" can be called upon in the bacterium and its progeny, which transcribe it in an orientation such as to generate interfering RNAs which hybridize with the corresponding nucleic acid sequences of the infecting virus, so inactivating the virus.

Makarova et al., (2006) A putative RNA-interference-based immune system in prokaryotes. Biology Direct 1, 7.

Barrangoue et al., (2007) CRISP provides acquired resistance against viruses in prokaryotes. Science 315, 1709-1712. 

End Note (March 2008)

In the heat-shock response transcription is generally repressed, but Alu transcription (by RNA polymerase III) increases. Mariner and colleagues now show that human Alu sequences (and similar sequences in mice) participate in the general repression of genes by binding to RNA polymerase II (the major polymerase for mRNA synthesis).

    Transcription, as indicated by [3H]-uridine labelling, is usually high in freshly cultured peripheral blood mononuclear cells. Thus, a response to the lectin Concanavalin-A is best observed after leaving the cells to "rest" for a day. We note above: "A general increase in transcription in cells exposed to "stress" - - would dictate a period of preincubation without stress before testing for specific transcription. This has indeed been found as a requirement for studies with freshly explanted human lymphocytes [27]." Indeed, Baechler et al. report that "hundreds of genes are sensitive to ex vivo handling of blood." By the criterion of decline during rest phase, there appear to be increases in mRNAs corresponding to G0S2 (Fig. 3 above), FosB (G0S3), Fos (G0S7),  RGS2 (G0S8), TIS11 (G0S24 ) and EGR1 (G0S30). But, in keeping with the observations of Mariner et al., many mRNAs such as G0S19 (CCL3) and RGS1, decrease (Heximer et al. 1997).

Baechler, E. C. et al., (2004) Expression levels of many genes in human peripheral blood cells are highly sensitive to ex vivo incubation. Genes & Immunity 5, 347-353.

Heximer, S. P., Cristillo, A. D. & Forsdyke, D. R. (1997) Comparison of mRNA expression of two regulators of G-protein signaling, RGS1/BL34/IR20 and RGS2/G0S8, in cultured human blood mononuclear cells. DNA Cell Biology 16, 589-598.

Mariner, P. D. et al. (2008) Human Alu RNA is a modular transacting repressor of mRNA transcription during heat shock. Molecular Cell 29, 499-509.

End Note (Jan 2009)

Intriguingly, the phage sequences targeted by the transcripts of the CRISP-Repeats (see above) may have, for a particular class of CRISP-R, a common purine-rich flanking sequence. By mutating the target sequence a phage can evade the CRISP-R host defence system. This is as one might expect, since it is likely that homologous base pairing between host transcript and the phage target sequence is required. However, evasion can also be brought about by mutating the non-targeted purine-rich flanking sequence (Deveau et al. 2008). This is consistent with a requirement for a "kissing" interaction between nucleic acid secondary structures (see Fig. 4 above) prior to hybridization (Forsdyke 2007). A flanking mutation could change the secondary structure and hence hybridization would not occur.

Deveau, H. et al. (2008) Journal of Bacteriology 190, 1401-1412. Diversity, activity, and evolution of CRISPR loci in Streptococcus thermophilus.

Forsdyke, D. R. (2007) Journal of Theoretical Biology 249, 325-330. Molecular sex: the importance of base composition rather than homology when nucleic acids hybridize. (Click here for full text)

End Note (April 2009)

Faulkner et al. (2009) have shown that many human and mouse transcripts initiate within repetitive elements (LINES, SINES and other retrotransposons). But "retroposon transcripts appear to be less expressed on average than protein-coding mRNAs." They conclude that: "The ultimate function, perhaps after further processing, of transcripts associated with novel retrotransposon promoters deserves future study."

Faulkner, G. J. et al. (2009) Nature Genetics 41, 563-571.The regulated retrotransposon transcriptome of mammalian cells.

End Note (October 2009)

Flegel (2009) proposed a CRISP-R like mechanism involving the conferral of specific resistance to viral pathogens by "immunospecific RNA (imRNA)" in crustaceans and insects. Some problems with this were considered by Forsdyke in the comment section of the paper (Click Here).

Flegel T. W. (2009) Biology Direct 4, 32. Hypothesis for hereditable, antiviral immunity in crustaceans and insects.

End Note (Jan 2010)

Since we are all heterozygotes for many alleles, mass transcription as part of the "stress" response to foreign intracellular invasion (Fig. 5), could generate proteins encoded by our two parental genomes that, apart from bringing about in utero "phenocopy" effects (Click Here), might also interact. The discovery of biallelic "promiscuous" thymic expression of certain self antigens during T cell education (under control of the AIRE transcription factor; Kyewski & Derbinski 2004), suggests a central mechanism by which such chance interactions (normal self recognizing normal self) could avoid future triggering of T cells in the periphery. 

    This hypothesis supposes that the aim of AIRE-induced transcription is not the display of self peptides from all AIRE-dependent transcribed genes, but merely those corresponding to genes whose products interact in a common cytosol, so generating hetero-aggregates for proteosome processing. Failure to eliminate T cells responding to peptides from such aggregates (negative selection) would be sufficient to account for the autoimmune disease seen in AIRE mutants. Under normal circumstances, having eliminated T cells responding to pMHCs corresponding to such interacting promiscously expressed self-proteins (negative selection), subsequent "promiscuous" expression in the periphery would be part of the response to foreign invasion (Fig. 5). Then the intracellular "antibody" repertoire seeks out protein (or RNA) corresponding to the intruder (normal self recognizing non-self; Gardner & Anderson 2009), and generates a novel coaggregate, peptides from which (as pMHC) activate T cells (positive selection).

    Why does each thymic medullary epithelial cell (MEC), under the influence of the AIRE transcription complex, transcribe only a small proportion of the total number of AIRE-transcribable genes? Thus, if the entire set be represented as A-Z, one MEC might transcribe A-C and another D-F, the choice being seemingly random. Thus, if A interacts with Q, this will be evident only in a cell that, by chance, transcribes, say AQW. The 5 day sojourn of T cells in the medulla should suffice for the deletion of those T cells of high affinity/avidity for any MHC-presented A and Q peptides. 

    At the time of this writing, the conventional wisdom is that all thymic AIRE transcripts are translated into proteins which are then, by some mechanism, displayed as MHC peptide complexes without overloading the mechanism. So peptides from A and Q and W - all three - would be displayed as pMHC on an AQW transcribing MEC, apparently without the necessity for prior protein aggregation. This would somehow suffice to eliminate high affinity/avidity T cells. But this negative selection is likely to require higher pMHC concentrations at the cell surface than positive selection. How are such high concentrations to be achieved?

    By restricting a MEC to displaying only part of the A-Z spectrum, the cell is more likely to be able to achieve the required high, specific, pMHC concentration (since there is less competition from other pMHCs). By restricting the display only to members of a part of the A-Z spectrum (e.g. AQW), some members of which can interact (e.g. A+ Q), the cell is even more likely to be able to achieve the high pMHC concentration that is needed for negative selection. Subsequently, peripheral promiscuous expression could generate a wide range of proteins (A-Z) in the hope that at least one would generate a novel coaggregate with a pathogen protein. This would lead to pMHC display and peripheral positive selection of T cells as part of the normal immune response (Click Here).

Gardner, J. M.  & Anderson, M. S. (2009) Nature Immunol  10, 934-936. The sickness unto Deaf.
Kyewski, B. & Derbinski, J
(2004) Nature Revs Immunol  4, 688-698. Self representation in the thymus: an extended view.

End Note (October 2011)

Just as there is diversification of the antibody repertoire to confront potentially harmful extracellular agents, so, in an emergency ("stress"), it might be predicted that there would be short-term diversification of the "RNA antibody" repertoire to confront potentially harmful intracellular agents (e.g. the nucleic acid thereof). It would be more important, in the short term, to protect the cell, than to optimally perform an RNA's usual function. Thus, we should not be surprised (although we are!) that Carmi et al (2011) report widespread ultra-editing of RNAs that is likely to be mediated by ADARs (adenosine deaminases acting on double-stranded RNAs). Interestingly, most ultra-editing is seen with RNAs extracted from a liver, part of which had been subjected to the stress of partial hepatectomy. Referring to Samuel (2011), the authors suggest: "The extreme number of ultra-edited RNAs from a regenerating liver library may also indicate induction of ADAR1 due to stress, possibly a viral infection." As an additional benefit, since most mRNAs have much secondary structure (probably because the corresponding gene requires such structure), ADARs that convert the purine A (adenine) to the purine I (inosine) would also decrease RNA secondary structure, making it easier to hybridize when the sequence of the RNA was complementary to the nucleic acid sequence of a potentially pathogenic agent. Carmi et al. also note: "Ultra-edited RNAs exhibit the known sequence motif of ADARs and tend to localize in sense strand Alu elements," and that "ultra-editing occurs primarily in Alu-rich regions."

Carmi S, Borukhov I, Levanon EY (2011) PLOS Genetics 7, e1002317. Identification of widespread ultra-edited human RNAs

Samuel CE (2011) Virology 411, 180-193. Adenosine deaminases acting on RNA (ADARs) are both antiviral and proviral.

 

End Note (December 2012)

Leonova et al. (2013) report "massive transcription" of repetitive elements when the oncogene p53 is inactivated and cultured mouse cells are treated with a demethylating agent likely to remove the (generally inhibitory) 5-Methyl group from DNA cytosine residues. Noting that under "stress" conditions such repetitive elements are normally transcribed, so increasing the probability of dsRNA formation, they suggest that the inhibitory effects of p53 and DNA methylation are overcome under such conditions. The interferon response follows.

Leonova KI et al. (2013) Proc. Natl. Acad. Sci USA 110: E89-E98.  p53 cooperates with DNA methylation and a suicidal interferon response to maintain epigenetic silencing of repeats and noncoding RNAs..

End Note (Mar 2013)

Zabolotneva et al. (2010) have also suggested an RNA-based "intracellular 'immune system'," where alarms begin ringing when an RNA of viral origin forms dsRNA with an 'RNA antibody' of host origin. Like us, they suspect that it may partly explain the variable quantities of 'junk DNA' found in genomes. Furthermore, they propose, and present bioinformatic analyses supporting, the idea that "Casual [random] combinations of nucleotides in - - the genome might create new DNA motifs that theoretically, after being transcribed, could be used by the host organism as  tool for recognition and targeting of intracellular pathogen transcripts. Novel transcribed [host] DNA motifs that would target the host genes [i.e. 'self']would be eliminated from the genome, whereas those that complementarily match the pathogen RNAs would be positively selected. Neutral motifs [yet to find a pathogen target but not interacting with 'self'] could be 'stored' in the genomes as ordinary non-coding DNA."

Zabolotneva A, Tkachev V, Filatov F, Buzdin A (2010) Biology Direct 5, 62. How many antiviral small  interfering RNAs may be encoded by the mammalian genome?

End Note (July 2016)

Enard et al. (2016) "conservatively estimate" that "viruses have driven close to 30% of all adaptive amino acid changes in the part of the human proteome conserved within mammals." Such "virus interacting proteins" vastly exceed the known proteins that regularly engage in immune responses to viruses (e.g. protein kinase R). This is consistent with our above suggestion of intracellular protein "immune receptors." Thus, over evolutionary time a protein that primarily evolved for a distinct function, but also happened to cross-react with some virus component, would in addition be selected by virtue of the latter function.

Enard D, Cai L, Gwennap C, Petrov DA (2016) eLife 5:e12469. Viruses are a dominant driver of protein adaptation in mammals.


End Note (Aug 2016)

Several studies suggest that ADAR hyperediting is primarily aimed to prevent formation of self-dsRNAs with strong secondary structure, thus assisting discrimination from not-self RNAs that might have more G-C pairs (i.e. more stable structures; 1-3). It is suggested above that more open (i.e. weaker) self-RNA structures will expand the repertoire of "antibody RNAs" with the potential to recognize and target (form dsRNA with) intracellular pathogen transcripts.  

1. Mannion et al. (2014) The RNA-editing enzyme ADAR1 controls innate immune responses to RNA. Cell Reports 9: 1482-94.

2. Liddicoat et al. (2015) RNA editing by ADAR1 prevents MDA5 sensing of endogenous dsRNA as nonself. Science 349: 1115-20. How many antiviral small  interfering RNAs may be encoded by the mammalian genome?

3. Savva et al. (2016) Reprogramming, circular reasoning and self versus non-self: one-stop shopping with RNA editing. Frontiers in Genetics 7, article 100.

 

End Note (March 2019)

The above report of Leonova et al. (2013) documenting the unleashing of innate immune mechanisms following the removal of epigenetic repression, has been extended by the extensive studies of Brocks et al. (2017) and Jones et al. (2019). Thus the latter notes: "DNA methylation inhibitors - - induce the expression of thousands of transposable elements including endogenous retroviruses and latent cancer testis antigens normally silenced by DNA methylation in most somatic cells. This results in a state of viral mimicry in which treated cells mount an innate immune response by turning on viral defence genes and potentially expressing neoantigens."

Leonova KI et al. (2013) Proc. Natl. Acad. Sci USA 110: E89-E98. p53 cooperates with DNA methylation and a suicidal interferon response to maintain epigenetic silencing of repeats and noncoding RNAs..

Brocks D et al. (2017) Nature Genetics 49:1052-1060. DNMT and HDAC inhibitors induce cryptic transcription start sites encoded in long terminal repeats.

Jones PA et al. (2019) Nature Reviews Cancer 19:151-161. Epigenetic therapy in immune- oncology.

 

End Note (March 2020)

Miao Wang et al. (2020) report that endogenous retroviral elements (HERVs) are transcribed increasingly following infection with dengue virus and this leads to local transcription of host genes that assist immune defences.

Wang, M. et al. (2020) Virology 544, 21-30.Transciption profile of endogenous retroviruses in response to dengue virus serotype 2 infection.

 

Update on Heat-Shock Response and Self/Not-Self Discrimination (2004 abstract and slides) (Click Here)

Return to Theoretical Immunology Index (Click Here)

Return to HomePage (Click Here)

Placed here in 2002 and last edited on 13 Nov 2020 by Donald Forsdyke