Microbiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary figure
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Erauso, G.
Right arrow Articles by van der Oost, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Erauso, G.
Right arrow Articles by van der Oost, J.
Agricola
Right arrow Articles by Erauso, G.
Right arrow Articles by van der Oost, J.
Microbiology 152 (2006), 1951-1968; DOI  10.1099/mic.0.28861-0
© 2006 Society for General Microbiology

Two novel conjugative plasmids from a single strain of Sulfolobus

Gaël Erauso1,2, Kenneth M. Stedman3,4, Harmen J. G. van de Werken1, Wolfram Zillig3,{dagger} and John van der Oost1

1 Laboratory of Microbiology, Wageningen University, Wageningen, The Netherlands
2 UMR CNRS 6539, IUEM, Université de Bretagne Occidentale, Technopôle Brest-Iroise, Place Copernic, 29280 Plouzané, France
3 Max-Planck-Institut für Biochemie, Martinsried, Germany
4 Biology Department, Portland State University, Portland, OR 97207, USA

Correspondence
Gaël Erauso
gael.erauso{at}univ-brest.fr


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
Two conjugative plasmids (CPs) were isolated and characterized from the same ‘Sulfolobus islandicus’ strain, SOG2/4. The plasmids were separated from each other and transferred into Sulfolobus solfataricus. One has a high copy number and is not stable (pSOG1) whereas the other has a low copy number and is stably maintained (pSOG2). Plasmid pSOG2 is the first Sulfolobus CP found to have these characteristics. The genomes of both pSOG plasmids have been sequenced and were compared to each other and the available Sulfolobus CPs. Interestingly, apart from a very well-conserved core, 70 % of the pSOG1 and pSOG2 genomes is largely different and composed of a mixture of genes that often resemble counterparts in previously described Sulfolobus CPs. However, about 20 % of the predicted genes do not have known homologues, not even in other CPs. Unlike pSOG1, pSOG2 does not contain a gene for the highly conserved PlrA protein nor for obvious homologues of partitioning proteins. Unlike pNOB8 and pKEF9, both pSOG plasmids lack the so-called clustered regularly interspaced short palindrome repeats (CRISPRs). The sites of recombination between the two genomes can be explained by the presence of recombination motifs previously identified in other Sulfolobus CPs. Like other Sulfolobus CPs, the pSOG plasmids possess a gene encoding an integrase of the tyrosine recombinase family. This integrase probably mediates plasmid site-specific integration into the host chromosome at the highly conserved tRNAGlu loci.


Abbreviations: CP, conjugative plasmid

The GenBank/EMBL/DDBJ accession numbers for the sequences of the pSOG plasmids are DQ335583 (pSOG1) and DQ335584 (pSOG2).

An alignment of the Sulfolobus CP integrases with representative members of the tyrosine recombinases is available as supplementary data with the online version of this paper.

{dagger}Deceased.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
Sulfolobus solfataricus was one of the first organisms to be recognized as a member of the Archaea (Zillig et al., 1980Down). Due to this early identification, S. solfataricus and its relatives have become model organisms for fundamental studies of Archaea. Studies of the genus Sulfolobus have been instrumental in understanding archaeal mechanisms of transposition (Martusewitsch et al., 2000Down), transfection (Schleper et al., 1992Down), transformation (Aravalli & Garrett, 1997Down; Cannio et al., 1998Down; Elferink et al., 1996Down; Stedman et al., 1999Down) and conjugation (Reilly & Grogan, 2001Down; Schleper et al., 1995Down). An impressive variety of mobile genetic elements has recently been discovered in Archaea in general, and in Sulfolobus in particular: viruses, autonomous insertion sequence (IS) elements, non-autonomous miniature inverted repeat transposable elements (MITEs), small non-conjugative plasmids and large conjugative plasmids (Brugger et al., 2002Down; Prangishvili et al., 2001Down; Rice et al., 2001Down; Zillig et al., 1998Down). Although there have been impressive recent developments in Sulfolobus genetics, this remains a bottleneck (Albers et al., 2006Down; Bartolucci et al., 2003Down; Jonuscheit et al., 2003Down; Stedman et al., 1999Down; Worthington et al., 2003Down).

The first archaeal conjugative plasmid (CP), pNOB8, was isolated from a Japanese Sulfolobus isolate (Schleper et al., 1995Down). Since then, several other CPs have been isolated from colony-cloned strains of ‘Sulfolobus islandicus’, and subsequently characterized (Greve et al., 2004Down; Stedman et al., 2000Down). Sequence comparison of all Sulfolobus CPs revealed three distinct sequence domains. One well-conserved cluster of genes covering approximately 12 kbp of the plasmids' genomes apparently contains the conjugative functions. A second is the putative origin of replication. Finally there is a region proposed to encode replication proteins (Greve et al., 2004Down). Only a few distant homologues to bacterial proteins involved in conjugative transfer (TraG, TrbE) and partitioning (ParA, ParB) have been found. In the case of the pNOB8 and pING plasmids, derived variant plasmids were detected upon propagation. These occur as a result of deletion and recombination (She et al., 1998Down; Stedman et al., 2000Down). Comparing the conserved sequences of CPs with some non-conjugative derivatives has provided insight into proteins and DNA sequence motifs putatively involved in conjugation in Archaea.

A single strain of ‘S. islandicus’ SOG2/4 was found to harbour two very different but related plasmids. One of these had a stable low copy number in the well-characterized S. solfataricus P1 strain, so it was of interest for the development of genetic tools. The two plasmids were separated and characterized. Here we present the complete sequences of these two archaeal CPs (pSOG1 and pSOG2). Comparison of these novel CPs with the available counterparts has been used to further identify plasmid features that play key roles in conjugative transfer in Archaea.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
Sulfolobus growth, DNA isolation and analysis.
Single-colony isolates of strains containing the pSOG plasmids were obtained and grown in standard Sulfolobus medium as described previously (Zillig et al., 1994Down). Plasmid DNAs were prepared from 4 litres of freshly conjugated cells (1 : 10 000 donor to recipient followed by growth for 48 h) by using a variation of the alkaline lysis method of Birboim & Doly (1979)Down as described previously (Arnold et al., 1999Down). Total DNA, i.e. chromosomal plus plasmid DNA, was isolated as described by Arnold et al. (1999)Down. For electrophoretic analysis, about 3 µg total DNA or 1 µg plasmid DNA was digested with the appropriate restriction enzyme and separated on 0.6–1.0 % agarose gels (Sambrook et al., 1989Down). Southern hybridizations were performed by using the DIG labelling and detection kit from Roche Diagnostics, according to the manufacturer's instructions. The copy number of the pSOG plasmids was determined by estimation of the ratio of single-copy chromosomal fragments to plasmid fragments in restriction digests of total DNA as described previously (Schleper et al., 1995Down).

Cloning and sequencing.
Prior to cloning, plasmid DNA preparations were purified by ultracentrifugation in a caesium chloride gradient in the presence of ethidium bromide (1 mg ml–1) (Sambrook et al., 1989Down). Digestion of both plasmids with EcoRI produced 11 bands for pSOG1 and 10 bands for pSOG2, ranging from 0.3 to 7.2 kbp. All of these fragments were cloned in the EcoRI site of pUC28 (Benes et al., 1993Down). Fragments obtained by digestion with BamHI, HindIII, PstI and XbaI in the size range 0.8–4.5 kb were also cloned in the corresponding sites of pUC28 to obtain an overlapping clone library for pSOG1 and pSOG2. Sequencing reactions were carried out on a LiCor DNA sequencer 4000L with a Thermo Sequenase fluorescent-labelled primer cycle sequencing kit (Amersham Biosciences) and infrared-labelled primers M13 forward and M13 reverse (MWG-Biotech). Gaps in the sequence were filled by using specific primers either directly for sequencing on library clones or to sequence PCR amplicons obtained with native pSOG DNA as template. The sequences were trimmed and assembled using the SeqMan II program (Lasergene package), with both strands completely sequenced and with a minimum threefold coverage.

Computer analysis.
DNA sequences were analysed using Vector NTI software (version 9, Informax). Direct and inverted sequence repeats were detected by using the GeneQuest program (Lasergene). Cumulative GC skews were made with the Genskew software (http://mips.gsf.de/services/analysis/genskew) and the Z-curve program (http://tubic.tju.edu.cn/zcurve/). Analyses were done with a window size of 30 nt. Identification of putative genes and operons was performed using the FGENESB pattern/Markov chain-based prediction program from Softberry (http://softberry.com/berry.phtml) and the pre-trained parameters of Sulfolobus solfataricus and S. tokodaii. Putative promoters (TATA box), Shine–Dalgarno sequences and terminators were identified with a window size of 12, 6 and 11 nt, respectively, in the 50 nt sequences upstream or downstream of the predicted gene start and stop codon. The nucleotide sequences were analysed using the Gibbs sampler algorithm (Thompson et al., 2003Down). Sequence logos were generated using WebLogo (Crooks et al., 2004Down). Homology searches were performed with a range of BLAST tools at the NCBI server (http://www.ncbi.nlm.nih.gov/blast). Identities were calculated with the program LALIGN at the Swiss EMBnet node server (http://www.ch.embnet.org/. Combined searches of a number of databases of protein families, domains and functional sites were performed using SMART (http://smart.embl-heidelberg.de/) and CDD tool (NCBI). The program COILS (EMBnet) was used for finding {alpha}-helical coiled-coil domains. Transmembrane domains were predicted by the programs PSORT (http://psort.nibb.ac.jp/), TMPRED (embnet) and TMHMM (http://www.cbs.dtu.dk/services/TMHMM). Identification of potential signal peptides was done with SIGNALP (http://www.cbs.dtu.dk/services/SignalP). For phylogenetic analyses, the deduced amino-acid sequences of the largest conserved ORFs in each Sulfolobus CP were aligned using MUSCLE (Edgar, 2004Down) and revised manually. Trees were generated from each individual alignment and for concatenated alignments of several ORFs, using the neighbour-joining method (Saitou & Nei, 1987Down) of the MEGA 3.1 program (Kumar et al., 2004Down). Distances were calculated using the Poisson correction (PC) distance model (Nei & Kumar, 2000Down). Tree significance was assessed by bootstrapping 1000 times.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
Origin of the pSOG2/4 plasmids
Strain SOG2/4 harbouring the conjugative plasmid (CP) pSOG2/4 was isolated from samples collected in the Sogasel Icelandic solfataric field and belongs to a species provisionally called Sulfolobus islandicus’, closely related to S. solfataricus (Zillig et al., 1994Down). The conjugative nature of this plasmid was shown by its capacity to be directly transferred from donor into recipient cells, resulting in complete spread through the recipient culture (Prangishvili et al., 1998Down). Upon conjugation into a foreign host the plasmid was amplified to high copy number (more than 35 copies per chromosome), as observed for the other Sulfolobus CPs. Unlike other Sulfolobus CPs, which were mostly stable immediately after conjugation, the plasmid from SOG2/4 appeared to be very unstable when transferred into S. solfataricus strain P1 but remained indistinguishable from the original when another ‘S. islandicus’ strain, HVE10/4, was the recipient (Prangishvili et al., 1998Down) (Fig. 1Down). However in the former case, as often observed for other Sulfolobus CPs, prolonged growth of transcipients resulted in plasmid variant formation and eventual curing (Schleper et al., 1995Down; She et al., 1998Down). Plasmid pSOG1 (previously named pSOG2/4 clone 1; Prangishvili et al., 1998Down) was isolated from a single colony obtained by plating transcipient cultures from a third successive conjugative transfer in S. solfataricus strain P1 (Prangishvili et al., 1998Down). A stable clone, named pSOG2 (previously named pSOG2/4 clone A), was obtained by conjugative transfer first in strain HVE10/4 and subsequentely in S. solfataricus P1; pSOG2 appeared to be indistinguishable by restriction endonuclease digestion from the original pSOG2/4 CP. See Fig. 1Down for an overview of the isolation procedure. Comparison of the restriction endonuclease digestion patterns and Southern blotting of pSOG1 and pSOG2 showed that the two plasmids differ dramatically (not shown; and see Prangishvili et al., 1998Down). Only 1/3 of the sequence of pSOG2 is conserved in pSOG1 (Fig. 2Down). The rest of the pSOG1 genome was likely acquired via recombination from another low-copy-number plasmid in the parent strain SOG2/4 that was lost upon passage through the ‘S. islandicus’ HVE10/4 strain. The ‘pSOG1’ plasmid must have been present in the original strain at very low copy number because it was not detected by Southern blotting that detected low-copy-number plasmids. After long exposures, background hybridization to genomic DNA was detected that may indicate an integrated plasmid (not shown). After passage, pSOG2 can be stably propagated in S. solfataricus P1. Copy number control appeared to be lost or damaged in the pSOG1 variant, as indicated by its extremely high copy number in S. solfataricus P1 as compared to the low copy number of pSOG2.


Figure 1
View larger version (21K):
[in this window]
[in a new window]
 
Fig. 1. Generation of plasmids pSOG1 and pSOG2. Plasmid hosts are shown as large ovals. Plasmids are shown as circles; a dashed-lined circle indicates plasmid loss. Black arrows represent plasmid transfer by conjugation. The broad white arrow represents single colony isolation. ‘lc’ indicates low copy number; ‘hc’ indicates high copy number; ‘vlc’ indicates very low copy number, which may be an integrated copy. ‘pSOG1’ is the precursor of the plasmid pSOG1, which contains putative conjugation genes and origin of the pKEF family of Sulfolobus CPs. ‘pSOG2/4’ is the originally observed plasmid and precursor of plasmid pSOG2.

 

Figure 2
View larger version (37K):
[in this window]
[in a new window]
 
Fig. 2. Comparison of the pSOG1 and pSOG2 sequences. This diagram shows the circular genomes of pSOG1 on the outside and pSOG2 on the inside. ORFs are shown as arrows. Similar ORFs in the two plasmids are filled in grey; identical ORFs are filled in black; ORFs not conserved between the two plasmids are not filled. ORFs with predicted functions arelabelled and ORFs discussed in the text are in bold. Insertions and gene replacements are indicated by dashed lines between the two genomes. ORF names are shown next to the corresponding arrows. The recombination motif TAAACTGGGGAGTTTA is represented by a small disk, coloured green when present on the direct DNA strand and lightblue when located on the complementary strand. Blue disks indicate the two larger tandem repeats, and a red disk indicates larger inverted repeats. The violet oval represents the putative site of integration attP. The approximate location of theorigin (Ori) and terminus (Ter) of replication as predicted by cumulative GC skew and Z-curve analyses are also indicated.

 
Nucleotide sequence of the pSOG1 and pSOG2 plasmids
The assembled circular sequences are 29 000 bp in length for pSOG1 and 26 960 bp for pSOG2 (Fig. 2Up). Their overall G+C contents were 35.8 mol% and 36.7 mol% respectively. The corresponding value for the chromosome of S. islandicus’ is not known but the value determined for S. solfataricus (35.8 %) (She et al., 2001Down) is identical to that of pSOG1. As previously deduced from their EcoRI restriction patterns and Southern hybridizations, the two genomes share a large 100 % identical region of 9842 bp (nucleotides 14 696–21 466 and 24 217–27 129 for pSOG1, 15 815–25 657 for pSOG2). This region in pSOG1 is interrupted by a non-homologous sequence of 2756 bp. As expected from previous studies (Greve et al., 2004Down; Stedman et al., 2000Down), pSOG2 shares extensive nucleotide sequence similarity (long stretches of sequences up to ~3 kbp with more than 95 % identity) with other Sulfolobus CPs of the pKEF group (nomenclature according to Greve et al., 2004Down), whereas pSOG1 has more similarities with the pARN group of plasmids, which also contains plasmids integrated into the genomes of Sulfolobus acidocaldarius and S. tokodaii (Chen et al., 2005Down; She et al., 2004Down).

The G+C content of pSOG CPs is not evenly distributed, displaying a number of peaks and troughs (not shown). Five regions of more than 2000 bp have a higher G+C content (>36 mol%); these fragments roughly correspond to parts of the genome that encode the most-conserved ORFs in Sulfolobus CPs (Fig. 2Up, Table 1Down). In contrast, lower G+C regions are less extended and contain less-conserved ORFs. The latter fragments may encode functional units, such as partitioning and additional elements involved in conjugation (see below), also indicating that pSOG2/4 plasmids have a mosaic structure composed of elements of diverse origin. A clear minimum, corresponding to several successive short poly(A) stretches, is located just in front of ORF175, present in both plasmids (ORFs present in only one plasmid are listed as ORF1- for pSOG1 and ORF2- for pSOG2).


View this table:
[in this window]
[in a new window]
 
Table 1. Properties of ORFs and operons of plasmids pSOG1 and pSOG2

 
ORF distribution
Forty-six ORFs encoding a product at least 50 aa in length were identified in the genome of pSOG1 and 41 ORFs in the genome of pSOG2 (Fig. 2Up, Table 1Up). These putative genes are generally closely spaced (mean density 1.55 ORFs per kb). However, two regions with larger intergenic spaces are located at the borders between the conserved and the variable regions of the two pSOG plasmids: between ORFs 1-76 and ORF87 and from ORF87 to ORF175a on one side, and between 1-68a and 1-125c on the other side. The putative genes are almost equally distributed on both DNA strands and apart from a few clusters of genes that are probably co-transcribed (see below) the genes are evenly dispersed. This ORF distribution is also observed for the other Sulfolobus CPs (Greve et al., 2004Down) (see below). Another common feature of the plasmids is that all of the larger ORFs (encoding >500 aa) are located in the same conserved region of their genome (Fig. 2Up, Fig. 4Down).


Figure 4
View larger version (31K):
[in this window]
[in a new window]
 
Fig. 4. Comparison between pSOG1 and pSOG2 and other Sulfolobus CPs. Genome maps are shown for all of the published Sulfolobus CPs and for the presumed defective plasmid pTC (see text for details). Homologous ORFs in different plasmids can be identified by colour and pattern. ORFs represented by white arrows have no homologues in the other Sulfolobus CPs. Predicted regions encoding conjugative functions are shown as region A, the putative replication origin in region B is delimited by a short red horizontal bar and the putative replication region is labelled C (figure modified from Greve et al., 2004Down with permission). ORFs discussed in the text are labelled.

 
Operons and putative transcriptional and translational signals
The pSOG plasmid ORFs start at ATG (79.5 %), TTG (17.5 %) or GTG (3 %) and terminate at TAA (44 %), TGA (39 %) or TAG (17 %). This distribution of start and stop codons resembles that of the S. solfataricus chromosome (Garcia-Vallve et al., 2003Down). Except for the above-mentioned two regions containing larger intergenic sequences, in 83 % of the cases an ORF is found within 50 nt of the previous ORF's stop codon. Moreover, for 31 % of the collinear ORFs this distance is less than 20 nt (75 % of these overlap), and the latter have been considered to be part of an operon. Sequence logos derived from alignment of the 50 nt upstream sequences of pSOG genes allowed us to identify putative translational and transcriptional signals (Fig. 3Down). The consensus ribosome-binding site (GGTGA) was found in all but a few genes that are assumed to be part of an operon (Table 1Up, Fig. 3Down). It is optimally located at positions –10 to –7 bp upstream of the putative start codon. This sequence is the reverse complement of (underlined) part of the 3' end of the 16S rRNA sequence from S. solfataricus (GGAUCACCUCA-3'). However, such a sequence was not detected for single genes or first genes of a candidate operon, confirming the results of previous analyses done on Sulfolobus (Tolstrup et al., 2000Down) and later on a large set of archaeal genomes (Torarinsson et al., 2005Down). Accordingly, for this class of genes, we also found a 7–8 nt A+T-rich sequence centred between positions –25 and –27 from the start codon, fulfilling the criteria for the Sulfolobus TATA box of Soppa (1999)Down. The promoter sequences of S. solfataricus generally contain a transcription factor B responsive element (BRE) with two to four A(T)s generally located 2 nt upstream of the TATA box sequence (Bell & Jackson, 2000Down); such a conserved BRE could generally be identified in both pSOG plasmids (Table 1Up, Fig. 3Down). The distance between the predicted TATA boxes and the putative start codon coincides with the mean interval found experimentally between the TATA box and the transcriptional start in mapped Sulfolobus promoters (Dalgaard & Garrett, 1993Down; Reiter et al., 1988Down). This means that there is little or no room for a ribosome-binding site and explains why this signal was not found in our analysis of single genes and first genes of an operon. It also implies that translation initiation for this class of genes must depend on a mechanism other than Shine–Dalgarno sequence (Condo et al., 1999Down). Table 1Up summarizes the information obtained from transcriptional and translational signal searches: pSOG plasmids appear to be organized in 30 transcription units (TU) for pSOG1 (5 operons and 25 single genes) and 26 TU (6 operons and 20 single genes) for pSOG2. Additional support for the co-transcription of the proposed TUs is provided by the identification, downstream of the last gene of the TU, of potential transcriptional terminators identical to those found in the virus SSV1 (Palm et al., 1991Down) and in the Sulfolobus chromosome (She et al., 2001Down) e.g. 5'-TTTTTT or 5'-TTTTCTT or 5'-TTTATTTT. The fraction of single genes (69 %) is quite high compared to that found in other Sulfolobus extrachromosomal elements (e.g. 27 % for the genome of virus SSV1; Palm et al., 1991Down). This mosaic character may reflect the need for fine-turning each gene expression separately and/or the modularity of these plasmids.


Figure 3
View larger version (22K):
[in this window]
[in a new window]
 
Fig. 3. Sequence logos of putative promoters, ribosome-binding sites (RBS) and terminators of pSOG plasmids. (a) Upstream pattern sequence of single genes and putative first genes of an operon (46 sites); the approximate locations of the BRE motif and TATA box are indicated by horizontal bars. (b) Putative ribosome-binding site of genes within an operon (22 sites). (c) Putative terminator pattern (46 sites).

 
Overall genome comparison with the other Sulfolobus CPs
Similarity searches showed that 53 of the 65 unique ORFs (80 %) of pSOG1 and pSOG2 had significant matches (BLASTP E-value <10–4) to proteins in public databases. Most of the hits were to hypothetical proteins encoded by other Sulfolobus CPs, showing from 26 % and up to 100 % amino acid sequence identity (Table 1Up). Ten homologous ORFs are shared by the eight CPs, while over 80 % of the other ORFs are common to two or more CPs. As illustrated in Fig. 4Up, the conserved ORFs are clustered in two genomic regions separated by a larger intergenic section. These three genomic sections, named A, B and C according to Greve et al. (2004)Down, appear to be functionally distinct. The largest one, section A, also contains the highly conserved large ORFs: 1-668 (TrbE), 1-609, 1-734 and 1-1063 (TraG) (pSOG1 numbering) which are most likely involved in conjugation. Section B carries the putative origin of replication. Section C corresponds to a cluster of closely packed genes, including the six other genes common to all CPs: a putative relaxase (211), an operon containing genes implicated in plasmid replication, 106 (RepA), 62 (CopG), 421(integrase), 84 and 93b (two hypothetical proteins). The sequence of an apparently defective CP, pTC, from Sulfolobus tengchongensis (Xiang et al., 2003Down) has been deposited in GenBank (NC_005969). It is missing a number of conserved ORFs from the other CPs, specifically a homologue of the highly conserved pSOG ORF2-779 (which appeared to be partitioned in three ORFs) and putative replication ORFs including all of sections B and C (Fig. 4Up). We are looking forward to publication of details on its isolation and physical characteristics.

To better evaluate the relationship between the eight self-transmissible plasmids of Sulfolobus, we used the most representative genes as phylogenetic markers. Unrooted trees were obtained from alignment of the eight homologous genes for each of the four largest ORFs of section A, namely TrbE, 2-779, 2-610 and TraG, and for a concatenated alignment of these four genes. The topology obtained for each individual gene (not shown) was very similar to that obtained for the concatenated tree (Fig. 5aDown) and not dependent on the method used for tree construction (neighbour joining, minimal evolution, parsimony). The concatenated tree clearly shows two distinct groups: (i) the pKEF9 group (Greve et al., 2004Down), which includes pSOG2 and the more divergent pNOB8 branch, and (ii) the pARN group, to which pSOG1 may belong. However, this clustering does not agree with the integrase tree, where pARN, pHVE and pKEF appear closely related whilst the pING integrase is the most divergent. This difference in tree topology suggests a distinct origin for the respective genomic fragments coinciding with their functional modularity, section A being devoted to conjugation and section C to replication and recombination.


Figure 5
View larger version (8K):
[in this window]
[in a new window]
 
Fig. 5. Phylogenetic relationships between large conserved ORFs of Sulfolobus CPs. (a) Unrooted tree obtained from an alignment of the concatenated four most conserved ORFs of Sulfolobus CPs, presumably involved in conjugation: TrbE, TraG, ORF734 and ORF609 (pSOG1 numbering). (b) Unrooted tree of the integrase gene of Sulfolobus CPs. Trees were constructed using the neighbour-joining method in MEGA 3.1 (Kumar et al., 2004Down). Branches are labelled with their corresponding bootstrap values (only those greater than 50 % are indicated).

 
Conjugative transfer function
At least three of the largest ORFs in the pSOG CPs are probably involved in the conjugation process. As previously reported for their homologues in other Sulfolobus CPs (Greve et al., 2004Down; She et al., 1998Down; Stedman et al., 2000Down), ORFs 1-1023/2-1082 showed significant similarities with the TraG/VirD4 [cluster of orthologous groups (COG) 3505], and ORFs 1-668/2-615 with TrbE/VirB4 (COG 0433). Both TraG and TrbE represent families of ATPases that are involved in conjugation in bacteria (http://www.ncbi.nlm.nih.gov/COG/) (Grohmann et al., 2003Down). These two proteins aligned with each other around the type I ATP-binding site (Walker A motif), which occurs at a similar position in each protein (not shown). The TraG and TrbE proteins have been proposed to be coupling proteins (Grohmann et al., 2003Down) connecting the relaxosome, a DNA-binding protein-complex encoded by both the CP and the host chromosome at the plasmid transfer origin oriT (Lanka & Wilkins, 1995Down), to the mating-pair formation (mpf) system, a plasmid-encoded multi-protein complex that is involved in the transfer of the donor DNA to the recipient cell (Llosa et al., 2003Down). In the current model, TraG is a membrane-anchored, multimeric protein forming a pore-like structure that actively exports the transferred DNA (T-DNA) via envelope-spanning mpf components (Llosa et al., 2002Down, 2003Down; Llosa & de la Cruz, 2005Down; Schroder et al., 2002Down). Accordingly, we found that both TraG-like ORFs 1023/1082 and TrbE-like ORFs 668/615 possess a predicted N-terminal transmembrane domain that could serve as an anchor. They also have the same predicted topology as their bacterial homologues (Schroder & Lanka, 2003Down). A third large ORF in the pSOG plasmids is also highly conserved in other Sulfolobus CPs (Table 1Up). These ORFs 1-734 and 2-610 share significant similarities with permeases of the major facilitator superfamily (PSI BLAST with E-value 10–47 after 5th iteration). Its product possesses up to 12 putative transmembrane segments covering two-thirds of the protein sequence. It also contains a type I ATP-binding site located roughly in the same region as the TraG-like and TrbE-like ORFs. This motif is part of a conserved domain (COG1196) typical of motor ATPases, including the Smc proteins, involved in chromosome segregation or compaction (Elie et al., 1997Down).

By analogy with the model recently proposed for bacteria (Grahn et al., 2000Down), we assume that in Sulfolobus conjugation, the TraG-like and the TrbE-like proteins form a heteromultimeric complex associated with the cytoplasmic membrane, and pump the DNA through a membrane-spanning channel constituted of at least the permease-like component, which may contribute actively to the T-DNA translocation. Several other pSOG ORFs, including the fourth largest, 1-609 and 2-615, contain putative membrane helices with predicted inner and outer segments (Table 1Up), and may also be involved in mating pair formation (mpf). In bacterial CPs, genes encoding conjugative transfer functions are generally clustered in one or two tra regions (Grohmann et al., 2003Down; Pansegrau et al., 1994Down), one encoding the relaxosome and the other the mpf system. The remarkable conservation of the almost contiguous cluster of ORFs (including the putative TraG, TrbE, ORFs 1-734/2-610 and 1-609/2-615) in all the Sulfolobus CPs suggests that these genes are involved in the same function, probably the mpf system.

Homology searches failed to detect putative components of the relaxosome in the pSOG sequences. Previous comparison of the genome sequence of functionally defective pING variants that had lost their capacity for self-transfer but were still transmissible led to the proposal that some of the conserved ORFs encode mobilization (mob) functions (Stedman et al., 2000Down). Among the four candidate mob genes only one, ORF211, is present in all self-transmissible plasmids of Sulfolobus CPs (Fig. 4Up). It is therefore tempting to speculate that, as in bacteria (Francia et al., 2004Down), archaeal mobile DNA elements carry the information necessary for relaxosome formation. The small pING plasmids of Sulfolobus should contain a gene encoding a functional analogue of the bacterial relaxases as well as a cis-acting transfer origin (oriT). Previous attempts to locate oriT in Sulfolobus CPs showed that six conserved sequence motifs could potentially play that role (Stedman et al., 2000Down). We found that only one of these sequence elements (‘motif 2’) is conserved in the pSOG plasmids and generally in the Sulfolobus CPs of the pKEF9 family (but not in the pARN plasmids), as well as in mobilizable pING derivatives, and that this motif is always located immediately upstream of ORF211. In a typical bacterial mobilization region, oriT is located upstream of the gene encoding the relaxase (Francia et al., 2004Down). The genomic context suggests a hypothetical function of relaxase for ORF211. However, the lack of any detectable sequence similarities with bacterial Mob proteins makes this assumption questionable.

Partitioning and plasmid maintenance
The N-terminal half (amino acids 1–160) of ORF1-349 in pSOG1 is similar to the highly conserved ParB/SpoB protein family involved in partitioning of bacterial plasmids and chromosomes (Table 1Up). The sequence includes two conserved motifs that are proposed to be involved in interaction with ParA/SopA and unknown host factors in bacteria (Hanai et al., 1996Down), and a helix–turn–helix DNA-binding domain typical of the ParB family. The three motifs aligned well with a set of divergent bacterial ParB proteins but poorly with ORF470 and ORF422 of the Sulfolobus plasmid pNOB8 (She et al., 1998Down), and not at all to ORFs in other Sulfolobus CPs (not shown). Due to (i) the relatively high similarity of ORF1-349 to bacterial rather than archaeal homologues, (ii) the significant difference in codon usage compared to the other pSOG ORFs (data not shown), and (iii) the genomic context (Fig. 2Up), it is suspected that ORF1-349 and surrounding DNA are the result of lateral gene transfer. Thus this parB homologue has most likely been acquired from a bacterial plasmid. The C-terminal region of ORF1-349 showed similarity to COG 5483, for which no function has yet been established. Surprisingly, a homologue of ParA/SopA appears not to be encoded by pSOG1. In bacteria, plasmid partitioning during cell division proceeds via the so-called segrosome, a protein complex at least consisting of ParA and ParB, which are encoded by the plasmid-borne par locus (Gerdes et al., 2000Down; reviewed by Hayes & Barilla, 2006Down). ParA is a membrane-associated ATPase that forms a complex with the DNA-binding ParB (Bignell & Thomas, 2001Down); the binding site of ParB is the cis-acting partition site parS. The archaeal plasmid pNOB8, which is stably maintained at a low copy number in its natural host (~5 copies per chromosome), contains one ParA and two ParB homologues, as well as a putative parS element. These elements are missing in the unstable, high-copy-number variant pNOB8-33, formed after conjugative transfer in the foreign host S. solfataricus P1 (She et al., 1998Down). Previously, it was reported that Par-like components are also absent in the other Sulfolobus CPs, which led to the conclusion that they may lack copy number control and partitioning (Greve et al., 2004Down). Similarly, no homologues of parA or parS appear to be present on the genomes of the pSOG plasmids. Probably maintenance of pSOG1 does not require a partitioning system, most likely due to its high copy number (40–50 copies per chromosome) in S. solfataricus P1. In the case of pSOG2, however, the stable low copy number does suggest some partitioning system that is different from the Par system. One such alternative has been suggested to be the so-called clustered regularly interspaced short palindrome repeats (CRISPRs, previously referred to as SRSR). These repeats are present in many prokaryotic genomes (Jansen et al., 2002Down), and also in pNOB and pKEF9 (Greve et al., 2004Down; Peng et al., 2003Down). An overexpression study in Haloferax initially suggested involvement in replicon partitioning (Mojica et al., 1995Down). However, recent comparative analyses suggest a role of the repeats in a host-defence mechanism against extrachromosomal elements (viruses and plasmids) (Bolotin et al., 2005Down; Mojica et al., 2005Down). No CRISPRs were found in the pSOG plasmids. Hence, the molecular basis for partitioning of low-copy-number archaeal CPs, including pSOG2, remains to be identified.

Plasmid replication
The pSOG plasmids contain an operon of nine short genes (ORF113 to ORF96) presumably involved in plasmid replication. A similar operon is present in each Sulfolobus CP and located in conserved region C of their genome (Greve et al., 2004Down). Five of the genes of the pSOG operon, including the first and the last two, occur in the same order in all Sulfolobus CPs (Fig. 4Up). For two of the putative proteins, searches in databases provide indirect evidence for their role in DNA replication.

Sequence similarity indicates that ORF62 belongs to the CopG family, a copy number control protein used by numerous bacterial plasmids (del Solar et al., 2002Down). In these bacterial plasmids, the copG gene is located upstream of a gene encoding a replication initiator protein and the two genes are expressed from a common promoter. The CopG protein binds to this promoter and represses the expression of both proteins, thus controlling the replication of the plasmid (del Solar et al., 2002Down). A similar organization exists in the Sulfolobus cryptic plasmid pRN1, where the copG gene precedes the gene for a RepA homologue (Keeling et al., 1998Down). It was shown that pRN1 CopG binds to a double-stranded DNA inverted repeat located within the cop-rep promoter and thus could downregulate the expression the RepA protein (Lipps et al., 2001bDown). Such a set of inverted repeats was also identified in the promoter region of the ‘replication’ operon of Sulfolobus CPs (Greve et al., 2004Down). In pSOG plasmids as well, a set of 8 bp inverted repeats separated by only 1 bp is found immediately downstream of the TATA box resembling the pRN1 CopG binding site and also similar to the palindromic binding site of CopG of the bacterial plasmid pMV158 (Gomis-Ruth et al., 1998Down) (see Fig. 4Up of Greve et al., 2004Down).

A function as replication initiator protein (RepA) was proposed for one of the most conserved ORFs of the ‘replication’ operon (Greve et al., 2004Down). Indeed ORF99 of pING1 shows weak but significant similarity to a putative chromosomal replication initiator of Haemophilus ducreyi (236 aa) and to RepA (346 aa) of plasmid pRUM from Enterococcus faecium. These similarities can be detected only when using the iterative PSI-BLAST tool. However, although similar in size and location in the rep operon, pSOG ORF106 is not homologous to the otherwise highly conserved RepA of Sulfolobus CPs. In fact, ORF106 of pSOG is homologous only to ORF107 of pHVE14, which is also present in the rep operon. Nevertheless, extensive search for a motif or conserved domain failed. Therefore no putative function could be attributed to this ORF. Since pSOG plasmids do not contain a homologue of the putative RepA-encoding gene, either it is not necessary for replication or another pSOG ORF serves that function, perhaps ORF106.

A highly conserved ORF in the ‘replication’ operon, ORF84, exhibits a leucine-zipper motif from position 22 to 57. This motif facilitates protein dimerization and is common to a class of DNA-binding proteins mostly found in eukaryotic transcription factors such as GCN4. A few examples of this class are known in prokaryotes, including the RepA protein of bacterial plasmids and two archaeal transcriptional regulators, GvpE of Halobacterium salinarium (Kruger et al., 1998Down) and PlrA of Sulfolobus (discussed below).

Plasmid replication origin
There are several reasons to assume that the origin of replication (oriV) may be located in the region extending from ORF1-76 (and partly overlapping this ORF) to ORF87 spanning about 700 bp. First, in the Z-curve of a cumulative GC-skew analyses of pSOG plasmids (not shown) both the X and Y components show a sharp peak centred within ORF1-153 and ORF2-150 in pSOG1 and pSOG2 respectively, indicating a possible replication origin (Chen et al., 2005Down; Zhang & Zhang, 2004Down). Second, that region contains a block of predicted genes, most of which are conserved in a similar gene context in other Sulfolobus conjugative and mobilizable plasmids. Third, this region is immediately preceded by a conspicuous AT-rich region (partly overlapping this conserved region) (Fig. 2Up), which may facilitate opening of the DNA strands. This region is also relatively rich in short repeated motifs that could serve as binding sites for replication factors, even though these repeats are not regularly spaced like the so-called iterons which serve as binding sites for RepA in bacterial plasmid origins (del Solar et al., 1998Down). A putative origin of replication has been proposed for Sulfolobus CPs that contains a specific direct repeat 5'-TCTATACCCCC-3' with 34–35 nt spacing in the context of a highly conserved 170 nt region (Greve et al., 2004Down). This direct repeat with appropriate spacing is found in both pSOG1 and pSOG2 (Fig. 6Down), but the remainder of the sequence is not well conserved and there are substantial differences between pSOG1 and pSOG2 in both the intervening and flanking sequences. The pSOG1 sequence resembles the pKEF9 putative origin whereas the pSOG2 sequence is more similar to the pNOB8 putative origin. This may be critical for the simultaneous occurrence of both plasmids (or of their precursors) in the original SOG2/4 strain (Fig. 1Up). This sequence partly overlaps with ORF1-76.


Figure 6
View larger version (40K):
[in this window]
[in a new window]
 
Fig. 6. Comparison of the putative oriV locus in pSOG1 and pSOG2 and other Sulfolobus CPs. Yellow background indicates blocks of nucleotides conserved in all Sulfolobus CPs oriV loci; grey background indicates those which are conserved only in a group of CPs. Black arrows indicate highly conserved 11 bp direct repeats. Red nucleotides indicate imperfect 13–18 direct repeats found in most CPs. The portion of the plasmid genome shown corresponds to the following positions: pSOG1, 13123–13307; pSOG2, 13193–13382; pARN3, 12858–13035; pING1, 13408–13585; pKEF9, 13565–13748; pNOB8, 19595–19751; pHVE14, 14997–15175.

 
IS elements and transposases
Unlike the other Sulfolobus CPs, except the pARN family, pSOG plasmids do not encode any protein with homology to transposases or ORFs known to be associated with insertion sequences (Fig. 4Up).

Putative transcriptional regulators
Bacterial CPs have evolved systems of regulation that minimize the metabolic load on the host exerted by the maintenance of a conjugative transfer apparatus while optimizing the adaptive advantages of self-transmission. Such systems also seem to operate in Sulfolobus CPs, like pSOG2. Upon conjugation, pSOG2 actively replicates to a high copy number, but subsequently replication appears to be strongly down-regulated to reduce the copy number and to maintain the plasmid stably in its new host. In bacterial CPs, regulatory circuits involving specialized transcriptional regulators have been described (Zatyka & Thomas, 2002Down). There are as many as six ORFs that potentially play similar roles in the pSOG plasmids. The first one, ORF62 or CopG, was discussed above. ORF132 and ORF1-159 belong to a superfamily of proteins containing a winged-helix–turn–helix (wHTH) DNA-binding domain. Genomic studies have shown that this class of HTH proteins is predominant in Archaea and that its diversity is comparable to that of bacteria (for a recent review see Aravind et al., 2005Down). Although the wHTH domain in Archaea combines with a variety of other domains including components of the replication or translation systems or in metabolic enzymes, most of the archaeal wHTH-containing proteins are predicted to be gene/operon-specific transcriptional regulators (Aravind & Koonin, 1999Down). This seems to be the case for ORF132 and ORF1-159, which are related to the MarR-like family (Pfam1047). Homologues of ORF132 are present in the Sulfolobus CP pHVE14, and several very closely related ORFs were identified in the chromosome of S. acidocaldarius and S. tokodaii (Table 1Up). Much weaker similarity to ORF132 was found with genes residing on pNOB8 and the S. solfataricus genome; no apparent homologues were found in other CPs. Interestingly, ORF1-159 has no significant similarities with other putative regulators identified in other Sulfolobus replicons nor with any proteins in the public databases, and its wHTH domain aligns only poorly with that of ORF132. The clear difference between the two DNA-binding proteins suggests that they have distinct functions in pSOG-related regulation. Since ORF1-159 and the closely associated ORF1-349 (ParB) were found only in pSOG1, both appear to be dispensable for Sulfolobus plasmids. Both ORF132 and ORF1-159 constitute a single-gene transcription unit; it is therefore difficult to infer which genes or operon they may control.

ORF78 encodes a member of the novel family of Sulfolobus plasmid regulatory proteins (pfam 05584) also known as PlrA (Table 1Up). So far representatives of this family have been found only in plasmids from the crenarchaeal genera Sulfolobus and Acidianus (Kletzin et al., 1999Down; Peng et al., 2000Down). This family is related to the DeoR family of bacterial transcriptional activators (Pfam 00455). It is almost identical (98 % amino acid identity) to the PlrA homologues in pARN3 and pKEF9 (Greve et al., 2004Down), but less so (~50 % identical) to other PlrA proteins. One member of this new family, ORF80 of the small cryptic plasmid pRN1 from ‘S. islandicus’, has been characterized (Lipps et al., 2001aDown). It has been shown experimentally that this basic protein binds in a highly specific manner to double-stranded DNA sequences upstream of ORF80. These sequences are conserved in the region upstream of other family members including ORF78 of pSOG1. ORF80 binds DNA as a dimer. Sequence analysis suggested that this dimerization is mediated by a leucine-zipper motif, the location of which is inverted with respect to the basic domain of the protein as compared to all other known leucine-zipper proteins. ORF80 has thus been proposed to be the first representative of a novel class of leucine-zipper proteins (Lipps et al., 2001aDown). Since the binding site of ORF80 partly overlaps with the putative archaeal TATA box, it was suggested that ORF80 represses its own transcription in an autoregulatory manner. It was suggested that ORF80 could form a complex with the replication initiation machinery (Lipps et al., 2001aDown). Moreover, it has been proposed that the region upstream of ORF80 contains the double-stranded origin of replication in pRN1, and that ORF80 could be involved in the regulation of plasmid copy number (Kletzin et al., 1999Down; Peng et al., 2000Down). Experimental evidence supporting these hypotheses is still lacking. All other plasmids of Sulfolobus, with the exception of pORA1 and pSOG2, contain PlrA homologues (Greve et al., 2004Down). This indicates an important but not essential role for PlrA for Sulfolobus plasmid function.

Interestingly, ORF93a and ORF175a, which form a putative operon with ORF211 (in the order 93a–211–175a), are also wHTH-containing proteins. ORF175a belongs to the TrmB family (Pfam 01978), of which two members have recently been characterized: TrmB is a sugar-specific transcriptional regulator of the operon encoding the trehalose/maltose ABC transporter in the hyperthermophilic euryarchaea Thermococcus litoralis and Pyrococcus furiosus (Lee et al., 2003Down). ORF93a belongs to a small family of predicted transcriptional regulator proteins of euryarchaeotes. ORF175a is not conserved in other described Sulfolobus CPs but is present in the sequence deposed in GenBank (NC_005969) for plasmid pTC of ‘Sulfolobus. tengchongensis’ and in the integrated plasmid-related element SA3 found in the chromosome of S. acidocaldarius (Chen et al., 2005Down). ORF93a homologues are found only in pARN3 and pHVE14. However in the other CPs (pKEF9, pING1 and pNOB8) an ORF of about the same size (encoding 92–95 aa) is found upstream of the ORF211 homologues (conserved in all the CPs) forming a putative operon. All of these ORFs are predicted to encode archaeal transcriptional regulators of the wHTH clan related either to the MarR, LysR or to the ArsR families. Thus, all these small regulators seem to be interchangeable as long as they ensure the same function in the same genomic context in different plasmids.

The integrase of pSOG plasmids and its integration site
Like other CPs of Sulfolobus, both pSOG plasmids contain a homologue (ORF421) of a new family of integrase proteins, the pNOB8-type integrase, originally identified by comparing plasmid-like sequences in the S. tokodaii genome (Kawarabayasi et al., 2001Down) with the pNOB8 CP (She et al., 2002Down). These integrases are assumed to facilitate reversible integration of the CPs into the host chromosome by site-specific recombination between a plasmid attachment site attP and the corresponding chromosomal attB site. The pNOB8-type integrase belongs to the superfamily of tyrosine recombinases, which play several crucial roles in prokaryotes and eukaryotes (for a review see Van Duyne, 2002Down). One striking feature of this family is the lack of global homology among its more than 150 members. Nevertheless, a conserved signature is found in the C-terminal part of all the proteins. All members of the family harbour two short regions of similarity, box I and box II, sharing four nearly invariant amino acids residues, R...HxxR...Y, directly involved in catalysis of the DNA strand cleavage and exchange (Esposito & Scocca, 1997Down; Nünes-Duby et al., 1998Down). Sequence alignments revealed that both the motifs found for the integrase of the Sulfolobus virus SSV1, the SSV-type, R...Kxxx...Y, and for the pNOB8-type integrase, R...Yxxx...Y differ from the consensus and may represent three major classes of integrase (She et al., 2004Down).

The overall sequence of pSOG integrase is similar to that of other Sulfolobus CPs and to the related integrase genes found in the Sulfolobus and Aeropyrum chromosomes (an alignment of the Sulfolobus CPs integrases with representative members of the tyrosine recombinases is available as supplementary data with the online version of this paper). Among these, however, pSOG integrase is more closely related to the integrases of pKEF9, pARN3 and pHVE14, with which it shares up to 63 % identity. These integrase sequences clearly form a homogeneous distinct group. As expected from the general features of tyrosine recombinases, the differences from the other aligned sequences are more pronouced in their N-terminal halves. Interestingly, search for conserved protein domains revealed that pSOG Int and the five other closely related integrases harbour a typical HTH-XRE domain (smart: SM00530, pfam: PF01381) in their N-termini (amino acid positions 17–69 in pSOG Int). This protein domain is found in a large family of DNA-binding proteins that include a bacterial plasmid copy control protein, bacterial methylases, and various bacteriophage transcription control proteins like the Cro and cI repressors of bacteriophage {lambda}. In Archaea, this motif is also well represented, with 106 entries in the Smart database. Most of them are small proteins (less than than 200 aa) with no other conserved domain and are predicted to be transcriptional regulators. Several of the archaeal HTH-XRE-containing proteins possess additional enzymic or protein–protein interaction domains. In a few cases, a HTH-XRE domain is fused to a metabolic enzyme (e.g. purine phosphoribosyltransferase of Pyrococcus abyssi, PAB2035). These proteins might combine the catalytic function with that of transcription regulation of the biosynthetic genes in response to the respective metabolite in the environment. Another type of association is found in archaeal inteins (e.g. those of the replication factor C from Methanococcus jannaschii), where the HTH-XRE domain is sandwiched between the N-terminal part of the intein module and the inserted homing endonuclease domain (Gogarten et al., 2002Down). This association with the endonuclease domain suggests that the HTH might play a role in the recognition of target sequences by the endonuclease in the process of homing.

By analogy with the previous examples, we envisage two general roles for the HTH domain in the pSOG integrase. First, the HTH domain could contribute to the integrase DNA-binding at the attP site. In the prototype integrase, the {lambda} integrase, the 356 aa sequence can be split into two domains by limited proteolysis. The N-terminal domain includes residues 1–64 and is responsible for binding the so-called arm-type sites of attP [adjacent direct repeats sites that flank the core region where crossing-over occurs (Groth & Calos, 2004Down)] while the C-terminal domain binds the lower-affinity core-type sites and contains the catalytic site (Groth & Calos, 2004Down). That the HTH domain of pSOG integrase could serve in binding of arm-type sites seems unlikely since such a domain is absent in the closely related pNOB8 integrase which has recently been proved active in S. solfataricus (She et al., 2004Down), indicating that this domain is not essential for the activity of the protein. Moreover, none of the identified prokaryotic or eukaryotic integrases possesses a HTH domain in its N-terminus. We therefore infer that the HTH-XRE motif is somehow involved in transcriptional regulation of the integration/excision of pSOG plasmid.

The putative integration site attP used by the pSOG plasmids corresponds to a 43 bp invariant sequence which is identical to the 3' end of two glutamyl-tRNA genes in the S. solfataricus P2 genome. In virus SSV1, the only well-studied archaeal integration system, a conserved 44 bp sequence, identical to the 3' half of an arginyl-tRNA gene, was found in the genome of the host Sulfolobus shibatae flanking the provirus as direct repeats (Muskhelishvili et al., 1993Down). Recent studies showed that the SSV1 integrase cleaves both DNA strands at the att sites and that the cleavage positions are localized on each side of the anti-codon loop of the tRNA where SSV1 integration takes place (Serre et al., 2002Down). This situation occurs quite frequently in the prokaryotic integrases, where some subfamilies recognize the flanking symmetry of the anti-codon stem–loop structure and use exclusively this tRNA sublocation as integration site (Williams, 2002Down). Alignment of the Sulfolobus CP attP sequences with those of the corresponding tRNA genes in S. solfatatricus (Fig. 7Down) strongly suggests that all the Sulfolobus CP integrases also use the anti-codon stem–loop sublocation as integration site and therefore belong to the same class of integrases as SSV1 Int. Interestingly, several Sulfolobus CP integrases, including pSOG and pNOB8, apparently share the same putative integration sites in Sulfolobus, the two tRNAGlu genes. However, pSOG integrase may use preferentially the tRNAGlu with a CTC anti-codon (Ssot26) that shows a perfect match with the pSOG attP, while the second (Ssot32), with TTC as anti-codon, has one mismatch in the anticodon. This situation is the exact opposite of that in pNOB8 integrase (She et al., 2004Down). Surprisingly, no attP site was detected in pING1 plasmids, which therefore can no longer integrate into the host chromosome.


Figure 7
View larger version (26K):
[in this window]
[in a new window]
 
Fig. 7. Alignment of the Sulfolobus CP integration sites. Conserved sequence positions are indicated on a black (completely conserved) or grey (partly conserved) background. Sequences in the attP region of each CP are aligned with their cognate tRNA sequence from the S. solfataricus genome (no site found for pING1). The boxed sequence corresponds to the tRNA anticodon loop and the flanking vertical arrows indicate the putative integrase cleavage positions. The core site symmetrical elements of attP (P, P') and attB (B, B') are indicated following the conventions used by Campbell (1992)Down for bacterial integration sites. The portion of the plasmid genome shown corresponds to the following positions: pSOG1, 21418–21516; pSOG2, 22791–22889; pARN3, 23442–23540; pKEF9, 25448–25546; pNOB8, 31203–31109; pHVE14, 27790–27867. The portions of sequences shown are the reverse complement of that deposited in GenBank.

 
Comparison of the sites of recombination in the pSOG1 and pSOG2 genomes
A putative recombination motif was previously described to explain the variation of the pING family of Sulfolobus CPs (Stedman et al., 2000Down). This motif, 5'-TAAACTGGGGAGTTA-3', was also found in regions of sequence divergence in the Sulfolobus CPs pHVE14, pKEF9, pARN3, pARN4 and pNOB8 (Greve et al., 2004Down). Strikingly, this motif is found flanking all but one of the locations in the pSOG1 and pSOG2 genomes in which they differ (Fig. 2Up, Fig. 4Up). Two of these sequences flank the conserved block of ORFs predicted to have conjugative functions (Fig. 4Up, section A) and presumably allowed the recombination event producing pSOG1 and pSOG2. Other recombination motifs flank the region near the putative replication origin and the identical sequences in pSOG1 and pSOG2. The only major gene insertion in pSOG1 that does not contain this flanking motif is between pSOG1 ORFs 1-421 and 1-288. This is the region that encompasses the PlrA and ParB homologues in pSOG1, leading us to speculate that this region was deleted from pSOG2 rather than inserted in pSOG1. There are no other sequences surrounding this region that indicate recombination, other than that of the flanking integrase gene ORF421.

Plasmid copy number and stability
The development of genetic tools for hyperthermophilic Archaea in general and Sulfolobus in particular has been hampered by the relative lack of stable plasmids with controlled copy number. The fuselloviruses of Sulfolobus replicate as double-stranded circular DNA and have been used as self-spreading plasmids (Jonuscheit et al., 2003Down; Schleper et al., 1995Down; Stedman et al., 1999Down). However, with larger insertions these plasmids are not very stable (Jonuscheit et al., 2003Down). Their copy number control is also not well understood. The large Sulfolobus CP pNOB8 was used as a vector for the first successful transformation of Sulfolobus (Elferink et al., 1996Down) but is also not stable.

This study was initiated in order to determine the genetic basis for stability and copy number control of the pSOG plasmids. Plasmid pSOG2 is very attractive as a potential vector as it has a stable low copy number in S. solfataricus P1 and can be transferred from cell to cell by conjugation. Unfortunately the molecular basis of this control is not clear. Perhaps the presence of the PlrA protein in pSOG1 causes a higher copy number or the origin of replication of pSOG1 is more active. Stability may be directly related to plasmid copy number, as strains containing CPs grow much more slowly than those that do not (Prangishvili et al., 1998Down; Schleper et al., 1995Down).

Plasmid compatibility and use as vectors
Plasmid compatibility has not been well studied in Sulfolobus. Compatible replicons are critical for sophisticated genetic experiments. The integration sites of different SSV viruses are different, indicating that they may be compatible with each other, but this has yet to be demonstrated (Wiedenheft et al., 2004Down). It is possible to co-infect certain ‘S. islandicus strains with both the SIRV virus and SSV1 (Prangishvili et al., 1999Down). Small plasmids can occur in the presence of either larger plasmids or virus genomes, as has been shown for SSV2 and the virus/plasmid hybrid pSSVx (Arnold et al., 1999Down). The non-conjugative plasmids pRN1 and pRN2 were both found in the same strain of S. islandicus’, REN1H1 (Zillig et al., 1994Down), but contain no selectable markers. The conjugation proteins of pSOG1 and its putative replication origin are clearly related to the pKEF family of Sulfolobus CPs, whereas the conjugation proteins and the origin of pSOG2 are clearly related to counterparts of the pARN family of Sulfolobus CPs. These two plasmids or precursors thereof are found in the SOG2/4 strain (‘pSOG1’ and ‘pSOG2/4’) (Fig. 1Up.). These two plasmids are the first example of two different families of Sulfolobus CPs to be found in the same Sulfolobus strain. Compatibility between different families of Sulfolobus CPs was previously demonstrated in laboratory conjugation experiments but had not previously been shown to exist in naturally occurring strains (Prangishvili et al., 1998Down). It remains to be determined if these CPs are also compatible with Sulfolobus viruses and small plasmids. In any case pSOG2 should be a useful addition to the Sulfolobus genetics tool-kit as a low-copy-number stable CP. pSOG1 may be useful as a ‘Trojan horse’ for the introduction of manipulated genes into host chromosomes by homologous recombination or transient expression.


    ACKNOWLEDGEMENTS
 
G. E. was supported by EU grant ERBIO4-CT96-0270; K. S. was supported by a Marie Curie Fellowship from the European Commission, a NSF-NATO Fellowship and Portland State University. Thanks to members of the Zillig lab for Southern blots on ‘S. islandicus’ strains.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
Albers, S.-V., Jonuscheit, M., Dinkelaker, S., Urich, T., Kletzin, A., Tampe, R., Driessen, A. J. M. & Schleper, C. (2006). Production of recombinant and tagged proteins in the hyperthermophilic archaeon Sulfolobus solfataricus. Appl Environ Microbiol 72, 102–111.[Abstract/Free Full Text]

Aravalli, R. N. & Garrett, R. A. (1997). Shuttle vectors for hyperthermophilic archaea. Extremophiles 1, 183–191.[CrossRef][Medline]

Aravind, L. & Koonin, E. V. (1999). DNA-binding proteins and evolution of transcription regulation in the archaea. Nucleic Acids Res 27, 4658–4670.[Abstract/Free Full Text]

Aravind, L., Anantharaman, V., Balaji, S., Babu, M. M. & Iyer, L. M. (2005). The many faces of the helix-turn-helix domain: transcription regulation and beyond. FEMS Microbiol Rev 29, 231–262.[CrossRef][Medline]

Arnold, H. P., She, Q., Phan, H., Stedman, K., Prangishvili, D., Holz, I., Kristjansson, J. K., Garrett, R. & Zillig, W. (1999). The genetic element pSSVx of the extremely thermophilic crenarchaeon Sulfolobus is a hybrid between a plasmid and a virus. Mol Microbiol 34, 217–226.[CrossRef][Medline]

Bartolucci, S., Rossi, M. & Cannio, R. (2003). Characterization and functional complementation of a nonlethal deletion in the chromosome of a beta-glycosidase mutant of Sulfolobus solfataricus. J Bacteriol 185, 3948–3957.