|
|
||||||||

1 Laboratory of Microbiology, Wageningen University, Wageningen, The Netherlands
2 UMR CNRS 6539, IUEM, Université de Bretagne Occidentale, Technopôle Brest-Iroise, Place Copernic, 29280 Plouzané, France
3 Max-Planck-Institut für Biochemie, Martinsried, Germany
4 Biology Department, Portland State University, Portland, OR 97207, USA
Correspondence
Gaël Erauso
gael.erauso{at}univ-brest.fr
| ABSTRACT |
|---|
|
|
|---|
The GenBank/EMBL/DDBJ accession numbers for the sequences of the pSOG plasmids are DQ335583 (pSOG1) and DQ335584 (pSOG2).
An alignment of the Sulfolobus CP integrases with representative members of the tyrosine recombinases is available as supplementary data with the online version of this paper.
| INTRODUCTION |
|---|
|
|
|---|
The first archaeal conjugative plasmid (CP), pNOB8, was isolated from a Japanese Sulfolobus isolate (Schleper et al., 1995
). Since then, several other CPs have been isolated from colony-cloned strains of Sulfolobus islandicus, and subsequently characterized (Greve et al., 2004
; Stedman et al., 2000
). Sequence comparison of all Sulfolobus CPs revealed three distinct sequence domains. One well-conserved cluster of genes covering approximately 12 kbp of the plasmids' genomes apparently contains the conjugative functions. A second is the putative origin of replication. Finally there is a region proposed to encode replication proteins (Greve et al., 2004
). Only a few distant homologues to bacterial proteins involved in conjugative transfer (TraG, TrbE) and partitioning (ParA, ParB) have been found. In the case of the pNOB8 and pING plasmids, derived variant plasmids were detected upon propagation. These occur as a result of deletion and recombination (She et al., 1998
; Stedman et al., 2000
). Comparing the conserved sequences of CPs with some non-conjugative derivatives has provided insight into proteins and DNA sequence motifs putatively involved in conjugation in Archaea.
A single strain of S. islandicus SOG2/4 was found to harbour two very different but related plasmids. One of these had a stable low copy number in the well-characterized S. solfataricus P1 strain, so it was of interest for the development of genetic tools. The two plasmids were separated and characterized. Here we present the complete sequences of these two archaeal CPs (pSOG1 and pSOG2). Comparison of these novel CPs with the available counterparts has been used to further identify plasmid features that play key roles in conjugative transfer in Archaea.
| METHODS |
|---|
|
|
|---|
Cloning and sequencing.
Prior to cloning, plasmid DNA preparations were purified by ultracentrifugation in a caesium chloride gradient in the presence of ethidium bromide (1 mg ml1) (Sambrook et al., 1989
). Digestion of both plasmids with EcoRI produced 11 bands for pSOG1 and 10 bands for pSOG2, ranging from 0.3 to 7.2 kbp. All of these fragments were cloned in the EcoRI site of pUC28 (Benes et al., 1993
). Fragments obtained by digestion with BamHI, HindIII, PstI and XbaI in the size range 0.84.5 kb were also cloned in the corresponding sites of pUC28 to obtain an overlapping clone library for pSOG1 and pSOG2. Sequencing reactions were carried out on a LiCor DNA sequencer 4000L with a Thermo Sequenase fluorescent-labelled primer cycle sequencing kit (Amersham Biosciences) and infrared-labelled primers M13 forward and M13 reverse (MWG-Biotech). Gaps in the sequence were filled by using specific primers either directly for sequencing on library clones or to sequence PCR amplicons obtained with native pSOG DNA as template. The sequences were trimmed and assembled using the SeqMan II program (Lasergene package), with both strands completely sequenced and with a minimum threefold coverage.
Computer analysis.
DNA sequences were analysed using Vector NTI software (version 9, Informax). Direct and inverted sequence repeats were detected by using the GeneQuest program (Lasergene). Cumulative GC skews were made with the Genskew software (http://mips.gsf.de/services/analysis/genskew) and the Z-curve program (http://tubic.tju.edu.cn/zcurve/). Analyses were done with a window size of 30 nt. Identification of putative genes and operons was performed using the FGENESB pattern/Markov chain-based prediction program from Softberry (http://softberry.com/berry.phtml) and the pre-trained parameters of Sulfolobus solfataricus and S. tokodaii. Putative promoters (TATA box), ShineDalgarno sequences and terminators were identified with a window size of 12, 6 and 11 nt, respectively, in the 50 nt sequences upstream or downstream of the predicted gene start and stop codon. The nucleotide sequences were analysed using the Gibbs sampler algorithm (Thompson et al., 2003
). Sequence logos were generated using WebLogo (Crooks et al., 2004
). Homology searches were performed with a range of BLAST tools at the NCBI server (http://www.ncbi.nlm.nih.gov/blast). Identities were calculated with the program LALIGN at the Swiss EMBnet node server (http://www.ch.embnet.org/. Combined searches of a number of databases of protein families, domains and functional sites were performed using SMART (http://smart.embl-heidelberg.de/) and CDD tool (NCBI). The program COILS (EMBnet) was used for finding
-helical coiled-coil domains. Transmembrane domains were predicted by the programs PSORT (http://psort.nibb.ac.jp/), TMPRED (embnet) and TMHMM (http://www.cbs.dtu.dk/services/TMHMM). Identification of potential signal peptides was done with SIGNALP (http://www.cbs.dtu.dk/services/SignalP). For phylogenetic analyses, the deduced amino-acid sequences of the largest conserved ORFs in each Sulfolobus CP were aligned using MUSCLE (Edgar, 2004
) and revised manually. Trees were generated from each individual alignment and for concatenated alignments of several ORFs, using the neighbour-joining method (Saitou & Nei, 1987
) of the MEGA 3.1 program (Kumar et al., 2004
). Distances were calculated using the Poisson correction (PC) distance model (Nei & Kumar, 2000
). Tree significance was assessed by bootstrapping 1000 times.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
|
3 kbp with more than 95 % identity) with other Sulfolobus CPs of the pKEF group (nomenclature according to Greve et al., 2004
The G+C content of pSOG CPs is not evenly distributed, displaying a number of peaks and troughs (not shown). Five regions of more than 2000 bp have a higher G+C content (>36 mol%); these fragments roughly correspond to parts of the genome that encode the most-conserved ORFs in Sulfolobus CPs (Fig. 2
, Table 1
). In contrast, lower G+C regions are less extended and contain less-conserved ORFs. The latter fragments may encode functional units, such as partitioning and additional elements involved in conjugation (see below), also indicating that pSOG2/4 plasmids have a mosaic structure composed of elements of diverse origin. A clear minimum, corresponding to several successive short poly(A) stretches, is located just in front of ORF175, present in both plasmids (ORFs present in only one plasmid are listed as ORF1- for pSOG1 and ORF2- for pSOG2).
|
|
|
To better evaluate the relationship between the eight self-transmissible plasmids of Sulfolobus, we used the most representative genes as phylogenetic markers. Unrooted trees were obtained from alignment of the eight homologous genes for each of the four largest ORFs of section A, namely TrbE, 2-779, 2-610 and TraG, and for a concatenated alignment of these four genes. The topology obtained for each individual gene (not shown) was very similar to that obtained for the concatenated tree (Fig. 5a
) and not dependent on the method used for tree construction (neighbour joining, minimal evolution, parsimony). The concatenated tree clearly shows two distinct groups: (i) the pKEF9 group (Greve et al., 2004
), which includes pSOG2 and the more divergent pNOB8 branch, and (ii) the pARN group, to which pSOG1 may belong. However, this clustering does not agree with the integrase tree, where pARN, pHVE and pKEF appear closely related whilst the pING integrase is the most divergent. This difference in tree topology suggests a distinct origin for the respective genomic fragments coinciding with their functional modularity, section A being devoted to conjugation and section C to replication and recombination.
|
By analogy with the model recently proposed for bacteria (Grahn et al., 2000
), we assume that in Sulfolobus conjugation, the TraG-like and the TrbE-like proteins form a heteromultimeric complex associated with the cytoplasmic membrane, and pump the DNA through a membrane-spanning channel constituted of at least the permease-like component, which may contribute actively to the T-DNA translocation. Several other pSOG ORFs, including the fourth largest, 1-609 and 2-615, contain putative membrane helices with predicted inner and outer segments (Table 1
), and may also be involved in mating pair formation (mpf). In bacterial CPs, genes encoding conjugative transfer functions are generally clustered in one or two tra regions (Grohmann et al., 2003
; Pansegrau et al., 1994
), one encoding the relaxosome and the other the mpf system. The remarkable conservation of the almost contiguous cluster of ORFs (including the putative TraG, TrbE, ORFs 1-734/2-610 and 1-609/2-615) in all the Sulfolobus CPs suggests that these genes are involved in the same function, probably the mpf system.
Homology searches failed to detect putative components of the relaxosome in the pSOG sequences. Previous comparison of the genome sequence of functionally defective pING variants that had lost their capacity for self-transfer but were still transmissible led to the proposal that some of the conserved ORFs encode mobilization (mob) functions (Stedman et al., 2000
). Among the four candidate mob genes only one, ORF211, is present in all self-transmissible plasmids of Sulfolobus CPs (Fig. 4
). It is therefore tempting to speculate that, as in bacteria (Francia et al., 2004
), archaeal mobile DNA elements carry the information necessary for relaxosome formation. The small pING plasmids of Sulfolobus should contain a gene encoding a functional analogue of the bacterial relaxases as well as a cis-acting transfer origin (oriT). Previous attempts to locate oriT in Sulfolobus CPs showed that six conserved sequence motifs could potentially play that role (Stedman et al., 2000
). We found that only one of these sequence elements (motif 2) is conserved in the pSOG plasmids and generally in the Sulfolobus CPs of the pKEF9 family (but not in the pARN plasmids), as well as in mobilizable pING derivatives, and that this motif is always located immediately upstream of ORF211. In a typical bacterial mobilization region, oriT is located upstream of the gene encoding the relaxase (Francia et al., 2004
). The genomic context suggests a hypothetical function of relaxase for ORF211. However, the lack of any detectable sequence similarities with bacterial Mob proteins makes this assumption questionable.
Partitioning and plasmid maintenance
The N-terminal half (amino acids 1160) of ORF1-349 in pSOG1 is similar to the highly conserved ParB/SpoB protein family involved in partitioning of bacterial plasmids and chromosomes (Table 1
). The sequence includes two conserved motifs that are proposed to be involved in interaction with ParA/SopA and unknown host factors in bacteria (Hanai et al., 1996
), and a helixturnhelix DNA-binding domain typical of the ParB family. The three motifs aligned well with a set of divergent bacterial ParB proteins but poorly with ORF470 and ORF422 of the Sulfolobus plasmid pNOB8 (She et al., 1998
), and not at all to ORFs in other Sulfolobus CPs (not shown). Due to (i) the relatively high similarity of ORF1-349 to bacterial rather than archaeal homologues, (ii) the significant difference in codon usage compared to the other pSOG ORFs (data not shown), and (iii) the genomic context (Fig. 2
), it is suspected that ORF1-349 and surrounding DNA are the result of lateral gene transfer. Thus this parB homologue has most likely been acquired from a bacterial plasmid. The C-terminal region of ORF1-349 showed similarity to COG 5483, for which no function has yet been established. Surprisingly, a homologue of ParA/SopA appears not to be encoded by pSOG1. In bacteria, plasmid partitioning during cell division proceeds via the so-called segrosome, a protein complex at least consisting of ParA and ParB, which are encoded by the plasmid-borne par locus (Gerdes et al., 2000
; reviewed by Hayes & Barilla, 2006
). ParA is a membrane-associated ATPase that forms a complex with the DNA-binding ParB (Bignell & Thomas, 2001
); the binding site of ParB is the cis-acting partition site parS. The archaeal plasmid pNOB8, which is stably maintained at a low copy number in its natural host (
5 copies per chromosome), contains one ParA and two ParB homologues, as well as a putative parS element. These elements are missing in the unstable, high-copy-number variant pNOB8-33, formed after conjugative transfer in the foreign host S. solfataricus P1 (She et al., 1998
). Previously, it was reported that Par-like components are also absent in the other Sulfolobus CPs, which led to the conclusion that they may lack copy number control and partitioning (Greve et al., 2004
). Similarly, no homologues of parA or parS appear to be present on the genomes of the pSOG plasmids. Probably maintenance of pSOG1 does not require a partitioning system, most likely due to its high copy number (4050 copies per chromosome) in S. solfataricus P1. In the case of pSOG2, however, the stable low copy number does suggest some partitioning system that is different from the Par system. One such alternative has been suggested to be the so-called clustered regularly interspaced short palindrome repeats (CRISPRs, previously referred to as SRSR). These repeats are present in many prokaryotic genomes (Jansen et al., 2002
), and also in pNOB and pKEF9 (Greve et al., 2004
; Peng et al., 2003
). An overexpression study in Haloferax initially suggested involvement in replicon partitioning (Mojica et al., 1995
). However, recent comparative analyses suggest a role of the repeats in a host-defence mechanism against extrachromosomal elements (viruses and plasmids) (Bolotin et al., 2005
; Mojica et al., 2005
). No CRISPRs were found in the pSOG plasmids. Hence, the molecular basis for partitioning of low-copy-number archaeal CPs, including pSOG2, remains to be identified.
Plasmid replication
The pSOG plasmids contain an operon of nine short genes (ORF113 to ORF96) presumably involved in plasmid replication. A similar operon is present in each Sulfolobus CP and located in conserved region C of their genome (Greve et al., 2004
). Five of the genes of the pSOG operon, including the first and the last two, occur in the same order in all Sulfolobus CPs (Fig. 4
). For two of the putative proteins, searches in databases provide indirect evidence for their role in DNA replication.
Sequence similarity indicates that ORF62 belongs to the CopG family, a copy number control protein used by numerous bacterial plasmids (del Solar et al., 2002
). In these bacterial plasmids, the copG gene is located upstream of a gene encoding a replication initiator protein and the two genes are expressed from a common promoter. The CopG protein binds to this promoter and represses the expression of both proteins, thus controlling the replication of the plasmid (del Solar et al., 2002
). A similar organization exists in the Sulfolobus cryptic plasmid pRN1, where the copG gene precedes the gene for a RepA homologue (Keeling et al., 1998
). It was shown that pRN1 CopG binds to a double-stranded DNA inverted repeat located within the cop-rep promoter and thus could downregulate the expression the RepA protein (Lipps et al., 2001b
). Such a set of inverted repeats was also identified in the promoter region of the replication operon of Sulfolobus CPs (Greve et al., 2004
). In pSOG plasmids as well, a set of 8 bp inverted repeats separated by only 1 bp is found immediately downstream of the TATA box resembling the pRN1 CopG binding site and also similar to the palindromic binding site of CopG of the bacterial plasmid pMV158 (Gomis-Ruth et al., 1998
) (see Fig. 4
of Greve et al., 2004
).
A function as replication initiator protein (RepA) was proposed for one of the most conserved ORFs of the replication operon (Greve et al., 2004
). Indeed ORF99 of pING1 shows weak but significant similarity to a putative chromosomal replication initiator of Haemophilus ducreyi (236 aa) and to RepA (346 aa) of plasmid pRUM from Enterococcus faecium. These similarities can be detected only when using the iterative PSI-BLAST tool. However, although similar in size and location in the rep operon, pSOG ORF106 is not homologous to the otherwise highly conserved RepA of Sulfolobus CPs. In fact, ORF106 of pSOG is homologous only to ORF107 of pHVE14, which is also present in the rep operon. Nevertheless, extensive search for a motif or conserved domain failed. Therefore no putative function could be attributed to this ORF. Since pSOG plasmids do not contain a homologue of the putative RepA-encoding gene, either it is not necessary for replication or another pSOG ORF serves that function, perhaps ORF106.
A highly conserved ORF in the replication operon, ORF84, exhibits a leucine-zipper motif from position 22 to 57. This motif facilitates protein dimerization and is common to a class of DNA-binding proteins mostly found in eukaryotic transcription factors such as GCN4. A few examples of this class are known in prokaryotes, including the RepA protein of bacterial plasmids and two archaeal transcriptional regulators, GvpE of Halobacterium salinarium (Kruger et al., 1998
) and PlrA of Sulfolobus (discussed below).
Plasmid replication origin
There are several reasons to assume that the origin of replication (oriV) may be located in the region extending from ORF1-76 (and partly overlapping this ORF) to ORF87 spanning about 700 bp. First, in the Z-curve of a cumulative GC-skew analyses of pSOG plasmids (not shown) both the X and Y components show a sharp peak centred within ORF1-153 and ORF2-150 in pSOG1 and pSOG2 respectively, indicating a possible replication origin (Chen et al., 2005
; Zhang & Zhang, 2004
). Second, that region contains a block of predicted genes, most of which are conserved in a similar gene context in other Sulfolobus conjugative and mobilizable plasmids. Third, this region is immediately preceded by a conspicuous AT-rich region (partly overlapping this conserved region) (Fig. 2
), which may facilitate opening of the DNA strands. This region is also relatively rich in short repeated motifs that could serve as binding sites for replication factors, even though these repeats are not regularly spaced like the so-called iterons which serve as binding sites for RepA in bacterial plasmid origins (del Solar et al., 1998
). A putative origin of replication has been proposed for Sulfolobus CPs that contains a specific direct repeat 5'-TCTATACCCCC-3' with 3435 nt spacing in the context of a highly conserved 170 nt region (Greve et al., 2004
). This direct repeat with appropriate spacing is found in both pSOG1 and pSOG2 (Fig. 6
), but the remainder of the sequence is not well conserved and there are substantial differences between pSOG1 and pSOG2 in both the intervening and flanking sequences. The pSOG1 sequence resembles the pKEF9 putative origin whereas the pSOG2 sequence is more similar to the pNOB8 putative origin. This may be critical for the simultaneous occurrence of both plasmids (or of their precursors) in the original SOG2/4 strain (Fig. 1
). This sequence partly overlaps with ORF1-76.
|
Putative transcriptional regulators
Bacterial CPs have evolved systems of regulation that minimize the metabolic load on the host exerted by the maintenance of a conjugative transfer apparatus while optimizing the adaptive advantages of self-transmission. Such systems also seem to operate in Sulfolobus CPs, like pSOG2. Upon conjugation, pSOG2 actively replicates to a high copy number, but subsequently replication appears to be strongly down-regulated to reduce the copy number and to maintain the plasmid stably in its new host. In bacterial CPs, regulatory circuits involving specialized transcriptional regulators have been described (Zatyka & Thomas, 2002
). There are as many as six ORFs that potentially play similar roles in the pSOG plasmids. The first one, ORF62 or CopG, was discussed above. ORF132 and ORF1-159 belong to a superfamily of proteins containing a winged-helixturnhelix (wHTH) DNA-binding domain. Genomic studies have shown that this class of HTH proteins is predominant in Archaea and that its diversity is comparable to that of bacteria (for a recent review see Aravind et al., 2005
). Although the wHTH domain in Archaea combines with a variety of other domains including components of the replication or translation systems or in metabolic enzymes, most of the archaeal wHTH-containing proteins are predicted to be gene/operon-specific transcriptional regulators (Aravind & Koonin, 1999
). This seems to be the case for ORF132 and ORF1-159, which are related to the MarR-like family (Pfam1047). Homologues of ORF132 are present in the Sulfolobus CP pHVE14, and several very closely related ORFs were identified in the chromosome of S. acidocaldarius and S. tokodaii (Table 1
). Much weaker similarity to ORF132 was found with genes residing on pNOB8 and the S. solfataricus genome; no apparent homologues were found in other CPs. Interestingly, ORF1-159 has no significant similarities with other putative regulators identified in other Sulfolobus replicons nor with any proteins in the public databases, and its wHTH domain aligns only poorly with that of ORF132. The clear difference between the two DNA-binding proteins suggests that they have distinct functions in pSOG-related regulation. Since ORF1-159 and the closely associated ORF1-349 (ParB) were found only in pSOG1, both appear to be dispensable for Sulfolobus plasmids. Both ORF132 and ORF1-159 constitute a single-gene transcription unit; it is therefore difficult to infer which genes or operon they may control.
ORF78 encodes a member of the novel family of Sulfolobus plasmid regulatory proteins (pfam 05584) also known as PlrA (Table 1
). So far representatives of this family have been found only in plasmids from the crenarchaeal genera Sulfolobus and Acidianus (Kletzin et al., 1999
; Peng et al., 2000
). This family is related to the DeoR family of bacterial transcriptional activators (Pfam 00455). It is almost identical (98 % amino acid identity) to the PlrA homologues in pARN3 and pKEF9 (Greve et al., 2004
), but less so (
50 % identical) to other PlrA proteins. One member of this new family, ORF80 of the small cryptic plasmid pRN1 from S. islandicus, has been characterized (Lipps et al., 2001a
). It has been shown experimentally that this basic protein binds in a highly specific manner to double-stranded DNA sequences upstream of ORF80. These sequences are conserved in the region upstream of other family members including ORF78 of pSOG1. ORF80 binds DNA as a dimer. Sequence analysis suggested that this dimerization is mediated by a leucine-zipper motif, the location of which is inverted with respect to the basic domain of the protein as compared to all other known leucine-zipper proteins. ORF80 has thus been proposed to be the first representative of a novel class of leucine-zipper proteins (Lipps et al., 2001a
). Since the binding site of ORF80 partly overlaps with the putative archaeal TATA box, it was suggested that ORF80 represses its own transcription in an autoregulatory manner. It was suggested that ORF80 could form a complex with the replication initiation machinery (Lipps et al., 2001a
). Moreover, it has been proposed that the region upstream of ORF80 contains the double-stranded origin of replication in pRN1, and that ORF80 could be involved in the regulation of plasmid copy number (Kletzin et al., 1999
; Peng et al., 2000
). Experimental evidence supporting these hypotheses is still lacking. All other plasmids of Sulfolobus, with the exception of pORA1 and pSOG2, contain PlrA homologues (Greve et al., 2004
). This indicates an important but not essential role for PlrA for Sulfolobus plasmid function.
Interestingly, ORF93a and ORF175a, which form a putative operon with ORF211 (in the order 93a211175a), are also wHTH-containing proteins. ORF175a belongs to the TrmB family (Pfam 01978), of which two members have recently been characterized: TrmB is a sugar-specific transcriptional regulator of the operon encoding the trehalose/maltose ABC transporter in the hyperthermophilic euryarchaea Thermococcus litoralis and Pyrococcus furiosus (Lee et al., 2003
). ORF93a belongs to a small family of predicted transcriptional regulator proteins of euryarchaeotes. ORF175a is not conserved in other described Sulfolobus CPs but is present in the sequence deposed in GenBank (NC_005969) for plasmid pTC of Sulfolobus. tengchongensis and in the integrated plasmid-related element SA3 found in the chromosome of S. acidocaldarius (Chen et al., 2005
). ORF93a homologues are found only in pARN3 and pHVE14. However in the other CPs (pKEF9, pING1 and pNOB8) an ORF of about the same size (encoding 9295 aa) is found upstream of the ORF211 homologues (conserved in all the CPs) forming a putative operon. All of these ORFs are predicted to encode archaeal transcriptional regulators of the wHTH clan related either to the MarR, LysR or to the ArsR families. Thus, all these small regulators seem to be interchangeable as long as they ensure the same function in the same genomic context in different plasmids.
The integrase of pSOG plasmids and its integration site
Like other CPs of Sulfolobus, both pSOG plasmids contain a homologue (ORF421) of a new family of integrase proteins, the pNOB8-type integrase, originally identified by comparing plasmid-like sequences in the S. tokodaii genome (Kawarabayasi et al., 2001
) with the pNOB8 CP (She et al., 2002
). These integrases are assumed to facilitate reversible integration of the CPs into the host chromosome by site-specific recombination between a plasmid attachment site attP and the corresponding chromosomal attB site. The pNOB8-type integrase belongs to the superfamily of tyrosine recombinases, which play several crucial roles in prokaryotes and eukaryotes (for a review see Van Duyne, 2002
). One striking feature of this family is the lack of global homology among its more than 150 members. Nevertheless, a conserved signature is found in the C-terminal part of all the proteins. All members of the family harbour two short regions of similarity, box I and box II, sharing four nearly invariant amino acids residues, R...HxxR...Y, directly involved in catalysis of the DNA strand cleavage and exchange (Esposito & Scocca, 1997
; Nünes-Duby et al., 1998
). Sequence alignments revealed that both the motifs found for the integrase of the Sulfolobus virus SSV1, the SSV-type, R...Kxxx...Y, and for the pNOB8-type integrase, R...Yxxx...Y differ from the consensus and may represent three major classes of integrase (She et al., 2004
).
The overall sequence of pSOG integrase is similar to that of other Sulfolobus CPs and to the related integrase genes found in the Sulfolobus and Aeropyrum chromosomes (an alignment of the Sulfolobus CPs integrases with representative members of the tyrosine recombinases is available as supplementary data with the online version of this paper). Among these, however, pSOG integrase is more closely related to the integrases of pKEF9, pARN3 and pHVE14, with which it shares up to 63 % identity. These integrase sequences clearly form a homogeneous distinct group. As expected from the general features of tyrosine recombinases, the differences from the other aligned sequences are more pronouced in their N-terminal halves. Interestingly, search for conserved protein domains revealed that pSOG Int and the five other closely related integrases harbour a typical HTH-XRE domain (smart: SM00530, pfam: PF01381) in their N-termini (amino acid positions 1769 in pSOG Int). This protein domain is found in a large family of DNA-binding proteins that include a bacterial plasmid copy control protein, bacterial methylases, and various bacteriophage transcription control proteins like the Cro and cI repressors of bacteriophage
. In Archaea, this motif is also well represented, with 106 entries in the Smart database. Most of them are small proteins (less than than 200 aa) with no other conserved domain and are predicted to be transcriptional regulators. Several of the archaeal HTH-XRE-containing proteins possess additional enzymic or proteinprotein interaction domains. In a few cases, a HTH-XRE domain is fused to a metabolic enzyme (e.g. purine phosphoribosyltransferase of Pyrococcus abyssi, PAB2035). These proteins might combine the catalytic function with that of transcription regulation of the biosynthetic genes in response to the respective metabolite in the environment. Another type of association is found in archaeal inteins (e.g. those of the replication factor C from Methanococcus jannaschii), where the HTH-XRE domain is sandwiched between the N-terminal part of the intein module and the inserted homing endonuclease domain (Gogarten et al., 2002
). This association with the endonuclease domain suggests that the HTH might play a role in the recognition of target sequences by the endonuclease in the process of homing.
By analogy with the previous examples, we envisage two general roles for the HTH domain in the pSOG integrase. First, the HTH domain could contribute to the integrase DNA-binding at the attP site. In the prototype integrase, the
integrase, the 356 aa sequence can be split into two domains by limited proteolysis. The N-terminal domain includes residues 164 and is responsible for binding the so-called arm-type sites of attP [adjacent direct repeats sites that flank the core region where crossing-over occurs (Groth & Calos, 2004
)] while the C-terminal domain binds the lower-affinity core-type sites and contains the catalytic site (Groth & Calos, 2004
). That the HTH domain of pSOG integrase could serve in binding of arm-type sites seems unlikely since such a domain is absent in the closely related pNOB8 integrase which has recently been proved active in S. solfataricus (She et al., 2004
), indicating that this domain is not essential for the activity of the protein. Moreover, none of the identified prokaryotic or eukaryotic integrases possesses a HTH domain in its N-terminus. We therefore infer that the HTH-XRE motif is somehow involved in transcriptional regulation of the integration/excision of pSOG plasmid.
The putative integration site attP used by the pSOG plasmids corresponds to a 43 bp invariant sequence which is identical to the 3' end of two glutamyl-tRNA genes in the S. solfataricus P2 genome. In virus SSV1, the only well-studied archaeal integration system, a conserved 44 bp sequence, identical to the 3' half of an arginyl-tRNA gene, was found in the genome of the host Sulfolobus shibatae flanking the provirus as direct repeats (Muskhelishvili et al., 1993
). Recent studies showed that the SSV1 integrase cleaves both DNA strands at the att sites and that the cleavage positions are localized on each side of the anti-codon loop of the tRNA where SSV1 integration takes place (Serre et al., 2002
). This situation occurs quite frequently in the prokaryotic integrases, where some subfamilies recognize the flanking symmetry of the anti-codon stemloop structure and use exclusively this tRNA sublocation as integration site (Williams, 2002
). Alignment of the Sulfolobus CP attP sequences with those of the corresponding tRNA genes in S. solfatatricus (Fig. 7
) strongly suggests that all the Sulfolobus CP integrases also use the anti-codon stemloop sublocation as integration site and therefore belong to the same class of integrases as SSV1 Int. Interestingly, several Sulfolobus CP integrases, including pSOG and pNOB8, apparently share the same putative integration sites in Sulfolobus, the two tRNAGlu genes. However, pSOG integrase may use preferentially the tRNAGlu with a CTC anti-codon (Ssot26) that shows a perfect match with the pSOG attP, while the second (Ssot32), with TTC as anti-codon, has one mismatch in the anticodon. This situation is the exact opposite of that in pNOB8 integrase (She et al., 2004
). Surprisingly, no attP site was detected in pING1 plasmids, which therefore can no longer integrate into the host chromosome.
|
Plasmid copy number and stability
The development of genetic tools for hyperthermophilic Archaea in general and Sulfolobus in particular has been hampered by the relative lack of stable plasmids with controlled copy number. The fuselloviruses of Sulfolobus replicate as double-stranded circular DNA and have been used as self-spreading plasmids (Jonuscheit et al., 2003
; Schleper et al., 1995
; Stedman et al., 1999
). However, with larger insertions these plasmids are not very stable (Jonuscheit et al., 2003
). Their copy number control is also not well understood. The large Sulfolobus CP pNOB8 was used as a vector for the first successful transformation of Sulfolobus (Elferink et al., 1996
) but is also not stable.
This study was initiated in order to determine the genetic basis for stability and copy number control of the pSOG plasmids. Plasmid pSOG2 is very attractive as a potential vector as it has a stable low copy number in S. solfataricus P1 and can be transferred from cell to cell by conjugation. Unfortunately the molecular basis of this control is not clear. Perhaps the presence of the PlrA protein in pSOG1 causes a higher copy number or the origin of replication of pSOG1 is more active. Stability may be directly related to plasmid copy number, as strains containing CPs grow much more slowly than those that do not (Prangishvili et al., 1998
; Schleper et al., 1995
).
Plasmid compatibility and use as vectors
Plasmid compatibility has not been well studied in Sulfolobus. Compatible replicons are critical for sophisticated genetic experiments. The integration sites of different SSV viruses are different, indicating that they may be compatible with each other, but this has yet to be demonstrated (Wiedenheft et al., 2004
). It is possible to co-infect certain S. islandicus strains with both the SIRV virus and SSV1 (Prangishvili et al., 1999
). Small plasmids can occur in the presence of either larger plasmids or virus genomes, as has been shown for SSV2 and the virus/plasmid hybrid pSSVx (Arnold et al., 1999
). The non-conjugative plasmids pRN1 and pRN2 were both found in the same strain of S. islandicus, REN1H1 (Zillig et al., 1994
), but contain no selectable markers. The conjugation proteins of pSOG1 and its putative replication origin are clearly related to the pKEF family of Sulfolobus CPs, whereas the conjugation proteins and the origin of pSOG2 are clearly related to counterparts of the pARN family of Sulfolobus CPs. These two plasmids or precursors thereof are found in the SOG2/4 strain (pSOG1 and pSOG2/4) (Fig. 1
.). These two plasmids are the first example of two different families of Sulfolobus CPs to be found in the same Sulfolobus strain. Compatibility between different families of Sulfolobus CPs was previously demonstrated in laboratory conjugation experiments but had not previously been shown to exist in naturally occurring strains (Prangishvili et al., 1998
). It remains to be determined if these CPs are also compatible with Sulfolobus viruses and small plasmids. In any case pSOG2 should be a useful addition to the Sulfolobus genetics tool-kit as a low-copy-number stable CP. pSOG1 may be useful as a Trojan horse for the introduction of manipulated genes into host chromosomes by homologous recombination or transient expression.
| ACKNOWLEDGEMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
Aravalli, R. N. & Garrett, R. A. (1997). Shuttle vectors for hyperthermophilic archaea. Extremophiles 1, 183191.[CrossRef][Medline]
Aravind, L. & Koonin, E. V. (1999). DNA-binding proteins and evolution of transcription regulation in the archaea. Nucleic Acids Res 27, 46584670.
Aravind, L., Anantharaman, V., Balaji, S., Babu, M. M. & Iyer, L. M. (2005). The many faces of the helix-turn-helix domain: transcription regulation and beyond. FEMS Microbiol Rev 29, 231262.[CrossRef][Medline]
Arnold, H. P., She, Q., Phan, H., Stedman, K., Prangishvili, D., Holz, I., Kristjansson, J. K., Garrett, R. & Zillig, W. (1999). The genetic element pSSVx of the extremely thermophilic crenarchaeon Sulfolobus is a hybrid between a plasmid and a virus. Mol Microbiol 34, 217226.[CrossRef][Medline]
Bartolucci, S., Rossi, M. & Cannio, R. (2003). Characterization and functional complementation of a nonlethal deletion in the chromosome of a
-glycosidase mutant of Sulfolobus solfataricus. J Bacteriol 185, 39483957.