|
|
||||||||
1 Department of Biology, Texas A&M University, College Station, TX 77843, USA
2 Department of Biology, Indiana University, Bloomington, IN 47405, USA
Correspondence
Jin Xiong
jxiong{at}mail.bio.tamu.edu
| ABSTRACT |
|---|
|
|
|---|
The GenBank/EMBL/DDBJ accession numbers for the sequences determined in this study are EU052681 and EU068732.
| INTRODUCTION |
|---|
|
|
|---|
Haem d1 is related to the denitrification process that converts nitrate to gaseous nitrogen as part of the anaerobic respiration of bacteria and archaea. Among the denitrifying enzymes is nitrite reductase, which converts nitrite to nitric oxide as an intermediate step of denitrification. Two types of nitrite reductase are known, copper-containing nitrite reductase and cytochrome cd1. The latter contains a unique tetrapyrrole, haem d1, as one of the prosthetic groups (e.g. Timkovich, 2003
). Little is known regarding the biosynthesis of haem d1 except that it may utilize uroporphyrinogen III, precorrins, sirohydrochlorin and porphyrindione d1 as intermediates (Yap-Bondoc et al., 1990
; Youn et al., 2004
; von Mering et al., 2005
) (Fig. 1a
).
|
Insertional mutagenesis analysis of Pseudomonas stutzeri has identified a nir locus that is necessary for haem d1 biosynthesis (de Boer et al., 1994
; Palmedo et al., 1995
; Glockner & Zumft, 1996
; Kawasaki et al., 1997
). In this locus, there are two nir operons, one containing nirJ, nirE and nirN genes, and the other nirC, nirF, nirD, nirL, nirG and nirH genes. NirN is homologous to NirS, the known structural polypeptide of cytochrome cd1, and shares regional homology with NirC and NirF (Timkovich, 2003
). The nirD, nirL, nirG and nirH genes are all strongly similar to each other at the sequence level and are proposed to have arisen from gene duplication events, although they do not have clearly defined functions. NirJ is a member of the radical S-adenosylmethionine (SAM) protein family, and does not have a clearly defined function in haem d1 biosynthesis. NirE is a SAM-dependent uroporphyrinogen methylase homologous to sirohaem synthase CysGA. This is the only enzyme that has clearly been suggested to catalyse the sequential methylation at C2 and C7 of the porphyrin to produce precorrin-1 and precorrin-2 during haem d1 biosynthesis (Kawasaki et al., 1997
). Except for NirE, the precise roles of other Nir proteins in the haem d1 biosynthetic pathway remain undefined.
The photosynthetic bacteria heliobacteria (Heliobacteriaceae) were first discovered in the early 1980s (Gest & Favinger, 1983
; Gest, 1994
) and have now been expanded to about a dozen strains encompassing five different genera (NCBI Taxonomy Database; http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Taxonomy). Heliobacteria, which belong phylogenetically to the low-GC Gram-positive group, are a unique group of photosynthetic bacteria in that they contain a bacteriochlorophyll g pigment and a simplified type I photosynthetic reaction centre (Madigan & Ormerod, 1995
). They are also known to be able to fix nitrogen and perform ammonia assimilation (Kimble & Madigan, 1992
). No other aspects of nitrogen metabolism are known for heliobacteria nor is there any indication that they may catalyse haem d1 biosynthesis.
We report here the discovery of a gene cluster related to haem d1 biosynthesis in two heliobacterial species, Heliobacillus mobilis and Heliophilum fasciatum. Subsequent bioinformatics analysis of the genes encoding the haem d1 biosynthesis enzymes yielded a significant insight into the biochemical pathway for the synthesis of this unique tetrapyrrole molecule.
| METHODS |
|---|
|
|
|---|
General DNA manipulation.
The analysis with Hb. mobilis began by first identifying an evolutionarily conserved segment of the hemB gene sequence among a group of Gram-positive bacteria through database searching using BLAST (Altschul et al., 1997
) and sequence alignment using CLUSTAL (Thompson et al., 1994
) and T-Coffee (Notredame et al., 2000
). The conserved region allowed the design of a pair of degenerate PCR primers with the aid of Oligo software (National Biosciences). The forward primer (TCKGCYTTYTAYGGACCHTTYC) and reverse primer (AYTCACCGSASACATTATA) used in degenerate PCR were synthesized by Integrated DNA Technologies. The analysis with Hp. fasciatum, which began after the entire Hb. mobilis sequence was obtained, was facilitated by the availability of the Hb. mobilis sequence information. It began by obtaining partial sequences from hemB, hemA2, hemD, hemL and hep2 using degenerate PCR (for hemA2, forward primer TCMACRTGCAAYCGDACGGA and reverse primer CACCTGYCCRAGAATTTGBGT; for hemL, forward primer TGGGGYCCICTKATYYTRGG and reverse primer GGTYAGIGCKCCIGAACC; for hep2, forward primer GGAAAAMGWYTVMGICCGGC and reverse primer ARWARRRRGCKGTYTTICG; and for hemD, forward primer AARGGMGGVGAYCCCTTYGT and reverse primer TSCCBGGHATCACYTCRGC).
The PCR products were cloned into the pUC19 vector with the PCR-Script Cloning kit (Stratagene). Small-scale plasmid DNA preparations were made by using the Qiaprep Spin Miniprep kit (Qiagen). DNA sequencing of the clones was performed with the universal primers for the pUC19 plasmid (forward primer CGCCAGGGTTTTCCCAGTCACGAC and reverse primer TCACACAGGAAACAGCTATGAC). Nucleotide sequences were determined by the dideoxy chain-termination method (Sanger et al., 1977
) using the BigDye Sequencing kit v3.1 (Applied Biosystems).
Once the partial hemB gene of Hb. mobilis was sequenced, the upstream and downstream flanking DNA was obtained by using the inverse PCR technique (Ochman et al., 1988
) repeatedly. For Hp. fasciatum, the partial gene fragments resulting from degenerate PCR were first joined using regular PCR and subsequently sequenced. Further upstream and downstream sequences were obtained using a novel genome-walking technique developed by Guo & Xiong (2006)
. The novel technique was necessary in this case because inverse PCR required a substantial quantity of genomic DNA that was not available for Hp. fasciatum. The novel method had the advantage of consuming only minute amounts of starting DNA.
Sequence analysis.
Sequencing was performed on both strands of DNA for cross-verification. The final sequence contigs were assembled by matching and removing overlapping regions of individual fragments and joining the remainder of the fragments. ORFs of the final sequences were determined using multiple hidden Markov model (HMM)-based gene-prediction programs: GeneMark.hmm (Lukashin & Borodovsky, 1998
), GeneMark frame-by-frame (Shmatkov et al., 1999
), AMIgene (Bocs et al., 2003
) and FrameD (Schiex et al., 2003
). The predictions were made with the HMMs of each program trained for a closely related low-GC Gram-positive bacterium such as Bacillus subtilis. To confirm the gene prediction, the putative ORFs were checked for the presence of RBSs immediately upstream of the start codons. Only the predicted frames that were preceded by the canonical RBS were accepted.
Once the genes and gene boundaries were determined, sets of genes that might be transcriptionally linked to form operons were predicted using the rule developed by Wang et al. (2004)
. The method, which has been shown to be 91 % accurate, required three pieces of information: gene orientation, intergenic distance and gene linkage conservation. To obtain the gene linkage information in other genomes, cross-genome comparison was performed with the aid of the STRING server (http://string.embl.de/), which compiled gene neighbourhood information of 179 completely sequenced genomes (von Mering et al., 2005
). To determine whether a pair of adjacent genes belonged to a common operon, a scoring scheme was used with the operon assignment threshold set at 2.
Gene functional annotation was based on a combined approach: (1) direct BLAST searches against the non-redundant GenBank database for translated proteins (Altschul et al., 1997
); (2) searches against the protein classification database Protonet (Sasson et al., 2003
), which annotates protein functions using a hierarchical tree-based approach with the aid of gene ontology, and provides information on the biological process, molecular function and cellular localization of each protein (e.g. Azuaje et al., 2006
; Thomas et al., 2007
); and (3) structural and functional feature prediction using Phylofacts (Krishnamurthy et al., 2006
) and Phobius (Kall et al., 2004
).
The statistical significance of pairwise sequence similarities was evaluated using the probability of random shuffles (PRSS) test (Pearson & Lipman, 1988
), which calculates the probability of similarities of randomly shuffled and unshuffled sequences using a distance matrix Monte Carlo procedure. The test was performed with 1000 global shuffles with the gap-opening penalty set at 12 and the gap-extending penalty at 2 by using the BLOSUM50 scoring matrix.
Phylogenetic analysis.
Phylogenetic analysis was carried out for several of the proteins encoded in the gene cluster. The sequence homologues of the heliobacterial proteins were retrieved from searching sequence databases using BLAST (Altschul et al., 1997
) with an E value cutoff of 10–20. After removing redundant and nearly redundant homologues, the sequences were aligned using a profile-based approach (Simossis et al., 2005
), followed by manual refinement. The final sequence alignments were used to construct phylogenetic trees based on maximum-likelihood with the aid of the PHYML program (Guindon & Gascuel, 2003
) under the Whelan and Goldman (WAG) substitution model (Whelan and Goldman, 2001
) with four substitution rate categories. Nonparametric bootstrapping was subsequently performed with 100 replicates of the datasets.
Molecular modelling.
3D protein structures of a number of proteins encoded by the gene cluster were constructed based on the principle of homology modelling. The homology models could be built because of the extremely conserved nature of protein structures given the small number of protein folds available (<800) against the huge number of protein sequences in nature (>1x106 individual sequences). The practical boundaries of sequence identity for proteins adopting the same structures were defined by Rost (1999)
as a function of sequence length in pairwise alignment, e.g. a sequence identity of 20 % for an alignment of 150 aa can fall within the safe zone for protein homology modelling. Below the safe zone is the twilight zone, where identical structure can still be found (sometimes as low as 12–15 %), although statistical tests such as the PRSS test have to be used to differentiate random matching from truly related sequences. The sequence alignments used in this study were well within the range suitable for homology model building.
The structural templates for the modelling were chosen from the Protein Data Bank (PDB) using an HMM-based approach, HHPred (Soding et al., 2005
). The resulting statistically most significant alignment was used as a basis for manual refinement. The refined alignment was used as input for the modelling software Modeller (Sali et al., 1995
), which was able to model both conserved regions and loops to generate a raw model that was subsequently refined with built-in energy-minimization features. The quality of the protein model was evaluated using Verify3D (Eisenberg et al., 1997
). The protein cofactors were subsequently modelled by transferring the coordinates directly from the template to the protein model. For NirL, quaternary modelling involving a complex structure of a NirL dimer and dsDNA was also performed. The NirL dimer was modelled by superimposing two monomers upon an Lrp dimer unit from the octameric structure generated by Ren et al. (2007)
. The dimer was then manually docked onto a 22 bp DNA structure (PDB code 1CGP) in the Quanta (Accelrys) molecular-modelling environment. The final modelling result was rendered using Pymol (DeLano Scientific LLC).
NirL expression and purification.
To test the hypothesis that NirL is a transcription factor, the protein was purified to homogeneity and its DNA-binding activity characterized. Briefly, the nirL gene was amplified using PCR with the primers CGCATATGTGGACTGAAAAAGACAAAGAG and CGGAATTCCGCTTCTTTTTCCATGAAG. The PCR product was subsequently cloned into an expression construct pTYB1 (New England Biolabs) between the NdeI and EcoRI restriction sites. The cloned gene was resequenced to verify the absence of mutations and was subsequently used for heterologous expression in Escherichia coli ER2566.
NirL was expressed as a C-terminal fusion protein to an intein (an inducible protein self-splicing element) and a chitin-binding domain. The strain with the NirL expression construct (pTYB1 : : nirL) was grown at 37 °C in Terrific Broth (TB) medium containing ampicillin (100 µg ml–1) to OD600 0.6, when IPTG was added to a final concentration of 0.5 mM. The cells were incubated at room temperature (22 °C) overnight before being harvested.
The cells were harvested by centrifugation at 5000 g for 10 min at 4 °C. The cell pellet was resuspended in 5 ml cell lysis buffer (20 mM Tris/HCl, pH 8.0, 500 mM NaCl, 1 mM EDTA, 0.1 % Triton X-100, 20 µM PMSF) and lysed by agitation in fine glass beads (0.1 mm diameter) using a mini-BeadBeater (Glen Mills). The lysed cell suspension was centrifuged at 1500 g for 10 min to remove the cell debris and glass beads. The cell lysate was centrifuged at 20 000 g for 30 min at 4 °C. The supernatant was subsequently loaded onto a chitin column equilibrated with column buffer (20 mM Tris/HCl, pH 8.0, 500 mM NaCl, 1 mM EDTA). The column was washed with 5 vols column buffer followed by 1 vol. cleavage buffer (20 mM Tris/HCl, pH 8.0, 500 mM NaCl, 1 mM EDTA, 20 µM PMSF, 50 mM DTT). The on-column protein cleavage was performed by incubating the fusion protein in the cleavage buffer at room temperature in an anaerobic chamber (Coy Laboratory) overnight (18 h). The column was then eluted with 2 vols of elution buffer (50 mM Tris, pH 8.0, 150 mM KCl, 5 mM DTT, 5 %, v/v, glycerol). The eluate was collected and concentrated using a Centricon-10 concentrator (Millipore). Protein samples were taken and analysed by SDS-PAGE on 12.5 % gels that were subsequently stained with Coomassie brilliant blue (R-250) dye.
DNA mobility shift assay.
The DNA fragment used for the mobility shift assay was a PCR-amplifed 200 bp region immediately upstream of nirJ2 in Hp. fasciatum, and contains the putative promoter for the nir operon. The PCR product was purified using the Qiaquick Gel Extraction kit (Qiagen). For the DNA-binding assay, 50 ng DNA was added to the binding buffer (10 mM Tris, pH 7.5, 50 mM KCl, 1 mM DTT, 2.5 %, v/v, glycerol, 5 mM MgCl2, 0.05 % Nonidet P-40) in a final volume of 20 µl, either with or without 10 µg purified NirL protein. The reaction was carried out at room temperature for 30 min. The reaction mixture was subjected to electrophoresis in 5 % polyacrylamide gels in native TBE buffer (45 mM Tris, 45 mM boric acid, 1 mM EDTA, pH 8.3) at 100 V for 1 h. Following electrophoresis, the gel was stained with 50 ml 0.001 % SYBR-Gold (Invitrogen) for 30 min, and visualized using the EpiChemi3 Imaging System (UVP).
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
The sequencing of the gene cluster was initiated by obtaining a number of conserved gene fragments through degenerate PCR. The flanking sequences of the segments were subsequently obtained by using two different genome-walking techniques: inverse PCR (Ochman et al., 1988
) and a method newly developed by Guo & Xiong (2006)
. The final nucleotide sequence length for the Hb. mobilis gene cluster was 16 361 bp, and for Hp. fasciatum, 17 398 bp. The locations and boundaries of the ORFs were determined based on a combination of de novo gene prediction programs and the presence of RBS in the immediate vicinity of the predicted ORFs to minimize errors. We found 17 protein-encoding genes, including partial ones at both ends, in the Hb. mobilis sequence and 16 genes in the Hp. fasciatum sequence (Fig. 2
).
|
|
|
As part of the sequence annotation, we performed operon prediction with the newly predicted genes using the Wang et al. (2004)
method, which determines operons by the combined information of inter-gene distances and gene linkage conservation among genomes, and has been shown to be highly accurate (
91 % accuracy). Two operons are predicted in the given sequences (Fig. 2
), with ccs1, ccsA, cysGB, hemA2, hemC and cysGA–hemD constituting the first operon, and nirJ2, nirD, nirL and hemL forming the second operon. The operon structure for the two heliobacterial strains is well conserved. The first transcriptional unit appears to be mainly involved in the early stage of tetrapyrrole biosynthesis and haem transport, with the exception of cysGA and cysGB. The second operon may be more specific for haem d1 biosynthesis, with the exception of hemL. In between the two operons are nirJ1 and hemB, which appear to be monocistronic.
Of particular interest is the presence of cysGB and cysGA in the first operon along with most of the hem and ccs genes, and of hemL in the second operon along with the nir genes. The hemL gene (encoding glutamate semialdehyde aminotransferase) is involved in the early steps of uroporphyrinogen III biosynthesis, whereas cysGA and cysGB, as illustrated below, may be involved in the late steps of haem d1 biosynthesis. The mixed arrangement of these genes in two different operons appears to indicate that the two stages of the haem d1 biosynthesis pathway as well as the final assembly of cytochrome cd1 are tightly co-regulated at the functional level.
The linkage of the hem genes responsible for the biosynthesis of uroporphyrinogen III appears to be consistent among Gram-positive bacteria such as B. subtilis, Staphylococcus aureus and Paenibacillus macerans (Hansson et al., 1991
; Kafala & Sasarman, 1997
; Johansson & Hederstedt, 1999
). The reported linkage patterns are in some ways similar to that in heliobacteria. It remains to be investigated whether the consistent clustering indicates possible physical interactions at the protein level or simply an evolutionary pressure for coexpression of the functionally related genes.
The discovery of the haem d1 biosynthesis genes was in fact a matter of serendipity as a result of genome walking. The analysis of the haem d1 biosynthesis genes turned out to be most interesting in filling the knowledge gaps for the enzymic involvement in the haem d1 biosynthesis pathway. The following sections concentrate on the proteins encoded by the cluster that are specifically related to haem d1 biosynthesis and its transport for cytochrome maturation.
CysGA
The database search analysis for the translated ORF downstream of hemC revealed a fusion gene of cysGA and hemD (Fig. 2
) (BLAST E value 0). The CysGA domain of the fusion product is on the N terminus (amino acids 1–251). Its homologues in other species have been annotated as sirohaem synthase, which is a SAM-dependent uroporphyrinogen III methylase catalysing the first two steps of sirohaem synthesis, namely methylation at rings I and II of uroporphyrinogen III to produce precorrin-1 and precorrin-2. The HemD domain on the C terminus (amino acids 252–512) is a uroporphyringen III synthase known to catalyse the cyclization of the linear tetrapyrrole 1-hydroxymethylbilane to produce the macrocyclic uroporphyrinogen III. The fusion of CysGA and HemD appears to be rather common in Gram-positive bacteria, as observed in Bacillus, Paenibacillus and Clostridium species (Johansson & Hederstedt, 1999
; Fujino et al., 1995
). The genetic fusion apparently generates an efficient mechanism to produce precorrin-2 from 1-hydroxymethylbilane, with three consecutive steps of catalysis being carried out by the same polypeptide.
Sirohaem is a similar compound to haem d1. It has been suggested that the initial methylation steps leading to the synthesis of precorrin-2 should be shared between sirohaem biosynthesis and haem d1 biosynthesis (Zumft, 1997
). In Pseudomonas, NirE has been shown by genetic analysis to be necessary to catalyse the conversion of uroporphyrinogen III to precorrin-2 (de Boer et al., 1994
; Kawasaki et al., 1997
) during haem d1 biosynthesis. NirE in fact shares 60 % sequence identity with CysGA from E. coli (Warren et al., 1994
), which confirms that NirE can essentially be treated as CysGA and that the latter can be directly involved in these reactions. In addition, it has been shown that there is an absolute requirement for SAM in the initial steps of haem d1 biosynthesis (Yap-Bondoc et al., 1990
).
To provide a structural basis for CysGA, we applied a comparative modelling approach. The CysGA template used for the modelling was obtained by searching PDB using an HMM-based approach to produce a high-quality alignment with a significantly related homologous sequence in the database (Soding et al., 2005
). The search identified the CysGA domain of CysG from Salmonella enterica as the closest homologue (1PJS) (Stroupe et al., 2003
). The full-length match between the CysGA domain of Hb. mobilis and that of S. enterica was 49 % in sequence identity (Fig. 4a
). A homology model was subsequently built based on a refined alignment with a bound cofactor S-adenosyl homocysteine (SAH) (Fig. 4b
), which is demethylated SAM. The model was evaluated using a statistical profile-based approach (Eisenberg et al., 1997
) and was shown to be of high quality (results not shown). There are two structural domains in the modelled structure, domain I (Fig. 4b
, left) and domain II (Fig. 4b
, right), both consisting of a β-sheet surrounded by
-helices. The two domains are arranged in a V shape with the SAH/SAM cofactor bound to domain II near the centre. CysGA is thought to be able to transfer a methyl group from SAM to C2 or C7 of the macrocyclic ring via a stereochemical inversion of the reactive carbon on the porphyrin substrate (Stroupe et al., 2003
). Since the closely related Salmonella methylase carries out the catalysis with a homodimeric quaternary structure, it is reasonable to postulate that heliobacterial CysGA achieves the same functionality through a similar architecture.
|
70 % accuracy).
The structural analysis of CysGB was similarly carried out using homology modelling with the CysGB domain of the same CysG protein from S. enterica serving as template (1PJS) (Stroupe et al., 2003
). The full-length alignment between Hb. mobilis CysGB and the CysGB domain of the template protein was 31 % by identity (Fig. 5a
). A homology model was subsequently built based on the refined alignment with the bound cofactor NAD (Fig. 4b
). From the structural model, it is clear that the dual function of CysGB is realized by two distinct structural domains in the protein, the dehydrogenase domain (Fig. 5b
, right) on the N terminus (residues 1–146), in which the cofactor NAD is bound, and the ferrochelatase domain (Fig. 5b
, left) on the C terminus (residues 147–208) (the metal ions are not bound to the protein but presumably exist in the aqueous environment).
|
NirJ
The ORFs immediately upstream and downstream of the hemB frame were both identified as nirJ, encoding a haem d1 biosynthesis protein (Fig. 2
). They were differentiated as nirJ1 and nirJ2 (BLAST E values 8x10–147 for NirJ1 and 7x10–125 for NirJ2). The two gene products were 32 % identical to one other at the amino acid level and were apparently the result of gene duplication. Our phylogenetic analysis of the NirJ family indicated that the duplication event was quite ancient, with the two versions of NirJ branching before the separation of bacteria and archaea (Fig. 6a
).
|
-lysine and L-β-lysine, and coproporphyrinogen III oxidase (HemN), which converts coproporphyrinogen III to protoporphyrinogen IX. None of these catalytic reactions, except that of HemN, is obviously related to haem d1 biosynthesis. HemN is the only member of the radical SAM protein family involved in tetrapyrrole biosynthesis, and catalyses two successive oxidative decarboxylation reactions on the propionate sidechains of coproporphyrinogen III with the aid of two SAM cofactors and one Fe4S4 centre (Layer et al., 2003In our HMM-based database search using NirJ2 of Hb. mobilis as query, HemN from E. coli indeed turned out to be one of the top hits in the search result. More detailed pairwise comparison between the two proteins showed a regional alignment covering 51 % of the total length with identity 12 %, similarity 57 % and P<0.05 from the PRSS test. Thus, this supports a remote homology between NirJ and HemN, which allowed us to propose that NirJ functions similarly to HemN. Since HemN catalyses decarboxylation of the propionate groups, this may be considered similar to the decarboxylation of the acetate groups that is required for haem d1 biosynthesis.
The homology model of NirJ2 was constructed based on the alignment with MoaA from Staph. aureus with a bound SAM and two Fe4S4 centres (Fig. 6c
). The overall protein model resembles a triosphosphate isomerase (TIM) barrel with an eight-stranded β-sheet wrapped around by eight
-helices. One of the bound Fe4S4 centres is thought to be able to transfer an electron to the SAM molecule and induce its cleavage, producing methionine and a 5'-deoxyadenosyl radical. The highly oxidizing radical then abstracts a hydrogen from a carbon atom on the substrate to induce a glycyl radical that catalyses a subsequent bond cleavage reaction on the substrate (Hänzelmann & Schindelin, 2004
). This mode of reaction is considered common among SAM radical enzymes with Fe4S4 centres, and may provide a mechanistic clue to the presumed bond breakage reaction of NirJ.
NirD and NirL
The heliobacterial nir operon contains two other nir genes, nirD and nirL (Fig. 2
). These two gene products share 24 % identity with each other at the translated amino acid level. They can be considered to be the result of gene duplication from a common ancestor. As shown in Fig. 7(a)
, the gene duplication appears to be very ancient, and may have occurred before the separation of bacteria and archaea. In fact, in the Pseudomonas lineage, an additional gene-duplication event appears to have occurred with this pair of gene homologues, giving rise to the four similar genes nirD, nirL, nirG and nirH. Deletion of any of these genes is able to abolish the production of haem d1 in Pseudomonas (Palmedo et al., 1995
; Kawasaki et al., 1997
).
|
An HTH motif was identified at the N terminus (residues 3–49) of heliobacterial NirD and NirL, and was strongly similar to the one in most Lrp proteins, supporting their putative role as transcription regulators. No enzymic functions were identified through the bioinformatics analysis. Furthermore, a palindromic sequence TTT(N)AT(N5–7)AT(N)AAA was found in the upstream region (–47.5±8.5 bp from gene start sites) of both nirJ1 and nirJ2, and matched well with the known DNA-binding motif, which is an AT-rich inverted repeat, of many Lrp proteins (Koike et al., 2004
). Thus, we suggest that NirD/L serve as transcription factors that regulate the expression of nirJ1 and the nir operon, including the hemL gene. Therefore, they can be considered to be indirectly involved in the biosynthesis of haem d1.
To verify that NirD/L are indeed DNA-binding proteins, we cloned and expressed the nirL gene from Hp. fasciatum and purified the NirL protein using an intein-mediated approach (Fig. 7b
). Its DNA-binding characteristics were determined using a gel mobility shift assay with a DNA probe that included 200 bp upstream from the nirJ2 gene, encompassing the putative promoter for the nir operon. DNA band shifts were clearly observed with the addition of partially purified NirL (Fig. 7c
). This result thus supports the above proposal that NirL, and likely NirD as well, plays a role in regulation of expression of the nir operon.
We further constructed a 3D model of NirL based on the strong full-length sequence similarity to a closely related Lrp transcription factor from Pyrococcus sp. (Koike et al., 2004
; PDB code 1RI7). The pairwise alignment had an identity level of 23 %. Based on the knowledge that all known Lrp transcription factors form an octamer consisting of four dimer units, a dimer of NirL (Fig. 7d
) was modelled along with its DNA ligand according to Koike et al. (2004),
showing the N-terminal HTH motif of NirL interacting closely with the major groove of the DNA.
It needs to be pointed out that this proposal is novel and contradictory to the current belief that the NirD/L proteins are directly involved in haem d1 synthesis (Zumft, 1997
; Timkovich, 2003
). Youn et al. (2004)
overexpressed a Pseudomonas nirFDLGH operon and obtained an unusual tetrapyrrole termed compound 800 that had some features related to haem d1. It is not clear whether the result was due to the expression of the five gene products encoded in the operon or upregulation/down-regulation of other nir genes in Pseudomonas as an indirect result of overexpression of the transcription regulators.
Ccs proteins
Also of interest are the two genes at the beginning of the haem biosynthesis gene cluster. They encode two transmembrane proteins related to cytochrome c biosynthesis. Sequence database searching identified them as Ccs1 and CcsA, responsible for the transmembrane delivery of haem c during the biogenesis of cytochrome c holoproteins (Nakamoto et al., 2000
) (BLAST E values 7x10–55 for Ccs1 and 4x10–75 for CcsA). This function could be significant, because cytochrome cd1 is known to carry out its catalysis in the periplasmic space (for Gram-positive bacteria, it is the space between the plasma membrane and the cell wall) (Suharti & de Vries, 2005
). The transport of the newly synthesized haem d1 across the membrane is thus a necessary step for the final assembly and maturation of cytochrome cd1 (Zumft, 1997
). The very existence of the ccs genes in an operon related to haem d1 biosynthesis gives important hints that they may be involved in the transport of haem d1 in addition to haem c for the generation of cytochrome cd1 in the mature form in the periplasm.
CcsA and Ccs1 of cyanobacteria and algal chloroplasts have been shown to function as a closely associated complex in delivering haem to an apocytochrome, with CcsA binding to haem through its tryptophan-rich domain, and Ccs1 interacting with the apocytochrome and anchoring it for haem insertion (Hamel et al., 2003
). The tryptophan-rich domain for haem binding has indeed been identified in heliobacterial CcsA. In addition to transport, the CcsA–Ccs1 complex in cyanobacteria and chloroplasts is also able to perform haem ligation to covalently attach a haem to a c-type apocytochrome (Hamel et al., 2003
). The latter function, if conserved in heliobacteria, should be confined to the incorporation of haem c into cytochrome cd1, since haem d1 is non-covalently bound to the cytochrome protein.
Working hypothesis on haem d1 biosynthesis
To summarize the above sequence and structural analysis, we propose a working hypothesis for the enzymes involved in the haem d1 biosynthesis pathway. The strong sequence similarity of heliobacterial CysGA to well-characterized SAM-dependent uroporphyrinogen III methyltransferases gives credence to the idea that the CysGA domain of the CysGA–HemD fusion protein is able to methylate uroporphyrinogen III at C2 and C7 via two consecutive steps to produce precorrin-2. CysGB, which contains a dehydrogenase domain, is proposed to catalyse the oxidation of the single bond between C15 and C16 to produce a double bond, leading to the formation of sirohydrochlorin. NirJ, belonging to the same protein family as HemN, which modifies tetrapyrrole sidechains through decarboxylation, is proposed to decarboxylate the acetate sidechains at C12 and C18 to produce methylated groups at rings III and IV. The final step of haem d1 synthesis, iron insertion of porphyrindione d1, is proposed to be carried out by the ferrochelatase domain of CysGB. The newly synthesized haem d1 may be transported across the membrane and subsequently inserted into an apocytochrome via the combined effects of CcsA and Ccs1 during the biogenesis of cytochrome cd1 (Fig. 8a, b
).
|
The formation of the acrylate group on ring IV is possibly catalysed by CysGB, since the dehydrogenation reaction is similar to that at neighbouring C15 and C16, resulting in a conjugated double bond with the macrocyclic ring. However, it remains to be seen whether the minimalist point of view can be sustained until the full genome data become available, though in Gram-positive bacteria, and especially heliobacteria, a complete set of genes for a biosynthetic pathway tend to be arranged in one operon or superoperon, as is the case for the photosynthesis gene cluster (Xiong et al., 1998
). This form of arrangement may ensure a tight gene regulation that is important for anaerobic metabolism. The working hypothesis for the haem d1 biosynthesis pathway offers many tantalizing clues to be tested by experimental investigation.
| ACKNOWLEDGEMENTS |
|---|
Edited by: P. Cornelis
| REFERENCES |
|---|
|
|
|---|
Azuaje, F., Al-Shahrour, F. & Dopazo, J. (2006). Ontology-driven approaches to analyzing data in functional genomics. Methods Mol Biol 316, 67–86.[Medline]
Beale, S. I. (1995). Biosynthesis and structures of porphyrins and hemes. In Anoxygenic Photosynthetic Bacteria, pp. 153–177. Edited by R. E. Blankenship, M. T. Madigan & C. E. Bauer. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Beale, S. I. (2000). Tetrapyrrole biosynthesis in bacteria. In Encyclopedia of Microbiology, 2nd edn, vol. 4, pp. 558–570. Edited by J. Lederberg. San Diego, CA: Academic Press.
Beer-Romero, P. & Gest, H. (1987). Heliobacillus mobilis, a peritrichously flagellated anoxyphototroph containing bacteriochlorophyll g. FEMS Microbiol Lett 41, 109–114.[CrossRef]
Bocs, S., Cruveiller, S., Vallenet, D., Nuel, G. & Médigue, C. (2003). AMIGENE: annotation of microbial genes. Nucleic Acids Res 31, 3723–3726.
Brinkman, A. B., Ettema, T. J., de Vos, W. M. & van der Oost, J. (2003). The Lrp family of transcriptional regulators. Mol Microbiol 48, 287–294.[CrossRef][Medline]
de Boer, A. P., Reijnders, W. N., Kuenen, J. G., Stouthamer, A. H. & van Spanning, R. J. (1994). Isolation, sequencing and mutational analysis of a gene cluster involved in nitrite reduction in Paracoccus denitrificans. Antonie Van Leeuwenhoek 66, 111–127.[CrossRef][Medline]
Eisenberg, D., Luthy, R. & Bowie, J. U. (1997). VERIFY3D: assessment of protein models with three-dimensional profiles. Methods Enzymol 277, 396–404.[Medline]
Enright, A. J., Iliopoulos, I., Kyrpides, N. C. & Ouzounis, C. A. (1999). Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90.[CrossRef][Medline]
Frankenberg, N., Schobert, M., Moser, J., Raux, E., Graham, R., Warren, M. J. & Jahn, D. (2004). The biosynthesis of hemes, siroheme, vitamin B12 and linear tetrapyrroles in Pseudomonas. In Pseudomonas, pp. 111–146. Edited by J.-L. Ramos. New York: Kluwer Academic/Plenum Publishers.
Fujino, E., Fujino, T., Karita, S., Sakka, K. & Ohmiya, K. (1995). Cloning and sequencing of some genes responsible for porphyrin biosynthesis from the anaerobic bacterium Clostridium josui. J Bacteriol 177, 5169–5175.
Gest, H. (1994). Discovery of heliobacteria. Photosynth Res 41, 17–21.[CrossRef]
Gest, H. & Favinger, J. L. (1983). Heliobacterium chlorum, an anoxygenic brownish-green photosynthetic bacterium containing a 'new' form of bacteriochlorophyll. Arch Microbiol 136, 11–16.[CrossRef]
Glockner, A. B. & Zumft, W. G. (1996). Sequence analysis of an internal 9.72-kb segment from the 30-kb denitrification gene cluster of Pseudomonas stutzeri. Biochim Biophys Acta 1277, 6–12.[Medline]
Guindon, S. & Gascuel, O. (2003). A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52, 696–704.
Guo, H. & Xiong, J. (2006). A specific and versatile genome walking technique. Gene 381, 18–23.[CrossRef][Medline]
Hamel, P. P., Dreyfuss, B. W., Xie, Z., Gabilly, S. T. & Merchant, S. (2003). Essential histidine and tryptophan residues in CcsA, a system II polytopic cytochrome c biogenesis protein. J Biol Chem 278, 2593–2603.
Hansson, M., Rutberg, L., Schröder, I. & Hederstedt, L. (1991). The Bacillus subtilis hemAXCDBL gene cluster, which encodes enzymes of the biosynthetic pathway from glutamate to uroporphyrinogen III. J Bacteriol 173, 2590–2599.
Hänzelmann, P. & Schindelin, H. (2004). Crystal structure of the S-adenosylmethionine-dependent enzyme MoaA and its implications for molybdenum cofactor deficiency in humans. Proc Natl Acad Sci U S A 101, 12870–12875.
Johansson, P. & Hederstedt, L. (1999). Organization of genes for tetrapyrrole biosynthesis in Gram-positive bacteria. Microbiology 145, 529–538.[CrossRef][Medline]
Kafala, B. & Sasarman, A. (1997). Isolation of the Staphylococcus aureus hemCDBL gene cluster coding for early steps in heme biosynthesis. Gene 199, 231–239.[CrossRef][Medline]
Kall, L., Krogh, A. & Sonnhammer, E. L. (2004). A combined transmembrane topology and signal peptide prediction method. J Mol Biol 338, 1027–1036.[CrossRef][Medline]
Kawasaki, S., Arai, H., Kodama, T. & Igarashi, Y. (1997). Gene cluster for dissimilatory nitrite reductase (nir) from Pseudomonas aeruginosa: sequencing and identification of a locus for heme d1 biosynthesis. J Bacteriol 179, 235–242.
Kimble, L. K. & Madigan, M. T. (1992). Nitrogen fixation and nitrogen metabolism in heliobacteria. Arch Microbiol 158, 155–161.[CrossRef]
Koike, H., Ishijima, S. A., Clowney, L. & Suzuki, M. (2004). The archaeal feast/famine regulatory protein: potential roles of its assembly forms for regulating transcription. Proc Natl Acad Sci U S A 101, 2840–2845.
Krishnamurthy, N., Brown, D. P., Kirshner, D. & Sjolander, K. (2006). PhyloFacts: an online structural phylogenomic encyclopedia for protein functional and structural classification. Genome Biol 7, R83[CrossRef][Medline]
Layer, G., Moser, J., Heinz, D. W., Jahn, D. & Schubert, W. D. (2003). Crystal structure of coproporphyrinogen III oxidase reveals cofactor geometry of radical SAM enzymes. EMBO J 22, 6214–6224.[CrossRef][Medline]
Lukashin, A. V. & Borodovsky, M. (1998). GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26, 1107–1115.
Madigan, M. T. & Ormerod, J. G. (1995). Taxonomy, physiology and ecology of heliobacteria. In Anoxygenic Photosynthetic Bacteria, pp. 17–30. Edited by R. E. Blankenship, M. T. Madigan & C. E. Bauer. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Marcotte, E. M., Pellegrini, M., Ng, H. L., Rice, D. W., Yeates, T. O. & Eisenberg, D. (1999). Detecting protein function and protein–protein interactions from genome sequences. Science 285, 751–753.
Nakamoto, S. S., Hamel, P. & Merchant, S. (2000). Assembly of chloroplast cytochromes b and c. Biochimie 82, 603–614.[Medline]
Notredame, C., Higgins, D. G. & Heringa, J. (2000). T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302, 205–217.[CrossRef][Medline]
Ochman, H., Gerber, A. S. & Hartl, D. L. (1988). Genetic applications of an inverse polymerase chain reaction. Genetics 120, 621–623.
Ormerod, J. G., Kimble, L. K., Nesbakken, T., Torgersen, Y. A., Woese, C. R. & Madigan, M. T. (1996). Heliophilum fasciatum gen. nov. sp. nov. and Heliobacterium gestii sp. nov.: endospore-forming heliobacteria from rice field soils. Arch Microbiol 165, 226–234.[CrossRef][Medline]
Palmedo, G., Seither, P., Korner, H., Matthews, J. C., Burkhalter, R. S., Timkovich, R. & Zumft, W. G. (1995). Resolution of the nirD locus for heme d1 synthesis of cytochrome cd1 (respiratory nitrite reductase) from Pseudomonas stutzeri. Eur J Biochem 232, 737–746.[Medline]
Pearson, W. R. & Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 85, 2444–2448.
Pospiech, A. & Neumann, B. (1995). A versatile quick-prep of genomic DNA from Gram-positive bacteria. Trends Genet 11, 217–218.[CrossRef][Medline]
Ren, J., Sainsbury, S., Combs, S. E., Capper, R. G., Jordan, P. W., Berrow, N. S., Stammers, D. K., Saunders, N. J. & Owens, R. J. (2007). The structure and transcriptional analysis of a global regulator from Neisseria meningitidis. J Biol Chem 282, 14655–14664.
Rost, B. (1999). Twilight zone of protein sequence alignments. Protein Eng 12, 85–94.
Sali, A., Potterton, L., Yuan, F., van Vlijmen, H. & Karplus, M. (1995). Evaluation of comparative protein modelling by MODELLER. Proteins 23, 318–326.[CrossRef][Medline]
Sanger, F., Nicklen, S. & Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A 74, 5463–5467.
Sasson, O., Vaaknin, A., Fleischer, H., Portugaly, E., Bilu, Y., Linial, N. & Linial, M. (2003). ProtoNet: hierarchical classification of the protein space. Nucleic Acids Res 31, 348–352.
Schiex, T., Gouzy, J., Moisan, A. & de Oliveira, Y. (2003). FrameD: a flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences. Nucleic Acids Res 31, 3738–3741.
Schultz, S. C., Shields, G. C. & Steitz, T. A. (1991). Crystal structure of a CAP–DNA complex: the DNA is bent by 90 degrees. Science 253, 1001–1007.
Shmatkov, A. M., Melikyan, A. M., Chernousko, F. L. & Borodovsky, M. (1999). Finding prokaryotic genes by the "frame-by-frame" algorithm: targeting gene starts and overlapping genes. Bioinformatics 15, 874–886.
Simossis, V. A., Kleinjung, J. & Heringa, J. (2005). Homology-extended sequence alignment. Nucleic Acids Res 33, 816–824.
Soding, J., Biegert, A. & Lupas, A. N. (2005). The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33, W244–W248.
Stroupe, M. E., Leech, H. K., Daniels, D. S., Warren, M. J. & Getzoff, E. D. (2003). CysG structure reveals tetrapyrrole-binding features and novel regulation of siroheme biosynthesis. Nat Struct Biol 10, 1064–1073.[CrossRef][Medline]
Suharti & de Vries, S. (2005). Membrane-bound denitrification in the Gram-positive bacterium Bacillus azotoformans. Biochem Soc Trans 33, 130–133.[CrossRef][Medline]
Thaw, P., Sedelnikova, S. E., Muranova, T., Wiese, S., Ayora, S., Alonso, J. C., Brinkman, A. B., Akerboom, J., van der Oost, J. & Rafferty, J. B. (2006). Structural insight into gene transcriptional regulation and effector binding by the Lrp/AsnC family. Nucleic Acids Res 34, 1439–1449.
Thomas, P. D., Mi, H. & Lewis, S. (2007). Ontology annotation: mapping genomic regions to biological function. Curr Opin Chem Biol 11, 4–11.[CrossRef][Medline]
Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.
Timkovich, R. (2003). The family of d-type hemes: tetrapyrroles with unusual substitutents. In The Porphyrin Handbook, pp. 123–156. Edited by K. M. Kadish, K. M. Smith & R. Guilard. San Diego, CA: Academic Press.
von Mering, C., Jensen, L. J., Snel, B., Hooper, S. D., Krupp, M., Foglierini, M., Jouffre, N., Huynen, M. A. & Bork, P. (2005). STRING: known and predicted protein–protein associations, integrated and transferred across organisms. Nucleic Acids Res 33, D433–D437.
Wang, L., Trawick, J. D., Yamamoto, R. & Zamudio, C. (2004). Genome-wide operon prediction in Staphylococcus aureus. Nucleic Acids Res 32, 3689–3702.
Warren, M. J., Bolt, E. L., Roessner, C. A., Scott, A. I., Spencer, J. B. & Woodcock, S. C. (1994). Gene dissection demonstrates that the Escherichia coli cysG gene encodes a multifunctional protein. Biochem J 302, 837–844.[Medline]
Whelan, S. & Goldman, N. (2001). A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 18, 691–699.
Woodcock, S. C., Raux, E., Levillayer, F., Thermes, C., Rambach, A. & Warren, M. J. (1998). Effect of mutations in the transmethylase and dehydrogenase/chelatase domains of sirohaem synthase (CysG) on sirohaem and cobalamin biosynthesis. Biochem J 330, 121–129.[Medline]
Xiong, J., Inoue, K. & Bauer, C. E. (1998). Tracking molecular evolution of photosynthesis by characterization of a major photosynthesis gene cluster from Heliobacillus mobilis. Proc Natl Acad Sci U S A 95, 14851–14856.
Yap-Bondoc, F., Bondoc, L. L., Timovich, R., Baker, D. C. & Hebbler, A. (1990). C-methylation occurs during the biosynthesis of heme d1. J Biol Chem 265, 13498–13500.
Youn, H.-S., Liang, Q., Cha, J. K., Cai, M. & Timkovich, R. (2004). Compound 800, a natural product isolated from genetically engineered Pseudomonas: proposed structure, reactivity, and putative relation to heme d1. Biochemistry 43, 10730–10738.[CrossRef][Medline]
Zumft, W. G. (1997). Cell biology and molecular basis of denitrification. Microbiol Mol Biol Rev 61, 533–616.[Abstract]
Received 8 March 2007;
revised 26 June 2007;
accepted 4 July 2007.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
| J MED MICROBIOL | ALL SGM JOURNALS | |