Microbiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Microbiology 154 (2008), 42-53; DOI  10.1099/mic.0.2007/010611-0
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chen, D. E.
Right arrow Articles by Saier, M. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chen, D. E.
Right arrow Articles by Saier, M. H.
Agricola
Right arrow Articles by Chen, D. E.
Right arrow Articles by Saier, M. H.
Microbiology 154 (2008), 42-53; DOI  10.1099/mic.0.2007/010611-0
© 2008 Society for General Microbiology

The phagosomal nutrient transporter (Pht) family

Derek E. Chen1, Sheila Podell2, John-Demian Sauer3, Michele S. Swanson4 and Milton H. Saier1

1 Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, USA
2 Scripps Genome Center, Scripps Institution of Oceanography, University of California at San Diego, La Jolla, CA 92093-0202, USA
3 Department of Biochemistry, University of California at Berkeley, Berkeley, CA, USA
4 Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI 48109, USA

Correspondence
Milton H. Saier, Jr
msaier{at}ucsd.edu


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Phagosomal transporters (Phts), required for intracellular growth of Legionella pneumophila, comprise a novel family of multispanning {alpha}-helical proteins within the major facilitator superfamily (MFS). The members of this family derive exclusively from bacteria. Multiple paralogues are present in a restricted group of Alpha- and Gammaproteobacteria, but single members were also found in Chlamydia and Cyanobacteria. Their protein sequences were aligned, yielding a phylogenetic tree showing the relations of the proteins to each other. Topological analyses revealed a probable 12 {alpha}-helical transmembrane segment (TMS) topology. Motif identification and statistical analyses provided convincing evidence that these proteins arose from a six TMS precursor by intragenic duplication. The phylogenetic tree revealed some potential orthologous relationships, suggestive of common function. However, several probable examples of lateral transfer of the encoding genetic material between bacteria were identified and analysed. The Pht family most closely resembles a smaller MFS family (the UMF9 family) with no functionally characterized members. However, the UMF9 family occurs in a broader range of prokaryotic organism types, including Archaea. These two families differ in that organisms bearing members of the Pht family often have numerous paralogues, whereas organisms bearing members of the UMF9 family never have more than two. This work serves to characterize two novel families within the MFS and provides compelling evidence for horizontal transfer of some of the family members.


Abbreviations: LPI, lineage probability index; MFS, major facilitator superfamily; Pht, phagosomal transporter; TMS, transmembrane segment; UMF9, unknown major facilitator-9

Supplementary figures showing multiple sequence alignments, and hydropathy, amphipathicity and similarity plots for members of the Pht and UMF9 families, supplementary tables listing the proteins of the Pht and UMF9 families, and a series of codon usage tables are available with the online version of this paper.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Animal pathogens have evolved mechanisms to acquire nutrients from their hosts. Bactericidal phagocytes serve as one intracellular site that virulent bacteria use as a protected replication niche (Greub & Raoult, 2004Down). Intracellular pathogens have acquired mechanisms to both protect themselves from the killing actions of their hosts and to compete with their hosts for nutrient and energy sources (Leung & Finlay, 1991Down; O'Riordan et al., 2003Down; Sauer et al., 2005bDown; Wieland et al., 2005Down). Legionella pneumophila is an excellent model organism to study many aspects of intracellular parasitism, including nutrient acquisition (Molofsky & Swanson, 2004Down).

L. pneumophila is a Gram-negative, facultative intracellular pathogen found ubiquitously in nature as a parasite of freshwater protozoans (Fliermans et al., 1981Down). When immunocompromised humans or smokers come into contact with aerosols containing L. pneumophila, the bacterium can cause opportunistic infections (Marston et al., 1994Down). The most severe disease caused by L. pneumophila is an often fatal pneumonia called Legionnaires' Disease. Through selective pressures exerted by their protozoan hosts, L. pneumophila has acquired mechanisms to survive in and exploit the normally bactericidal cells of the human lung and thereby cause debilitating disease (Fields et al., 2002Down).

L. pneumophila survives inside amoebae and macrophages due to its ability to establish a unique, protected replication vacuole that is separate from the canonical endocytic pathway. A type IV secretion system, the dot/icm apparatus, is required to establish this niche. Within this protected compartment, the bacteria apparently gauge the nutrient supply; when adequate, the cells differentiate and begin to replicate (Sauer et al., 2005bDown). The process of bacterial differentiation, between transmissive and replicative forms, is a central paradigm of Legionella pathogenesis, as mutations in a number of loci that regulate this process cause significant growth defects (Molofsky & Swanson, 2004Down). In broth cultures, differentiation to the transmissive form is governed in large part by a stringent response-like regulatory cascade which is activated upon amino acid starvation. The nutrients sensed within the vacuole that trigger differentiation and promote replication are largely unknown.

The nutrient requirements of Legionella have been studied for many years. Early work indicated that Legionella species require only amino acids as sources of carbon, nitrogen and energy. In fact, in broth cultures, Legionella are unable to metabolize sugars as energy sources (Tesh et al., 1983Down). Arginine, cysteine, methionine, serine, threonine and valine are essential amino acids required for replication, and glutamine, glutamate and serine are the preferred energy sources (George et al., 1980Down; Tesh & Miller, 1981Down; Tesh et al., 1983Down; Warren & Miller, 1979Down). Since this early work, few advances have been made concerning the nutrient requirements of Legionella or the triggers that promote differentiation.

Recently, it was established that Legionella pneumophila requires the phtA locus for growth and differentiation within macrophages. The growth defect displayed by a phtA mutant was rescued by the addition of excess threonine in either peptide or free amino acid form (Sauer et al., 2005bDown). This led to the identification of PhtA as a threonine transporter. The identity of PhtJ as a valine transporter, also required for normal differentiation and replication within macrophages, was also achieved (Sauer et al., 2005aDown, bDown; J. D. Sauer, unpublished results; Gao et al., 1998Down; Harb & Abu Kwaik, 2000Down). Nucleotide sequence analysis predicted that these proteins are members of a moderately sized family of transporters within the major facilitator superfamily (MFS). These transporters may generally be utilized to scavenge sparse nutrients from the host cell, thus exploiting these cells as replicative niches.

In this communication, we identify all sequenced members of this new family, the Pht family, as of October 2006. Their conserved motifs and uniform topological characteristics are identified, and their phylogenetic relationships are defined. This family most closely resembles another MFS family [TC #2.A.1.54; the unknown major facilitator-9 (UMF9) family] with no functionally characterized members. Our work also bears on the evolutionary origins of these proteins and provides convincing evidence for horizontal transfer of some of the encoding genes between bacteria.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Protein multiple alignments for both the Pht family and the UMF9 family were generated using the CLUSTAL_X program (Thompson et al., 1997Down). Related proteins were obtained using PSI-BLAST (NCBI; Altschul et al., 1997Down). Redundancies were manually removed, retaining all paralogues of only one strain of any particular species when multiple strains were represented unless an orthologue in the chosen strain was not available. Then the homologue from a different strain was included (Table 1Down). Phylogenetic trees were generated with the CLUSTAL_X neighbour-joining tree creation function (Saitou & Nei, 1987Down). The protein and 16S rRNA trees were drawn using the TreeView program (Zhai et al., 2002Down). Binary alignments between proteins (e.g. Wen1 and PhtH) were generated using the GAP program (Devereux et al., 1984Down). The AveHas program (Zhai & Saier, 2001Down) was used to generate the mean hydropathy, amphipathicity and similarity plots. The Kazusa codon usage database (www.kazusa.or.jp/codon) provided codon usage information for the genomes of the species of proteins being examined. The codon usage for the genes of the specified proteins was calculated using the Countcodon program (www.kazusa.or.jp/codon).


View this table:
[in this window]
[in a new window]

 
Table 1. Members of the Pht family presented by phylogenetic cluster (see Fig. 1aUp)

Multiple strains of L. pneumophila, C. burnetii, F. tularensis and Wolbachia endosymbionts had been fully sequenced at the time this study was initiated. In all cases, we selected just one strain with the most complete set of paralogues for presentation. Only when an additional homologue was found in another strain of the same species did we include it in the table and our study. Four strains of Legionella pneumophila had been fully sequenced: Corby, Lens, Paris and subsp. pneumophila Philadelphia 1. L. pneumophila Philadelphia 1 possesses all Pht family members present in these strains except PhtL, which was found only in strain Lens. Therefore, PhtL from strain Lens, the only L. pneumophila homologue not found in strain Philadelphia 1, is included in the table. Two strains of Coxiella burnetii had fully sequenced genomes. In addition to the homologues present in strain RSA 493, strain Dugway 7E9-12 has one homologue that is not present in RSA 493; Cbu10 from C. burnetii Dugway 7E9-12 is therefore included in the table. Six strains of Francisella tularensis had been fully sequenced; three subsp. tularensis, two subsp. holarctica and one subsp. novicida. All F. tularensis homologues of the Pht family proved to be present in the substrain labelled subsp. tularensis. Two fully sequenced genomes of Wolbachia endosymbiont strains had been fully sequenced. The Pht homologues in these two strains are the same except for Wen4 which is from strain TRS of Brugia malayi. This homologue, in addition to those from the Wolbachia endosymbiont of Drosophila simulans, was included in the table.

 
Predicted protein sequences for test genomes were downloaded from the Integrated Microbial Genomes (IMG) website of the Joint Genomes Institute, version 2.0 (http://img.jgi.doe.gov/cgi-bin/pub/main.cgi; Markowitz et al., 2006Down). Lineage Probability Index (LPI) scores were obtained using the DarkHorse algorithm to predict the likelihood of horizontal gene transfer, as described by Podell & Gaasterland (2007)Down. Briefly, this algorithm identifies protein sequence matches from a diverse reference database (such as GenBank) with a genome-wide set of predicted proteins, then combines this information with taxonometric data about the matched sequences. The result is a numeric value for each test protein sequence (LPI score), which estimates the likelihood that the phylogeny of its database match is typical or atypical compared to the rest of the test genome. The input parameters for this method are a complete set of predicted genomic protein sequences from the test organism, a set of self-definition keywords that define phylogenetic granularity, and a numerical value between zero and one, called the ‘filter threshold’, which modulates selectivity of the calculation. For this study, the self-definition keyword for each species consisted of the genus name, and the filter threshold value was set at 0.1.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Members of the Pht family
All sequenced members of the Pht family were retrieved from the NCBI non-redundant protein database using PhtJ, the valine uptake system of L. pneumophila, as the query sequence in PSI-BLAST searches without iterations, first in June 2006 and again in October 2006. The same homologues were retrieved when other Pht proteins of L. pneumophila were used as the query sequence. The members of the family are presented in Table 1Up, listed according to phylogenetic cluster (see below) and, in Table S1 (available with the online version of this paper and at the authors' website: www.biology.ucsd.edu/~msaier/supmat/Pht), listed alphabetically by protein abbreviation. The majority of the members of the Pht family are from gammaproteobacteria, but several are from alphaproteobacteria, one is from a cyanobacterium and one is from a chlamydial species. Many bacterial kingdoms are not represented, and no member proved to be from an archaeon or a eukaryote.

The numbers of paralogues in the different organisms represented vary tremendously. There are 11 Pht paralogues in one strain of Legionella pneumophila plus one more in a different strain of L. pneumophila (see Table 1Up), 9 in a strain of Coxiella burnetii plus one more in another strain of C. burnetii, 7 in Rickettsiella grylli, 6 in a strain of Francisella tularensis and 3 in one strain of Wolbachia, with another in a second strain (see Table 1Up). The presence of family members in some but not all strains of a species suggests the recent gain or loss of genes encoding these proteins.

All of the bacteria with multiple Pht paralogues are gammaproteobacteria except for Wolbachia which is an alphaproteobacterium. None of the other bacteria possessing Pht family members has more than one sequenced member of this transporter family. Strikingly, a gammaproteobacterium or an alphaproteobacterium can have multiple paralogues in this family when absolutely none is found when the identical search criteria are applied to over 100 fully sequenced proteobacteria of all subdivisions. However, the occurrence of transporter types encoded within the genome of a bacterium appears to be related to (1) the organism's phylogeny and (2) its lifestyle (Ren & Paulsen, 2007Down). As noted below, all of the organisms with four or more paralogues cluster together in the 16S rRNA tree (see below). In addition, the proteins in this family are found almost exclusively in species that are either obligate or facultative intracellular parasites. Thus, our phylogenetic analysis predicts that the Pht transporters may serve a unique type of function that is not required by most bacteria.

Phylogeny of Pht family members
A phylogenetic tree for the Pht family, based on the multiple alignment shown in Fig. S1 (available with the online version of this paper and at the authors' website: www.biology.ucsd.edu/~msaier/supmat/Pht), is shown in Fig. 1(a)Down. There are 11 clusters, containing 46 sequences from 13 different species. Cluster 1 includes two proteins each from L. pneumophila, C. burnetii and R. grylli. These proteins cluster as expected for two sets of orthologues, based on the 16S rRNA tree shown in Fig. 1(b)Down. For example, Cbu2 and Rgr1 cluster most closely together on both trees (Fig. 1a, bDown) and therefore could be orthologous to PhtC since the rRNAs of Coxiella and Rickettsiella are more closely related to each other than they are to that of Legionella. Similarly, Cbu1 and Rgr2 are more closely related to each other than they are to PhtD. These three proteins thus show the same relative positions in Fig. 1(a)Down as do the corresponding rRNAs in Fig. 1(b)Down, consistent with orthology. The results are therefore consistent with the conclusion that cluster 1 contains two sets of orthologues present in C. burnetii, R. grylli and L. pneumophila but lacking in F. tularensis.


Figure 1
View larger version (23K):
[in this window]
[in a new window]

 
Fig. 1. Phylogenetic tree of members of the Pht family (a) as well as the 16S rRNAs from the represented organisms (b). Genes outlined by boxes indicate the most likely candidates for horizontal gene transfer. These trees are based on the CLUSTAL_X multiple alignment (Thompson et al., 1997Down) shown in Fig. S1. They were drawn with the TreeView program (Zhai et al., 2002Down). Protein abbreviations are as indicated in Table 1Up (by cluster) and Table S1 (alphabetically).

 
Cluster 2 contains three proteins from C. burnetii, two from L. pneumophila, and one each from R. grylli and F. tularensis. Of these proteins, Cbu3 is most closely related to Ftul, while PhtE is most closely related to Rgr3. Since the Coxiella rRNA is most closely related to that of Rickettsiella, and most distantly related to that of Francisella, we conclude that these proteins cannot be assigned to orthologous clusters based on the 16S rRNA tree. Cluster 3 includes just two sequence divergence proteins, PhtI and Cbu6, which could be orthologues if Rickettsiella lacks the corresponding protein. Cluster 4 has three proteins that may be orthologous. If so, R. grylli lacks the corresponding orthologue. Clusters 5, 7 and 8 consist of five very distantly related proteins. Two of the three proteins of cluster 7 as well as the cluster 5 protein are from F. tularensis.

Clusters 7 and 8 contain the only two sequences derived from free-living bacterial species, Pma1 from Prochlorococcus marinus and Zmo1 from Zymomonas mobilis. As documented below, we believe the genes encoding these two proteins were acquired by horizontal transfer.

Cluster 6 is the largest cluster represented in the tree in Fig. 1(a)Up. Two L. pneumophila paralogues are present, one (PhtB) distant from all other members of this cluster; the other (PhtK) closely related to Pam1 from the chlamydial species Candidatus ‘Protochlamydia amoebophila’ UWE25. As noted below, all three of these proteins were probably acquired by horizontal transfer. The two proteins from two different Ehrlichia species cluster together, suggesting that they are orthologues. The two Anaplasma proteins also cluster together as expected for orthologues. Finally, all four Wolbachia proteins cluster together, suggesting that they arose by recent gene duplication events. Wen1 and Wen4, from two different strains of Wolbachia, are probably orthologues. Wen1, Wen2 and Wen3 are from the same strain of Wolbachia and therefore probably arose by gene duplication events in this organism.

Cluster 9 has two paralogues each from L. pneumophila and C. burnetii as well as a single homologue each from F. tularensis and R. grylli. The relationships of proteins Ftu6, Cbu7 or Cbu8 and Rgr4 are consistent with orthology. If so, the pair of paralogues, Cbu7 and Cbu8, arose by an extragenic duplication event that occurred at about the same time as Coxiella and Rickettsiella diverged from each other. The two L. pneumophila and two C. burnetii paralogues could have arisen by early gene duplication events. Finally, the distances between the homologues in clusters 10 and 11 are not consistent with orthology, but the two paralogues in cluster 11 from R. grylli could have arisen by a recent gene duplication event.

In summary, we find a poor correlation between clustering patterns of the 16S rRNAs and the proteins, suggesting the occurrence of horizontal gene transfer. There are few well-conserved sets of potential orthologues. However, clusters 1–4 and 9–11 consist only of gammaproteobacterial homologues, suggesting either that lateral transfer did not occur or that it occurred between relatively closely related organisms. All other homologues, from alphaproteobacteria, cyanobacteria and chlamydiae, are in clusters 6–8 which also contain gammaproteobacterial representatives. Based on the phylogenetic analyses, both gain (through horizontal transfer) and loss (by gene deletion) appear to have occurred among the genes encoding members of this family.

Further evidence for horizontal gene transfer in the Pht family
The possibility of horizontal gene transfer was examined further using several different independent methods. The methods used included genome-wide phylogenetic analysis of database matches, G+C contents, codon usage patterns and conservation of gene order (synteny) among closely related species. Genome-wide phylogenetic analysis was performed for all species containing Pht family members using the DarkHorse method (Podell & Gaasterland, 2007Down; see Methods), to obtain LPI scores. Genes encoding proteins where phylogenetic tree relationships suggested horizontal transfer were also examined by other methods (Table 2Down and Tables S3–S17, available with the online version of this paper and at the authors' website: www.biology.ucsd.edu/~msaier/supmat/Pht). For the genes in clusters 1–5 and 9–11, no significant differences were seen in G+C content, relative codon usage frequencies between organismal and gene values, LPI values from the DarkHorse program or synteny with closely related organisms; however, significant differences were found in clusters 6, 7 and 8.


View this table:
[in this window]
[in a new window]

 
Table 2. R2 values, G+C content differences and LPI values ascertaining the evidence for horizontal gene transfer

Genes and values strongly indicative of horizontal transfer are indicated in bold type.

 
The results using these different methods are summarized in Table 2Up. This table records (1) R2 values for codon usage correlation where low values are indicative of horizontal transfer, (2) differences in G+C content for the gene and the coding regions within the entire protein-coding parts of the genomes, and (3) results from the LPI analysis. Details of the LPI analysis are shown graphically in Fig. 2Down. Using these methods, five genes appear likely to have undergone lateral transfer. These genes are pam1, phtB, phtK, pma1 and zmo1. Two of these genes showed low R2 values, and one showed a significant G+C content difference between gene and genome. The phtB gene also showed the lowest R2 value and the greatest G+C content difference from the G+C content of all other protein-coding genes in the genome. Fig. 2Down illustrates that LPI scores for pam1, phtB, phtK, pma1 and zmo1 are substantially lower than for other genes from the same host genome, while scores for other members of the Pht family, with the possible exception of Pam1, are not.


Figure 2
View larger version (20K):
[in this window]
[in a new window]

 
Fig. 2. LPI score distribution histograms for (a) L. pneumophila Philadelphia, (b) P. marinus MIT9313, (c) Z. mobilis, (d) Candidatus ‘Protochlamydia amoebophila’ UWE25, (e) C. burnetii and (f) F. tularensis. LPI scores are normalized values obtained using the DarkHorse algorithm (Podell & Gaasterland, 2007Down), reflecting phylogenetic similarity of individual proteins from the test organism to database protein sequence matches from other species. Scores are inversely proportional to the likelihood of acquisition via horizontal transfer. Organisms at similar phylogenetic distances receive similar scores, regardless of database abundance. For each genome, scores for all predicted proteins having orthologues in the non-redundant GenBank database were collected into bins of 0.02 units. Arrows indicate relative positions of scores for Pht family members.

 
Six strains of P. marinus have been sequenced. Synteny analysis revealed that of these six strains, only strain MIT9313 had the pma1 gene, although all of these six strains showed identical flanking regions (see Fig. 3Down). For the flanking homologues, synteny was only observed for members of the same species when available, but not across species lines. Thus, no clear evidence for horizontal transfer was obtained for these latter genes, although the lack of synteny with the nearest sequenced relatives is consistent with horizontal transfer. Thus, both characteristics of these genes proved similar to those of their source organisms (e.g. ≤1 % difference for G+C content and R2 values >0.66). For phtG, the only minor codon usage difference was for glycine codons (Table S12), and for phtK, the differences were for the 6 leucine codons and possibly the 4 valine codons (Table S13). For pam1 there were no significant differences (Table S11), but for phtB and pma1, the codon usage differences were highly significant. In summary, the evidence for horizontal transfer was best for phtB and pma1, but also substantial for phtK and zmo1.


Figure 3
View larger version (23K):
[in this window]
[in a new window]

 
Fig. 3. Gene order of equivalent genomic regions in six strains of P. marinus. The region flanking the pma1 gene in strain MIT9313 was compared to syntenic regions in five closely related Prochlorococcus strains using the gene neighbourhood browser function of the Integrated Microbial Genomes (IMG) website, version 2.0 (http://img.jgi.doe.gov/cgi-bin/pub/main.cgi; Markowitz et al., 2006Down). Abbreviations: ftsJ, gene encoding the cell division protein (putative RNA methyltransferase); purB, the adenylosuccinate lyase gene; fumC, the fumarate lyase gene; hc, putative DNA helicase gene; hp, hypothetical protein gene; bioF, putative 8-amino-7-oxononanoate synthase gene; bioD, putative dethiobiotin synthase gene.

 
Topology of Pht family proteins
Mean hydropathy, amphipathicity and similarity plots were generated using the AveHas program (Zhai & Saier, 2001Down). As recorded in Table 1Up, all of the putative Pht proteins are about the same size, and there are no significant size differences between members of the various phylogenetic clusters. Indeed, the AveHas plots (Fig. S2) revealed a typical MFS 12-TMS topology with a 6+6 arrangement. The two halves look similar with peaks 3 and 4 as well as peaks 9 and 10 being closer to each other than to the others. Also, in both halves, the first 4 peaks are best conserved, while the last peak is the most hydrophobic. These observations led us to examine sequence motif similarities between the two halves as an approach to reveal their homology.

Conserved motifs between the first and second halves of Pht family members
The multiple alignment (Fig. S1) showed that in all homologues, the initiation codon occurs at about the same place (±35 alignment positions). One homologue, Ftu3, was C-terminally truncated, losing TMSs 11 and 12 due to a database gene-model error. This protein was reconstituted to its full size following translation of the downstream DNA in the three reading frames using the NCBI ORF finder program (www.ncbi.nlm.nih.gov/gorf/gorf.html).

The most conserved regions of these proteins occur overlapping and between TMSs 2 and 3, and 8 and 9, as well as between TMSs 4 and 5, and 10 and 11. These two sets of sequences show striking sequence similarities between the two halves of these proteins.

The first motifs of these two regions (TMSs 2–3 and 8–9) show well-conserved, similar consensus sequences (Fig. 4aDown). The second motifs of these regions (TMSs 4–5 and 10–11) show less well-conserved, but significantly similar consensus sequences (Fig. 4bDown). The similarities of these two regions (which are less than the similarities observed for the same halves of the different family members) probably reflect an ancient intragenic duplication event that generated the 12-TMS MFS permeases from their 6-TMS precursors. These protein halves were therefore examined for the statistical significance of their sequence similarities using the GAP (Devereux et al., 1984Down) and IC (Zhai & Saier, 2002Down) programs.


Figure 4
View larger version (18K):
[in this window]
[in a new window]

 
Fig. 4. Alignment of conserved motifs in the first (top) and second (bottom) halves of members of the Pht family. (a) Motifs in the TMS 2–3 and 8–9 regions of the two halves. Vertical lines indicate identities; colons indicate similarities; alternative residues at a particular position are indicated in parentheses; alignment position is indicated at the beginning of each motif. (b) Motifs in the TMS 4–5 and 10–11 regions of the two halves. Colons represent similar residue types between the two sequences; alternative dominant residues at a single position are in parentheses; *, full conservation in one of the two motifs; –, a gap occurs in the aligned motifs.

 
Homology of the two 6-TMS repeat units with each other
As noted above, the two halves of each protein in the Pht family (see Fig. S2) show similar sequence motifs. Furthermore, members of the MFS, derived from some but not other families within this superfamily, have been shown to have arisen by intragenic duplication events in which a precursor gene encoding a 6-TMS protein duplicated to give full-length MFS permeases of 12 TMSs (Pao et al., 1998Down; Saier et al., 1999Down).

We examined members of the Pht family to see if the occurrence of this evolutionary event could be established using statistical means. Representative results are shown in Fig. 5Down. When the first half (TMSs 1–6) of Wen1 from the Wolbachia endosymbiont strain TRS, from the roundworm Brugia malayi, was compared with the second half (TMSs 7–12) of PhtH from L. pneumophila, a comparison score of 10.9 SD was obtained with 38.2 % similarity and 22.5 % identity. This comparison score indicates that the degree of sequence similarity observed for the two halves of the proteins could not have occurred by chance except with a probability of less than 10–29. Based on criteria presented previously (Saier, 1994Down) and the superfamily principle (Doolittle, 1986Down), this value is considered sufficient to establish homology. We therefore conclude that members of the Pht family arose by an intragenic duplication event.


Figure 5
View larger version (32K):
[in this window]
[in a new window]

 
Fig. 5. Binary alignment of the first half (TMSs 1–6) of Wen1 with the second half (TMSs 7–12) of PhtH (see Table 1Up). The GAP (Devereux et al., 1984Down) and IC (Zhai & Saier, 2002Down) programs were used to derive the alignment shown and the comparison score of 10.9 SD. Residue numbers are provided at the beginning and end of each line. Vertical lines, identities; colons, close similarities; dots, more distant similarities as defined by the GAP program. Dots above the alignment indicate the position of every tenth residue.

 
Members of the UMF9 family
BLAST searches and phylogenetic analyses revealed that most closely related to members of the Pht family are members of another small MFS family, no member of which has been functionally characterized. We designated this family the UMF9 family (see www.tcdb.org/). This family proved to have different organismal sources and numbers of paralogues per organism (Table 3Down).


View this table:
[in this window]
[in a new window]

 
Table 3. Members of the UMF9 family presented by phylogenetic cluster (see Fig. 6aUp)

 
After removing redundancies as described above for the Pht family, 27 sequenced proteins comprise this family. These proteins derive from a variety of bacterial kingdoms as well as from Euryarchaeota (Table 3Up and Table S2). The bacterial divisions include Alpha-, Beta-, Gamma- and Deltaproteobacteria, Firmicutes, Actinobacteria and Acidobacteria. The organismal distribution is therefore broader than that of the Pht family, even though the UMF9 family is smaller. No organism has more than two paralogues within this family. The four organisms that have two paralogues are from four different prokaryotic subdivisions (Alpha- and Gammaproteobacteria, a firmicute and a euryarchaeon). All other organisms represented have just one.

Phylogeny of UMF9 family members
The phylogenetic tree for the UMF9 family, based on the multiple alignment shown in Fig. S3, is shown in Fig. 6(a)Down. There are 7 clusters containing 27 sequences from 23 different species. Cluster 1 consists of alpha-, gamma- and deltaproteobacterial homologues as well as one from a firmicute. The relative branch lengths are consistent with orthology (see Fig. 6bDown). Cluster 2 includes two close homologues, Sfu1 from a deltaproteobacterium and Pth1 from a firmicute, clearly indicating the occurrence of lateral gene transfer. Other organisms in cluster 2 are all from Euryarchaeota, and their phylogenetic relationships to each other are suggestive of orthology. Surprisingly, the sequence of the Archaeoglobus fulgidus gene afu1 appears to be more closely related to bacterial database matches than to archaeal ones, with an LPI score of 0.09 (data not shown). All of the organisms in this cluster live in anaerobic, methanogenic environments characterized by close association of archaeal and bacterial species, providing ample opportunity for horizontal gene transfer to occur.


Figure 6
View larger version (20K):
[in this window]
[in a new window]

 
Fig. 6. Phylogenies of members of (a) the UMF9 family and (b) 16S rRNAs from the represented organism. Tree construction and format of presentation are as described in the legend to Fig. 1Up. The tree shown in Fig. 6(a)Up was based on the multiple alignment shown in Fig. S3. Protein abbreviations are as indicated in Table 3Up (by cluster) and Table S2 (alphabetically).

 
Cluster 3 includes four proteins, a gamma- and two betaproteobacterial homologues as well as one from an acidobacterium. These proteins clearly are not orthologues. Cluster 4 firmicute proteins could be orthologous to each other (compare Fig. 6a and bUp). Cluster 5 consists of a single protein from Bradyrhizobium. Cluster 6 consists of proteins only from actinobacteria, and their phylogenetic distances may be consistent with orthology within experimental error. Cluster 7 includes two paralogues from a euryarchaeon, Natronomonas. Based on these observations, the occurrence of horizontal transfer within some clusters of the UMF9 family but not others seems likely.

Further evidence for lateral gene transfer within the UMF9 family
Within the UMF9 family, we identified genes which also exhibited significant G+C content and codon usage differences (data not shown). Most striking was the pth1 gene which showed 4.7 mol% G+C content differences between the gene and genome, with major codon usage frequency differences for the Arg, Gln, Glu, Gly, Ile, Phe, Pro and Thr codons (R2 value of 0.47; Table S14). Several others, such as aeh1, showed no significant differences. These results suggest that horizontal gene transfer may also have been a characteristic of the UMF9 family.

Topology of UMF9 family members
An AveHas plot for the UMF9 family was derived (see Fig. S4). It does not differ appreciably from that of the Pht family. Thus, a 6+6-TMS topology with similar hydropathy characteristics and recognizable motif similarities between TMSs 2 and 3, and TMSs 8 and 9 was apparent.

Conserved motifs between the first and second halves of UMF9 family members
The most conserved motifs for the UMF9 family are very similar to those of the Pht family. These motifs are shown in Fig. 7Down. The conserved motif in the first halves of the proteins is better conserved than the corresponding motif in the second halves, but the C-terminal portions of these motifs are nevertheless sufficiently similar to strongly suggest a common origin.


Figure 7
View larger version (8K):
[in this window]
[in a new window]

 
Fig. 7. The best conserved motifs between TMSs 2 and 3 in the first halves and TMSs 8 and 9 in the second halves of members of the UMF9 family. The convention of presentation is as described in the legend to Fig. 4Up.

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
We have characterized two distinct families within the MFS, the Pht family of phagosomal nutrient transporters and the UMF9 family of unknown function. These two families are more closely related to each other than to any of the other currently recognized families within this superfamily. This became apparent first as a result of BLAST search analyses, second when motif analyses were conducted, and third when phylogenetic analyses between MFS families were conducted, for example, as reported by Pao et al. (1998)Down (data not shown).

The Pht and UMF9 families are similar in that both appear to be characterized by horizontal transfer. They differ, however, in that whereas the Pht family has many paralogues in a single organism, the UMF9 family does not. Moreover, while the Pht family is very restricted with respect to distribution of organismal types and lifestyles, the UMF9 family is much more widely distributed. The latter family is represented not only in a greater range of bacterial kingdoms, but also in the Archaea. This is surprising in view of the fact that the UMF9 family has fewer sequenced members than the Pht family. The basis for these differences presumably reflects the evolutionary pressure for different organisms to acquire and retain the genes encoding these transporters. While Pht family members may function primarily in acquiring nutrients from a host phagosome, the organismal distribution of the UMF9 family suggests that this cannot be true for this family. In fact, all organisms possessing more than one paralogue of the former family (but not the latter family) are intracellular animal parasites. Recent phenotypic analyses suggest that in addition to the amino acid transporters PhtA and PhtJ, the L. pneumophila homologues in phylogenetic cluster 1 of Fig. 1(a)Up, PhtC and PhtD, contribute to nucleoside assimilation (Fonseca et al., 2007Down). Further elucidation of the substrates of members of both families will undoubtedly prove illuminating.

All functionally characterized members of the MFS have at least 12 TMSs and two internally repeated segments (Pao et al., 1998Down; Saier et al., 1999Down). None identified so far contains a single 6-TMS unit. It is presumed that during evolution, the two homologous halves of these proteins have assumed different functional roles in the transport cycle (Abramson et al., 2004Down; Lemieux et al., 2004Down; Tamura et al., 2003Down). A carrier mechanism, in contrast to a channel mechanism, may depend on conformational constraints. These constraints may require that the transporter exists as a single polypeptide chain, rather than as an oligomer of small protein subunits (Saier, 2003Down). These considerations have been discussed previously (Saier, 2003Down; van Veen, 2001Down).

The bioinformatic analyses reported here pose a number of questions for future study. Why has lateral transfer been a characteristic of both the Pht and UMF9 families when this seems not to have been the case for many other transporter families? What were the selective pressures for horizontal transfer? What were the source organisms from which the horizontally transferred genes found in present-day organisms came? Why do Pht family members have so many paralogues in some organisms when this is not true of the UMF9 family? Is the presence of these paralogues a consequence of the intracellular lifestyles of these bacteria? If so, does gene transfer occur outside of the parasitic environment of these facultative pathogens? Why are members of the Pht family so restricted in organismal distribution when the members of the smaller UMF9 family are so much more broadly distributed? This last question is of particular interest in view of the fact that these two families are much more closely related to each other than they are to any of the other >50 currently recognized MFS families. Further bioinformatic and functional analyses are likely to provide answers to these questions.


    ACKNOWLEDGEMENTS
 
The authors would like to thank Terry Gaasterland and the Scripps Genome Center for providing computational resources and infrastructure used in the bioinformatics analysis. S. P. was supported in this work by NSF grant number EF-0412090, the Gordon and Betty Moore Foundation, and the Rancho Santa Fe Foundation. D. E. C. and M. H. S. were supported by NIH grant GM077402 from the National Institute of General Medical Sciences. We thank Mary Beth Hiller and Natasha Weaver for their assistance in the preparation of this manuscript.

Edited by: J. M. Becker


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Abramson, J., Iwata, S. & Kaback, H. R. (2004). Lactose permease as a paradigm for membrane transport proteins. Mol Membr Biol 21, 227–236.[CrossRef][Medline]

Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.[Abstract/Free Full Text]

Devereux, J., Haeberli, P. & Smithies, O. (1984). A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res 12, 387–395.[Medline]

Doolittle, R. F. (1986). Of URFs and ORFs: A Primer on How to Analyze Derived Amino Acid Sequences. Mill Valley, CA: University Science Books.

Fields, B. S., Benson, R. F. & Besser, R. E. (2002). Legionella and Legionnaires' disease: 25 years of investigation. Clin Microbiol Rev 15, 506–526.[Abstract/Free Full Text]

Fliermans, C. B., Cherry, W. B., Orrison, L. H., Smith, S. J., Tison, D. L. & Pope, D. H. (1981). Ecological distribution of Legionella pneumophila. Appl Environ Microbiol 41, 9–16.[Abstract/Free Full Text]

Fonseca, M. V., Sauer, J.-D., Byrne, B. G. & Swanson, M. (2007). Thymidine salvage in L. pneumophila: a link between metabolism and cellular differentiation. Second American Society for Microbiology Conference on Integrating Metabolism and Genomics (IMAGE2), Montreal, Quebec, Canada, 30 April–3 May 2007.

Gao, L. Y., Harb, O. S. & Kwaik, Y. A. (1998). Identification of macrophage-specific infectivity loci (mil) of Legionella pneumophila that are not required for infectivity of protozoa. Infect Immun 66, 883–892.[Abstract/Free Full Text]

George, J. R., Pine, L., Reeves, M. W. & Harrell, W. K. (1980). Amino acid requirements of Legionella pneumophila. J Clin Microbiol 11, 286–291.[Abstract/Free Full Text]

Greub, G. & Raoult, D. (2004). Microorganisms resistant to free-living amoebae. Clin Microbiol Rev 17, 413–433.[Abstract/Free Full Text]

Harb, O. S. & Abu Kwaik, Y. (2000). Characterization of a macrophage-specific infectivity locus (milA) of Legionella pneumophila. Infect Immun 68, 368–376.[Abstract/Free Full Text]

Lemieux, M. J., Huang, Y. & Wang, D. N. (2004). The structural basis of substrate translocation by the Escherichia coli glycerol-3-phosphate transporter: a member of the major facilitator superfamily. Curr Opin Struct Biol 14, 405–412.[CrossRef][Medline]

Leung, K. Y. & Finlay, B. B. (1991). Intracellular replication is essential for the virulence of Salmonella typhimurium. Proc Natl Acad Sci U S A 88, 11470–11474.[Abstract/Free Full Text]

Markowitz, V. M., Korzeniewski, F., Palaniappan, K., Szeto, E., Werner, G., Padki, A., Zhao, X., Dubchak, I., Hugenholtz, P. & other authors (2006). The integrated microbial genomes (IMG) system. Nucleic Acids Res 34, D344–D348.[Abstract/Free Full Text]

Marston, B. J., Lipman, H. B. & Breiman, R. F. (1994). Surveillance for Legionnaires' disease. Risk factors for morbidity and mortality. Arch Intern Med 154, 2417–2422.[Abstract]

Molofsky, A. B. & Swanson, M. S. (2004). Differentiate to thrive: lessons from the Legionella pneumophila life cycle. Mol Microbiol 53, 29–40.[CrossRef][Medline]

O'Riordan, M., Moors, M. A. & Portnoy, D. A. (2003). Listeria intracellular growth and virulence require host-derived lipoic acid. Science 302, 462–464.[Abstract/Free Full Text]

Pao, S. S., Paulsen, I. T. & Saier, M. H., Jr (1998). The major facilitator superfamily. Microbiol Mol Biol Rev 62, 1–32.[Abstract/Free Full Text]

Podell, S. & Gaasterland, T. (2007). DarkHorse: a method for genome-wide prediction of horizontal gene transfer. Genome Biol 8, R16[CrossRef][Medline]

Ren, Q. & Paulsen, I. T. (2007). Large-scale comparative genomic analyses of cytoplasmic membrane transport systems in prokaryotes. J Mol Microbiol Biotechnol 12, 165–179.[CrossRef][Medline]

Saier, M. H., Jr (1994). Computer-aided analyses of transport protein sequences: gleaning evidence concerning function, structure, biogenesis, and evolution. Microbiol Rev 58, 71–93.[Abstract/Free Full Text]

Saier, M. H., Jr (2003). Tracing pathways of transport protein evolution. Mol Microbiol 48, 1145–1156.[CrossRef][Medline]

Saier, M. H., Jr, Beatty, J. T., Goffeau, A., Harley, K. T., Heijne, W. H. M., Huang, S.-C., Jack, D. L., Jahn, P. S., Lew, K. & other authors (1999). The major facilitator superfamily. J Mol Microbiol Biotechnol 1, 257–279.[Medline]

Saitou, N. & Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4, 406–425.[Abstract]

Sauer, J. D., Shannon, J. G., Howe, D., Hayes, S. F., Swanson, M. S. & Heinzen, R. A. (2005a). Specificity of Legionella pneumophila and Coxiella burnetii vacuoles and versatility of Legionella pneumophila revealed by coinfection. Infect Immun 73, 4494–4504.[Abstract/Free Full Text]

Sauer, J. D., Bachman, M. A. & Swanson, M. S. (2005b). The phagosomal transporter A couples threonine acquisition to differentiation and replication of Legionella pneumophila in macrophages. Proc Natl Acad Sci U S A 102, 9924–9929.[Abstract/Free Full Text]

Tamura, N., Konishi, S. & Yamaguchi, A. (2003). Mechanisms of drug/H+ antiport: complete cysteine-scanning mutagenesis and the protein engineering approach. Curr Opin Chem Biol 7, 570–579.[CrossRef][Medline]

Tesh, M. J. & Miller, R. D. (1981). Amino acid requirements for Legionella pneumophila growth. J Clin Microbiol 13, 865–869.[Abstract/Free Full Text]

Tesh, M. J., Morse, S. A. & Miller, R. D. (1983). Intermediary metabolism in Legionella pneumophila: utilization of amino acids and other compounds as energy sources. J Bacteriol 154, 1104–1109.[Abstract/Free Full Text]

Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876–4882.[Abstract/Free Full Text]

van Veen, H. W. (2001). Towards the molecular mechanism of prokaryotic and eukaryotic multidrug transporters. Semin Cell Dev Biol 12, 239–245.[CrossRef][Medline]

Warren, W. J. & Miller, R. D. (1979). Growth of Legionnaires' disease bacterium (Legionella pneumophila) in chemically defined medium. J Clin Microbiol 10, 50–55.[Abstract/Free Full Text]

Wieland, H., Ullrich, S., Lang, F. & Neumeister, B. (2005). Intracellular multiplication of Legionella pneumophila depends on host cell amino acid transporter SLC1A5. Mol Microbiol 55, 1528–1537.[CrossRef][Medline]

Zhai, Y. & Saier, M. H., Jr (2001). A web-based program for the prediction of average hydropathy, average amphipathicity and average similarity of multiply aligned homologous proteins. J Mol Microbiol Biotechnol 3, 285–286.[Medline]

Zhai, Y. & Saier, M. H., Jr (2002). A simple sensitive program for detecting internal repeats in sets of multiply aligned homologous proteins. J Mol Microbiol Biotechnol 4, 375–377.[Medline]

Zhai, Y., Tchieu, J. & Saier, M. H., Jr (2002). A web-based TreeView (TV) program for the visualization of phylogenetic trees. J Mol Microbiol Biotechnol 4, 69–70.[CrossRef][Medline]

Received 12 July 2007; revised 28 September 2007; accepted 1 October 2007.



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chen, D. E.
Right arrow Articles by Saier, M. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chen, D. E.
Right arrow Articles by Saier, M. H.
Agricola
Right arrow Articles by Chen, D. E.
Right arrow Articles by Saier, M. H.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS
Copyright © 2008 Society for General Microbiology.