Microbiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Microbiology 154 (2008), 2559-2661; DOI  10.1099/mic.0.2008/021360-0
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Villegas, A.
Right arrow Articles by Kropinski, A. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Villegas, A.
Right arrow Articles by Kropinski, A. M.
Agricola
Right arrow Articles by Villegas, A.
Right arrow Articles by Kropinski, A. M.
Microbiology 154 (2008), 2559-2661; DOI  10.1099/mic.0.2008/021360-0
© 2008 Society for General Microbiology

An analysis of initiation codon utilization in the Domain Bacteria – concerns about the quality of bacterial genome annotation

Andre Villegas1 and Andrew M. Kropinski1,2

1 Public Health Agency of Canada, Laboratory for Foodborne Zoonoses, Guelph, ON N1G 3W4, Canada
2 Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON N1G 2A3, Canada

Correspondence
Andrew M. Kropinski
Andrew_Kropinski{at}phac-aspc.gc.ca


    ABSTRACT
 TOP
 ABSTRACT
 REFERENCES
 
Using custom software (Inidon) we have examined the initiation codon utilization in 620 complete bacterial genomes downloaded from the National Center for Biotechnology Information (NCBI). The mean utilization of ATG, GTG and TTG codons is 80.1, 11.6 and 7.8 %, respectively. In most cases in which similar species or strains have been analysed the utilization percentages of the three initiation codons are remarkably similar, but in certain cases the results exhibit significant differences.


A table of results exported from Inidon for 620 bacterial chromosomes is available with the online version of this paper. Access to Inidon software can be obtained at http://molbiol-tools.ca/InitCodon/Supplementary_TableS1.xls.

While numerous programs are available [e.g. GCUA (McInerney, 1998Down), OPTIMIZER (Puigbo et al., 2007Down), ACUA (Vetrivel et al., 2007Down), E-CAI (Puigbo et al., 2008Down), CAI Analyser (Ramazzotti et al., 2007Down), CodonW (http://codonw.sourceforge.net/culong.html) and Countcodon (Nakamura et al., 2000Down; http://www.kazusa.or.jp/codon/countcodon.html)] for the analysis and optimization of codon usage, little has been published on initiation codon usage in bacteria. Surprisingly, the only paper specifically to address this topic was published well before the genomic era (Rudd & Schneider, 1992Down). From that paper, one is left with the impression that, at least in the case of Escherichia coli, ATG(AUG) is the predominant initiation codon (92.0 % of characterized genes), while GTG(GUG) functions in 6.7 % of translational starts, and TTG(UUG) accounts for only 1.2 %.

We have reexamined the status of initiation codon usage using data from 620 bacterial chromosomes (GenBank data to February 14, 2008). Data on bacterial genomes were downloaded from the GenBank ftp site at the National Center for Biotechnology Information (NCBI; ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/). Information on the microbial phylum and genome mass was extracted from the genomic flat file (*.gbk), and on mol%G+C from the FASTA nucleotide (*.fna) files. Initiation codon usage was determined using Inidon (Initiation Codon; http://molbiol-tools.ca/Inidon/) software. This program was written as a Java applet and is easily accessible via a Java-enabled web browser. This tool accepts FASTA-formatted gene files (*.ffn – nucleotide coding regions). Inidon goes through each gene in these files and records the number of occurrences for every initiation codon encountered. The program reports the total number of genes in the input file, the number of occurrences and the frequency of each encountered initiation codon in decreasing order. The results were then exported into MS Excel (the complete database can be viewed as Supplementary Table S1).

Although the total data show considerable scatter (Fig. 1Down), analysis using Lowess Spine analysis (GraphPad Software) suggests that the overall base composition of the host genome has little impact on initiation codon utilization between 30 and 65 mol%G+C (Fig. 1Down).


Figure 1
View larger version (43K):
[in this window]
[in a new window]

 
Fig. 1. Utilization of ATG, GTG and TTG initiation codons in 620 bacterial chromosomes as a function of their mol%G+C content.

 
Bacterial genomes with extremely low GC contents tended to exclusively utilize ATG, while at high host mol%GC the utilization of ATG and TTG decreased and that of GTG increased. The mean utilization of these three codons was ATG, 80.1 %, GTG, 11.6 %, and TTG, 7.8 %, values considerably different from those reported by Rudd & Schneider (1992)Down.

In most cases where similar species or strains were analysed, the utilization percentages of the three initiation codons were remarkably similar. In certain cases (see Table 1Down) the results exhibited significant differences.


View this table:
[in this window]
[in a new window]

 
Table 1. Closely related genomes with significantly differing percentages of ATG initiation codon utilization

CDS, coding sequence.

 
In part this may be due to differences in recognizing coding regions, since many of these pairs showed significant differences in the total number of genes. This was not due to the massive presence of pseudogenes, which only had an impact on ‘initiation codon’ percentages in Mycobacterium leprae and Rickettsia massiliae (see Supplementary Table S1, column M).

In addition, in the case of Mycoplasma gallisepticum, ATT and ATC contributed significantly to initiation codon utilization, yet apparently are not utilized in related species. Examination of the data further suggests that the major discrepancies are due to over or under estimation of the role of TTG codons, and to a lesser extent the utilization of GTG. These results clearly indicate that the identification of initiation codons in bacterial genes is still far from precise and therefore warrants new software developments plus the reexamination, and possible third-party correction, of existing GenBank genomic data.

Edited by: C. J Dorman


    REFERENCES
 TOP
 ABSTRACT
 REFERENCES
 
McInerney, J. O. (1998). GCUA: general codon usage analysis. Bioinformatics 14, 372–373.[Abstract/Free Full Text]

Nakamura, Y., Gojobori, T. & Ikemura, T. (2000). Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucleic Acids Res 28, 292[Abstract/Free Full Text]

Puigbo, P., Guzman, E., Romeu, A. & Garcia-Vallve, S. (2007). OPTIMIZER: a web server for optimizing the codon usage of DNA sequences. Nucleic Acids Res 35, W126–W131.[Abstract/Free Full Text]

Puigbo, P., Bravo, I. G. & Garcia-Vallve, S. (2008). E-CAI: a novel server to estimate an expected value of Codon Adaptation Index (eCAI). BMC Bioinformatics 9, 65[CrossRef][Medline]

Ramazzotti, M., Brilli, M., Fani, R., Manao, G. & Degl'innocenti, D. (2007). The CAI Analyser Package: inferring gene expressivity from raw genomic data. In Silico Biol 7, 507–526.[Medline]

Rudd, K. E. & Schneider, T. D. (1992).Compilation of E. coli ribosome binding sites. In A Short Course in Bacterial Genetics, pp. 17.19–17.45. Edited by J. H. Miller. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory.

Vetrivel, U., Arunkumar, V. & Dorairaj, S. (2007). ACUA: a software tool for automated codon usage analysis. Bioinformation 2, 62–63.[Medline]

Received 13 June 2008; revised 27 June 2008; accepted 1 July 2008.



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Villegas, A.
Right arrow Articles by Kropinski, A. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Villegas, A.
Right arrow Articles by Kropinski, A. M.
Agricola
Right arrow Articles by Villegas, A.
Right arrow Articles by Kropinski, A. M.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS
Copyright © 2008 Society for General Microbiology.