|
|
||||||||
1 Public Health Agency of Canada, Laboratory for Foodborne Zoonoses, Guelph, ON N1G 3W4, Canada
2 Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON N1G 2A3, Canada
Correspondence
Andrew M. Kropinski
Andrew_Kropinski{at}phac-aspc.gc.ca
| ABSTRACT |
|---|
|
|
|---|
While numerous programs are available [e.g. GCUA (McInerney, 1998
), OPTIMIZER (Puigbo et al., 2007
), ACUA (Vetrivel et al., 2007
), E-CAI (Puigbo et al., 2008
), CAI Analyser (Ramazzotti et al., 2007
), CodonW (http://codonw.sourceforge.net/culong.html) and Countcodon (Nakamura et al., 2000
; http://www.kazusa.or.jp/codon/countcodon.html)] for the analysis and optimization of codon usage, little has been published on initiation codon usage in bacteria. Surprisingly, the only paper specifically to address this topic was published well before the genomic era (Rudd & Schneider, 1992
). From that paper, one is left with the impression that, at least in the case of Escherichia coli, ATG(AUG) is the predominant initiation codon (92.0 % of characterized genes), while GTG(GUG) functions in 6.7 % of translational starts, and TTG(UUG) accounts for only 1.2 %.
We have reexamined the status of initiation codon usage using data from 620 bacterial chromosomes (GenBank data to February 14, 2008). Data on bacterial genomes were downloaded from the GenBank ftp site at the National Center for Biotechnology Information (NCBI; ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/). Information on the microbial phylum and genome mass was extracted from the genomic flat file (*.gbk), and on mol%G+C from the FASTA nucleotide (*.fna) files. Initiation codon usage was determined using Inidon (Initiation Codon; http://molbiol-tools.ca/Inidon/) software. This program was written as a Java applet and is easily accessible via a Java-enabled web browser. This tool accepts FASTA-formatted gene files (*.ffn – nucleotide coding regions). Inidon goes through each gene in these files and records the number of occurrences for every initiation codon encountered. The program reports the total number of genes in the input file, the number of occurrences and the frequency of each encountered initiation codon in decreasing order. The results were then exported into MS Excel (the complete database can be viewed as Supplementary Table S1).
Although the total data show considerable scatter (Fig. 1
), analysis using Lowess Spine analysis (GraphPad Software) suggests that the overall base composition of the host genome has little impact on initiation codon utilization between 30 and 65 mol%G+C (Fig. 1
).
|
In most cases where similar species or strains were analysed, the utilization percentages of the three initiation codons were remarkably similar. In certain cases (see Table 1
) the results exhibited significant differences.
|
In addition, in the case of Mycoplasma gallisepticum, ATT and ATC contributed significantly to initiation codon utilization, yet apparently are not utilized in related species. Examination of the data further suggests that the major discrepancies are due to over or under estimation of the role of TTG codons, and to a lesser extent the utilization of GTG. These results clearly indicate that the identification of initiation codons in bacterial genes is still far from precise and therefore warrants new software developments plus the reexamination, and possible third-party correction, of existing GenBank genomic data.
Edited by: C. J Dorman
| REFERENCES |
|---|
|
|
|---|
Nakamura, Y., Gojobori, T. & Ikemura, T. (2000). Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucleic Acids Res 28, 292
Puigbo, P., Guzman, E., Romeu, A. & Garcia-Vallve, S. (2007). OPTIMIZER: a web server for optimizing the codon usage of DNA sequences. Nucleic Acids Res 35, W126–W131.
Puigbo, P., Bravo, I. G. & Garcia-Vallve, S. (2008). E-CAI: a novel server to estimate an expected value of Codon Adaptation Index (eCAI). BMC Bioinformatics 9, 65[CrossRef][Medline]
Ramazzotti, M., Brilli, M., Fani, R., Manao, G. & Degl'innocenti, D. (2007). The CAI Analyser Package: inferring gene expressivity from raw genomic data. In Silico Biol 7, 507–526.[Medline]
Rudd, K. E. & Schneider, T. D. (1992).Compilation of E. coli ribosome binding sites. In A Short Course in Bacterial Genetics, pp. 17.19–17.45. Edited by J. H. Miller. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory.
Vetrivel, U., Arunkumar, V. & Dorairaj, S. (2007). ACUA: a software tool for automated codon usage analysis. Bioinformation 2, 62–63.[Medline]
Received 13 June 2008;
revised 27 June 2008;
accepted 1 July 2008.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
| J MED MICROBIOL | ALL SGM JOURNALS | |