NAPP in a nutshell |
NAPP (Nucleic Acids Phylogenetic Profiling [1]) is a clustering method that efficiently identifies noncoding RNA (ncRNA) elements in a bacterial genome. In short, the intergenic regions of a reference genome are tiled into overlapping 50-nt segments, and all tiles and coding sequences are classified based on their occurrence profiles in 1000 other genomes. Tiles corresponding to actual ncRNAs tend to cluster together and with certain types of protein-coding genes. We term these "RNA-rich clusters". Any non-annotated tile in such clusters can be considered as a strong ncRNA candidate (sRNA, cis-acting RNA or other ncRNAs). Furthermore, certain clusters are enriched for genes in specific functional classes, which permits to draw hypotheses on the function of associated ncRNAs [2].
|
This web server enables users to retrieve RNA-rich clusters from any genome in a list of 1000+ sequenced bacterial genomes. RNA-rich clusters can be viewed separately or, alternatively, all tiles from RNA-rich clusters can be contiged into larger elements and retrieved at once as a CSV or GFF file for use in a genome browser or comparison with other predictions/RNA-seq experiments.
|
A short documentation for this interface is available here.
|
References |
[1]
Ott A, Idali A, Marchais A & Gautheret D. (2011) NAPP: the nucleic acid phylogenetic profile database. Nucl. Acids Res. [Epub Ahead of print]
|
[2] Marchais A, Naville M, Bohn C, Bouloc P & Gautheret D. (2009) Single-Pass Classification of all Non-Coding Sequences in a Bacterial Genome Using Phylogenetic Profiles. Genome Res. 19:1084-92.
|
[3] Marchais A, Duperrier S, Durand S, Gautheret D, Stragier P. (2011) CsfG, a sporulation-specific, small non-coding RNA highly conserved in endospore formers. RNA Biol. 2011 May 1;8(3).
|
Contact: napp.biologie@u-psud.fr
|