get.ncbi {seqinr}R Documentation

Bacterial complete genome data from ncbi ftp site

Description

Try to connect to ncbi ftp site to get a list of complete bacterial genomes.

Usage

get.ncbi(repository = "ftp://ftp.ncbi.nih.gov/genomes/Bacteria/")

Arguments

repository Where to look for data. The default value is the location of the complete bacterial genome sequences at ncbi ftp repository.

Value

Returns a data frame which contains the following columns:

species The species name as given by the corresponding folder name in the repository (e.g. Yersinia_pestis_KIM).
accession The accession number as given by the common prefix of file names in the repository (e.g. NC_004088).
size.bp The size of the sequence in bp (e.g. 4600755).
type A factor with two levels (plasmid or chromosome) temptatively deduced from the description of the sequence.

WARNING

This function is highly dependant on ncbi ftp site conventions for which we have no control. The ftp connection apparently does not work when there is a proxy, this problem is circumvented here in a rather crude way.

Author(s)

J.R. Lobry

References

  citation("seqinr")

Examples

## Not run: bacteria <- get.ncbi()
## Not run: summary(bacteria)

[Package seqinr version 1.0-4 Index]