query {seqinr} | R Documentation |
This is a major command of the package. It executes all sequence retrievals using any selection criteria the data base allows. The sequences are coming from ACNUC data base located on the web and they are transfered by socket. The command produces the list of all sequence names that fit the required criteria. The sequence names belong to the class of sequence SeqAcnucWeb
.
query(listname, query, socket = "auto", invisible = TRUE, verbose = FALSE, virtual = FALSE)
listname |
The name of the list as a quoted string of chars |
query |
A quoted string of chars containing the request with the syntax given in the details section |
socket |
a socket of class connection and sockconn returned by choosebank .Default value (auto) means
that the socket will be set to to the socket component of the banknameSocket variable. |
invisible |
if FALSE , the result is returned visibly. |
verbose |
if TRUE , verbose mode is on |
virtual |
if TRUE , no attempt is made to retrieve the information about
all the elements of the list. In this case, the req component of the list is set to
NA . |
Each selection criterion is written using the following syntax:
criterion1 ET criterion2 : logical AND (sequences that fit criteria 1 and 2 simultaneously).
criterion1 OU criterion2 : logical OR (sequences that fit at least one of both criteria).
NO criterion1 : logical negation (sequences that do not fit criterion 1).
Parentheses can be used to delimit the range of operations. List of sequences can be re-used at will, which is very convenient to fragment complexe requests into simple requests. For instance, here are two equivalent ways to get all coding sequences from Escherichia coli that are not partial:
choosebank("genbank")
query("final", "sp=escherichia coli ET t=cds ET NO k=partial")
choosebank("genbank")
query("eco", "sp=escherichia coli")
query("ecocds", "eco ET t=cds")
query("final", "ecocds ET NO k=partial")
Document | Format | Example | |
Journal article | journal_code/volume/1st_page | jme/34/17 | |
Book | book/year/1st_author | book/1980/broker | |
Thesis | thesis/year/1st_author | thesis/1984/wildgruber | |
Patent | patent/patent_coded_number | patent/ep0238993 | |
Unpublished, or submitted | unpubl/year/1st_author | unpubl/1993/cho |
ID | Locus entry | (EMBL, SWISS-PROT, NRSub) |
LOCUS | Locus entry | (GenBank, Hovergen, EMGLib) |
CDS | .PE protein coding region | (all) |
RRNA | .RR mature ribosomal RNA | (all) |
TRNA | .TR mature transfer RNA | (all) |
MISC_RNA | .RN other structural RNA coding region | (EMBL, GenBank, Hovergen, NRSub, EMGLib) |
SNRNA | .SN small nuclear RNA | (EMBL, GenBank, Hovergen, EMGLib) |
SCRNA | .SC small cytoplasmic RNA | (EMBL, GenBank, Hovergen, NRSub, EMGLib) |
3'INT | .3I 3' intron | (Hovergen) |
3'NCR | .3F 3' non-coding region | (Hovergen) |
5'INT | .5I 5' intron | (Hovergen) |
5'NCR | .5F 5' non-coding region | (Hovergen) |
CPG | .CG CpGobs/CpGexp>0.5 | (Hovergen) |
INT_INT | .IN internal intron | (Hovergen) |
Each entry of a FEATURE TABLE describing a coding region of a DNA fragment gives rise to a subsequence equal to the fragments described in the location of the feature. The type of the resulting subsequence equals the key of the corresponding feature table entry. The name of the resulting subsequence is built by adding to the parent sequence's name an extension uniquely identifying this particular feature.
Sequences of a given type are generally subsequences, i.e., fragments of parent sequences, except if the coding region covers totally the parent sequence, in which case ACNUC does not create a subsequence.
CHLOROPLAST | Chloroplast genome | (EMBL, GenBank, NBRF, Hovergen) |
MITOCHONDRION | Mitochondrial genome | (EMBL, GenBank, NBRF, Hovergen) |
KINETOPLAST | Kinetoplast genome | (EMBL, GenBank, Hovergen) |
NUCLEAR | Nuclear genome | (all) |
DNA | Sequenced molecule is DNA | (all) |
RNA | Sequenced molecule is RNA | (all) |
MRNA | Sequenced molecule is mRNA | (GenBank, Hovergen) |
RRNA | Sequenced molecule is rRNA | (GenBank, Hovergen) |
TRNA | Sequenced molecule is tRNA | (GenBank, Hovergen) |
URNA | Sequenced molecule is snRNA | (GenBank, Hovergen) |
crelistfromclientdata
with type = "SQ"
for this purpose.
crelistfromclientdata
with type = "AC"
for this purpose.
A list with the following components:
bank |
the name of the bank that has been choosen by choosebank.socket |
call |
original call |
name |
list name |
nelem |
number of elements in the list on the server |
typelist |
the type of the elemnts of the list. Could be SQ for a list of sequence names, KW for a list of keywords, SP for a list of species names. |
req |
a list of sequence names that fit the required criteria or NA when
called with parameter virtual is TRUE |
Most of the documentation was imported from ACNUC help files written by Manolo Gouy
J.R. Lobry & D. Charif
To get the release date and content of all the databases located at the pbil, please look at the following url: http://pbil.univ-lyon1.fr/search/releases.php
Gouy, M., Milleret, F., Mugnier, C., Jacobzone, M., Gautier,C. (1984) ACNUC: a nucleic acid sequence data base and analysis system.
Nucl. Acids Res., 12:121-127.
Gouy, M., Gautier, C., Attimonelli, M., Lanave, C., Di Paola, G. (1985)
ACNUC - a portable retrieval system for nucleic acid sequence databases:
logical and physical designs and usage.
Comput. Appl. Biosci., 3:167-172.
Gouy, M., Gautier, C., Milleret, F. (1985) System analysis and nucleic acid sequence banks.
Biochimie, 67:433-436.
citation("seqinr")
choosebank
,
getSequence
,
getName
,
crelistfromclientdata
## Not run: # Need internet connection choosebank("genbank") query("bb", "sp=Borrelia burgdorferi") # To get the names of the 4 first sequences: sapply(bb$req[1:4], getName) # To get the 4 first sequences: sapply(bb$req[1:4], getSequence, as.string = TRUE) ## End(Not run)