怎么从从ncbi的ftp上下了windows的本地blast
发布网友
发布时间:2022-04-10 03:34
我来回答
共1个回答
热心网友
时间:2022-04-10 05:04
This document describes the "BLAST" databases available on the NCBI
FTP site under the /blast/db directory. The direct URL is:
ftp://ftp.ncbi.nih.gov/blast/db 本地BLAST数据库下载地址
1. General Introction
NCBI BLAST home pages (http://www.ncbi.nih.gov/BLAST/) use a standard
set of BLAST databases for Nucleotide, Protein, and Translated BLAST
searches. These databases are made available in the /blast/db directory as
compressed archives (ftp://ftp.ncbi.nih.gov/blast/db/) in pre-formatted
format.这些数据库是已经预先进行过makeblastdb命令的,下载后可以直接使用
The FASTA databases reside under the /blast/db/FASTA directory.
The pre-formatted databases offer the following advantages:
* The pre-formatted databases are smaller in size and therefore are
faster to download;
* Sequences in FASTA format can be generated from the pre-formatted
databases by the fastacmd utility; 可以从这些数据库文件中导出FASTA文件
* A convenient script (update_blastdb.pl) is available to download
the pre-formatted databases from the NCBI ftp site; 可用该脚本升级数据库
* Pre-formatting removes the need to run formatdb; 无需再运行建库命令行
* Taxonomy ids are available for each database entry.
Pre-formatted databases must be downloaded using the update_blastdb.pl
script or via FTP in binary mode. Documentation for the update_blastdb.pl
script can be obtained by running the script without any arguments (perl is
required). 下载数据库时,需要用到perl脚本update_blastdb.pl,或使用FTP下载工具
The compressed files downloaded must be inflated with gzip or other decompress
tools. The BLAST database files can then be extracted out of the resulting
tar file using tar program on Unix/Linux or WinZip and StuffIt Expander
on Windows and Macintosh platforms, respectively.下载的数据库为压缩包,要解压缩
Large databases are formatted in multiple 1 Gigabytes volumes, which
are named using the database.##.tar.gz convention. All relevant volumes
are required. An alias file is provided so that the database can be called
using the alias name without the extension (.nal or .pal). For example,
to call est database, simply use "-d est" option in the commandline
(without the quotes). 大的数据库通常分为多个压缩包,例如nr库有11个压缩包。所有的相关压缩包
都要下载,解压。解压缩会生成对应的库文件,同时生成一个nr.pal文件。检索nr库时输入-d nr 即可。
Certain databases are subsets of a larger parental database. For those
databases, alias and mask files, rather than actual databases, are provided.
The mask file needs the parent database to function properly. The parent
databases should be generated on the same day as the mask file. For
example, to use swissprot pre-formatted database, swissprot.tar.gz, one
will need to get the nr.tar.gz with the same date stamp. 有些数据库是大数据库
的子集,使用这些子集数据库时,必须同时下载其(相同日期的)大数据库
Additional BLAST databases that are not provided in pre-formatted
formats are available in the FASTA subdirectory. 有些BLAST数据库没有提供预先建库
的文件,这些数据库可以从FASTA文件夹里下载 For genomic BLAST
databases, please check the genomes ftp directory at:
ftp://ftp.ncbi.nih.gov/genomes/ 在这里下载基因组BLAST数据库
2. Contents of the /blast/db/ directory
The pre-formatted BLAST databases are archived in this directory. The
name of these databases and their contents are listed below.
数据库名称 数据库内容
+----------------------+-----------------------------------------------+
|File Name | Content Description |
+----------------------+-----------------------------------------------+
/FASTA | subdirectory for FASTA formatted sequences
存放FASTA格式序列的子文件夹
README | README for this subdirectory (this file)
env_nr.*tar.gz | Environmental protein sequences 环境蛋白序列
env_nt.*tar.gz | Environmental nucleotide sequences 环境核苷酸序列
est.*tar.gz | volumes of the formatted est database
| from the EST division of GenBank, EMBL,
| and DDBJ. EST数据库