生物数据库与在线工具
生物数据库
生物数据库基于科学实验数据、出版文献资料以及高通量实验技术和计算分析方法构建了生命科学研究信息资源库。该数据库系统涵盖了基因组学(Genome Analysis)、蛋白质组学(Protein Analysis)、代谢组学(Metabolomics)、微阵列基因表达分析(Microarray Gene Expression)和系统发育学(Systematics and Developmental Biology)等多个研究领域的核心数据与分析结果。
生物数据库主要可分为三个类别:序列、结构与功能数据库。其中序列数据库专门存储核酸碱基序列与蛋白质氨基酸序列;而结构数据库则记录了RNA分子与蛋白质的三维构象信息;此外,在功能数据库中,则详细记录了基因产物的各种生理作用机制及其相关数据(例如:酶催化作用的具体步骤;基因突变对产物功能的影响;以及基因调控通路的具体表现形式等)。
数据库类型
生物 database有两种常见的类别: 一级 database 和二级 database. 一级 database 用于存储实验中获得的数据; 二级 database 以其他 (如一级) 数据库的信息为基础, 并根据具体需求进行处理或分析以获取结果.
数据库查找
获取生物数据库的关键来源是《核酸研究》(NAR:Nucleic Acids Research)期刊中的专题综述部分;该期刊系统地整理了与生物科学及生物信息学领域相关的网络数据库,并在2018年底统计显示共有1737个数据库被收录。
NAR系统将15种类型的生物信息学数据系统性地划分为15个类别以便于研究者进行跨学科分析。(非-脊椎动物)生命科学领域的研究者普遍采用这一分类方法以提高研究效率与数据整合能力
在线工具
NAR除了收集生物数据库之外,并非仅此用途,在线发表内容还包括每年还会发布用于分子生物学数据分析和可视化展示的网络资源
表1 2017年网络资源
| Web Server name | URL | Brief description | 
|---|---|---|
| agriGO v2 | http://systemsbiology.cau.edu.cn/agriGOv2/ | GO analysis for agricultural species | 
| AMMOS2 | http://drugmod.rpbs.univ-paris-diderot.fr/ammosHome.php | Energy minimization of protein–ligand complexes | 
| antiSMASH | http://antismash.secondarymetabolites.org/ | Secondary metabolite biosynthetic gene cluster mining in bacterial and fungal genomes | 
| ARTS | http://arts.ziemertlab.com | Biosynthetic gene cluster mining for novel antibiotics | 
| BAR 3.0 | http://bar.biocomp.unibo.it/bar3 | Protein structure and function annotation | 
| BepiPred-2.0 | http://www.cbs.dtu.dk/services/BepiPred-2.0/ | B-cell epitope prediction from a protein sequence | 
| BioAtlas | http://bioatlas.compbio.sdu.dk | Visualization of microbiome and metagenome locations | 
| BIS2Analyzer | http://www.lcqb.upmc.fr/BIS2Analyzer/ | Analysis of coevolving amino-acid pairs in protein sequences | 
| BusyBee | https://ccb-microbe.cs.uni-saarland.de/busybee | Metagenome binning | 
| CAFE | https://github.com/younglululu/CAFE | Stand-alone program for alignment-free comparison of metagenome data | 
| Cancer PanorOmics | http://panoromics.irbbarcelona.org | Mapping of cancer mutations to 3D protein–protein interaction sites | 
| COFACTOR | http://zhanglab.ccmb.med.umich.edu/COFACTOR/ | Structure-based protein function annotation | 
| compleXView | http://xvis.genzentrum.lmu.de/compleXView | Protein-protein interaction based on affinity purification mass spectrometry | 
| ConTra v3 | http://bioit2.irc.ugent.be/contra/v3 | Transcription factor binding sites analysis | 
| CPC2 | http://cpc2.cbi.pku.edu.cn | Protein coding potential of RNA transcripts | 
| CSPADE | http://cspade.fimm.fi/ | Chemoinformatics bioactivity assay visualization | 
| CSTEA | http://comp-sysbio.org/cstea/ | Analysis of time-series gene expression data on cell state transitions | 
| DEOGEN2 | http://deogen2.mutaframe.com/ | Prediction of deleterious mutations in proteins | 
| DNAproDB | http://dnaprodb.usc.edu | Structural analysis of DNA–protein complexes | 
| DSSR | http://jmol.x3dna.org | DNA and RNA structure visualization | 
| DynOmics | http://dyn.life.nthu.edu.tw/oENM/ | Protein molecular dynamics using elastic network models | 
| EBISearch | http://www.ebi.ac.uk/ebisearch | Web services text search in EMBL-EBI data | 
| FireProt | http://loschmidt.chemi.muni.cz/fireprot | Design of thermostable proteins | 
| GalaxyHomomer | http://galaxy.seoklab.org/cgi-bin/submit.cgi?type=HOMOMER | Prediction of protein homo-oligomer structure | 
| GASS-WEB | http://gass.unifei.edu.br/ | Identification of enzyme active sites | 
| GeMSTONE | http://gemstone.yulab.org/ | Genetic variant prioritization in human disease | 
| Gene ORGANizer | http://geneorganizer.huji.ac.il | Linkage of human genes to their affected body organs | 
| GenProBiS | http://genprobis.insilab.org | Mapping of SNPs to protein binding sites | 
| GEPIA | http://gepia.cancer-pku.cn/ | Analysis of differential gene expression in cancer | 
| GeSeq | https://chlorobox.mpimp-golm.mpg.de/geseq.html | Annotation of chloroplast genomes | 
| GibbsCluster | http://www.cbs.dtu.dk/services/GibbsCluster-2.0 | Detection of protein short linear motifs | 
| GPCR-SSFE 2.0 | http://www.ssfa-7tmr.de/ssfe2/ | Homology modeling of G-protein coupled receptors | 
| GWAB | http://www.inetbio.org/gwab/ | Network-based genome wide association analysis | 
| HDOCK | http://hdock.phys.hust.edu.cn/ | Protein–protein and protein–DNA/RNA docking | 
| HGVA | http://bioinfodev.hpc.cam.ac.uk/web-apps/hgva | Archive of human genetic variant annotations | 
| HH-MOTiF | http://chimborazo.biochem.mpg.de/ | Detection of protein short linear motifs | 
| I-TASSER-MR | http://zhanglab.ccmb.med.umich.edu/I-TASSER-MR/ | Protein structure modeling for X-ray crystallography | 
| INTAA | http://bioinfo.uochb.cas.cz/INTAA/ | Analysis of amino acid interaction energies | 
| IntaRNA 2.0 | http://rna.informatik.uni-freiburg.de/IntaRNA/Input.jsp | Prediction of interactions between RNA molecules | 
| IslandViewer 4.0 | http://www.pathogenomics.sfu.ca/islandviewer4/ | Prediction of bacterial genomic islands (horizontal gene transfer) | 
| kpLogo | http://kplogo.wi.mit.edu/ | Detection and visualization of short sequence motifs | 
| LigParGen | http://jorgensenresearch.com/ligpargen | Force field parameters for molecular dynamics | 
| LimTox | http://limtox.bioinfo.cnio.es | Text mining for compound toxicity | 
| mCSM-NA | http://structure.bioc.cam.ac.uk/mcsm_na | Prediction of protein mutation effect on nucleic acid binding affinity | 
| MicrobiomeAnalyst | http://microbiomeanalyst.ca | Analysis of microbiome data | 
| MinePath | http://www.minepath.org | Differential expression analysis for regulatory network subpaths | 
| ModFOLD6 | http://www.reading.ac.uk/bioinf/ModFOLD/ | Protein structure quality assessment | 
| mTCTScan | http://jjwanglab.org/mTCTScan | Mutation prioritization for cancer drug response | 
| MutaGene | https://www.ncbi.nlm.nih.gov/projects/mutagene/ | Visualization and analysis of mutational profiles in cancer | 
| NNAlign-2.0 | http://www.cbs.dtu.dk/services/NNAlign-2.0 | Detection of ligand motifs for receptor–ligand interactions | 
| NOREVA | http://server.idrb.cqu.edu.cn/noreva/ | Evaluation of data normalization methods for mass spectrometry based metabolomics data | 
| Olelo | http://www.hpi.de/plattner/olelo | Text mining in PubMed | 
| OmicSeq | http://www.omicseq.org | Search for omics data in major repositories | 
| P4P | http://sing.ei.uvigo.es/p4p | Bacterial strain classification based on peptide datasets | 
| Pathview | http://pathview.uncc.edu/ | Visualization and annotation of metabolic pathways | 
| pepATTRACT | http://bioserv.rpbs.univ-paris-diderot.fr/services/pepATTRACT | Prediction of protein–peptide docking | 
| PharmMapper | http://lilab.ecust.edu.cn/pharmmapper | Drug target search using pharmacophore mapping | 
| PhD-SNPg | http://snps.biofold.org/phd-snpg | Deleterious SNP classification | 
| PIGSPro | http://cassandra.med.uniroma1.it/AbPrediction/web/pigs.php | Modeling of immunoglobulin variable domains | 
| plantiSMASH | http://plantismash.secondarymetabolites.org | Detection of biosynthetic gene clusters in plants | 
| PMut | http://mmb.irbbarcelona.org/PMut/ | Prediction of disease potential for protein mutations | 
| Prism3 | http://prism3.magarveylab.ca/prism | Prediction of natural product structures from biosynthetic gene clusters | 
| ProteinsAPI | http://www.ebi.ac.uk/proteins/api | Web service for protein data from UniProtKB | 
| ProteinsPlus | http://proteins.plus | Structure-based modeling of proteins | 
| ProteoSign | http://bioinformatics.med.uoc.gr/ProteoSign | Protein differential abundance analysis | 
| ReFOLD | http://www.reading.ac.uk/bioinf/ReFOLD/ | Protein structure refinement | 
| RegulatorTrail | https://regulatortrail.bioinf.uni-sb.de | Analysis of transcription factors and target genes | 
| RiPPMiner | http://www.nii.ac.in/rippminer.html | Prediction of chemical structures for ribosomally synthesized and post translationally modified peptides | 
| RNA workbench | https://github.com/bgruening/galaxy-rna-workbench | Stand-alone collection of tools for analyzing RNAseq and RNA sequence data | 
| RNA-MoIP | http://rnamoip.cs.mcgill.ca/ | Prediction of RNA 2D and 3D structure | 
| SBSPKSv2 | http://www.nii.ac.in/sbspks2.html | Analysis of polyketide synthases | 
| SCENERY | http://mensxmachina.org/en/software/ | Network reconstruction from cytometry data | 
| SDM | http://structure.bioc.cam.ac.uk/sdm2 | Prediction of stability in protein mutants | 
| SeMPI | http://www.pharmaceutical-bioinformatics.de/sempi/ | Prediction of polyketide synthase products from biosynthetic gene clusters | 
| SLiMSearch | http://slim.ucd.ie/slimsearch/ | Detection of protein short linear motifs | 
| SODA | http://protein.bio.unipd.it/soda/ | Prediction of solubility in protein mutants | 
| SpartaABC | http://spartaabc.tau.ac.il/webserver | Sequence simulation with indels | 
| ThreaDomEx | http://zhanglab.ccmb.med.umich.edu/ThreaDomEx | Prediction of protein domains and domain boundaries | 
| Tools at EMBL-EBI | http://www.ebi.ac.uk/Tools/webservices/ | Web service tools from EMBL-EBI | 
| TraitRateProp | http://traitrate.tau.ac.il/prop | Test of sequence evolution association with phenotype | 
| TRAPP | http://trapp.h-its.org | Analysis of protein binding site dynamics | 
| VCF.Filter | https://biomedical-sequencing.at/VCFFilter/ | Stand-alone program for filtering and annotating genetic variants in vcf files | 
| Web3DMol | http://web3dmol.duapp.com/ | Protein structure visualization | 
| WebGestalt | http://www.webgestalt.org | Gene set functional enrichment analysis | 
| WoPPER | http://WoPPER.ba.itb.cnr.it/ | Detection of bacterial genome regions with coordinated gene expression changes | 
| XSuLT | http://structure.bioc.cam.ac.uk/xsult | Annotation and visualization of protein multiple sequence alignment | 
表2 2018年网络资源
| Web server name | URL | Brief description | 
|---|---|---|
| AAI-profiler | http://ekhidna2.biocenter.helsinki.fi/AAI | proteome average amino acid identity comparison | 
| AlloFinder | http://mdl.shsmu.edu.cn/ALF/ | allosteric modulator identification | 
| ArDock | http://ardock.ibcp.fr | protein–protein interaction region prediction | 
| BAGEL4 | http://bagel4.molgenrug.nl | secondary metabolite gene clusters (RIPPs, bacteriocins) | 
| BaMM | https://bammmotif.mpibpc.mpg.de | nucleotide binding motifs | 
| BeStSel | http://bestsel.elte.hu | circular dichroism spectroscopy based protein secondary structure analysis | 
| BRepertoire | http://mabra.biomed.kcl.ac.uk/BRepertoire | antibody repertoire analysis | 
| BUSCA | http://busca.biocomp.unibo.it | protein subcellular localization prediction | 
| CABS-flex 2.0 | http://biocomp.chem.uw.edu.pl/CABSflex2 | simulation of protein structure flexibility | 
| CalFitter | https://loschmidt.chemi.muni.cz/calfitter/ | protein thermal denaturation analysis | 
| CASTp 3.0 | http://sts.bioe.uic.edu/castp/ | topology of protein pockets, cavities and channels | 
| CavityPlus | http://www.pkumdl.cn/cavityplus | protein binding site cavities | 
| CellAtlasSearch | http://www.cellatlassearch.com | single cell gene expression data search | 
| cgDNAweb | http://cgDNAweb.epfl.ch | double-stranded DNA coarse-grain models | 
| CircadiOmics | http://circadiomics.ics.uci.edu | circadian rhythm dataset analysis and repository | 
| COACH-D | http://yanglab.nankai.edu.cn/COACH-D/ | protein–ligand binding site prediction | 
| Coloc-stats | https://hyperbrowser.uio.no/coloc-stats/ | genomic location enrichment analysis | 
| ComplexContact | http://raptorx2.uchicago.edu/ComplexContact/ | protein heterodimer complex residue–residue contact prediction | 
| CoNekT-Plants | http://conekt.plant.tools | comparative analyses of plant gene co-expression | 
| CRISPOR | http://crispor.org | guide sequences for CRISPR/Cas9 genome editing | 
| CRISPRCasFinder | https://crisprcas.i2bc.paris-saclay.fr | CRISPR array and Cas gene detection | 
| CSAR-web | http://genome.cs.nthu.edu.tw/CSAR-web | contig scaffolding | 
| dbCAN2 | http://cys.bios.niu.edu/dbCAN2 | carbohydrate-active enzyme annotation | 
| DynaMut | http://biosig.unimelb.edu.au/dynamut/ | point mutation effects on protein stability and dynamics | 
| easyFRAP-web | https://easyfrap.vmnet.upatras.gr/ | protein mobility analysis with fluorescence recovery after photobleaching data | 
| EviNet | https://www.evinet.org/ | gene set network enrichment analysis | 
| ezTag | http://eztag.bioqrator.org | biomedical concept annotation | 
| FragFit | http://proteinformatics.de/FragFit | protein segment modeling of cryo-EM density maps | 
| Freiburg RNA tools | http://rna.informatik.uni-freiburg.de | RNA analysis | 
| GADGET | http://gadget.biosci.gatech.edu | population-based distributions of genetic variants | 
| Galaxy | https://usegalaxy.org | biomedical data analysis workflows | 
| Galaxy HiCExplorer | https://hicexplorer.usegalaxy.eu | chromatin 3D conformation analysis | 
| GDA | http://gda.unimore.it/ | integration of drug response, gene expression profiles and mutations for cancer | 
| GeneMANIA | http://genemania.org | gene function prediction | 
| geno2pheno[ngs-freq] | http://ngs.geno2pheno.org | viral drug resistance prediction | 
| GIANT 2.0 | http://giant-v2.princeton.edu | human tissue-specific gene functional relationships | 
| GPCRM | http://gpcrm.biomodellab.eu/ | G protein-coupled receptors structure modeling | 
| gRINN | http://grinn.readthedocs.io | protein molecular dynamics residue interaction energies | 
| GWAS4D | http://mulinlab.org/gwas4d | prioritization of regulatory variants from GWAS data | 
| HMMER | http://www.ebi.ac.uk/Tools/hmmer | profile hidden Markov models homology search | 
| HotSpot Wizard 3.0 | http://loschmidt.chemi.muni.cz/hotspotwizard3 | protein engineering directed mutation | 
| HPEPDOCK | http://huanglab.phys.hust.edu.cn/hpepdock/ | peptide–protein docking | 
| HSYMDOCK | http://huanglab.phys.hust.edu.cn/hsymdock/ | symmetric protein complex docking | 
| InterEvDock2 | http://bioserv.rpbs.univ-paris-diderot.fr/services/InterEvDock2/ | protein–protein docking | 
| INTERSPIA | http://bioinfo.konkuk.ac.kr/INTERSPIA/ | protein–protein interactions in multiple species | 
| iPath3.0 | http://pathways.embl.de | metabolic pathway visualization and customization | 
| IUPred2A | http://iupred2a.elte.hu | intrinsically disordered protein regions | 
| Kinact | http://biosig.unimelb.edu.au/kinact/ | kinase activating missense mutations prediction | 
| KnotGenome | http://knotgenom.cent.uw.edu.pl/ | topological analysis of chromosome knots and links | 
| LitVar | https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar | genetic variant information retrieval from PubMed | 
| LOLAweb | http://lolaweb.databio.org | genomic region enrichment analysis | 
| MetaboAnalyst 4.0 | http://metaboanalyst.ca | metabolomics data analysis | 
| MetExplore | https://metexplore.toulouse.inra.fr/metexplore2/ | metabolic network analysis | 
| MiGA | http://microbial-genomes.org/ | prokaryotic genome and metagenome classification | 
| MISTIC2 | https://mistic2.leloir.org.ar | residue pair covariation in protein families | 
| MOLEonline | https://mole.upol.cz | biomolecule channels, tunnels, and pores | 
| mTM-align | http://yanglab.nankai.edu.cn/mTM-align/ | protein structure multiple alignment and database search | 
| Mutalisk | http://mutalisk.org | somatic mutations correlation with genomic, transcriptional and epigenomic features | 
| Ocean Gene Atlas | http://tara-oceans.mio.osupytheas.fr/ocean-gene-atlas/ | marine plankton gene geolocation and abundance | 
| oli2go | http://oli2go.ait.ac.at/ | PCR primer and hybridization probe design for non-human DNA | 
| OmicsNet | http://www.omicsnet.ca | molecular interactions networks visualization | 
| oriTfinder | http://bioinfo-mml.sjtu.edu.cn/oriTfinder | origin of transfer sites in bacterial mobile genetic elements | 
| PaintOmics 3 | http://bioinfo.cipf.es/paintomics/ | visualization of omics data on KEGG pathways | 
| PANNZER2 | http://ekhidna2.biocenter.helsinki.fi/sanspanz/ | protein function prediction | 
| PatScanUI | https://patscan.secondarymetabolites.org/ | DNA and protein sequence pattern search | 
| PhytoNet | http://www.gene2function.de | phytoplankton gene expression profiles | 
| pirScan | http://cosbi4.ee.ncku.edu.tw/pirScan/ | piRNA target prediction | 
| ProTox-II | http://tox.charite.de/protox_II | chemical toxicity prediction | 
| psRNATarget | http://plantgrn.noble.org/psRNATarget/ | plant small RNA target prediction | 
| PSSMSearch | http://slim.ucd.ie/pssmsearch/ | protein motifs for binding and post-translational modification | 
| PUG-REST | https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest | PubChem cheminformatics programmatic access | 
| RepeatsDB-lite | http://protein.bio.unipd.it/repeatsdb-lite | tandem repeats in proteins | 
| RNApdbee 2.0 | http://lepus.cs.put.poznan.pl/rnapdbee-2.0/ | RNA secondary structure annotation | 
| RSAT | http://www.rsat.eu/ | DNA regulatory motifs | 
| SMARTIV | http://smartiv.technion.ac.il/ | RNA sequence and structure motifs for RNA binding proteins | 
| SNPnexus | http://www.snp-nexus.org | SNP functional annotation | 
| SPAR | https://www.lisanwanglab.org/SPAR | analysis of small RNA sequencing data | 
| SWISS-MODEL | https://swissmodel.expasy.org | structure homology modeling for proteins and protein complexes | 
| TAM 2.0 | http://www.scse.hebut.edu.cn/tam/ | microRNA set enrichment analysis | 
| TCRmodel | http://tcrmodel.ibbr.umd.edu/ | T cell receptor structure modeling | 
| UNRES | http://unres-server.chem.ug.edu.pl | coarse-grained simulation of protein structure | 
| VarAFT | http://varaft.eu | disease-causing variants annotation | 
| WEGO 2.0 | http://wego.genomics.org.cn | Gene Ontology visualization | 
| X2K Web | http://X2K.cloud | kinase enrichment analysis for differentially expressed gene signatures | 
| xiSPEC | http://spectrumviewer.org | proteomics mass spectrometry data analysis | 
参考资料
https://en.wikipedia.org/wiki/Biological_database
Year 2018's Yearly Issue of the Nucleic Acids Research Database and the Molecular Biology Database Repository (%20https://doi.org/10.1093/nar/gkx1235)
The special section of the 15th edition of the Nucleic Acid Research online journal, issue number 2017, presents a comprehensive collection of articles dedicated to advancing the field of nucleic acids research.
The 16th anniversary edition of the Nucleic Acid Research web server in 2018.
