結論 - 討論與結論 - 建立一個各基因資料庫間的基因名稱導航與微陣列平台之註解系統：DIPLEX

第五章討論與結論

5.2 結論

Microarray 技術的發展，提供生物醫學專家快速的累積基因表現的研究資料。這些數據有高度的再利用性，可供其他研究人員進行多樣化的分析。相對的，資訊的爆炸產生了另一些方面的需求，研究人員在經過實驗後可能得到一長串的可疑基因 list，正確的幫基因做詳細的註解，成為一件重要的事情。以前生物醫學專家在經過複雜的實驗後會得到少量的幾個可疑基因，以人工查詢這幾個基因是非常簡單的事情，但由於Microarray 的出現，使得可疑基因的數量暴增，幫基因做註解的工作變的相對複製。此外，不同技術的Microarray 資料，要如何對照研究也是一個難題。DIPLEX 的開發完成，幫助我們自動化產生基因註解資料，並提供與Pathway 資訊的整合以及不同平台 Microarray 基因的比對工具，進而實行多樣的分析比較，並提供相關的基因表現數據做參考。然而，

DIPLEX 還有許多進步的空間，目前的資料只有以人類為主，未來的目標將著重在擴充更多的物種，囊括更多的基因表現數據與Pathway 的資訊，以及改善系統效能。使用者介面也是另一項發展的重心，一個好的網站必須不斷的改進使用者介面，讓使用者能更快掌握到所需要的資訊，以及讓人一眼就明白的資料呈現方式。希望DIPLEX 能發展成為生物資訊的重要網站。

參考文獻

中文參考文獻

中華民國行政院衛生署 http://www.doh.gov.tw

英文參考文獻

Ashburner M, Mungall CJ, Lewis SE. Ontologies for biologists: a community model for the annotation of genomic data. Cold Spring Harb Symp Quant Biol 2003;68:227-35.

Benson DA, Boguski MS, Lipman DJ, Ostell J, Ouellette BF. GenBank. Nucleic Acids Res 1998;26(1):1-7.

Berners-Lee, T. et al, “World Wide Web: The Information Universe”, Electronic Networking:

Research, Applications and Policy, 1(2), 1992.

Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 2001;29(4):365-71.

Brazma A, Parkinson H, Sarkans U, Shojatalab M, Vilo J, Abeygunawardena N, et al.

ArrayExpress--a public repository for microarray gene expression data at the EBI.

Nucleic Acids Res 2003;31(1):68-71.

Brown PO, Botstein D. Exploring the new world of the genome with DNA microarrays. Nat Genet 1999;21(1 Suppl):33-7.

Bussey KJ, Kane D, Sunshine M, Narasimhan S, Nishizuka S, Reinhold WC, et al.

MatchMiner: a tool for batch navigation among gene and gene product identifiers.

Genome Biol 2003;4(4):R27.

Cheung KH, White K, Hager J, Gerstein M, Reinke V, Nelson K, et al. YMD: a microarray database for large-scale gene expression analysis. Proc AMIA Symp 2002:140-4.

Claverie JM. Computational methods for the identification of differential and coordinated gene expression. Hum Mol Genet 1999;8(10):1821-32.

Diehn M, Sherlock G, Binkley G, Jin H, Matese JC, Hernandez-Boussard T, et al. SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression

data. Nucleic Acids Res 2003;31(1):219-23.

Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 2002;30(1):207-10.

Gasteiger E, Jung E, Bairoch A. SWISS-PROT: connecting biomolecular knowledge via a protein database. Curr Issues Mol Biol 2001;3(3):47-55.

Gollub J, Ball CA, Binkley G, Demeter J, Finkelstein DB, Hebert JM, et al. The Stanford Microarray Database: data access and quality assessment tools. Nucleic Acids Res 2003;31(1):94-6.

Lash AE, Tolstoshev CM, Wagner L, Schuler GD, Strausberg RL, Riggins GJ, et al.

SAGEmap: a public gene expression resource. Genome Res 2000;10(7):1051-60.

Lee JK, Bussey KJ, Gwadry FG, Reinhold W, Riddick G, Pelletier SL, et al. Comparing cDNA and oligonucleotide array data: concordance of gene expression across platforms for the NCI-60 cancer cells. Genome Biol 2003;4(12):R82.

Lenhard B, Hayes WS, Wasserman WW. GeneLynx: a gene-centric portal to the human genome. Genome Res 2001;11(12):2151-7.

Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ. High density synthetic oligonucleotide arrays. Nat Genet 1999;21(1 Suppl):20-4.

Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 1996;14(13):1675-80.

Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI.

Nucleic Acids Res 2005;33(Database issue):D54-8.

Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, et al. ONCOMINE: a cancer microarray database and integrated data-mining platform. Neoplasia 2004;6(1):1-6.

Rocca-Serra P, Brazma A, Parkinson H, Sarkans U, Shojatalab M, Contrino S, et al.

ArrayExpress: a public database of gene expression data at EBI. C R Biol 2003;326(10-11):1075-8.

Schadt EE, Li C, Su C, Wong WH. Analyzing high-density oligonucleotide gene expression array data. J Cell Biochem 2000;80(2):192-202.

Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995;270(5235):467-70.

Stoesser G, Baker W, van den Broek A, Camon E, Garcia-Pastor M, Kanz C, et al. The EMBL nucleotide sequence database. Nucleic Acids Res 2001;29(1):17-21.

Tateno Y, Fukami-Kobayashi K, Miyazaki S, Sugawara H, Gojobori T. DNA Data Bank of Japan at work on genome sequence data. Nucleic Acids Res 1998;26(1):16-20.

Wain HM, Lush MJ, Ducluzeau F, Khodiyar VK, Povey S. Genew: the Human Gene

Nomenclature Database, 2004 updates. Nucleic Acids Res 2004;32(Database issue):D255-7.

World Health Organization, WHO Report 2004. 世界衛生組織報告 2004.

電子資料

Affymetrix http://www.affymetrix.com ArrayExpress http://www.ebi.ac.uk/arrayexpress DDBJ http://www.ddbj.nig.ac.jp/

EMBL http://www.ebi.ac.uk/embl/

GenBank http://www.ncbi.nlm.nih.gov/GeneBank/

LocusLink http://www.ncbi.nlm.nih.gov/LocusLink/

NCBI http://www.ncbi.nlm.nih.gov/

NCI panel of 60 human tumor cell lines http://dtp.nci.nih.gov/

OMIM http://www.ncbi.nlm.nih.gov/Omim/

RefSeq Statistics http://www.ncbi.nlm.nih.gov/LocusLink/

RSstatistics.html

Stanford SOURCE http://source.stanford.edu/

UniGene http://www.ncbi.nlm.nih.gov/UniGene/

UniProt http://www.pir.uniprot.org/

聯合國糧食及農業組織 http://www.fao.org/

附錄

CREATE TABLE `Acc2UG` (

`Acc` varchar(20) collate utf8_unicode_ci NOT NULL default '', `UniGene` varchar(20) collate utf8_unicode_ci NOT NULL default '', KEY `Acc` (`Acc`,`UniGene`),

KEY `UniGene` (`UniGene`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `ArrayExperiment`

CREATE TABLE `ArrayExperiment` ( `id` int(11) NOT NULL auto_increment,

`name` varchar(255) collate utf8_unicode_ci NOT NULL default '', `type` varchar(255) collate utf8_unicode_ci NOT NULL default '', `abstract` text collate utf8_unicode_ci NOT NULL,

PRIMARY KEY (`id`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=4 ;

-- --- --

-- 資料表格式： `Clone2Acc` //Clone Image ID to GenBank Accession No.

CREATE TABLE `Clone2Acc` (

`CloneId` varchar(30) NOT NULL default '', `Collection` varchar(10) NOT NULL default '', `PlateNo` varchar(10) NOT NULL default '', `Row` varchar(10) NOT NULL default '', `Column` varchar(10) NOT NULL default '', `LibraryId` varchar(10) NOT NULL default '', `Species` varchar(10) NOT NULL default '', `Acc` varchar(255) NOT NULL default '', KEY `CloneId` (`CloneId`),

KEY `Acc` (`Acc`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='Clone ID 2 GeneBankACC';

-- --- --

-- 資料表格式： `Gene2Kegg` //Gene ID to KEGG Pathway maps --

CREATE TABLE `Gene2Kegg` (

`LocusLink` int(10) NOT NULL default '0',

`KeggMap` varchar(255) collate utf8_unicode_ci NOT NULL default '', KEY `LocusLink` (`LocusLink`),

KEY `KeggMap` (`KeggMap`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `GoTerm` //Gene Ontology Terms --

CREATE TABLE `GoTerm` (

`id` int(11) NOT NULL default '0',

`name` varchar(255) collate utf8_unicode_ci NOT NULL default '', `term_type` varchar(55) collate utf8_unicode_ci NOT NULL default '', `go_id` varchar(255) collate utf8_unicode_ci NOT NULL default '', KEY `GoNum` (`go_id`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `HGNC` //HGNC table --

CREATE TABLE `HGNC` (

`HGNC_ID` varchar(10) collate utf8_unicode_ci NOT NULL default '', `Symbol` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Name` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Status` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Locus_Type` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Previous_Symbols` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Previous_Names` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Aliases` varchar(255) collate utf8_unicode_ci NOT NULL default '',

`Chromosome` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Date_Approved` varchar(5) collate utf8_unicode_ci NOT NULL default '', `Date_Modified` varchar(5) collate utf8_unicode_ci NOT NULL default '', `Date_Name_Changed` varchar(5) collate utf8_unicode_ci NOT NULL default '', `Accession_Numbers` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Enzyme_ID` varchar(255) collate utf8_unicode_ci NOT NULL default '',

`LocusLink_ID` varchar(255) collate utf8_unicode_ci NOT NULL default '', `MGD_ID` varchar(10) collate utf8_unicode_ci NOT NULL default '', `Misc_IDs` varchar(10) collate utf8_unicode_ci NOT NULL default '', `Pubmed_IDs` varchar(10) collate utf8_unicode_ci NOT NULL default '', `RefSeq_IDs` varchar(255) collate utf8_unicode_ci NOT NULL default '',

`Gene_Family_Name` varchar(255) collate utf8_unicode_ci NOT NULL default '', `GDB` varchar(255) collate utf8_unicode_ci NOT NULL default '',

`LocusLink` varchar(255) collate utf8_unicode_ci NOT NULL default '', `OMIM` varchar(255) collate utf8_unicode_ci NOT NULL default '',

`RefSeq` varchar(255) collate utf8_unicode_ci NOT NULL default '', `SwissProt` varchar(255) collate utf8_unicode_ci NOT NULL default '', KEY `Approved_Symbol` (`Symbol`),

KEY `Approved_Name` (`Name`), KEY `Aliases` (`Aliases`),

KEY `Chromosome` (`Chromosome`), KEY `Enzyme_ID` (`Enzyme_ID`), KEY `LocusLink_ID` (`LocusLink_ID`), KEY `LocusLink` (`LocusLink`),

KEY `OMIM` (`OMIM`), KEY `RefSeq` (`RefSeq`), KEY `SwissProt` (`SwissProt`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `KeggTerm` //KEGG pathway map terms --

CREATE TABLE `KeggTerm` (

`keggid` varchar(20) collate utf8_unicode_ci NOT NULL default '', `keggterm` varchar(255) collate utf8_unicode_ci NOT NULL default '',

`subclass` varchar(255) collate utf8_unicode_ci NOT NULL default 'Neurodegenerative Disorders',

`class` varchar(255) collate utf8_unicode_ci NOT NULL default 'Human Diseases', KEY `keggid` (`keggid`,`class`),

KEY `class` (`class`), KEY `subclass` (`subclass`), KEY `keggterm` (`keggterm`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `Loc2UG` //LocusLink (Entrez Gene ID) to UniGene ID

CREATE TABLE `Loc2UG` (

`LocusLink` varchar(20) character set utf8 collate utf8_unicode_ci NOT NULL default '', `UniGene` varchar(20) character set utf8 collate utf8_unicode_ci NOT NULL default '', KEY `LocusLink` (`LocusLink`),

KEY `UniGene` (`UniGene`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='LocusLink to UniGene';

-- --- --

-- 資料表格式： `Loc2acc` // LocusLink (Entrez Gene ID) to GenBank Accession --

CREATE TABLE `Loc2acc` (

`LocusLink` int(10) NOT NULL default '0',

`Acc` varchar(20) collate utf8_unicode_ci NOT NULL default '', `gi_num` varchar(20) collate utf8_unicode_ci NOT NULL default '0', `sequence_type` varchar(10) collate utf8_unicode_ci NOT NULL default '', `protein_accession` varchar(20) collate utf8_unicode_ci NOT NULL default '', `tax_id` int(10) NOT NULL default '0',

KEY `LocusLink` (`LocusLink`), KEY `Acc` (`Acc`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `Loc2ref` //LocusLink (Entrez Gene ID) to RefSeq ID --

CREATE TABLE `Loc2ref` (

`LocusLink` int(10) NOT NULL default '0',

`Refseq` varchar(20) collate utf8_unicode_ci NOT NULL default '', `gi_num` varchar(20) collate utf8_unicode_ci NOT NULL default '0', `review_status` varchar(20) collate utf8_unicode_ci NOT NULL default '',

`protein_accession` varchar(20) collate utf8_unicode_ci NOT NULL default '', `tax_id` int(10) NOT NULL default '0',

KEY `LocusLink` (`LocusLink`), KEY `Refseq` (`Refseq`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `SAGELongTag2UG`

CREATE TABLE `SAGELongTag2UG` (

`UniGene` varchar(20) character set utf8 collate utf8_unicode_ci NOT NULL default '', `SAGELongTag` varchar(20) character set utf8 collate utf8_unicode_ci NOT NULL default '',

PRIMARY KEY (`SAGELongTag`), KEY `UniGene` (`UniGene`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0;

-- --- --

-- 資料表格式： `SAGEShortTag2UG`

CREATE TABLE `SAGEShortTag2UG` (

`UniGene` varchar(20) character set utf8 collate utf8_unicode_ci NOT NULL default '', `SAGEShortTag` varchar(20) character set utf8 collate utf8_unicode_ci NOT NULL default '',

PRIMARY KEY (`SAGEShortTag`), KEY `UniGene` (`UniGene`)

) ENGINE=MyISAM DEFAULT CHARSET=latin1;

-- --- --

-- 資料表格式： `biocarta_name` //Biocarta Pathway maps --

CREATE TABLE `biocarta_name` (

`PathwayID` varchar(255) collate utf8_unicode_ci NOT NULL default '', `PathwayName` text collate utf8_unicode_ci NOT NULL,

`WebAddress` text collate utf8_unicode_ci NOT NULL, PRIMARY KEY (`PathwayID`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `gene2biocarta` //Gene ID to BioCarta pathway maps --

CREATE TABLE `gene2biocarta` (

`PathwayID` varchar(255) collate utf8_unicode_ci NOT NULL default '', `GeneID` varchar(255) collate utf8_unicode_ci NOT NULL default '', KEY `GeneID` (`GeneID`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `gene2go` //Gene ID to Gene Ontology --

CREATE TABLE `gene2go` (

`tax_id` varchar(10) collate utf8_unicode_ci NOT NULL default '', `gene_id` int(10) NOT NULL default '0',

`go_id` varchar(255) collate utf8_unicode_ci NOT NULL default '', `evidence` varchar(10) collate utf8_unicode_ci NOT NULL default '', `go_qualifier` varchar(20) collate utf8_unicode_ci NOT NULL default '', `go_description` text collate utf8_unicode_ci NOT NULL,

KEY `gene_id` (`gene_id`), KEY `go_id` (`go_id`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `gene_info`

CREATE TABLE `gene_info` (

`tax_id` varchar(255) collate utf8_unicode_ci NOT NULL default '', `GeneID` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Symbol` varchar(255) collate utf8_unicode_ci NOT NULL default '', `LocusTag` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Synonyms` varchar(255) collate utf8_unicode_ci NOT NULL default '', `dbXrefs` varchar(255) collate utf8_unicode_ci NOT NULL default '', `chromosome` varchar(255) collate utf8_unicode_ci NOT NULL default '', `map_location` varchar(255) collate utf8_unicode_ci NOT NULL default '', `description` varchar(255) collate utf8_unicode_ci NOT NULL default '', `type_of_gene` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Symbol_authority` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Name_authority` varchar(255) collate utf8_unicode_ci NOT NULL default '', `Nomenclature_status` varchar(255) collate utf8_unicode_ci NOT NULL default '', KEY `tax_id` (`tax_id`),

KEY `GeneID` (`GeneID`), KEY `Symbol` (`Symbol`), KEY `Synonyms` (`Synonyms`), KEY `dbXrefs` (`dbXrefs`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --- --

-- 資料表格式： `mim2gene`

CREATE TABLE `mim2gene` (

`OMIM` varchar(255) collate utf8_unicode_ci NOT NULL default '',

`GeneID` varchar(255) collate utf8_unicode_ci NOT NULL default '', `type` varchar(255) collate utf8_unicode_ci NOT NULL default '', KEY `GeneID` (`GeneID`),

KEY `OMIM` (`OMIM`)

) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

在文檔中建立一個各基因資料庫間的基因名稱導航與微陣列平台之註解系統：DIPLEX (頁 84-96)

結論

第五章 討論與結論

5.2 結論

參考文獻

附錄

第五章討論與結論