• 沒有找到結果。

Classification of Cancer-Related Genes from OMIM Database

II. Literature Review

3.2 Classification of Cancer-Related Genes from OMIM Database

OMIM has included a variety of articles, or records, on genetic diseases and inherited genes that have been read and briefly summarized into few sentences by scientists. In other word, OMIM acts as a miniature reading environment for readers to view a variety of the article sources at once. In addition, OMIM is also a high-quality information source and considered a key referencing database by the genetics community. As a result, we have chosen OMIM to be our major resource to derive the cancer-related gene list.

We have limited our search to the key word “cancer” on the OMIM search engine to extract all cancer-related gene records. In order to narrow down the search field to only the text portion, we would only focus on the information contained in the “title”, “text” and “clinical synopsis” (Table 1).

Table 1: Explanation of the name, content and search tips in different search field.

(http://www.ncbi.nlm.nih.gov/Omim/omimhelp.html#SearchFields)

Search Field Description Qualifier

All Fields Contains all terms from all searchable database fields in the

database. [ALL]

Allelic

Variant Describes a subset of disease-producing mutations. [AV] or [VAR]

Chromosome The chromosome onto which a gene or disorder has been

mapped, as reported in the OMIM Gene or Morbid Map. [CH]or [CHR]

Clinical Synopsis

Clinical features of a disorder and the mode of inheritance (e.g., autosomal dominant, autosomal recessive, x-linked), if known.

[CS] or [CLIN]

Contributor

Contributor to an OMIM record. Names are in the format of lastname followed by one or more initials (with no periods), e.g., Smith AB

[AU] or [CTRB]

Creation Date The date on which an OMIM record was created, in the format YYYY/MM/DD.

[CD] or [CDAT]

EC/RN Number

Number assigned by the Enzyme Commission or Chemical Abstract Service (CAS) to designate a particular enzyme or chemical, respectively.

[EC] or [ECNO]

Editor

Editor of OMIM record. Names are in the format of lastname followed by one or more initials (with no periods), e.g., Smith AB

[ED] or [EDTR]

Filter

Primarily used to retrieve subsets of records that contain crosslinks to other Entrez databases, and LinkOuts to external (non-Entrez) resources.

There is a separate LinkOut Overview document which provides more detail about that service.

[FI] or [FILT]

Gene Map Cytogenetic map location represented in the OMIM Gene Map [GM]or [MAP]

Gene Map Disorder

Text words appearing in the Disorder column of the OMIM Gene Map.

[DIS] or [DI]

Gene Name

The official gene symbol, and alternate gene symbols, associated with a record. Currently limited to gene symbols present on the OMIM Gene Map. All gene symbols represented in OMIM (mapped or unmapped) can be searched in the Title Word field, described below.

[GN] or [GENE]

MIM Number For information on the numbering system, see the OMIM

Date on which the record was last modified, in the format YYYY/MM/DD.

[MD] or [MDAT]

Modification History

All dates on which an OMIM record was updated, in the format YYYY/MM/DD.

[MDH] or [HIST]

Properties

An index containing various properties of OMIM records, identifying those which have attributes such as Allelic Variants, Clinical Synopsis, or Gene Map locus.

The most commonly used attributes are presented as check boxes on the Limits page.

To see a complete list of attributes, you can browse the index of the Properties field by use the Index option.

[PR] or [PROP]

Reference

Contains author names and title words from the articles cited in an OMIM entry.

Names are in the format of lastname followed by one or more initials (with no periods), e.g., Smith AB

[RE] or [REF]

Text Word

Contains terms from the main text-containing section of a record, which begins under the title of a record and ends above the Allelic Variants section (if present), or above the

References section (if no Allelic Variants are described).

[TXT] or [WORD]

Title Word Words in title of an OMIM record. Includes words in the primary title, alternative titles, and included titles.

[TI] or [TITL]

In other word, if the key word “cancer” is nowhere to be found in any of the three sections, we would assume that the record does not consist of any cancer relevant information. Next step would be to review each gene’s OMIM record to confirm its role in different cancer types, which are defined based on the ten leading mortality rate in cancer among Taiwanese population by Department of Health for year 2004. The ten cancer types are lung cancer, hepatocellular carcinoma (HCC), colorectal carcinoma, female breast cancer, gastric carcinoma, oral cancer, cervical cancer, prostate cancer, esophageal cancer and pancreatic cancer.

Further to the key word “cancer” search in the OMIM database, we have also used the ten cancer types for individual search so that a more comprehensive cancer-related gene list would be obtained. Since many different terms can be used to refer to one cancer type, all the possibilities therefore have to be taken into the searching consideration. For example, breast cancer can be described as a breast carcinoma, mammary gland neoplasm etc. As a result, we have used both the synonyms for each cancer type based on the classification by the International Classification of Disease for Oncology (ICD-O) plus the synonyms, near-synonyms and closely related concepts for cancers defined by the Medical Subject Headings (MeSH). ICD-O is used mainly for the cancer and/or tumour registries for coding the histology and site of the neoplasms (ICD-O Website, 2005). Table 2 has shown a summary of the synonyms for ten cancer types defined by ICD-O while Table 3 has illustrated all related terms for the listed cancer types in MeSH.

Table 2: Summary of the Synonyms for Ten Different Cancer Types Defined by ICD-O Words Synonyms by ICD-O

Cancer cancer//carcinoma//leukaemia//leukemia//lymphoma//malignancy//

melanoma//myeloma//neoplasm//tumor//tumour//

Lung bronchiole//bronchogenic//bronchus//carina//hilus//lingula//lung//pulmonary//

Liver liver//hepatocellular//hepatoma//

Colorectal bowel//cecum//colon//colorectal//ileocecal//intestine//pelvirectal//rectal//

rectosigmoid//rectum//sigmoid//

Female Breast areola//breast//mammary//nipple//

Stomach antrum//cardia//cardioesophageal//esophagogastric//fundus//gastric//

"nos"//prepylorus//pyloric//pylorus//stomach//

Oral alveolar//alveolus//buccal//cheek//frenulum//gingiva//"gum"//labial//linguae//

molar//mouth//oral//palate//periodontal//retromolar//salivary//tongue//tonsil//

tooth//uvula//

Cervical cervical//cervix//endocervical//endocervix//exocervical//exocervix//

internal os//nabothian//

Prostate prostate//prostatic//

Esophageal esophageal//esophagus//

Pancreatic langerhans//pancreas//pancreatic//santorini//wirsung//

Table 3: Summary of the Synonyms for Ten Different Cancer Types Defined by MeSH

Lung Neoplasms//Cancer of Lung//Lung Cancer//Pulmonary Cancer//Pulmonary Neoplasms//Cancer of the Lung//Neoplasms, Lung//Neoplasms,

Pulmonary//Non-Small-Cell Lung Carcinoma//Carcinoma, Non-Small Cell Lung//

Liver Cancer

Liver Neoplasms

Liver Neoplasms//Cancer of Liver//Hepatic Cancer//Liver Cancer//Cancer of the Liver//Hepatic Neoplasms//Neoplasms, Hepatic//Neoplasms, Liver//Carcinoma, Hepatocellular//Hepatocellular Carcinoma//Hepatoma//

Colorectal Cancer

Colorectal Neoplasms

Colonic Neoplasms//Cancer of Colon//Colon Cancer//Cancer of the Colon//Colon Neoplasms//Colonic Cancer//Neoplasms, Colonic//Colorectal Neoplasms,

Hereditary Nonpolyposis//Hereditary Nonpolyposis Colorectal Cancer//Hereditary Nonpolyposis Colorectal Neoplasms//Lynch Syndrome//Colon Cancer, Familial Nonpolyposis//Lynch Cancer Family Syndrome I//Lynch Syndrome I//Lynch Syndrome II//

Breast Cancer

Breast Neoplasms

Breast Neoplasms//Breast Cancer//Breast Tumors//Cancer of Breast// Cancer of the Breast//Human Mammary Carcinoma //Mammary Carcinoma, Human//Mammary Neoplasm, Human//Mammary Neoplasms, Human//Neoplasms, Breast//Tumors, Breast/

Stomach Cancer

Stomach Neoplasms

Stomach Neoplasms//Cancer of Stomach//Gastric Cancer//Gastric Neoplasms//Stomach Cancer//Cancer of the Stomach//Neoplasms, Gastric//Neoplasms, Stomach//

Oral Cancer

Mouth Neoplasms

Mouth Neoplasms//Cancer of Mouth//Mouth Cancer//Oral Cancer//Oral Neoplasms//Cancer of the Mouth//Neoplasms, Mouth//Neoplasms, Oral//Oral Cavity//Cavitas Oris//Cavitas oris propria//Mouth Cavity Proper//Oral Cavity Proper//Vestibule Oris//Vestibule of the Mouth//

Cervical Cancer

Cervix Neoplasms

Cervix Neoplasms//Cancer of Cervix//Cervical Cancer//Cancer of the

Cervix//Cervical Neoplasms//Cervix Cancer//Neoplasms, Cervical//Neoplasms, Cervix//Cervical Intraepithelial Neoplasia//Neoplasia, Cervical

Intraepithelial//Cervical Intraepithelial Neoplasia, Grade III//Cervical Intraepithelial Neoplasms//Intraepithelial Neoplasia, Cervical//

Prostate Cancer

Prostatic Neoplasms

Prostatic Neoplasms//Cancer of Prostate//Prostate Cancer//Cancer of the

Prostate//Neoplasms, Prostate//Neoplasms, Prostatic//Prostate Neoplasms//Prostatic Cancer//Prostatic Hyperplasia//Adenoma, Prostatic//Benign Prostatic

Hyperplasia//Prostatic Adenoma//Prostatic Hyperplasia, Benign//Prostatic Hypertrophy//Prostatic Hypertrophy, Benign//Prostatism//

Esophageal Cancer

Esophageal Neoplasms

Esophageal Neoplasms//Cancer of Esophagus//Esophageal Cancer//Cancer of the Esophagus//Esophagus Cancer//Esophagus Neoplasm//Neoplasms, Esophageal//

Pancreatic Cancer

Pancreatic Neoplasms

Pancreatic Ductal Carcinoma//Duct-Cell Carcinoma of the Pancreas//Duct-Cell Carcinoma, Pancreas//Ductal Carcinoma of the Pancreas//Pancreatic Duct Cell Carcinoma//Pancreatic Neoplasms//Cancer of Pancreas//Pancreatic Cancer//Cancer of the Pancreas//Neoplasms, Pancreatic//Pancreas Cancer//Pancreas

Neoplasms//Carcinoma, Pancreatic Ductal//

Ten gene lists for ten different cancer types would be derived as a result of the reviewing and categorizing process of each OMIM record. We would define each individual gene within each cancer-specific gene list as one cancer-related gene since the gene has been confirmed by the OMIM to be related to this particular cancer type. In other word, we have had a total of ten specific cancer-related gene lists from the OMIM database.

As ten cancer-related gene lists have been identified, we would perform a ten cancer-related gene lists interaction to look for any common genes that are present across ten cancer types.

First step would be to use the Microsoft Access software to create tables individually for ten cancer types. Each database table was created by using SQL language. For example, the script for creating the breast cancer table is as follows:

After creating ten tables, the next step would be to do the cancer-gene list interactions. SQL language is once again used to complete various cancer-gene list interactions. For example, the script can be seen below for the cancer-gene list interactions between the breast, cervical and prostate tissue:

The process flow leading to the completion of obtaining the ten cancer-related gene lists as well as the common cancer-related genes are summarized in Figure 9 below.

Create table breast (genesymbol varchar(15));

SELECT Breast.GeneSymbol

FROM (Breast INNER JOIN Cervical ON

Breast.GeneSymbol = Cervical.GeneSymbol) INNER JOIN Prostate ON Cervical.GeneSymbol = Prostate.GeneSymbol;

Figure 9: Flowchart for extracting cancer-related gene lists from the OMIM database.

相關文件