生物資料庫對醫療產業之影響
劉扶東
Fu-Tong Liu, MD, PhD
中央研究院Academia Sinica, Taiwan
「生技醫療與智慧化醫療之產業趨勢與商機」研 討會
September 18, 2019
1Biobank
A biorepository that stores biological samples for use in research
An important resource supporting contemporary medical research like genomics and personalized medicine
Gives researchers access to data representing a large number of people
Instrumental in identifying disease biomarkers, promoting drug discovery and development
2
Biobank
A biorepository that stores biological samples for use in research
An important resource supporting contemporary medical research like genomics and personalized medicine
Gives researchers access to data representing a large number of people
Instrumental in identifying
disease biomarkers, promoting drug discovery and development
3
Time magazine, 2009
Population Biobanks
Epidemiology-driven Biobanks Diseased-oriented Biobanks
Tumor-banks
Population
individuals Patients
Life style questionnaires
Biological samples
Determinants of individual susceptibility
Clinical information
Clinical samples
Determinants of exposure/life style &
biologic effect
Determinants of disease
Disease/tum or biomarkers
New drug targets
Implementation of
personalized/precision medicine
4
Types of Biobank
Modified from Paskal, et al., Path & Onco Res, 24: 771 (2018)
UK Biobank (1)
Established by the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish
Government, and the Northwest Regional Development Agency
Supported by the National Health Service (NHS) (initial funding of ~ £62 million)
500,000 volunteer participants, aged 40-69 years (recruited in 2006-2010)
5
UK Biobank (2)
Data/information
– Blood biochemistry, all participants – Questionnaires
– Images (brain, heart, abdomen, bones & carotid artery) of 100,000 participants
– Genotyping, all 500,000
– A 24-hour activity monitor for a week, 100,000
Linked to electronic health records
Provides health information to approved researchers in the UK and overseas, from academia and industry
6
UK Biobank (3)
Starting in 2017, researchers can access the database
The first descriptions of the full cohort, including genome- wide genetic data
– the HLA have the expected associations with disease (C. Bycroft et al. Nature 562, 203–209; 2018)
Brain imaging of 10,000 individuals: genetic influences on brain structure and function, and correlations with
neurodegenerative, psychiatric and personality traits
(L. T. Elliott et al. Nature 562, 210–216; 2018)
7
All of Us (US) (1)
Created in 2015; launched in 2018
Run by the National Institutes of Health (NIH)
FY2016 - $130 million; FY2017 - $230 million; and FY2018 -
$290 million
Collect genetic and health data from one million volunteers
more than 100 partners (Mayo Clinic, Vanderbilt, Broad Institute, many medical centers, Google life sciences startup Verily Life Sciences)
8
All of Us (US) (2)
Data/information
– blood and urine samples and physical measurements – wearable devices (Fitbit)
– survey on lifestyle and environment
– give researchers access to their electronic health records (EHR) – capture genomic data from select participants
As of May 2019, enrollment numbers: 187,000+ participants
9
BioVU Vanderbilt University
De-identified DNA samples (from leftover plasma donated by patients during their clinic visits)
More than 250,000 DNA samples collected over the last decade
Linked to electronic health information
VUMC has formed Nashville Biosciences
Pharmaceutical companies can gain access to BioVU to accelerate new drug discovery and development
10
2024 2012
Recruitment promotion by
TWB staff
Referral by doctors/
medical center biobanks
Taiwan Biobank
A National Project – To establish an infrastructure for personalized medicine and prevention
B. Banking/Upgrading Information/Specimens A. Recruiting Participants, Follow-up,
Collecting information/Specimens
D. Making
Information/Specimens Publicly Available
C. Profiling Genomic, Epigenomic,
Metabolomic, and Environmental Features 12
-General population
Follow-up: Success rate >75%
2019/9/19 13
2019/4/30
recruitment 112,326 follow up 22,500
200,000
General population ( 30-70 years old)
-General population
14
-Disease patients
Multi-center, single standardized case recruitment
Patients, 100,000
Breast CA, Lung CA Colon CA, Liver CA Gastric CA, Head/Neck CA
CVD, Stroke DM, Alzheimer’s disease
Chronic Kidney Dis, Asthma, Endometreosis
12 disease cases: breast CA, lung CA, colon CA, liver CA, gastric CA, head/neck CA, CVD, stroke, DM, Alzheimer’s disease, chronic kidney dis, asthma, endometriosis
Reference for imputation
Detection of rare/novel variants
Exploration of clinical relevance Whole genome sequencing
- 2,021 genomes at the end of 2018
TWB1.0 / TWB2.0
Whole genome genotyping
TWB1.0 - ~650,000 SNPs based on genetic studies of HapMap Project, 1,000 Genomes Project, CHB array plate, and Pharmacogenomic markers (Axiom Biobank
Genotyping Arrays)
TWB2.0 - ~750,000 SNPs covering specific SNPs associated with diseases, drug metabolism, and drug response in
Taiwan population
The overlap between TWB1.0 and TWB2.0 is ~100,000 SNPs
Banking information/specimens
*TWB has begun collecting tumor tissues since Feb, 2016
17
-Biologic specimens
~2,680,000 tubes of different kinds of specimens
-- Data release since September 2014
19
Data/Specimens 2014 2015 2016 2017 2018 2019/02 Total
Questionnaires 16,471 400,256 473,487 538,762 776,085 90,355 2,295,416 Physical data 3,392 67,027 123,791 150,805 209,076 17,280 571,371 Blood and urine test 3,492 59,439 80,533 103,416 326,266 16,087 589,233 DNA sample (g) 3,584 21,880 24,096 25,518 21,264 8,490 104,832
Plasma (tube) 1,792 2,124 1,000 6,184 6,724 331 18,155
Urine (tube) 0 1,900 600 5,194 4,480 0 12,174
Whole genome genotyping 3,392 76,809 207,437 184,466 337,609 51,100 860,813 Whole genome sequencing 0 1,229 17,958 22,199 31,487 4,905 77,778
DNA methylation 0 0 550 6,855 15,853 2,942 26,200
Plasma metabolome 0 0 0 400 3,628 409 4,437
HLA genotyping 0 0 0 0 18,210 2,120 20,330
Sum 4,386,720
20
科研效益
項目 完成人數 數位資料釋出 全基因體定序
2,021
人77,778
人次TWB
成本~NT$ 1.2
億使用者成本
~NT$ 46
億An example:
~38倍效益,使用愈多效益愈高
Taiwan Cancer Precision Medicine Program
Lung Cancer
Breast Cancer Colorectal
Cancer
Research and Clinical Teams, Taiwan NCI, USA
Proteomics
Genomics
Microbiota
Next-generation cancer
therapeutics Academia NBRP
Sinica
Medical Centers
Clinical Samples
Taiwan Cancer Knowledge base
Omics big data Tissue banks
Integration by AI
Biobank and drug discovery
Identify association between genetic variants and disease phenotypes to prioritize drug targets based on genetic evidence
Investigate the effect of genetic variants in genes encoding drug targets to predict possible drug effect
22
“New genetic data can lead to new medicines and improved outcomes” (March 19, 2019)
Regeneron and U.K. Biobank completed exome sequence of 50,000 U.K. Biobank participants
Data are available to health researchers to aid in
therapeutic discovery and enhance the understanding of human biology
Regeneron
– has advanced multiple new targets and development programs – is leading a consortium of biopharma to complete exome
sequencing of all U.K. Biobank participants by 2020 – predicts a great deal of actionable information will be
generated
23
Extracting phenotypes from electronic health records to support precision
medicine
Accurate phenotyping requires extracting information from billing codes, prescriptions, laboratory tests and clinical notes
24
Wei and Denny, Genome Medicine (2015)
Extracting phenotypes from electronic health records to support precision medicine
EHRs are primarily designed for clinical care not research
EHR data are highly complex
Clinical data are often fragmented across healthcare systems
Many data in clinical records are not computable
An issue limiting repurposing EHRs for research is accuracy
25
Wei and Denny, Genome Medicine (2015)
Extracting phenotypes from electronic health records to support precision medicine
To identify populations with high accuracy takes domain knowledge
Leveraging EHRs for phenotyping involves collaboration across disciplines (e.g., domain experts work with clinical informaticians)
Validation is an important part of the process
An algorithm may be revised and validated iteratively until its performance achieves a desired goal
26
Wei and Denny, Genome Medicine (2015)
eMERGE network
A consortium funded by the National Human Genome Research Institute (NHGRI)
Initially included five medical research biobanks in 2007
To develop methods and best practices for utilization of EHRs for genetic research
The ‘best practice’ is an iterative paradigm of algorithm design followed by physician review of cases and controls
27
An iterative process to enable precision medicine
Biobank Knowledge
network
Prevention Treatment
Health Disease Monitoring
Validated
Electronic Health Record
Biomedical research
Modified from Galli, S. J Allergy Clin Immunol 2016 Validated
Health Outcomes
Risk Prediction Diagnosis