生物資料庫對醫療產業之影響

(1)

生物資料庫對醫療產業之影響

劉扶東

Fu-Tong Liu, MD, PhD

中央研究院

Academia Sinica, Taiwan

「生技醫療與智慧化醫療之產業趨勢與商機」研 討會

September 18, 2019

¹

(2)

Biobank

 A biorepository that stores biological samples for use in research

 An important resource supporting contemporary medical research like genomics and personalized medicine

 Gives researchers access to data representing a large number of people

 Instrumental in identifying disease biomarkers, promoting drug discovery and development

2

(3)

Biobank

 A biorepository that stores biological samples for use in research

 An important resource supporting contemporary medical research like genomics and personalized medicine

 Gives researchers access to data representing a large number of people

 Instrumental in identifying

disease biomarkers, promoting drug discovery and development

3

Time magazine, 2009

(4)

Population Biobanks

Epidemiology-driven Biobanks Diseased-oriented Biobanks

Tumor-banks

Population

individuals Patients

Life style questionnaires

Biological samples

Determinants of individual susceptibility

Clinical information

Clinical samples

Determinants of exposure/life style &

biologic effect

Determinants of disease

Disease/tum or biomarkers

New drug targets

Implementation of

personalized/precision medicine

4

Types of Biobank

Modified from Paskal, et al., Path & Onco Res, 24: 771 (2018)

(5)

UK Biobank (1)

 Established by the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish

Government, and the Northwest Regional Development Agency

 Supported by the National Health Service (NHS) (initial funding of ~ £62 million)

 500,000 volunteer participants, aged 40-69 years (recruited in 2006-2010)

5

(6)

UK Biobank (2)

 Data/information

– Blood biochemistry, all participants – Questionnaires

– Images (brain, heart, abdomen, bones & carotid artery) of 100,000 participants

– Genotyping, all 500,000

– A 24-hour activity monitor for a week, 100,000

 Linked to electronic health records

 Provides health information to approved researchers in the UK and overseas, from academia and industry

6

(7)

UK Biobank (3)

 Starting in 2017, researchers can access the database

 The first descriptions of the full cohort, including genome- wide genetic data

– the HLA have the expected associations with disease (C. Bycroft et al. Nature 562, 203–209; 2018)

 Brain imaging of 10,000 individuals: genetic influences on brain structure and function, and correlations with

neurodegenerative, psychiatric and personality traits

(L. T. Elliott et al. Nature 562, 210–216; 2018)

7

(8)

All of Us (US) (1)

 Created in 2015; launched in 2018

 Run by the National Institutes of Health (NIH)

 FY2016 - $130 million; FY2017 - $230 million; and FY2018 -

$290 million

 Collect genetic and health data from one million volunteers

 more than 100 partners (Mayo Clinic, Vanderbilt, Broad Institute, many medical centers, Google life sciences startup Verily Life Sciences)

8

(9)

All of Us (US) (2)

 Data/information

– blood and urine samples and physical measurements – wearable devices (Fitbit)

– survey on lifestyle and environment

– give researchers access to their electronic health records (EHR) – capture genomic data from select participants

 As of May 2019, enrollment numbers: 187,000+ participants

9

(10)

BioVU Vanderbilt University

 De-identified DNA samples (from leftover plasma donated by patients during their clinic visits)

 More than 250,000 DNA samples collected over the last decade

 Linked to electronic health information

 VUMC has formed Nashville Biosciences

 Pharmaceutical companies can gain access to BioVU to accelerate new drug discovery and development

10

(11)

2024 2012

Recruitment promotion by

TWB staff

Referral by doctors/

medical center biobanks

Taiwan Biobank

A National Project – To establish an infrastructure for personalized medicine and prevention

(12)

B. Banking/Upgrading Information/Specimens A. Recruiting Participants, Follow-up,

Collecting information/Specimens

D. Making

Information/Specimens Publicly Available

C. Profiling Genomic, Epigenomic,

Metabolomic, and Environmental Features ¹²

－General population

(13)

Follow-up: Success rate >75%

2019/9/19 13

2019/4/30

recruitment 112,326 follow up 22,500

200,000

General population ( 30-70 years old)

－General population

(14)

14

－Disease patients

Multi-center, single standardized case recruitment

Patients, 100,000

Breast CA, Lung CA Colon CA, Liver CA Gastric CA, Head/Neck CA

CVD, Stroke DM, Alzheimer’s disease

Chronic Kidney Dis, Asthma, Endometreosis

12 disease cases: breast CA, lung CA, colon CA, liver CA, gastric CA, head/neck CA, CVD, stroke, DM, Alzheimer’s disease, chronic kidney dis, asthma, endometriosis

(15)



Reference for imputation



Detection of rare/novel variants



Exploration of clinical relevance Whole genome sequencing

- 2,021 genomes at the end of 2018

(16)

TWB1.0 / TWB2.0

Whole genome genotyping

TWB1.0 - ~650,000 SNPs based on genetic studies of HapMap Project, 1,000 Genomes Project, CHB array plate, and Pharmacogenomic markers (Axiom Biobank

Genotyping Arrays)

TWB2.0 - ~750,000 SNPs covering specific SNPs associated with diseases, drug metabolism, and drug response in

Taiwan population

The overlap between TWB1.0 and TWB2.0 is ~100,000 SNPs

(17)

Banking information/specimens

*TWB has begun collecting tumor tissues since Feb, 2016

17

－Biologic specimens

~2,680,000 tubes of different kinds of specimens

(18)

(19)

-- Data release since September 2014

19

Data/Specimens 2014 2015 2016 2017 2018 2019/02 Total

Questionnaires 16,471 400,256 473,487 538,762 776,085 90,355 2,295,416 Physical data 3,392 67,027 123,791 150,805 209,076 17,280 571,371 Blood and urine test 3,492 59,439 80,533 103,416 326,266 16,087 589,233 DNA sample (g) 3,584 21,880 24,096 25,518 21,264 8,490 104,832

Plasma (tube) 1,792 2,124 1,000 6,184 6,724 331 18,155

Urine (tube) 0 1,900 600 5,194 4,480 0 12,174

Whole genome genotyping 3,392 76,809 207,437 184,466 337,609 51,100 860,813 Whole genome sequencing 0 1,229 17,958 22,199 31,487 4,905 77,778

DNA methylation 0 0 550 6,855 15,853 2,942 26,200

Plasma metabolome 0 0 0 400 3,628 409 4,437

HLA genotyping 0 0 0 0 18,210 2,120 20,330

Sum 4,386,720

(20)

20

科研效益

項目完成人數數位資料釋出全基因體定序

2,021

人

77,778

人次

TWB

成本

~NT$ 1.2

億

使用者成本

~NT$ 46

億

An example:

~38倍效益，使用愈多效益愈高

(21)

Taiwan Cancer Precision Medicine Program

Lung Cancer

Breast Cancer Colorectal

Cancer

Research and Clinical Teams, Taiwan NCI, USA

Proteomics

Genomics

Microbiota

Next-generation cancer

therapeutics Academia NBRP

Sinica

Medical Centers

Clinical Samples

Taiwan Cancer Knowledge base

Omics big data Tissue banks

Integration by AI

(22)

Biobank and drug discovery

 Identify association between genetic variants and disease phenotypes to prioritize drug targets based on genetic evidence

 Investigate the effect of genetic variants in genes encoding drug targets to predict possible drug effect

22

(23)

“New genetic data can lead to new medicines and improved outcomes” (March 19, 2019)

 Regeneron and U.K. Biobank completed exome sequence of 50,000 U.K. Biobank participants

 Data are available to health researchers to aid in

therapeutic discovery and enhance the understanding of human biology

 Regeneron

– has advanced multiple new targets and development programs – is leading a consortium of biopharma to complete exome

sequencing of all U.K. Biobank participants by 2020 – predicts a great deal of actionable information will be

generated

23

(24)

Extracting phenotypes from electronic health records to support precision

medicine

 Accurate phenotyping requires extracting information from billing codes, prescriptions, laboratory tests and clinical notes

24

Wei and Denny, Genome Medicine (2015)

(25)

Extracting phenotypes from electronic health records to support precision medicine

 EHRs are primarily designed for clinical care not research

 EHR data are highly complex

 Clinical data are often fragmented across healthcare systems

 Many data in clinical records are not computable

 An issue limiting repurposing EHRs for research is accuracy

25

(26)

Extracting phenotypes from electronic health records to support precision medicine

 To identify populations with high accuracy takes domain knowledge

 Leveraging EHRs for phenotyping involves collaboration across disciplines (e.g., domain experts work with clinical informaticians)

 Validation is an important part of the process

 An algorithm may be revised and validated iteratively until its performance achieves a desired goal

26

(27)

eMERGE network

 A consortium funded by the National Human Genome Research Institute (NHGRI)

 Initially included five medical research biobanks in 2007

 To develop methods and best practices for utilization of EHRs for genetic research

 The ‘best practice’ is an iterative paradigm of algorithm design followed by physician review of cases and controls

27

(28)

An iterative process to enable precision medicine

Biobank Knowledge

network

Prevention Treatment

Health Disease Monitoring

Validated

Electronic Health Record

Biomedical research

Modified from Galli, S. J Allergy Clin Immunol 2016 Validated

Health Outcomes

Risk Prediction Diagnosis