Source of
Source of
data/ethical
data/ethical
concerns
concerns
Wei-Chu Chie
Wei-Chu Chie
Steps of a research
Steps of a research
• Conceiving the research question
• Choosing the study subjects
• Planning the measurement
• Conducting the study
• Analyzing the data
• Writing the report
A good research question
A good research question
• Feasible
• Interesting
• Novel
• Ethical
• Relevant
Data: subjects and
Data: subjects and
measurement
measurement
– Subjects
• target population (research question) • accessible population (study plan)
• intended sample (study plan) • actual subjects (actual study)
– Measurement
• phenomena of interest (research question) • intended variables (study plan)
Research validity
Research validity
– Internal validity
• how well the findings of the study infer to the truth of the study (study plan)
– actual subjects to intended sample
– actual measurement to intended variables
– External validity
• how well the truth in the study infer to the truth in the universe (research question)
– intended sample to accessible and to target population
Source of data
Source of data
• Primary
– collect original data by the researcher
him or herself
• Secondary
– use existing data
• Tertiary
Primary data: subjects
Primary data: subjects
– Specification: target to accessible population
• target population: well suited to the research quest ion
• accessible population: representative of the target population and easy to study
– Sampling: accessible population to sample
• intended sample: representative of the accessible po pulation and easy to study
– Question of external validity (generalizabilit y)
• usually less strict in epidemiologic and clinical ou tcome research
Primary data: subjects
Primary data: subjects
• Inclusion criteria: be specific
– specifying the characteristics that define
population relevant to the research
question and easy to study
• target population: demographic and clinical characteristics
• accessible population: geographic and temporal characteristics
Primary data: subjects
Primary data: subjects
• Exclusion criteria: be parsimonious
– highly likely of being lost to follow-up
– inability to provide good data
– ethical barriers
– refusal
Primary data: subjects
Primary data: subjects
• Sampling
– probability sampling: *often used in PH
studies
• simple random • systematic
• proportion to population size (PPS) • stratified random
Primary data: subjects
Primary data: subjects
• Sampling
– non-probability sampling
• consecutive *most often used in clinical studies
• convenience • judgmental
• Actual subjects
– non-response ‘bias’ and its prevention
– systematic error/internal validity
Primary data: measurement
Primary data: measurement
• Measurement scale
– categorical (nominal) including binary
– ordinal or rank
– interval or continuous
Primary data: measurement
Primary data: measurement
• Precision
– free of random error
• Accuracy
– free of systematic error
• Validity of the instrument vs. internal
validity of the study
Primary data: measurement
Primary data: measurement
• Precision: free of random error
– the degree to which a variable has
nearly the same value when measured
several times
– coefficient of variation (C.V.)
– reliability
Primary data: measurement
Primary data: measurement
• Accuracy: free of systematic error
– the degree to which a variable actually
represent what it is supposed to
represent
– validity
– with gold standard
• sensitivity• specificity
Primary data: measurement
Primary data: measurement
• Accuracy: free of systematic error
– without gold standard
• face validity & content validity • criterion-related validity
– convergence validity – divergence validity
Primary data: measurement
Primary data: measurement
• Choice of proper instrument
– Status or time to event:
• binary/nominal• registry/clinical or other records • classified from interval scale
– Surrogate endpoint:
• nominal/ordinal/interval
Primary data: measurement
Primary data: measurement
• Choice of proper instrument
– quality of life
– functional status
– satisfaction
– cost
• interval scale • questionnairesPrimary data: measurement
Primary data: measurement
• General rules of increasing precision
– standardization of methods
– training and certifying observers
– refining the instrument
– automating the instrument
– repeating the measurements
Primary data: measurement
Primary data: measurement
• General rules of increasing accuracy
– the same as that for precision except
repeated measurements
– making un-obstructive measurements
– blinding
Primary data: measurement
Primary data: measurement
• Questionnaire
– use existing vs. self-designed ones
– copyright and translation right
– standardized translation procedure
• forward• backward
Primary data: measurement
Primary data: measurement
• Questionnaire design
– questions: open vs. closed-ended
– format/wording
• clarity
• simplicity • neutrality • specific
Primary data: measurement
Primary data: measurement
• Questionnaire design
– Scale
• summative (Likert) • cumulative (Guttman)
– Draft and content or face validity exami
nation
– Coding/precoding
– Pretest and revision, reliability and va
lidity
Primary data: measurement
Primary data: measurement
• Questionnaire use
– Administration
• interview: face-to-face, group interview, telephone
• self-administered: concurrent, mailed, internet
– Quality control
Primary data: ethical
Primary data: ethical
concerns
concerns
• Subjects
– four principles of medical ethics:
• autonomy
• beneficience
• non-maleficence • justice
– informed consent/not limited to experime
ntal studies/
IRB: institutional review boardPrimary data: ethical
Primary data: ethical
concerns
concerns
• Subjects source
– institutions/physicians
• Cooperation/collaboration
– right and duty
– authorship
• Instruments
Secondary data: overview
Secondary data: overview
• Strengths
– speed: especially easy in the e-era
– economy
• Weakness
– quality unsure
Secondary data: types
Secondary data: types
– Aggregate: group as unit
• vital statistics
• disease incidence/prevalence of geographic area • economic, demographic, … data/census
– ecological correlation study/ecological fallacy
– Individual: individual as unit
• government statistics: mortality, cancer registry, ... • hospital discharge data/health insurance data, ... • previous studies of different purposes
Secondary data: how to
Secondary data: how to
start
start
– Find data bases to fit a research question
• choose a research question/literature review • list predictors/outcome variables
• identify proper data bases that might include the variables
• be familiar with the data bases/consultation • choose the best one/application
• formulate hypotheses and statistical methods • data analysis
Secondary data: how to
Secondary data: how to
start
start
– Find research questions to fit existing
data sets:
• the reverse way of usual research design • choose (a) data base(s)/application
• be familiar with the data bases/make a flow sheet of variables/identify pairs/groups of variables of interest
• literature review/experts consultation
Secondary data: data
Secondary data: data
linkage
linkage
• Making use of more than one data
bases
• Key linkage variable:
– a variable that all the data bases possess – e.g. individual citizen’s ID
– easy in electronic data bases
• Enrich the ‘utility’ of data
– e.g. hospital discharge data --- mortality registry-- survival of a certain disease
Secondary data:
Secondary data:
Public accessible
Public accessible
secondary data bases in Taiwan
secondary data bases in Taiwan
– aggregate data: national and local• 內政部台閩地區人口統計 • 衛生署衛生統計
• 其他政府出版品集體統計資料
– individual data:
• mortality (death certificate) registry 死亡檔 (original ID)
Secondary data:
Secondary data:
Public accessible
Public accessible
secondary data bases in Taiwan
secondary data bases in Taiwan
– individual data: previous studies
• 中研院調查研究工作室學術研究資料庫
– 台灣地區社會變遷基本調查 – 政大選舉中心選舉調查
– 國民營養狀況變遷調查 等
• Individual-based government surveys
– 生育力調查 – 收支調查
Secondary data: p
Secondary data: p
ublic inaccessible
ublic inaccessible
secondary data bases in Taiwan
secondary data bases in Taiwan
• NHI original data (ID not scrambled)
• Household registry data
• Health station-household data: PHIS
• Military service data
Secondary data: hospital-based
Secondary data: hospital-based
data ready for clinical research
data ready for clinical research
• Data kept by separate hospitals
– computerized
• hospital cancer registry • hospital NHI claims data
• other computerized records
– not computerized
• written form on medical records • special tests/examinations/studies
Secondary data: other
Secondary data: other
countries
countries
– US
• government-owned
• separate health insurance, managed care, HMOs
– Scandinavian countries
• large and detailed medical and national files with original ID, ready to link
Secondary data: ethical concerns
Secondary data: ethical concerns
and related limitations
and related limitations
– Protection of privacy/confidentiality
• personal electronic data protection law • scrambled ID: NHI data base– inaccessible to original record/individual, unable to examine validity of data or improve quality – data bases linkage impossible: poverty of
contents
• third party linkage/removal of ID before use: cancer registry
• researcher/investigator agreement: hospitals • informed consent: certain studies/hospitals
Secondary data: ethical concerns
Secondary data: ethical concerns
and related limitations
and related limitations
– Ownership and intelligence property
• government and public data bases• individual researcher/investigators
– data donation – data share – authorship