• 沒有找到結果。

Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm

N/A
N/A
Protected

Academic year: 2022

Share "Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm"

Copied!
32
0
0

加載中.... (立即查看全文)

全文

(1)

Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm

Speaker :

樂正、張耿健、張馭荃、

葉光哲

Author : Ha Minh Lam 、 Oliver

Ratmann 、 Maciej F. Boni

(2)

Outline

 Introduction

 The proposed algorithm

 Improvement

 Experimental results and discussion

(3)

Outline

 Introduction

 The proposed algorithm

 Improvement

 Experimental results and discussion

(4)

Recombination

Three nucleotide( 核甘酸 ) sequences :

(5)

Informative Sites

The nucleotide in is identical to one parental sequence but different from

 

Not informative site

(6)

Double-breakpoint Recombinant

 

   

 

right-side is the most likely recombination breakpoint

(7)

Graph Representation

+1 +1 +1 -1 -1 -1 +1 +1 +1 +1 -1

 

� �

 

 up-steps

 

 down-steps

 

±1   ��������

(8)

Maximum descent

0 1 2 3 4 5 6 7 8 9 10 A:+1 +1 +1 -1 -1 -1 +1 +1 +1 +1 -1 B:+1 +2 +3 +2 +1 +0 +1 +2 +3 +4 +3

� �

 

Maximum descent =

l < r

 

In this case, Maximum descent = 3 at

 

(9)

Statistical Tests

Given three sequences

Null hypothesis : is not recombination of

the informative sites will become random permutation

(10)

Random Permutation

� �

 

Maximum descent = 3

� �

 

(11)

Statistical Tests

suppose that the maximum descent

 

is the probability that maximum descent for a random arrangement

 

(12)

Outline

 Introduction

 The proposed algorithm

 Improvement

 Experimental results and discussion

(13)

Some Notations

the probability that up-steps , down-steps , and the ma ximum descent is exactly

� �

 

�=8 ,�=5 ,�=4

 

(14)

Example

For there are  permutations.

 

Md = 2 Md = 1 Md = 2

�1,2,0=0,�1,2,1= 1

3 , � 1,2,2= 2

3 , �1,2,3=0

 

(15)

Some Notations

the probability that up-steps , down-steps , and the ma ximum descent is

can be computed from

(16)

Some Notations

the probability that up-steps , down-steps , the maximu m descent is exactly and the minimum value is exactly unit below the origin

� �

 

�=8 ,�=5 ,�=4 , j=1

 

(17)

Some Notations

can be computed from

so we only need to concentrate on how to compute

(18)

Dynamic Programming

Transfer need time complexity

 

(1)

(2)

(3)

(4)

(19)

Dynamic Programming

Transfer need time complexity

 ) states

Time complexity ) Space complexity )

(20)

Outline

 Introduction

 The proposed algorithm

 Improvement

 Experimental results and discussion

(21)

Rewritten

��,�,�=

�=0

��,�,�, �

 

¿ ��,�,�,0+

�=1

�−1

��,�,�, �+��,�,�,�

 

expanded by (1)

expanded by (4)

expanded by (3)

(22)

Recursive Relation

��,�,�  

Can see this equation as the recursive function of We only need to know for which

 

(23)

Improvement

building dp table for

 Original method

1. Time complexity )

2. Space complexity )

 Improved method

1. Time complexity )

2. Space complexity )

Since for we only need to memorize those with

 

(24)

Complexity Analysis

for computing

 Transfer need time complexity

 ) states

Time complexity ) Space complexity )

��,�,�

 

(25)

Complexity Analysis

for computing , overall

Time complexity ) Space complexity )

(26)

Outline

 Introduction

 The proposed algorithm

 Improvement

 Experimental results and discussion

(27)

New Applications

 The 3SEQ maximum descent statistic describes clus

tering patterns in sequences of binary outcomes, a

nd is therefore not confined to recombination analy

sis.

(28)

New Applications (1) : Seasonality

 A particular population behavior or climatic charact eristic can be noted to occur or not occur every day.

 E.g., rain or no rain

 An ordered sequence of the days in the year will sh

ow if the occurrence of one of the behaviors is clust

ered and thus if this feature was seasonal in that one

year.

(29)

New Applications (2) : Disease Severity

 When a process is expected to behave at an interme diate range or when an observation is expected to b e made at intermediate values only, this pattern can be tested.

 E.g., Dengue virus ( 登革熱 ) does not cause severity for all ages equally.

 First infection (childhood) is typically nonsevere; second

ary infections (older children or teenagers) have a higher

chance of severity.

(30)

New Applications (2) : Disease Severity

 Disease severity in a surveillance system should be seen in the intermediate age ranges.

 Disease severity can be tested if each age band is ov

errepresented or underrepresented in the pool of pat

ients experiencing dengue-like severe disease in a h

ospital.

(31)

Discussion

(32)

Discussion

 In general, when recombinants are identified by a m

osaicism statistic like the one used by 3SEQ, a phyl

ogenetic analysis should be performed to ensure tha

t the recombination signal is preserved when the ent

ire evolutionary history of the sample is taken into a

ccount.

參考文獻

相關文件

Therefore, the “Buddhism for this World” is a movement, in certain aspects, of “returning to the India’s primitive Buddhism and early Mahāyāna Buddhism.” The proposing of

Bootstrapping is a general approach to statistical in- ference based on building a sampling distribution for a statistic by resampling from the data at hand.. • The

In particular, we present a linear-time algorithm for the k-tuple total domination problem for graphs in which each block is a clique, a cycle or a complete bipartite graph,

If x or F is a vector, then the condition number is defined in a similar way using norms and it measures the maximum relative change, which is attained for some, but not all

Chen, The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem, Journal of Global Optimization, vol.. Soares, A new

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the

In summary, the main contribution of this paper is to propose a new family of smoothing functions and correct a flaw in an algorithm studied in [13], which is used to guarantee

Rugged, high resolution, full-color, video-rate displays enable a multitude.