• 沒有找到結果。

Image retrieval using document and establish the diagnosis and treatment of gastric cancer case based reasoning

N/A
N/A
Protected

Academic year: 2021

Share "Image retrieval using document and establish the diagnosis and treatment of gastric cancer case based reasoning"

Copied!
11
0
0

加載中.... (立即查看全文)

全文

(1)

Image retrieval using document and establish the diagnosis and treatment of gastric cancer case based

reasoning

Zhan Huang Qian a detached a Sun Lei Ming Yang Liu ab a

a Taipei Medical University, b Institute of Medical Informatics, Taipei Medical University Hospital

E-mail: jhcxxx@yahoo.com.tw

Summary

With the rapid development of information technology, information technology has become the medical care and a key tool in biomedical research. Therefore, the use of digital medical information to help clinicians to solve the problem of patients is an important issue.

Case based reasoning (Case-Based Reasoning, CBR) is a by previous experience to solve problems encountered. This is just as human beings to solve problems, they tend to imitate their predecessors is very similar, and can be quite extensive areas of application. The advantage is to not spend too much effort to do

complex rule-based reasoning (Rule-Based Reasoning, RBR), which can produce results quickly. In the current CBR systems, most have only carried out for some

documents. However, in the medical field, the image data can provide value, far more than other areas of expertise, subject the documents and images can be combined into one, then the effect on the overall diagnosis would be more helpful. This essay is intended to use file retrieval combination of image

retrieval, to enhance the capacity of case-based reasoning system, and actually used in the diagnosis of gastric cancer.

Keyword: Case based reasoning, document retrieval, image retrieval, gastric cancer

Preface

Years, although medical technology and bio-information has a significant breakthrough and progress, but in the present, cancer is still human health and the greatest threat to life. According to the Department of Health cancer

statistics report, 91 the Republic of China National were 56,323 new cancer cases,

(2)

of which 2,446 were cancer deaths, although the incidence rate of decline in ranking, but ranking the top ten cancer deaths reason for fifth place.

A very high cure rate of early gastric cancer, five-year survival of 95%. But once

in the late stage of gastric cancer was found, the cure rate is almost zero.

Therefore, prevention and treatment of gastric cancer, if early diagnosis and treatment, the cure rate is almost different from normal people. Is how to

establish a mechanism to early diagnosis of gastric cancer is indeed an important topic.

In the artificial intelligence field, when the problem areas in a clear, concise knowledge representation; or cases of complexity, not easy to split, and with experience related, repetitive situations, case-based reasoning in particular can play its effectiveness.

In the current case-based reasoning system, most of them are only carried out for some documents. However, in the medical field, the image data also provides important information. The diagnosis of gastric cancer, in addition to clinical symptoms to assess from the outside, the patient gastroscopy image

interpretation, also has considerable importance. The purpose of this research project in addition to use of medical records documentation as an index, but also try to include the index gastroscopy the image data into, and then use the replacement or conversion of Case-type adaptation method to solve the case based reasoning in the case of adaptation problem. In the case of capture and case similarity calculation, using the CBR tool is built-in tools, while using the case base and the training set (Training set) the results of the similarity between the documents and images to deduce the common form of weights of the index to find the best and most able to present the attribute weights.

1. Case Based Reasoning

CBR by theSchank & Abelson in 1977 from the artificial intelligence branch out by a new set of theories and research methods are a set of inferences based on prior experience to deal with the problem of the status of methodology, the experience was stored in case database in all cases (1). The operation of the main problems faced by imitating human reasoning when in fact, encountered by the previous experience to find the most similar cases, by changing the contents of cases to solve the current problems faced by (2).

(3)

Figure CBRarchitecture diagram

2. File Search

Document retrieval object is not structured or semi-structured

documents, content from the composition of words and phrases. In this

study, the definition for document retrieval, not through a special way to analyze the information document, or integration, but rather to find the document describing the specific information implicit in the rules when, and this rule applies related search and analysis of specific information.

Correlation between gastric symptoms predict the way through the text mining to find documents describing symptoms of gastric cancer among the implicit association rules when, and by association rules to predict (3).

3. Image Retrieval

Content-Based Image Retrieval (CBIR) is a kind of video content for the query object query. In general, people describe the image content, most often by the color (color),pattern (texture) and shape (shape) to observe the three angles. Unlike

traditional text query, CBIR to let users to image characteristics of their own to make inquiries, not just by words.

Materials and methods Data pre-processing 1. Data Collection

The study collected information to a medical center in recent five years, patients receiving cancer treatment, after complete screening medical records and

endoscopy reports, a total of 340 patients within five years, then exclude those who do not meet the conditions for a total of 206 complete cases, and randomly selected 150 of them as the case base while the other 56 were substituted into the system for evaluation to assess the accuracy of the system.

In the case through the selection, medical documents and medical image data files were set up and use Text mining tooland CBIR of theGIFT system as a tool to

(4)

create a database of documents and images, and after the CBR tool indexed files and case library.

2. Document Retrieval

In the case records of the choice, this study is to take pre-operative diagnosis as predictive models, so the contents of cases with 1. Chief complain, admission note, present illness, family history and past history

2. Experimental data.

3. Of several medical textbooks, some of the gastric contents

4. Past few years, stomach cancer about two hundred medical references, along with the contents of a document retrieval.

5. The establishment of common medical word speech database, try to access with the digestive system and cancer-related medical text to create a speech database, increase the retrieval rate.

6. Will address library substitution Text mining tool in the retrieval, the patient's file after the calculation data input words or phrases related to the frequency, and in the percentage of high and low frequency as an index of selection and

assessment of weights.

7. Reference books and periodicals and other literature to be indexed and weight to the establishment (such as gender, blood type, age, ... ... etc.). And the

substitution CBR tool to build a complete case base for comparison purposes.

3. Image Retrieval

This research project attempted to Gastroscopy image data included in the index among the images selected by some of the best images of lesions as the center and try to avoid choosing too small and the body with endoscopic mirror image, so as not to affect the retrieval accuracy .

Image to the CBIR existingopen source tools to evaluate, establish images of case base, and to the calculation of similarity. Video case base is established, we turn more information on the video input, through the calculation of new images and video images of the similarity between the case base returned CBR tool,and

(5)

provides an index for the CBR system, improving theCBR system's accuracy to enhance the future development of this study scalability.

Research Methods

1. Gastric cancer patient data collection and creation of a database

Gastric cancer patients in this study to collect data that directly from the hospital information system access. Collection of patients admitted to the hospital, underwent upper endoscopic screening, and treatment of surgical resection pathology report to determine those standards for the establishment of patient information. Staging of gastric cancer patients in accordance with the

classification of early gastric cancer is divided into Type I, the first Иa type, the first Иb type, the first and third Ш Иc type type, for gastric cancer types I, II, III, p.

IV and type of malignant lymphoma and ten. Using text mining toolto establish the patient database.

2. Use file search tool to create a text database index patients

Text database to establish the index patient in the CBR, the index build is very important because the choice of the index directly affect the inference results of good and bad, and good quality index for the case of adaptation also has a great influence, and good the index must have a predictive (predictive), availability (usefulness), specificity (concreteness) and helpfulness (Usefulness).

Specific word vocabulary word vocabulary specific retrieval of medical

documents very rewarding position, because the system is to want to deal with the document focuses on medical research literature, it appears the Central Standing Committee in the document specific biomedical terms, or gene name etc. Therefore in must have a background knowledge to do to assist, facilitate the subsequent analysis. Specific word thesaurus is divided into six sections, namely the names of diseases, clinical symptoms and signs, past medical history,

relevance words, no sense and antisense keyword keyword.

3. Use of patient image retrieval tools to build the index gastroscopy image database

Index to the main purpose is to enhance the efficiency of the implementation.

In the analysis of the implementation of a general query system efficiency, it is

(6)

usually divided into two aspects: one is when 』『 information on the

implementation of construction efficiency (Off-line efficiency), and one for 』『 user query execution efficiency (On- line efficiency). 』『 Information when building work includes the video cut, feature extraction, and the user query 』『 work is to find a certain feature of the image (for query by feature), or a similar search with the specified image images (for query by example).

Image information query the index of technology needs to solve two major problems:

1. Feature space dimension (dimension of feature space)is usually large: the so-called feature space, referring to the description of the database representation of all the characteristics of images (video description) the composition of space. Image query system usually feature space dimension, generally are in the hundreds of years.

2. Distance in the feature space is usually not measure the distance measurement method Euclidean space (Euclidean distance measure ie L2 metric): For most of the features of that law, its most appropriate distance measure is not L2 metric.

To solve these two problems, the most commonly used method is to first describe the specific conduct on the image dimensionality reduction (Dimension Reduction), and then apply the support of non-Euclidean space distance measures indexing multi-dimensional indexing techniques (Multidimensional Indexing

Techniques that support non-Euclidean similarity measure).

4. Replace or convert the use of Case-type adaptation law for the establishment of patient case based reasoning case adaptation

In CBR practice, or case base in case of failure (incomplete case), or because the solution space (solution space) is too big to be a typical case of the whole library into the case. At this point, if there had not encountered the problem, we must adapt through case studies, in order to retrieve cases can effectively solve the problem.

In addition to cases of capture, the case adapted for case-based reasoning in another of the most important component, to provide a more comprehensive FAQ system. Even part of the case-based reasoning system is divided into only two cases of retrieval and case adaptation block, both sides can operate

independently without disturbing each other. This is why some cases do not mention cases of capture adaptation of the system, but it can work reasons.

(7)

In general, the following four types of adaptation methods:

Replacement-type adaptation method (Substitutional adaptation) is characteristic for a single alternative to adjust the mode to change its value does not involve adding and reducing or restructuring the work characteristics. This is the most basic adaptation methods, the majority of case-based reasoning systems use the number of Jieyou. In the case of the problem and get back very similar, you can play a significant function. Conversion-type adaptation method(Transformational adaptation) approach is based on adding, deleting, or re-adaptation of action to achieve certain characteristics.

When these two methods are not applicable, the complexity of the issues that need to adapt innovative ways to generate adapted method (Derivational adaptation, or Generative adaptation) is one. Type derived from the analogy of this Law, the system will make reference to previous adaptation of a similar mark (trace), repeat (replay) the adaptation step in the new issue. A more complex case- based reasoning systems, most have adapted a number of ways, therefore, the above-mentioned three types of skill mix is made of modular adaptation law (Compositional adaptation) which is a common practice.

5. Using K-Nearest-neighbor case of similarity calculated

K-Nearest-neighbor k-nearestneighbor method (K-Nearest Neighbor, K-NN)is a distance-based, using the distance matrix by the sort, to get back (Retrieve) the forecast of the k-value close to case, assessment of each issue in the case base case attribute variables, similarity, using a variety of weighting factor.

Calculation formula for calculating the sum of similarity expressed as follows;

(1)

T is the target case,S is the case base case,n is all the attributes of each case, i is each attribute, fis the objective case base case the first i

attributes case the similarity function, Wis the first i attributes The importance of weight. Similarity of its value dropped after the regularization included between 0 and 1

. 0 is completely similar to one that fully 100% similarity. Case base reasoning correction in many cases using K-NNclassification, the sensitivity of all the similar functions is subject to separate each other, the noise properties and other factors.

(8)

Therefore, the similarity in the case base reasoning is very important. The K-NN algorithm encountered in the simulation because the same noise, it will need to rely on several attempts to do the best of the batch, so in the same group classification of homogeneity within the group, the largest, and the groups the variability between the greatest. Also give the variation of weight and improve accuracy.

Attribute weights of different K-NN classification often used in cases when the library system back cases, K-NN assumption in each caseX = (X1, X2, ... .... Xn) is defined by the data with nattributes set, the property value or classification symbol may be the property, when Xc is the X's in a particular classification value.

Suppose given a case base search qand L, K-NN to retrieve from the case library L q's in the k most similar cases and to predict the weight qof the major

categories, and the Kvalue greater than or equal to 1, is defined formula as follows;

(2)

And W f ≧ 0for all f (3)

Euclid distance is to use the continuous value and symbolic value of information, such as formula (3) value and symbolic value of continuous treatment. Equation

(1) When the ownership is a weight will be allowed to repeat, inappropriate, bad direct impact on the properties of distance calculations and the lack of a K-NN, when the same attribute appears K-NN's performance will change difference. In

equation (2) K-NN is the case base to meet the back case.

Experimental results Accuracy assessment

CBR system of assessment methods are commonly based on P value (Precision Value) based:

If P (10)> 90%, Result is Highly perfect If 80% <P (10) <90%, Result is very good

(9)

If 70% <P (10) <80%, Result is good

In this study, the assessment of similarity to gastric cancer case classification (Classification), stage (Stage) and combined (Classification + Stage) were assessed, and in the case library to find in order to find the maximum similarity before 20 sequentially arranged.

Medical file search results

In the classification (Classification), some 46.43% of similar cases in the first case can be found in the same category of cases, while 98.21% of the cases similar to the first 10 cases to find the most similar to those Jie Ke, P (10) = 98.21. In other

words, a Jiucheng case can be found in the former case the results 10 queries. In

the seventeenth to find similar cases are cases of early gastric cancer E2a, because such a classification of cases is less common in the case of case base is relatively small, probably there will be some errors in the search, worth exploring.

In stage (Stage), some 32.14% cases of similar cases can be found in the first case the same classification, while 100% of similar cases in the first nine cases to find the most similar to those Jie Ke, P (10) = 100 %.

In merger inquiries (Classification + Stage), some 23.21% of similar cases in the first case can be found in the same category of cases, while 98.21% of the cases similar to the first 10 cases to find the most similar to those Jie Ke, P (10) = 98.21%.

This perspective, the future new query case, the first ten cases can be queries, so you can speed up the query and increase the speed of the system to follow.

Medical Imaging search results

A. In the category (Classification): the general classification of gastric cancer itself largely on the appearance of the main, though slightly behind the accuracy of image retrieval document retrieval accuracy, but the case is still 37.50% of similar cases in the first found in the same category of cases, while 92.85% of the cases similar to the first 10 cases to find the most similar to those Jie Ke, P (10) = 92.85%; In other words, there Jiucheng case may be the first 10 query cases found the results, there are bias values, case158 125 was found in the case of similar cases, is gastric lymphoma, because this category of cases in the case of

(10)

case base is relatively small, may be part of the search on the error, CBRcan be used in the case law of the case and adapted by adapted by adding the case base.

B. In the stage(Stage): the due installment, the look has become the more important sign, mainly the depth of tumor invasion may be, it is more difficult in image retrieval, accuracy becomes more and drop But there are still 25% of cases of similar cases can be found in the first case the same classification, while 82.14% of cases similar to the first 10 cases to find the most similar to those Jie Ke, P (10) = 82.14; In other words, there Bacheng case can be found in the first 10 cases of query results.

C. in the combined query(Classification + Stage), only 16.07% of similar cases in the first case can be found in the same category of cases, while 67.86% of the cases similar to the first 10 cases to find the most similar to those Jie Ke, P ( 10) = 67.86%; nearly Qicheng case can be found in the first 10 cases query results.

Medical documents and medical images integrated query results

The above document retrieval and image retrieval query results, the document retrieval query similarity is higher than the image retrieval, the scope of this study will be in the medical document retrieval (T) and image retrieval (G) are located between the two results with different weights, and then incorporated into the CBR system to strike the optimal solution.

This study followed the weight values of the two sub-set is divided into nine groups: T1G9, T2G8 ...to T9G1, one generation into the CBR systemin solving the following results table:.

Table document retrieval and image retrieval of all results after the merger

Canzhuo above results, when the documents and the right to reset the image retrieval 6:4 (6T4G), search results from the best.

1. Category(Classification) aspects of the results: P (10) = 87.50%.

2. Stage (Stage) aspects of the results:P (10) = 92.86%.

(11)

3. Merger inquiries (Classification + Stage) Results: P (10) = 87.50%.

In the unity of the document and image retrieval, the first case the accuracy of P (1) rate on the right have obvious upgrade

1. In the classification (Classification) aspects of the results:P (1) = 62.50% (Text 46.43%, Image 37.50%).

2. On stage(Stage) side: P (1) = 33.93% (Text 32.14%, Image 25.00%).

3. In merger inquiries (Classification + Stage) side: P (1) =30.36% (Text 23.21%, Image 16.07%).

Retrieve documents and images we can see the result of the merger is better than the respective search.

Discussion and conclusion

In the present case have been carried out based reasoning system, the

experience of selecting large part of the experience are using simple text, that is, instrument information to previous experience, and then use such information analysis, indexing and weighting to create a case base. The image of case based reasoning are still rare. However, clinical diagnosis and treatment,

documentation and imaging the association between the importance of other areas and can not be compared. Therefore, the integration between documents and images, is the establishment of medical case based reasoning is very

important work.

Hospital's PACS system contains a wealth of medical imaging, if properly applied in this study to aid the early diagnosis of gastric cancer doctor, you can increase the risk of gastric cancer was found early symptoms, will help to improve health care quality. In this study, accumulation of more cases than the follow-up library materials, may also toward other alternative K-NNsimilarity algorithm and the inference of the direction of similarity to upgrade the system to improve system quality.

Reference material English literature

(12)

1. Snell: Clinical Anatomy for Medical Student; 5th edition, 1995.

2. Jeng, BC and TP Liang (1995), "Fuzzy Indexing and Retrieval in Case-Based Systems," Expert Systems with Application, Vol. 8, No. 1, pp.135-142.

3. Aha, DW (1998), "The Omnipresence of Case-Based Reasoning in Science and

參考文獻

相關文件

You are given the wavelength and total energy of a light pulse and asked to find the number of photons it

好了既然 Z[x] 中的 ideal 不一定是 principle ideal 那麼我們就不能學 Proposition 7.2.11 的方法得到 Z[x] 中的 irreducible element 就是 prime element 了..

Wang, Solving pseudomonotone variational inequalities and pseudocon- vex optimization problems using the projection neural network, IEEE Transactions on Neural Networks 17

volume suppressed mass: (TeV) 2 /M P ∼ 10 −4 eV → mm range can be experimentally tested for any number of extra dimensions - Light U(1) gauge bosons: no derivative couplings. =&gt;

For pedagogical purposes, let us start consideration from a simple one-dimensional (1D) system, where electrons are confined to a chain parallel to the x axis. As it is well known

Define instead the imaginary.. potential, magnetic field, lattice…) Dirac-BdG Hamiltonian:. with small, and matrix

incapable to extract any quantities from QCD, nor to tackle the most interesting physics, namely, the spontaneously chiral symmetry breaking and the color confinement.. 

• Formation of massive primordial stars as origin of objects in the early universe. • Supernova explosions might be visible to the most