Semantic Learning from User’s Relevance Feedback

The semantic learning process at each round is performed on the accumulated training data.

The training data is composed of two databases, the relevant one (MDB^P) and irrelevant one (MDB^N). Each database contains relevant/irrelevant music objects accumulated from previous rounds. The amount of samples in each MDB^P/ MDB^Nincreases during the session. The concept can be learned by mining common properties of MDB^PandMDB^N respectively first and then discovering discrimination between these properties. Table 4.1 is an example of MDB^P containing four music objects while Table 4.2 is an example of MDB^N with three music objects. For convenience of explanations, in these examples, a music object is modeled as a two-attribute global feature (G, H), and a set of three-attribute local features (A, B, C) where each three-attribute local feature corresponding to a SMP. One example of common properties of MDB^Pis (A=1,C=1) and one example of discriminative properties of MDB^P and MDB^N is (B=2) & (H=2) & (A=1, C=1) which appears frequently in MDB^Pbut seldom or never appears in MDB^N.

The semantic learning process for capturing user’s concept proceeds first by frequent pattern mining algorithm followed by associated classification algorithm. Details are shown in the following.

4.1 Frequent Pattern Mining

Relevance/irrelevance is usually defined by a characteristic that is shard by relevant/irrelevant music objects. To capture the characteristic sharing by a class of music objects, we employ the data mining techniques. Before the mining process, each (attribute, value) pair of the global and local feature is transformed to an item. For example, an (attribute, value) pair of (B,2) is transformed to an item “B2”. Therefore, the global feature is represented as an itemset of six items while the local feature of an SMP is represented as an itemset of five items. A music object is therefore treated as a set of itemsets. Before presenting the algorithm, we introduce some formal definitions in the following.

Definition 1:

Let I be the set of possible items and Y = {X| X ⊆ I, X is the itemset corresponding to a local feature or a global feature}. Let MDB^P/MDB^N be a music database, where each object T is a set of itemset such that T={T1, T2,…,Tx| Ti∈Y }, namely T ⊆Y.

Example 1:

Take M1 in table 4.1 as an example. The object M1 is represented as {{G4, H2}, {A1, B2, C1}, {A2, B2, C1}, {A1, B1, C2}}.

The common property found in this mining stage is called a frequent pattern,.

Definition 2:

Let X be the itemset corresponding to a local feature or a global feature. The common property, pattern, found in the mining stage is a set of itmest, P = {P1, P2,…,Pv| Pj ⊆ X }.

Example 2:

An example of pattern in table 3 is {{H2}, {A1,C1}} where itemset {H2} and {A1, C1} are the subset of a local feature or a global feature.

Definition 3:

We say that an object T contains the pattern P if there is a one-to-one mapping function from P to T such that for each Pi, there exists a Ti, Ti∈T ∋ Pi ⊆ Ti.

Example 3:

Take the pattern {{A2}, {C2}} as an example. If an object contains {{A2}, {C2}}, there must exist two distinct itemset containing {A2} and {C2} respectively. For instance, in table 4.1 M1 and M2 contains {{A2}, {C2}}, while M3 doesn’t contain {{A2}, {C2}}.

Definition 4:

Given a pattern P, the support count of P, supCount(P), is the number of objects in MDB^P / MDB^N that contain P and it’s support sup(P) in an object database is (supCount(P))*100%.

We called P a frequent pattern if sup(P) is no less than a given minimum support threshold ,minsup.

Example 4:

An example of frequent patterns with support 100% in MDB^P is {{B2}, {H2}, {A1,C1}}

which is contained in all objects in MDB^P.

Table 4.1 An example of MDB^P.

Table 4.2 An example of MDB^N.

Figure 4.1 An example of classifier.

The task of frequent pattern mining is to find all frequent patterns with support no less than the minimum support threshold minsup. The frequent pattern found in MDB^P, MDB^N are called positive frequent pattern and negative frequent pattern respectively. Both of them are is the form of set of itemsets.

A well-known approach for mining frequent pattern is Apriori algorithm [1]. Apriori is a frequent patterns in MDB^P

(A=1),(B=2), (H=2), … (A=1,C=1), …

(H2)&(B=2) & (A=1,C=1)

…

frequent patterns in MDB^N (A=1), (B=3), (C=1), … (A=1,C=1), ...

(B=3) & (A=1,C=1)

…

(H=2)&(B=2)&(A=1,C=1)→Positive conf=100%

(H=2)&(A=1,C=1)→Positive conf=100%

data mining technique originally developed to discover frequent itemsets from database of itemsets. However, in our work, MDB^P/MDB^N.is a database of sets of itemsets and the frequent pattern is also a set of itemsets. Therefore, we proposed a two-phase mining algorithm modified from Apriori to discover the frequent patterns. The first phase will find the frequent itemsets and the second phase will discover the frequent patterns constituted by the frequent itemset found in the first phase. Note that the itemset found in the first phase corresponds to the music segment (SMP) level while the pattern (set of itemsets) found in the second phase corresponds to the music object level. The mining process will proceed on both MDB^P and MDB^N respectively.

1^st phase : mining frequent itemsets

We employ Apriori algorithm to discover all frequent itemset in which each item must appear in the same itemset. The classic Apriori algorithm for discovering frequent itemset makes multiple passes over the database. In the first pass, support of each individual item is calculated and those above the minsup will be kept as a seed set. In the subsequent pass, the seed set is used to generate new potentially frequent itemsets, candidate itemsets. Then the support of each candidate itemset is calculated by scanning the database. Those candidates with support no less than minsup are the frequent itemsets and are fed into the seed set that will be used for the next pass. The process continues until no new frequent itemsets are found.

In our work, only the step of support calculation is different from classic Apriori algorithm, since in our work each object is a set of itemset, rather than an itemset. For the example of Table 3, the support count of the frequent itemset {A2, C2} is two. {A2, C2} appears in M2, M3, but not in M1, M4.

2^nd phase : mining in segment level

The second phase will discover the patterns constituted by frequent itemsets found in the first phase. Similar to the algorithm in the first phase, the algorithm makes multiple passes over the database MDB^P/MDB^N. In the k-th pass, the seed set (the set of candidate patterns of k itemsets) is generated by joining two frequent patterns of k itemsets found in the previous pass. Then the support of each candidate pattern is calculated by scanning the database. Those candidates with support no less than minsup are the frequent patterns and are fed into the seed set that will be used for the next pass. The process continues until no new frequent patterns are found. The only exception is the first pass in which the seeds are the frequent itemsets generated in the first phase.

In order to improve the efficiency of the above mining process, we introduce the pattern canonical form in the following to makes the candidate generation more efficient.

Following shows the definitions and examples.

Definition 5:

Let Pk be a pattern containing k itemsets. Assume that items in an itemset is ordered by lexicographic ordering pl. The pattern canonical form of P k is defined as the set of itemsets in which itemset is rendered based on the ordering ppcf. We define the ordering ppcfas follows. If α = {s1, s2, …,sm} and β = {t1, t2, …,tn} are two itemsets in P k, then αp^p^cfβ iff one of the following is true.

(i) m < n or

(ii) m = n and ∃i,j,1<i< j,∋sαk = tβk for 1<k≤i and sαjp^ltβj.

Example 5:

For example, the canonical form of {{A1,C1}, {A1}, {B2,C1}} is {{A1},{A1,C1},{B2,C1}}.

Definition 6:

Given two frequent k-patterns in pattern canonical form, {P1, P2,…,Pk} and {Q1, Q2,…,Qk, they are joinable if P2= Q1 and P3= Q2, …, and Pk = Q(k-1). A (k+1)-candidate pattern {P1, P2,…, Pk,Qk} will be generated in canonical form as well.

Example 6:

Given two frequent 2-patterns in canonical form, {{A1},{B2}} and {{B2},{A1,C1}} will generate the 3-candidate pattern {{A1},{B2},{A1,C1}}.

Moreover, in order to count the support of patterns more efficiently, we maintain a table, occurrence table, for each pattern.

Definition 7:

Given a frequent k-patterns P in pattern canonical form, {P1, P2,…,Pk}, the occurrence of P in object T is {LP¹, LP², …, LP^k} where LPⁱ, 1 ≦ i ≦ k indicates the location of itemset Piin T. There may be more than one occurrence in an object. The table records all occurrences in each object in the database for P is called the occurrence table.

Example 7:

The occurrence tables for the patterns {{B2},{H2}} and {{H2},{A1,C1}} are shown in Table 5(a) and (b) respectively. In Table 5(a), the first occurrence of the pattern {{B2},{H2}} in music object M1 is (2,1) where {B2} appears in the 2^nd itemset and {H2} appears in the 1^st itemset of music object M1.

We use the data structure, occurrence table, to store positions where the pattern appears in. Each pattern is associated with an occurrence table. Moreover, we also derive the occurrence table for each candidate during the process candidate generation.

Table 4.3 Examples of occurrence tables.

(a) {{B2},{H2}} (b) {{H2},{A1,C1}} (c) {{B2},{H2},{A1,C1}

M1 (2,1), (3,1) M1 (1,2) M1 (3,1,2) M2 (2,1)(3,1) M2 (1,3) M2 (2,1,3)

M3 (3,1) M3 (1,2) M3 (3,1,2)

M4 (3,1) M4 (1,2) M4 (3,1,2)

Definition 8:

Given two joinable k-patterns along with theiroccurrence tables, suppose that the occurrence of the first pattern in a specific music object is (u1, u2, …,uk) while that of the second pattern in the same music object is (v1, v2, …,vk), an occurrence (u1, u2, …, uk, vk) for this object will be generated if u2= v1, u3= v2,…, uk = v(k-1) and u1 ≠ vk.

Example 8:

Given two occurrence tables in Table 5(a) and 5(b), the occurrence table for the candidate {{B2}, {H2}, {A1,C1}} is presented in Table 5(c). In Table 5(c), the occurrence in object M1, (3,1,2) is generated by (3,1) in Table 5(a) and (1,2)in Table 5(b) of M1.

By utilizing the occurrence table, it is efficient to check the support count of each candidate pattern without scanning the music database.

4.2 Associative Classification

After a two-level mining process performs on MDB^P and MDB^N,we obtain a collection of positive and negative frequent patterns with respect to the common properties of music objects relevant and irrelevant to the concept respectively. In order to discriminate the concept of relevant music from that of irrelevant music, this step tries to find the discrimination between characteristics of MDB^Pand MDB^N. The result of this step is a classifier consisting

of rules. Figure 4.1 is an example of classifier learned from common properties discovered from Table 3 and 4. One rule is “(B=2) & (A=1,C=1)→ Positive”, which classifies a music object containing attributes of (B=2) &(A=1,C=1) as positive class. This rule comes from the fact that (B=2) & (A=1,C=1) appears frequently in MDB^Pbut seldom appears in MDB^N.

We employ associative classification algorithm [16] to generate a binary classifier learned from the positive and negative frequent patterns. The algorithm eventually will generate a classifier containing a set of ranked rules. The classifier is of the form <r1, r1,…, r1, default_class}. Each rule ri is of the forml⇒y, wherel∈F, F is the collection of positive and negative frequent patterns and y is a class label. The confidence of a rule is defined as the percentage of the training music that contains l belonging to class y.

A naïve version of the algorithm will first sort the set of rules according to a defined precedence order. And then select rules following the sorted sequence that correctly classify at least one music object and will be a potential rule in our classifier. Different from the original rule type, the frequent pattern on the left hand side of one rule in our work is a set of itemset.

We say that a music object is covered by a rule if it contains the frequent pattern of the rule.

Take the rule {{B2},{A1,C1}}→ positive in Figure 4.1 as an example, if a music object has two itemsets containing {B2} and {A1,C1} respectively, then the music objects is covered by rule {{B2},{A1,C1}}→Positive. If the music object is in MDB^P, we say that it’s correctly classified by the rule. A default class referred to the majority class of the remaining music object in database is determined. Finally, it will discard those rules that do not improve the accuracy of the classifier. The first rule in classifier that made the least error recorded in classifier is the cut off rule where rules after the cut off rule will be discarded since they only produce more errors.

4.3 S2U Feedback Strategy

Once the classifier is constructed, the system then produces a ranked list of music objects.

As we have mentioned in section 1, what the system return to the user will determine the potential information granted from the user. We present three types of feedback strategies, most-positive, most-informative and hybrid strategies. In general, the most-informative music objects will not coincide with the most-positive music objects. Different strategies along with corresponding scoring function are described as follows.

(1) Most-Positive strategy (MP)

If the user is impatient, the system should present the most positive (i.e. those marked as relevant by the system) music objects learned so far. The most positive music is a list of music object m ordered by the score function which is related to the confidence of matched rules.

∑

where Rp/Rn stands for rules belong to positive/negative class that satisfies each music object m.

(2) Most-Informative strategy (MI)

If we sacrifice the performance at this round for maximizing information obtained for the next round, a better result can be expected in the future process. Most-Informative strategy will select a set of music objects such that their judgment by the user will provide more information for labeling uncertain music objects. The uncertain music objects are those whose

class labels the system is uncertain about. These objects are most-informative objects. In other word, the system using MI strategy will display a collection of most informative objects at each round until the user attempts to find out what the system can retrieve in hand. Then, the system will adjust itself to MP strategy and return the most positive music objects.

In the associative classification algorithm, object which matches no rules in hand belongs to the default class. We define those belong to default class as most informative music objects.

If the user is willing to interact with the system, our system will display a number of most informative music objects for user’s feedback.

(3) Hybrid strategy (HB)

HB is a compromised between MP and MI strategies. The system applied HB strategy will equally return both most positive and most informative objects each round. The score of each music object m is defined as follows:

⎩⎨

⎧ ∈

= score m otherwise class default

m m score

HB ( ),

_ ,

5 . ) 0

( (4)

CHAPTER 5 Experimental Result and Analysis

5.1 Dataset

The dataset contains 215 MIDI music objects collected from the internet. Each music object belongs to western pop music including rock, jazz, and country genres. Subjects involved in the experiment are unfamiliar with some of the music objects. In this case, noisy and inconsistent judgment caused by the user because of familiarity may be avoided.

Automatic melody extraction process is performed on each MIDI file by all-mono algorithm. The raw feature representation and quantized version will be fed to the system separately for performance evaluation.

5.2 Experiment Setup

In order to evaluate the segment-based relevance feedback algorithm, we design an on-line CBMR system with relevance feedback mechanism. The relevance feedback information of users is essential for system evaluation. We invite eight subjects to investigate our system for creating relevance feedback data.

The retrieval process proceeds by randomly selecting 20 music objects for user’s labeling. An on-line training process will derive a classifier based on initial U2S feedback.

The classifier labels all music objects in database along with a scoring function which defines

relevance degree of each one. According to specified S2U feedback strategy, at most 20 music objects will be returned and judged by the user. Once the user isn’t satisfied with the current retrieval result, next round proceeds again. The training samples are accumulated from each relevance feedback round. The classifier is expected to be refined based on the accumulated training samples via the relevance feedback mechanism.

In order to compare with performances for experiments with different strategies and parameter settings, the user has to go through many experiments and provide relevance feedback for each one of them in reality. It wastes user’s time and somewhat a tiring job. To reduce user’s burden, we attempt to collect user’s relevance feedback data in advance. Once the user determines the concept in mind, the user labels each music object in the database either as relevant or irrelevant. The relevance feedback data made by the user will be regarded as the groundtruth. After that, a series of experiments for each user will be conducted. Each experiment corresponding to a query session contains many rounds will be simulated and each returned music object will be automatically label based on user’s groudtruth.

5.3 Effectiveness Analysis

In order to evaluate the results of our experiments, the performance measure employed is based on the average precision, which is defined as the ratio of the number of relevant music objects of the returned music objects over the number of total returned music objects n for all users.

Note that each experiment intends to evaluate effectiveness of the refinement framework on music retrieval system. Music objects that system has returned in previous round will not be removed from the music database.

We conducted four sets of experiments for performance comparison. The first one is to evaluate the different feedback strategies of the system. The second experiment is to measure the effectiveness of the number of music objects accumulated from user’s feedback (top K).

Subsequently, the experimental result for evaluating effectiveness of the number of rounds (N) applied most-informative (MI) will be discussed. Finally we show the effect of motive threshold on performance.

5.3.1 Effectiveness of System Feedback Strategy

We perform three different experiments to compare the effectiveness of system feedback strategy applied for each round during a query session. As mentioned in section 4.3, the system can employ MP, MI or HB strategy at each round. Three different experiments are described as follows:

S2U feedback strategy (MP): the system applies MP feedback strategy each round and only uses the top K music objects returned for further refinement.

S2U feedback strategy (MI): the system applies MI feedback strategy for consecutive N rounds and then evaluates the final result by applying MP feedback strategy for the rest of rounds. By examining precision at the N+1 round among three S2U feedback strategies, how well does MI strategy work can be evaluated.

S2U feedback strategy (HB): the system applies HB feedback strategy each round during the query session.

Figure 5.1 illustrates the performance comparison of three S2U feedback strategies performed on raw feature presentation. Motive threshold is set to 0.4 for three different strategies. Minsup is set to 0.2. The factor of N is set to 4, i.e., the first four rounds are in most informative strategy and the rest of rounds adopt most positive strategy. The number of music objects used for relevance feedback, K, is set to 10. As the number of rounds increases, precision grows for all S2U feedback strategies as we have expected. The randomly selected initial training samples may limit initial knowledge learned from the first training round and thus we fix the initial query examples as the seed set for each user to ensure fair comparison among experiments under different parameter settings.

Figure 5.2 shows the performance comparison of three different strategies performed on quantized feature representation. The parameter setting is the same as the raw one. In HB feedback strategy part of uncertain music objects were contained in the retrieval results. The precision of each round is bounded since the uncertain music objects aren’t necessarily the positive objects. On the other hand, uncertain objects are counted on to improve the precision.

However, there’s no clear benefit gained by HB strategy shown in Figure 5.1 and 5.2. In the first N round, HB and MP feedback strategies have better performance than MI feedback strategy. With the help of active learning, MI feedback strategy gradually improves the system

在文檔中數位音樂典藏之資料探勘與智慧型檢索技術 (II) (頁 35-67)