Feature Alignment - 知識轉移於智慧家庭環境之行為辨識應用

After the feature reformulation procedure, we compute the feature mapping by feature alignment procedure. This procedure finds similar feature pairs(fi, gl) that fi and glare

in different datasets. The purpose of this procedure is not only to make the size of the two feature sets in source and target domain datasets be the same but also to find the correspondence between any two features in two domains to transfer knowledge in the best way.

3.3.1 Graph Matching Algorithms

Given a similarity measure, we formulate the problem by defining a weighted graph, and apply graph matching algorithms to compute the mapping. Specifically, we reduce our problem to minimum cost perfect matching problem or stable marriage problem in graph theory. In the following sections, we will introduce the two matching problems. After that, we show how to reduce our mapping problem to graph matching problems.

Matching

In graph theory, a matching is a set of edges that any two edges can not share one vertex.

Given a graph G(V, E), a matching of G is a subset of E that if two edges (vi, vj) and (vr, vs) are in the subset, i 6= r, s and j 6= r, s. A perfect matching E^pis a matching that no vertex is left behind in the matching. That is, for each vi ∈ V , we have one and only one edge e∈ E^p that e= (vi, vj) or (vj, vi).

Minimum Cost Perfect Matching

A minimum cost perfect matching [14] is a perfect matching with minimum cost. That is, given a weighted graph G, it finds a matching in the graph G such that the summation of the weights of these edges in the matching is minimized. Formally, let wibe the weight of edge ei, the perfect matching Ep is a set of edges in a perfect matching Ek with the property:

Ep = argmin

E_k

ei∈Ek

wi (3.11)

Stable Marriage

The stable marriage problem [18] in graph theory is a problem of finding a stable matching between two sets of vertices, V_a and V_b. A matching is stable if when an edge(ai, bi) is in the matching, there is no edge(aj, bj) in the matching such that ai prefers b_j to b_iand bj also prefers aito aj. Note that a stable matching may not have minimum total cost, as the example we show in figure 3.4.

Figure 3.4: A stable marriage matching(A, D), (B, C) is with cost 105, which is not a minimum cost matching.

3.3.2 Feature Mapping by Graph Matching

We can reduce our feature mapping problem to the graph matching problem. Assume we have two sets of features A and B. There are m features in A and n features in B. We define a complete bipartite graph Km,n = (U, V, E) that |U| = m and |V | = n. Vertex ui ∈ U and v^j ∈ V represent a feature aⁱ ∈ A and b^j ∈ B respectively. We also assign weight values to all edges in E. The weight value of edge(ui, vj) is the divergence of feature a_i and feature b_j. If u_r and v_s are matched in the graph K_m,n according to the algorithm, feature arand feature bsare mapped together, as we defined in definition 1. By this reduction, we can solve our feature mapping problem by solving the graph matching problem.

Choosing different graph matching algorithms has different meanings for knowledge transfer. Observing figure 3.4, we can see the difference. If the algorithm for minimum cost perfect match problem is applied, we are going to find a mapping in global view, namely, the total divergence between two feature sets after mapping is minimized. On the other hand, the stable marriage algorithm aligns the most similar features in two datasets first. In this case, total divergence may not be optimal, but the most preferred pairs be-tween two features will not be sacrificed.

Note that in our method, some features in the datasets may be ignored because of the following two reasons. First, if m 6= n, the matching computed by the graph matching algorithm is not perfect. If ui is not covered in the matching, its corresponding feature ai will be ignored when we transfer knowledge. Besides, some of these edges in the matching may be with high weight values, which means the corresponding features are in fact not similar. In this case, it would be better to ignore these edges with high weight values in the matching.

3.3.3 Measuring Divergence of Datasets

If there is more than one source domain available, intuitively we should transfer knowl-edge between two “similar” domains instead of two highly divergent domains. The rela-tionship between divergence of two datasets and performance of models has been studied in [2], [48], and [9], which show that divergence of two datasets and model performance of knowledge transfer are related. However, currently there is no standard method to eval-uate the divergence between two different domains to decide how to transfer knowledge.

There is also no known criteria on how to choose a method to estimate the divergence of two datasets in transfer learning.

It may be possible to extend the feature similarity measure to estimate divergence between datasets in two domains according to our feature-based knowledge framework, as proposed in [9]. LetF and G be two feature sets in two domains, from definition 1, we can use the following equation to estimate the divergence of two datasets:

DF ,G(F; T ) ≈ X

(fi,gj)∈M(F ,G)

D{fi},{gi}(fi; T )

where T is the task of domains, andDF ,G(F; T ) is the estimated distance between F and G under T . That is, the total distance of two domains is the summation of the distance of each features which are mapped together under the task.

Therefore, assume the divergence between features in two datasets are given or com-putable. We can compute an one-to-one correspondence of these features between the source and target domain dataset by applying the graph matching algorithms. The sum-mation of feature divergence in the mapping may be used to estimate the distance of source and target datasets. We may use minimum cost perfect match algorithm to com-pute the best matching between nodes in a bipartite graph. The result of this matching can give a lower bound of divergence of two datasets.

Chapter 4 Experiments

In this chapter, we describe the knowledge transfer experiment conducted according to our framework under two scenarios. The details of activity recognition datasets, including the datasets we used, the algorithms of data preprocessing and feature mapping, parameters, and the results are given. The feature reformulation procedures for the two scenarios are that we described in section 3.2. We measure our knowledge transfer framework by the accuracy of models.

Recall the two scenarios we gave in section 1.1.1.

1. A dataset collected in a laboratory environment is available, and we are preparing to deploy sensors to the target domain environment.

2. A dataset collected in a laboratory environment is available, and we have also col-lected and labelled some samples in the target environment.

In these two scenarios, data samples collected in the laboratory environment is our source domain data. We proposed the following solutions for these two scenarios, and apply our framework to run the experiments:

1. We transfer knowledge by sensor profiles.

2. We use labelled source and target domain data samples to transfer knowledge.

In our experiments, each feature is extracted from only one sensor, so feature profiles and sensor profiles are identical. We will use sensor profile and feature profile inter-changeably in the following sections without confusion.

Table 4.1: The number of features and activities in the datasets

Dataset MAS S1 MAS S2

♯ of sensors 76 70

♯ of activities 23 25

在文檔中知識轉移於智慧家庭環境之行為辨識應用 (頁 45-50)