Damped dynamical system - 基於貝氏IRT模型之線上學習演算法

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

The rst issue is caused by active users that have rated many items. Ultimately, their latent variables' variances (σ²αj, σ_β²

j) and the corresponding items' latent variable's variance (σ_θ²_i) could converge toward 0 (σ² → 0) and subsequent updates for their latent variables (αj, β_j) and items' latent variable (θi) would halt. Instead, if variances can slightly grow to a reasonable small amount rather than go to zero in the end, stopping learning problem might be lightened.

The second issue is caused by inactive users that have barely data to update. Once their data arrive, large updates are possible. This issue, compared to the rst one, is more concerned about the mean of latent variable (µ). That is, unless inactive users become active, their ratings to some items could greatly shift the central tendency of these items' latent quality (µθi). Instead, if mean can grow back towards prior assumption as time duration between present and the last time of update expands in the sense that we gradually lose inactive users information, the trouble can be mitigated.

Before giving a numerical example to enhance the ideas above, we'd like to elaborate the term dynamical in detail rst.

Damped dynamical system

Given a sequence of unit time, say 1 second or 1 minute, update every user and item's information whatever there has datum or not.

This is exactly the point where the term dynamical comes. Take a damped pendulum problem for example, which is a basic dynamical problem in physics (Figure 4.1).

Figure 4.1: Damped pendulum (from Shane Mac, 2016)

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 4.2: Phase portrait of a damped pendulum system (from Strogatz, 1994, p.173) At any moment, if there are no external forces (no data), the black ball will stay at its stationary point (prior). Once it gets a push (one data point arrives), it will start wiggling back and forth. Meanwhile, it receives the downward gravitational force g (the parameter, ξ), which slows it down. Without further interaction with any external force afterwards, it will use up its moving energy and end up stopping at the stationary position.

To make a strong connection between this dynamical system and the following dynamical learning property that we are going to demonstrate, the whole mechanism can be illustrated more explicitly by the phase portrait of the damped pendulum system (Figure 4.2). It is clear from this gure that, this system has stable xed (stationary) points at (kπ, 0) and saddle nodes at ((k − 1)π, 0), where k = 0, ±2, · · · . The origin, (0, 0), is asymptotically stable causing the stable spiral, all solutions starting near it will spiral and close to it. In other words, if the |θ| < π caused by an external force, the black ball will wiggle around the lowest position for a while and then stop near it. Furthermore, this gure unintentionally emphasizes the serious consequences of cold start eectsif the external force is so powerful (large potential of update) leading to |θ| > π, though it seems ne in this picture, the nal outcome would be unpredictable and the stop point could no longer be at the position (0, 0). (black ball in reality will always go back to the lowest position as long as the external force didn't break the link between the ball and the end point, however, in a recommender system, the overall preference or quality could be distorted in this manner!)

With the explanation above, we have the following conclusion:

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

If there are no extra rating data (external forces) of users' or items' to update for a long period of time, we may take it as losing their information (losing moving energy) gradually and their preferences/qualities will eventually converge back to their priors (stationary point)

To make this conclusion more precisely, assume that

µ_prior = 0, σ_prior = 1

and assume that, one data point Dt= (j, i, c)arrives at time t. Through Algorithm 1,

µ_inter= ˜µ_statical = 3, σ_inter= ˜σ_statical = 0.5

We then apply (4.19)(4.20) to these four practical values (µprior, σ_prior, µ_inter, σ_inter) to generate a pair of real update values (˜µ_dynamical, ˜σ_dynamical). Without any further data, we can observe the behaviour of ˜µdynamical(t), ˜σ_dynamical(t)

by applying (4.19)(4.20) to (µ_prior, σ_prior, µ_inter = ˜µ_dynamical(t − 1), σ_inter = ˜σ_dynamical(t − 1)) many times with dier-ent ξ in each run (Figure 4.3).

Figure 4.3: Dynamical behaviour of ˜µ_dynamical(t), ˜σ_dynamical(t)with dierent ξ. The horizontal lines represent the prior's information (µprior, σ_prior).

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Just like the damped pendulum problem, ˜µ_dynamical(t), ˜σ_dynamical(t)

will ultimately converge back to where they started (the horizontal lines). Figure 4.3 also shows that, the greater the ξ (gravitational force), the faster the speed of convergence (stop). But, how exactly the value of ξ is is still undetermined. We will give a range of sensible values through experimental results in chapter 5.

The dynamical learning algorithm (Algorithm 2) is presented in the next page. Recall the issues, stopping learning and cold start, both can be mitigated through the gradually forgetting mechanism in the sense that we gradually lose an user's or item's information if the time period between present and the last time of update is long enough. This mechanism may benet to reduce biases in including the latent variables of inactive users who have rated few items, and the latent variables of unpopular items which have few raters.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Algorithm 2 Dynamical Learning Algorithm Setup ξU ser, ξ_Item, ξ_γ

Setup prior:

µ_α⁽⁰⁾, µ_β⁽⁰⁾, µ_θ⁽⁰⁾, σ_α⁽⁰⁾, σ_β⁽⁰⁾, σ_θ⁽⁰⁾, µ_γ⁽⁰⁾, σ_γ⁽⁰⁾ Set current:

µ_α = µ_α⁽⁰⁾, µ_β = µ_β⁽⁰⁾, µ_θ = µ_θ⁽⁰⁾, µ_γ = µ_γ⁽⁰⁾; , σ_α= σ_α⁽⁰⁾, σ_β = σ_β⁽⁰⁾, σ_θ = σ_θ⁽⁰⁾, σ_γ = σ_γ⁽⁰⁾ dynamical(prior, current, xi) {

Apply (4.19)(4.20) }

At time t (t is related to time here),

(1). collect all rating data received within the time interval (t − 1, t]

D_t= (· · · , (j, i, c), · · · )⁰, a Nt-by-3 matrix where Nt is the size of data collected within the time interval.

(2). setup update ag (to indicate which user/item has been updated)

agU = {FALSE}1:J, agI = {FALSE}1:I

(3). check {# of data} and update users/items' information if Nt == 0 then

user_current = dynamical(user_prior, user_current, ξU ser) item_current = dynamical(item_prior, item_current, ξItem)

cutpoints_current = dynamical(cutpoints_prior, cutpoints_current, ξγ) elsefor k = 1:Nt do

Dk= Dt[k, ]

inter = statical(Dk) user_currentDk[1]

= dynamical(user_priorDk[1]

, inter[1:4], ξU ser) item_currentDk[2]

= dynamical(item_priorDk[2]

, inter[5:6], ξItem) cutpoints_current[7:10] = dynamical(cutpoints_prior[7:10], inter[7:10], ξγ)

agUDk[1]

= TRUE

agIDk[2]

= TRUE end for

user_current[!agU] = dynamical(user_prior[!agU], user_current[!agU], ξU ser) item_current[!agI] = dynamical(item_prior[!agI], item_current[!agI], ξItem) end if

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

在文檔中基於貝氏IRT模型之線上學習演算法 - 政大學術集成 (頁 24-29)

Damped dynamical system

國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y