4. EMPIRICAL ANALYSIS
4.3 M ODEL R ESULTS
4.3.1 Expected Number of Reorders (Conditional Expectation)
The expected number of reorders is computed using Equation (1). However, there is a value created from a complex function we need to estimate. This complicated function is called the Gaussian hyper-geometric function. In Literature Review, we have briefly mentioned this kind of function. Usually, we could have two methods to compute this function. One is numerical integration and the other is the algorithms.
We will choose the method of numerical integration to compute it as the authors of the BG/NBD model. Therefore, before forecasting the expected number of reorders, we will introduce this function simply and the process to estimates its value in Microsoft Excel.
>1
z ; if z =1, the series converges for c−b−a>0.
And we could have the following recursive expression for each term of the series:
where u0 =1.
B. The Numerical Integration Method Employed in Microsoft Excel
It’s easy and simple for us to estimate2
F
1( ) ⋅
by utilizing the above series.We just need to continue adding terms to the series untilujis less than
“machine epsilon” (the smallest number that a specific computer recognizes as being bigger than zero). In Microsoft Excel, it’s easier to compute the series to a fixed number of terms, as a result, we will evaluate the first terms (j = 0, 1, 2…150).
In terminating the adding process of the series at j=150, whether have we evaluated too few terms? Because the speed with whichujÆ0 depends on the magnitude of z and the observed number of reorders X in Equation (1), where sampling rule, which is choosing the customers who have ever made their first purchase in the first quarter of 2006, the smallest T equals to 40.14.
Thus the biggest z equals to 0.52. However, there is no point in going beyond j = 40 for z<0.5 (Fader et al., 2005 b). As a result, based on the biggest z value, it’s feasible for us to terminating the series at j = 150.
As to X, the biggest X in our dataset equals 50. We have tried to continue adding the terms to the biggest extent that j = 240, and we found the final result is the same as while j = 150.
(2) Computing conditional expectation of number of reorders
After estimating the parameters and the value of2
F
1( ) ⋅
, we could also get the conditional expectation of number of reorders in Microsoft Excel. Figure 3 shows the screenshot of excel worksheet of conditional expectation.Figure3. Screenshot of Excel Worksheet of Conditional Expectation
In this worksheet, we first place the four model parameters we have
estimated in cells B1:B4. Then we place the purchase history
(
X =x,tx,T)
of a particular customer in cells B6:B9. For example, we choose the customer whose ID=34, X=20, tx=48.285 and T=49.142. For all customers, t equals to 52 because the length of time over which we wish to make the conditional forecast is a year. Furthermore, we use the method outlined above to compute2F1( )
⋅ which is central to Equation (1). Corresponding to the prototype of2F1( )
⋅ , the function parameters(
a ,,b c)
are given in cells E2:E4 and the function argument( )
z are setting in cell E5. In last, we evaluate the first 151 terms of the series (cell E7:E157) and these terms are summed in cell E1. Then we couldfinally get the valueE
(
Y( )
t X =x, tx,T,γ,α,a,b)
in cell C11. For this customer, we expect that he might have almost 19 reorders in next year 2007 conditioned on his purchase behavior in 2006.(3) Use- friendly worksheet
The original worksheet developed by the authors of the BG/NBD is easily operated even for marketing practitioners. However, looking at figure 3, we could only get one conditional expectation of a customer once at a time. Thus, based on Equation (1), we redesign a worksheet which is more user-friendly for customer base analysers. This worksheet is shown in Figure 4.
Figure4. Screenshot of Excel Worksheet of Conditional Expectation—New Version
In this new worksheet, the four parameters we have estimated are still placed in cell B2:B4. And the three parameters
(
a ,,b c)
and the argument( )
z of( )
⋅1
2F are given in column I to column L corresponding to each customers.
Then we equally evaluate the 151 terms of series from column P to column FJ.
And the value of 2F1
( )
⋅ are summed in column N. In this way, as long as we put into the purchase information of each customer (column E to column G) and set the value of t over which we want to forecast, we could get the value(
Y( )
t X x t T a b)
E = , x, ,γ,α, , in column M respectively.
4.3.2 Expected Active Probability
The expected active probability is computed using Equation (6) and (7) that we have derived. In coding these equations in Microsoft Excel, we luckily find that the elements of Equation (7) are similar to the log-likelihood function (Equation (13)) of the BG/NBD). In Equation (7), we could rewrite the Equation (7) with the
Beta-Gamma transformation functionB
( ) ( ) ( ) (
a,b =Γ a Γ b Γ a+b)
and A1, A2, and A3 of Equation (10) to(12) To combine Equation (6) and (12), we get Equation (13):
(
activeatT X x t T a b) (
L Aa bAX Ax t T)
With Equation (13), we could also easily code it in Excel by taking “log” first, summing these elements together and then taking “exponent” to get the final expected active probability. In order to explain the coding process completely, we have the following figure to show the screenshot worksheet.
( )
Figure5. Screenshot of Excel Worksheet of Expected Active Probability
Likewise, we place the four estimated parameters in B1:B4. We could also observe that this worksheet is almost identical to the one of Parameter Estimation (Figure 2) till Column I. And the formula of Column J
is
[
ln(
A_1) ]
+[
ln(
A_2) ]
+[
ln(
A_3) ]
−[
ln( )
all]
. Finally, taking “exponent” for all cells of Column J in Column K respectively, we could get the expected active probability of each customer.There is one thing we must pay attention to. When some customers have zero reorder in their observation time period (X=0, tx=0, T), their active probabilities are expected to 1. Some people may argue that whether it’s reasonable or dependable.
However, in the model development of BG/NBD, the concept of the active probability has been simultaneously considered in deriving the conditional
expectation of number of reorders (BG/NBD, Equation (A3)). And comparing the performance of the BG/NBD and Pareto/NBD on conditional expectation for the customers who have made zero reorder in T (called “zero class”), the forecasting capability of BG/NBD is better, especially for the zero class (Fader et al., 2005).
Therefore, it could raise us more confidence on the expected active probability for the zero class.
On the other hand, because the zero class could be easily recognized, we also
could focus these customers who at least have made one reorder in T and then take appropriate response to these expected active probabilities while implementing one-to-one marketing afterward.
4.3.3 Expected Dollar Volume per Reorder
The expected dollar volume per reorder is computed using Equation (8) and (10).
As the BG/NBD, we have already coded these equations in Microsoft Excel. The following figure shows the screenshot.
Figure6. Screenshot of Excel Worksheet of Expected Dollar Volume
There are two cases while computing the expected dollar volume per reorder. If X≦1, we use Equation (8); if X>1, we use Equation (10). One of the differences between them is the formulas of reliability coefficient ρ (Equation (9) and (12)). The other one is Z vs.z. After estimating parameters, the values of weighted average
2
σ (cell D2), W σ (cell E2), 2 σ (cell F2), 2A E
[ ]
θ (H2),z
(column B), ρ (cell G2), 1 and ρ (column J) have been also computed. Finally, we could get the expected xdollar volume per reorder in Column I respectively.