Time Coupling - Following the Trail of Data

Having developed methods for predicting both the criminal location and where the next crime will occur, we now extend these methods by adding in the concept of time.

As described so far, the algorithm neglects all chronological properties of the killings, discarding one of the most important pieces of information available to us. Ideally, we want to be able to quickly adapt to the criminal’s motion if they were to move to a different location or otherwise alter behavior over time.

To account for these possible changes, we weight events more heavily the more recently they occurred. Doing this allows for recent changes in behavior to be more strongly reflected in future predictions, which allows for better calibration with the criminal’s latest motives. To keep the algorithm as general as before, we will modify the algorithm to take a weighting function as an input, which is then multiplied by f (dist) when we calculate Prob[g].

Although the composition of the weighting function w can be anything, we notice that

w(t) = (d − c)a

√t + a + c

yields reasonable results. Due to time constraints, this w was the only reasonable weighting function which we managed to study in depth, and therefore any future

Program 3 The killer-finding and future-predicting algorithm with time-based weighting

1G is the set of grid points after the search area is partitioned by the mesh 1G⁰ is G augmented with a 30% boundary buffer zone

1C is the set of past crimes, each of which has a location and date

1f (d) is the distance decay function which returns a higher number for more probable locations

1w(t) is the weight function which returns a higher number for smaller inputs 1Prob is the probability of the killer location based on distance decay affects for the grid

1∆Prob measures the change in distribution of Prob when adding a new crime scene

1Function criminalLocationProb (crimes C, decay function f , weight function w):

1 1 Let Prob be a mapping from G⁰ into R 1 1 for g ∈ G⁰ do

1 1 1 /*Calculate cumulative distance effect from all c ∈ C*/

1 1 1 Prob[g] = 0 1 1 1 for c ∈ C do

1 1 1 1 Prob[g] = Prob[g] + w(time since c_time) · f (distance between c and g) 1 1 Normalize the values of Prob so that the sum of the values is 1

1 1 return Prob

1Function futureCrimeProb (crimes C, decay function f , weight function w):

1 1 killer prob distrib = criminalLocationProb(C,f ,w) 1 1 Let ∆Prob be a mapping from G⁰ into R

1 1 for g ∈ G⁰ do

1 1 1 Let C⁰ be C augmented with a “virtual crime” at g 1 1 1 V = criminalLocationProb(C⁰,f ,w)

1 1 1 /*Calculate sum of squared differences between killer prob distrib and V */

1 1 1 ∆Prob[g] = 0 1 1 1 for h ∈ G⁰ do

1 1 1 1 ∆Prob[g] = ∆Prob[g] + (V[h] − killer prob distrib[h])² 1 1 Normalize the values of ∆Prob so that the sum of the values is 1 1 1 return ∆Prob

0 20 40 60 80 100 2

3 4 5 6 7 8 9 10

Time (Days)

W eigh t

Weight Function

Figure 5: A decay plot.

time-based models use this w as the time weighting function.

5.4.1 Why It Helps

To help emphasize the benefits of time coupling, we consider a randomly generated scenario. The points in the following graphs were automatically generated to be close-to-linear, and some randomly selected set of times generated by a Poisson process were matched to these points in one of two ways:

• In the “correlated” case, the times were chosen to increase monotonically with x. That is, the larger the value of x, the larger the time that was assigned to it.

• In the “uncorrelated” case, the various times were assigned randomly to the points.

Notice how the addition of time-based weighting in the uncorrelated case had almost no effect when compared to the correlated case, in which the effect of the weighting was pronounced. As there was no correlation in the former case, any small pertur-bations will essentially cancel each other out, leaving each point with no expected

(a) Next Crime Probability Plot with unweighted time, uncorrelated points.

(b) Next Crime Probability Plot with weighted time, uncorrelated points.

(d) Next Crime Probability Plot with weighted time, correlated points.

(e) Criminal Location Probability Plot with un-weighted time, correlated points.

(f) Criminal Location Probability Plot with weighted time, correlated points.

change. Thus, the correctness of the plot is not drastically affected, if at all. In the latter case, however, the correlation was captured by the algorithm and the expected location of the next killing shifted rather drastically to accommodate this change.

Intuitively, if a serial killer has been moving north one mile every day for a while, it is reasonable to assume that his or her future kills will be more north than average.

Perhaps even more interesting, the location of the next crime adapted much more quickly than the estimated location of the criminal. Again, this is sensible: a gradual shift in the location of crimes is much more indicative of a behavior-based shift than one that is residence-based.

6 Result Discussion

To evaluate our models, including various choices for our distance decay function, we look at case study data (see Appendix A) for actual apprehended serial killers: Peter Sutcliffe and Chester Turner.

6.1 Metrics

We use two metrics in quantifying the quality of our results. The first is Error Distance and the second is Search Cost.

Error Distance

This mesaures the distance between the calculated most probable spot and the actual spot “as the crow flies”. When searching for the criminal, the error distance tells us how far away our best guess was from the actual location. When predicting the next crime it tells us how far away the crime happened from the location where we allocated the most resources.

Search Cost

This measures how many grid squares would need to be searched out of the total before the correct spot is found if we go through the squares in the order of their probability. When searching for the criminal this tells us how much area would need to be searched to find the criminal. We assume no preference over locations of the same color and thus this may vary in actual police searching where the initial search direction may vary. When predicting the location of the next crime it tells us what percentage of our resources were wasted on locations that had higher priority than the actual location but saw no actual crime. This metric provides a much more realistic assessment of the quality of a prediction; however, in some instances the Error Distance still provides useful information.

在文檔中 Following the Trail of Data (頁 19-24)