On August 2, 2007, journalist and Oakland Post editor-in-chief Chauncey Bailey was assas-sinated in broad daylight just a few blocks from my downtown apartment. Although the as-yet open case seemed an example of a political murder by a group Bailey was investigat-ing, it refocused national attention on violent crime in Oakland. Around the same time, the Oakland Tribune published Sean Connelly and Katy Newton’s award-winning Not Just A Number, an interactive map of Oakland homicides (http://www.bayareanewsgroup.com/
multimedia/iba/njn/). Connelly and Newton were particularly interested in the stories behind the city’s murder statistics. Where a majority of victims had previously been iden-tified by mug shots, Not Just A Number made a special effort to contact surviving family members, friends, and neighbors to put a face on the names in the news. We were inter-ested in publishing a service complementary to these stories that offered hard facts and current data.
We budgeted two weeks of rapid development across three people: I transformed the col-lected data into a web-ready service, Stamen interaction designer Tom Carden developed an immersive visual interface using Flash, and creative director Eric Rodenbeck oversaw the visual direction and accompanying language.
Our first priority for publishing information is to show everything. The home page of the Crimespotting site is a map (Figure 11-6), and the map shows all crime reports from the past week. The map is positioned over most of West Oakland and downtown, with the iconic Lake Merritt included for visual recognition. Familiar “slippy map” pan and zoom controls make the rest of town immediately available, northwest toward Berkeley/Emeryville, northeast toward the affluent hills and Piedmont, and southeast toward San Leandro and beyond. This presentation is in sharp contrast to the existing wizard approach currently published by the City of Oakland. Where the existing application requires some prior knowledge of Oakland and assumes that the visitor is looking for crime information about some specific place, the Crimespotting slippy map requires no existing knowledge or par-ticular search agenda, instead supporting a more exploratory, meandering form of search behavior.
Peter Morville describes the concept of “findability,” a newly emerging concept that describes orientation in an information space and the ways in which data is made self-evident through interface and description. The dynamic web-based map has come a long way in the past four years. In 2005, one national newspaper experimenting with Google
V I S U A L I Z I N G U R B A N D A T A 175 Maps found that test subjects didn’t know they could move a map; now organizations
such as the New York Times routinely push the boundaries of information design and pre-sentation online. With our crime database, we felt it was important to make the informa-tion more findable by creating a data-first user interface. Data first means that it’s possible to start with a broad visual overview, and narrow down search results by type, time, or geography. We implemented the concept of “scented” widgets, introduced by UC Berkeley researchers Wesley Willett, Jeffrey Heer, and Maneesh Agrawala in a 2007 paper on embedded visualization (http://vis.berkeley.edu/papers/scented_widgets/2007-ScentedWidgets-InfoVis.pdf):
While effective information scent cues may be based upon the underlying information content (e.g., when the text in a web hyperlink describes the content of the linked doc-ument, it serves as a scent), others may involve various forms of metadata, including usage patterns. In the physical world, we often navigate in response to the activity of others. When a crowd forms we may join in to see what the source of interest is. Alter-natively, we may intentionally avoid crowds or well-worn thoroughfares, taking “the road less travelled” to uncover lesser-known places of interest. In the context of infor-mation spaces, such social navigation can direct our attention to hot spots of interest or to under-explored regions.
The date selector interface at the bottom-left corner of the main Crimespotting map inter-face shows a bar chart of reported crime over time (Figure 11-7), while the type selector at F I G U R E 1 1 - 6.The Oakland Crimespotting home page shows a map of crime reports from the past week. (See Color Plate 32.)
176 C H A P T E R E L E V E N
the lower right includes discreet tooltips showing the total numbers of each report type in the currently selected time span (Figure 11-8). Both serve dual functions: filtering and feedback. The date selector in particular was inspired by a similar feature in blog statistics package Measure Map (http://measuremap.com/), designed by Jeffrey Veen at Adaptive Path, and later rolled into Google’s own Analytics product. Measure Map’s date slider in turn was inspired by interface features on Flickr, so this is truly a case of imitation being a sincere form of flattery. Our own enhancement is a color differentiation between bars showing days with already loaded data (dark) and those without (light).
There’s a flip side to showing everything, and that’s information overload. We’ve intro-duced one form of visual report type filtering that’s inspired by Apple’s Spotlight feature in the Max OS X System Preferences dialog box: when a particular crime report on the map or in the type selector is hovered over by the mouse for an extra few seconds, the interface darkens, leaving brightly lit areas around mapped reports matching that type. A robbery may be covered up by a different type of crime and thus be invisible on the map, but it can be surfaced and accessed through the spotlight display.
One unexpected dividend of the design process was a clearer understanding of where data specialization could become an interface commodity. We chose Flash as an implementa-tion environment for its visual sophisticaimplementa-tion, and early on realized that it would be neces-sary to implement our own slippy map interaction code rather than rely on one of the many available JavaScript implementations, like OpenLayers (http://www.openlayers.org).
F I G U R E 1 1 - 7.The date selector interface on the main Crimespotting map. (See Color Plate 33.)
F I G U R E 1 1 - 8.The type selector shows the total numbers of each report type in the selected time span. (See Color Plate 34.)
V I S U A L I Z I N G U R B A N D A T A 177 Panning, zooming map interactions seemed like a useful feature to apply to other projects,
so early work on crime data display resulted in a separate BSD-licensed software library called Modest Maps (http://www.modestmaps.com/). Modest Maps made it possible to see a clean break in functionality between data display and interaction metaphor, and the sepa-ration of the map-specific code library has assisted in rapid development for a significant number of unrelated projects, some from Stamen but many from outside designers and developers.
Our second priority was to introduce a public, shareable address space for the data we col-lect. Generally, there are just a few flavors of URL in Crimespotting:
• The map view, http://oakland.crimespotting.org, and a larger one at http://oakland.
crimespotting.org/map
• The report list view, e.g., http://oakland.crimespotting.org/crimes, http://oakland.
crimespotting.org/crimes/Robbery, http://oakland.crimespotting.org/crimes/2009-01-09, and http://oakland.crimespotting.org/crimes/2009-01-09/Robbery
• The individual report view, e.g., http://oakland.crimespotting.org/crime/2009-01-09/
Robbery/113569
• The police beat view, e.g., http://oakland.crimespotting.org/beat/04X
Most of these URLs were designed before their associated content. In particular, they had to conform to the ideals described in Matt Biddulph’s 2005 presentation, “Designing Data For Reuse” (http://www.hackdiary.com/slides/xtech2005/): human-readable, suggestive, hack-able, opaque, permanent, and canonical. We have a hierarchy of addresses that makes sense when read aloud: “robberies on January 9th,” “police beat 04X,” and so on. Where there is potential ambiguity—for example, date-first “/crimes/2009-01-09/Robbery” ver-sus type-first “/crimes/Robbery/2009-01-09” or singular “/crime/Robbery” verver-sus plural
“/crimes/Robbery”—we introduce an HTTP redirect to the proper, canonical form. The redirect makes the URL more shareable by ensuring that my list of thefts on a given day matches yours. One aspect of the individual report URLs that’s an unfortunate compro-mise is the presence of a numeric primary key at the end of the address. PostgreSQL devel-oper Josh Berkus has a special distaste for such keys, described in detail in his series on
“Primary Keyvil” (http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327):
It didn’t take long (about 2 months) to discover that there was a serious problem with having ‘id’ as the only unique column. We got multiple hearings scheduled on the cal-endar, in the same docket, on the same date or in the same place. Were these duplicates or two different hearings? We couldn’t tell…. The essential problem is that an auto-number ‘id’ column contains no information about the record to which it’s connected, and tells you nothing about that record. It could be a duplicate, it could be unique, it could have ceased to exist if some idiot deleted the foreign key constraint.
Our excuse for including such keys is connected to a fairly loose understanding of how the Oakland Police Department keeps its records. Although every report has a case number, case numbers are frequently shared between different reports, and appear to link clusters of individual charges into a single broader incident. An extreme example is case number
178 C H A P T E R E L E V E N
08-056061 (http://oakland.crimespotting.org/crime/2008-08-01/murder/93014), a combination of nine murder, theft, and aggravated assault reports from one night in August 2008.
We’ve settled on the use of case number and text description (e.g., “ASSAULT W/SEMI-AUTOMATIC FIREARM ON PEACE OFFICER/FIREFIGHTER”) as a unique identifier, too long for a comfortably readable URL. The numeric ID acts as a surrogate.
The outcome of this attention to URLs is to turn online crime information into a social object. With CrimeWatch, referring to a report entails a procedural description of actions to take: go to the wizard, select this, press that, click over here, and so on; finding specific information about a particular report requires approximately a dozen separate clicks. With exposed URLs, the address itself is a complete description of the crime information.
Leonard Richardson identifies the address or URI as the primary technology that led to the WWW’s supplantation of other popular 1990s Internet protocols. In his excellent 2008 talk “Justice Will Take Us Millions Of Intricate Moves” (http://www.crummy.com/writing/
speaking/2008-QCon/), Richardson argues that a triangle of technologies makes up what we know as the Web: the URI to address things, HTTP to move them around, and HTML to help client software understand what to do with them. All three are critical components.
URI design in particular is enjoying a flow of popular attention, but it’s lowly old HTML with its links and forms that makes a connected web truly possible. This explanation is a crucial elaboration on Roy T. Fielding’s 2000 PhD thesis introducing the idea of Represen-tational State Transfer (REST) as an architectural style (http://www.ics.uci.edu/~fielding/pubs/
dissertation/top.htm). Where possible, we try to follow these concepts by keeping the inter-active flashy parts of Crimespotting firmly grounded in a supporting matrix of basic, 1993-vintage web page. Our API outputs XML for Flash, RSS and Atom for feed readers, and CSV for spreadsheets, all vital uses of information that constitute a complete API.
What launched in August 2007 included all the concepts described here, and relied on an expensive nightly scrape of CrimeWatch. We were fairly certain that someone in city gov-ernment would eventually notice and complain, but we were lulled into a false sense of security by eight months of smooth sailing.
Revisiting
A short time after launch, our scraping bot began running into a wall. It was seemingly impossible to access the CrimeWatch site for any extended length of time, even with a reg-ular browser. Conversations with the city information technology department suggested that once our access was publicly noted, it was considered unwelcome. The city offered some hints of an official method of accessing the data, but the wheels of bureaucracy grind slowly and nothing was forthcoming immediately. We regretfully took the site down, and spent a few months considering enhancements and strategies for bringing it back. There were two ideas we worked on during this time that ultimately never saw the light of day, and one new feature that we made public. An outcome of the revision process has been a more focused, pragmatic final data display.
V I S U A L I Z I N G U R B A N D A T A 179 In thinking about how best to represent the impact of local crime on a place, conversations
with Adam Greenfield led to the idea of “violence as a force acting on a place.” One way to envision the long-term impact of a murder or robbery on the surrounding neighborhood is as an aura (see Figure 11-9). My initial mental model of this was a space-time sphere, perhaps a quarter mile in space radius and a week in time radius. The visual display would be a small spot that grows into a large stain as a time slider is moved closer to the actual event time. Greenfield suggested inverting the sphere into a pair of cones: the actual crime is a point, with “light cones” spreading forward and backward through time. The visual display would be a large, diffuse circle that clarifies and focuses into a tiny point as the time slider is moved closer to the exact event time. This display concept might be a better fit for showing causality. The potential for a crime might be broad, spread throughout a neighborhood. As events unfold, the malignant potential collapses to a point where a neighbor is victimized and then subsequently spreads out again as news gets around and a feeling of personal safety falls away.
We made a number of interactive maps exploring the cone metaphor, and discovered a few interesting things. One thing we noticed was that certain report types have unique visual signatures that depend on their enforcement patterns. Prostitution in particular is a special case. Where most of the reports we display are driven by the event—a victim call-ing in—prostitution is driven by police department decisions and scheduled crackdowns.
We routinely see weeks of quiet on the prostitution front interrupted by rapid, concen-trated sweeps along San Pablo Avenue or International Boulevard. The cone display meta-phor was unfortunately too esoteric for use on the primary website. The idea of time F I G U R E 1 1 - 9.Two methods of envisioning the long-term impact of a murder or robbery on the surrounding neighborhood.
crime
crime time
space
time
space
180 C H A P T E R E L E V E N
navigation on maps is fairly novel, and it was important for us to make the relationship between report display and time control as unambiguous as possible. Cones would have to remain in the experimental bin.
Another possible enhancement that received a great deal of serious attention during our downtime was the concept of distributed page scraping. The reason our normal collection process was vulnerable to interruption was that all requests had to originate from the same Internet address, making them trivially easy to block when needed. We experi-mented with a distributed model impleexperi-mented as a Firefox browser add-on, executed in JavaScript and controlled centrally. We hoped that a sufficient number of our technically savvy visitors would be willing to download a browser toolbar icon and help collect data when indicated. Requests to the CrimeWatch server would be spread over a large number of visiting IP addresses, at unpredictable hours of the day: a pattern effectively indistin-guishable from normal site use. An added benefit of this process was the promise of human error correction at the end. The final screen in the mediated scraping process included an overview of all the reports the user had just assisted in collecting, with the possibility of marking certain matches as incorrect.
One feature developed during this time was beat-specific pages, such as this one for the commercial and residential area between downtown and Lake Merritt: http://oakland.
crimespotting.org/beat/04X (see Figure 11-10). When we initially developed the service, we consciously decided to ignore the administrative divisions present in CrimeWatch. Police service areas, city council districts, zip codes, and beats all seemed to us a distraction from location and proximity. After our launch, we quickly learned that we were wrong about police beats. Our users informed us that citizen communication with the department occurred via beat officers, who had specific geographic patrols and regular meetings with local residents. The division of reports into beats was important, because it matched the area of concern and responsibility for any given officer. Furthermore, beat boundaries fre-quently follow obvious physical features of the city: major streets, creeks, freeways, and railroad tracks all serve to impart a sense of neighborhood self-identification. Beat pages are now home to a static overview map of the area showing its borders, as well as portions of the API likely to be maximally useful to nontechnical users comfortable with common spreadsheet software. The eventual feedback we received on this feature was invaluable.
One resident said, “We have a Beat1X NCPC (Neighborhood Crime Prevention Council) meeting next week…I’ll be able to show up more prepared than OPD…our experience has been that they seldom if ever have current statistics to share with us.”
The Firefox browser plug-in and associated web service controller were completed and planned for limited, experimental rollout around the same time that the City of Oakland informed us that we would be provided with a nightly spreadsheet of complete citywide crime report information, along with street addresses or intersections where appropriate.
Starting in January 2008 and lasting until the present day, our data collection process has evolved from a lengthy, error-prone affair to a rapid one blessed by the municipal creators and stewards of the data we were working with.
V I S U A L I Z I N G U R B A N D A T A 181 Eventually, through the gracious assistance of Oakland CTO Bob Glaze, Program Manager
Ahsan Baig, and the City’s Julian Ware and Andrew Wang, we were granted a nightly Microsoft Excel spreadsheet of official crime report data. The difference was like night and day: where before it required hours of data processing and time lag to collect information, now it was a matter of just a few quick minutes. Location information also became signifi-cantly more reliable, featuring block-level street addresses and intersections in place of colored icons.
Conclusion
Have we been successful in maintaining a data service that conforms to the ideals of beauty we began with? The crime report data featured in Crimespotting is interesting, reg-ularly eliciting mail from concerned residents and supporting a population of email alert subscribers several hundred strong. Crime is a serious issue for any urban resident, but it is especially relevant in a city with Oakland’s reputation for trouble. Is our published data use-ful? We regularly hear from residents who use our news feeds and email alerts to stay abreast of neighborhood events or research new places to live. Are we sufficiently free or public?
F I G U R E 1 1 - 1 0.A beat-specific page allows citizens to provide feedback to the officers who patrol their local areas.
(See Color Plate 35.)
182 C H A P T E R E L E V E N
All site information is made available in a variety of forms suitable for a wide range of technical proficiency, from the simple daily mail subscription or spreadsheet to the more advanced news feed or XML-based API. The project has been a productive success, result-ing in what we believe is a data service maximally useful to local residents.
All site information is made available in a variety of forms suitable for a wide range of technical proficiency, from the simple daily mail subscription or spreadsheet to the more advanced news feed or XML-based API. The project has been a productive success, result-ing in what we believe is a data service maximally useful to local residents.