• 沒有找到結果。

Chapter 9: Firebird: Predicting Fire Risk and Prioritizing Fire Inspections in

9.3 Data Description

An essential step before identifying and prioritizing potential properties to inspect is to join the data about commercial properties from multiple sources. This was done to construct as complete a picture as possible for the properties in Atlanta needing inspection, as required by the Atlanta Code of Ordinances. After the data joining, we identified 19,397 new potential commercial properties to inspect, through a process of property discovery that utilized AFRD and City of Atlanta fire code criteria. See Table 9.1 for a summary of the different lists of total commercial property inspections and commercial buildings we will be referring to throughout this chapter.

Name Count

Current annual inspections 2,573

Long list of potential new inspections2 19,397 Short list of potential new inspections 6,096 Current + short list inspections 8,669 Current + short list inspections with risk score 5,022 Properties for building predictive model 8,223

Table 9.1: Summary of inspection and building lists

9.3.1 Data Sources

Firebird uses data from multiple sources, as tabulated in Table 9.2. AFRD provided us with a dataset of 2, 543 historical fire incidents from July 2011 to March 2015, of which 34.3%

were commercial fires. This includes information about fire incidents, such as time, location, type, and cause of fire. AFRD also provided a dataset of fire inspections, with 32, 488 inspection permit records from 2012 to 2015. The inspection data includes information such as inspected property types, address, and time of inspections. We also obtained structural information about commercial properties from a dataset purchased by AFRD from the

2We provided AFRD with two lists of potential properties: one longer list that was the most extensive that we could provide, and another shorter list that was more manageable to display on a map, refined using the most frequently inspected property usage types.

Source Name Description Atlanta Fire Rescue Department Fire Incidents Fire incidents

from 2011 - 201 Fire Permits All permits filed by

AFRD in 2012-2015

City of Atlanta

Parcel Basic information for

each parcel in Atlanta Strategic Community

Investigation

Information regarding parcel conditions

Business Licenses All the business licenses issued in Atlanta

Atlanta Police Department Crime 2014 crime in Atlanta Liquor Licenses All filed liquor licenses

by Police Department Atlanta Regional Commission Neighborhood

Planning Unit

Boundary data for each Atlanta neighborhood

U.S. Census Bureau Demographic Household number,

population by race and age Socioeconomic Household median income CoStar Group, Inc CoStar Properties Commercial property

information

Google Place APIs Google Place Information regarding

places from Google Maps Table 9.2: Firebird Data Sources Summary

CoStar Group, a commercial real estate agency. This dataset includes building-level features such as year built, building material, number of floors and units, building condition and other information. A total of 8,223 commercial properties are documented by the CoStar Group in the City of Atlanta.

While CoStar offers building-level information, parcel data from Atlanta’s Office of Buildings provides parcel-level information, such as property value, square footage, address, and other information about each parcel (a unit of land surrounding building(s)). The business license dataset obtained from the City of Atlanta’s Office of Revenue provides information about businesses that own commercial properties. The business licenses dataset has 20,020 records with over 20 features including business type, business name, address, owner, etc. For non-business commercial properties (e.g., schools, churches, daycare

centers), we obtained such data from Google Places API and State of Georgia Government.

To offer more information about properties for building a predictive risk model, we also obtained socioeconomic and demographic data from the U.S. Census Bureau, liquor license and 2014 crime data from the Atlanta Police Department, and Certificate of Occupancy (CO) data from the Atlanta Office of Buildings. All of these data sources contributed to discovering new inspections and developing our predictive model for commercial fire risk estimation.

9.3.2 Data Joining

A critical step of this study was to join different datasets together so that data from different sources about the same building or property could be unified to create the most complete picture of a given property. For instance, by joining fire incident and commercial property data together, we can obtain a general idea regarding which commercial properties caught fire in the past five years. Furthermore, by joining commercial property data with data from the commercial real estate reports like the CoStar Group or the SCI Report, we can generate a more comprehensive view regarding specific characteristics of buildings, such as the structure and parcel condition, and even vacancy information.

We joined the datasets together based primarily on spatial location information. There are three types of spatial or location information in our datasets: longitude and latitude, address information, and the parcel identification number, which is a unique ID number created by Fulton and DeKalb county3for tax purposes. We then performed a location join based on the above three types of location information. The variety of spatial information types, and our method for joining them is illustrated in Figure 9.2. One obstacle we encountered was that spatial information had different formatting standards across the datasets. For example, the addresses from the CoStar Group were all in lowercase, with road names abbreviated instead

3The City of Atlanta is comprised of two separate counties, Fulton and Dekalb. Although both county governments provided building information, their parcel ID numbering schemes were not consistent. Thus, building information had to joined using addresses and coordinates.

of fully spelled out, while datasets from the multiple departments of the City of Atlanta tend to use a more consistent address format. Therefore, a spatial information cleaning process was conducted before joining the datasets directly. The address location information from different datasets was first validated using Google Geocoding API. The API can auto-correct some misspellings of address information. After validation, addresses were then reformatted using US Postal Service’s address validation API. The coordinate information was processed in ESRI ArcGIS software to filter out data points falling outside of the City of Atlanta. The cleaned datasets were then joined together based on the formatted addresses from the USPS API and the coordinate information from ArcGIS.