Positional Accuracy and Geographic Bias of Four Methods of Geocoding in Epidemiologic Research

Mario Schootman, David Sterling, James Struthers, Yan Yan, Ted Laboube, Brett Emo, Gary Higgs

Research output: Contribution to journalArticleResearchpeer-review

51 Citations (Scopus)

Abstract

Purpose: We examined the geographic bias of four methods of geocoding addresses using ArcGIS, commercial firm, SAS/GIS, and aerial photography. We compared "point-in-polygon" (ArcGIS, commercial firm, and aerial photography) and the "look-up table" method (SAS/GIS) to allocate addresses to census geography, particularly as it relates to census-based poverty rates. Methods: We randomly selected 299 addresses of children treated for asthma at an urban emergency department (1999-2001). The coordinates of the building address side door were obtained by constant offset based on ArcGIS and a commercial firm and true ground location based on aerial photography. Results: Coordinates were available for 261 addresses across all methods. For 24% to 30% of geocoded road/door coordinates the positional error was 51 meters or greater, which was similar across geocoding methods. The mean bearing was -26.8 degrees for the vector of coordinates based on aerial photography and ArcGIS and 8.5 degrees for the vector based on aerial photography and the commercial firm (p < 0.0001). ArcGIS and the commercial firm performed very well relative to SAS/GIS in terms of allocation to census geography. For 20%, the door location based on aerial photography was assigned to a different block group compared to SAS/GIS. The block group poverty rate varied at least two standard deviations for 6% to 7% of addresses. Conclusion: We found important differences in distance and bearing between geocoding relative to aerial photography. Allocation of locations based on aerial photography to census-based geographic areas could lead to substantial errors.

Original languageEnglish
Pages (from-to)464-470
Number of pages7
JournalAnnals of Epidemiology
Volume17
Issue number6
DOIs
StatePublished - 1 Jun 2007

Fingerprint

Geographic Mapping
Photography
Censuses
Research
Geography
Poverty
Hospital Emergency Service
Asthma

Keywords

  • Bias
  • GPS
  • Geocoding
  • Geography

Cite this

Schootman, Mario ; Sterling, David ; Struthers, James ; Yan, Yan ; Laboube, Ted ; Emo, Brett ; Higgs, Gary. / Positional Accuracy and Geographic Bias of Four Methods of Geocoding in Epidemiologic Research. In: Annals of Epidemiology. 2007 ; Vol. 17, No. 6. pp. 464-470.
@article{1742eced11384ac68dcf075ea6625f62,
title = "Positional Accuracy and Geographic Bias of Four Methods of Geocoding in Epidemiologic Research",
abstract = "Purpose: We examined the geographic bias of four methods of geocoding addresses using ArcGIS, commercial firm, SAS/GIS, and aerial photography. We compared {"}point-in-polygon{"} (ArcGIS, commercial firm, and aerial photography) and the {"}look-up table{"} method (SAS/GIS) to allocate addresses to census geography, particularly as it relates to census-based poverty rates. Methods: We randomly selected 299 addresses of children treated for asthma at an urban emergency department (1999-2001). The coordinates of the building address side door were obtained by constant offset based on ArcGIS and a commercial firm and true ground location based on aerial photography. Results: Coordinates were available for 261 addresses across all methods. For 24{\%} to 30{\%} of geocoded road/door coordinates the positional error was 51 meters or greater, which was similar across geocoding methods. The mean bearing was -26.8 degrees for the vector of coordinates based on aerial photography and ArcGIS and 8.5 degrees for the vector based on aerial photography and the commercial firm (p < 0.0001). ArcGIS and the commercial firm performed very well relative to SAS/GIS in terms of allocation to census geography. For 20{\%}, the door location based on aerial photography was assigned to a different block group compared to SAS/GIS. The block group poverty rate varied at least two standard deviations for 6{\%} to 7{\%} of addresses. Conclusion: We found important differences in distance and bearing between geocoding relative to aerial photography. Allocation of locations based on aerial photography to census-based geographic areas could lead to substantial errors.",
keywords = "Bias, GPS, Geocoding, Geography",
author = "Mario Schootman and David Sterling and James Struthers and Yan Yan and Ted Laboube and Brett Emo and Gary Higgs",
year = "2007",
month = "6",
day = "1",
doi = "10.1016/j.annepidem.2006.10.015",
language = "English",
volume = "17",
pages = "464--470",
journal = "Annals of Epidemiology",
issn = "1047-2797",
publisher = "Elsevier Inc.",
number = "6",

}

Positional Accuracy and Geographic Bias of Four Methods of Geocoding in Epidemiologic Research. / Schootman, Mario; Sterling, David; Struthers, James; Yan, Yan; Laboube, Ted; Emo, Brett; Higgs, Gary.

In: Annals of Epidemiology, Vol. 17, No. 6, 01.06.2007, p. 464-470.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Positional Accuracy and Geographic Bias of Four Methods of Geocoding in Epidemiologic Research

AU - Schootman, Mario

AU - Sterling, David

AU - Struthers, James

AU - Yan, Yan

AU - Laboube, Ted

AU - Emo, Brett

AU - Higgs, Gary

PY - 2007/6/1

Y1 - 2007/6/1

N2 - Purpose: We examined the geographic bias of four methods of geocoding addresses using ArcGIS, commercial firm, SAS/GIS, and aerial photography. We compared "point-in-polygon" (ArcGIS, commercial firm, and aerial photography) and the "look-up table" method (SAS/GIS) to allocate addresses to census geography, particularly as it relates to census-based poverty rates. Methods: We randomly selected 299 addresses of children treated for asthma at an urban emergency department (1999-2001). The coordinates of the building address side door were obtained by constant offset based on ArcGIS and a commercial firm and true ground location based on aerial photography. Results: Coordinates were available for 261 addresses across all methods. For 24% to 30% of geocoded road/door coordinates the positional error was 51 meters or greater, which was similar across geocoding methods. The mean bearing was -26.8 degrees for the vector of coordinates based on aerial photography and ArcGIS and 8.5 degrees for the vector based on aerial photography and the commercial firm (p < 0.0001). ArcGIS and the commercial firm performed very well relative to SAS/GIS in terms of allocation to census geography. For 20%, the door location based on aerial photography was assigned to a different block group compared to SAS/GIS. The block group poverty rate varied at least two standard deviations for 6% to 7% of addresses. Conclusion: We found important differences in distance and bearing between geocoding relative to aerial photography. Allocation of locations based on aerial photography to census-based geographic areas could lead to substantial errors.

AB - Purpose: We examined the geographic bias of four methods of geocoding addresses using ArcGIS, commercial firm, SAS/GIS, and aerial photography. We compared "point-in-polygon" (ArcGIS, commercial firm, and aerial photography) and the "look-up table" method (SAS/GIS) to allocate addresses to census geography, particularly as it relates to census-based poverty rates. Methods: We randomly selected 299 addresses of children treated for asthma at an urban emergency department (1999-2001). The coordinates of the building address side door were obtained by constant offset based on ArcGIS and a commercial firm and true ground location based on aerial photography. Results: Coordinates were available for 261 addresses across all methods. For 24% to 30% of geocoded road/door coordinates the positional error was 51 meters or greater, which was similar across geocoding methods. The mean bearing was -26.8 degrees for the vector of coordinates based on aerial photography and ArcGIS and 8.5 degrees for the vector based on aerial photography and the commercial firm (p < 0.0001). ArcGIS and the commercial firm performed very well relative to SAS/GIS in terms of allocation to census geography. For 20%, the door location based on aerial photography was assigned to a different block group compared to SAS/GIS. The block group poverty rate varied at least two standard deviations for 6% to 7% of addresses. Conclusion: We found important differences in distance and bearing between geocoding relative to aerial photography. Allocation of locations based on aerial photography to census-based geographic areas could lead to substantial errors.

KW - Bias

KW - GPS

KW - Geocoding

KW - Geography

UR - http://www.scopus.com/inward/record.url?scp=34249102650&partnerID=8YFLogxK

U2 - 10.1016/j.annepidem.2006.10.015

DO - 10.1016/j.annepidem.2006.10.015

M3 - Article

VL - 17

SP - 464

EP - 470

JO - Annals of Epidemiology

JF - Annals of Epidemiology

SN - 1047-2797

IS - 6

ER -