January 19, 2024 | Zhenlong Li*, Huan Ning, Fengrui Jing*, M. Naser Lessani
This article examines the biases associated with mobile location data, specifically using SafeGraph Patterns data in the United States. The study investigates biases from multiple dimensions, including spatial, temporal, urbanization, demographic, and socioeconomic factors, over a five-year period from 2018 to 2022 across various geographic levels (state, county, census tract, and census block group). Key findings include:
1. **Spatial and Temporal Dynamics**: The average sampling rate of 7.5% is observed, with notable temporal dynamics and geographic disparities. The number of sampled devices is strongly correlated with the census population at the county level, but less so at the census tract and block group levels.
2. **Demographic and Socioeconomic Bias**: Minor biases are observed among groups such as gender, age, and moderate-income. However, minority groups like Hispanic populations, low-income households, and individuals with low education levels exhibit higher levels of underrepresentation, varying over space, time, urbanization, and geographic levels.
3. **Urban-Rural Differences**: The data shows a significant increase in sampling rates for rural areas after 2019, while urban areas initially had higher sampling rates but became underrepresented.
4. **Geographic Distribution**: The data is more concentrated in the Deep South and Midwest states, with lower sampling rates in the Northeast and West.
5. **Temporal Trends**: The pandemic exacerbated disparities, with low-socioeconomic groups being underrepresented during the pandemic.
The study highlights the need for thorough evaluation of spatiotemporal dynamics of bias when using mobile location datasets, providing valuable insights for future research and applications.This article examines the biases associated with mobile location data, specifically using SafeGraph Patterns data in the United States. The study investigates biases from multiple dimensions, including spatial, temporal, urbanization, demographic, and socioeconomic factors, over a five-year period from 2018 to 2022 across various geographic levels (state, county, census tract, and census block group). Key findings include:
1. **Spatial and Temporal Dynamics**: The average sampling rate of 7.5% is observed, with notable temporal dynamics and geographic disparities. The number of sampled devices is strongly correlated with the census population at the county level, but less so at the census tract and block group levels.
2. **Demographic and Socioeconomic Bias**: Minor biases are observed among groups such as gender, age, and moderate-income. However, minority groups like Hispanic populations, low-income households, and individuals with low education levels exhibit higher levels of underrepresentation, varying over space, time, urbanization, and geographic levels.
3. **Urban-Rural Differences**: The data shows a significant increase in sampling rates for rural areas after 2019, while urban areas initially had higher sampling rates but became underrepresented.
4. **Geographic Distribution**: The data is more concentrated in the Deep South and Midwest states, with lower sampling rates in the Northeast and West.
5. **Temporal Trends**: The pandemic exacerbated disparities, with low-socioeconomic groups being underrepresented during the pandemic.
The study highlights the need for thorough evaluation of spatiotemporal dynamics of bias when using mobile location datasets, providing valuable insights for future research and applications.
[slides and audio] Understanding the bias of mobile location data across spatial scales and over time%3A A comprehensive analysis of SafeGraph data in the United States