A Big Data Driven Framework for Duplicate Device Detection from Multi-sourced Mobile Device Location Data
Authors:
Aliakbar Kabiri,
Aref Darzi,
Saeed Saleh Namadi,
Yixuan Pan,
Guangchen Zhao,
Qianqian Sun,
Mofeng Yang,
Mohammad Ashoori
Abstract:
Mobile Device Location Data (MDLD) has been popularly utilized in various fields. Yet its large-scale applications are limited because of either biased or insufficient spatial coverage of the data from individual data vendors. One approach to improve the data coverage is to leverage the data from multiple data vendors and integrate them to build a more representative dataset. For data integration,…
▽ More
Mobile Device Location Data (MDLD) has been popularly utilized in various fields. Yet its large-scale applications are limited because of either biased or insufficient spatial coverage of the data from individual data vendors. One approach to improve the data coverage is to leverage the data from multiple data vendors and integrate them to build a more representative dataset. For data integration, further treatments on the multi-sourced dataset are required due to several reasons. First, the possibility of carrying more than one device could result in duplicated observations from the same data subject. Additionally, when utilizing multiple data sources, the same device might be captured by more than one data provider. Our paper proposes a data integration methodology for multi-sourced data to investigate the feasibility of integrating data from several sources without introducing additional biases to the data. By leveraging the uniqueness of travel pattern of each device, duplicate devices are identified. The proposed methodology is shown to be cost-effective while it achieves the desired accuracy level. Our findings suggest that devices sharing the same imputed home location and the top five most-visited locations during a month can represent the same user in the MDLD. It is shown that more than 99.6% of the sample devices having the aforementioned attribute in common are observed at the same location simultaneously. Finally, the proposed algorithm has been successfully applied to the national-level MDLD of 2020 to produce the national passenger origin-destination data for the NextGeneration National Household Travel Survey (NextGen NHTS) program.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
A Big-Data Driven Framework to Estimating Vehicle Volume based on Mobile Device Location Data
Authors:
Mofeng Yang,
Weiyu Luo,
Mohammad Ashoori,
**a Mahmoudi,
Chenfeng Xiong,
Jiawei Lu,
Guangchen Zhao,
Saeed Saleh Namadi,
Songhua Hu,
Aliakbar Kabiri
Abstract:
Vehicle volume serves as a critical metric and the fundamental basis for traffic signal control, transportation project prioritization, road maintenance plans and more. Traditional methods of quantifying vehicle volume rely on manual counting, video cameras, and loop detectors at a limited number of locations. These efforts require significant labor and cost for expansions. Researchers and private…
▽ More
Vehicle volume serves as a critical metric and the fundamental basis for traffic signal control, transportation project prioritization, road maintenance plans and more. Traditional methods of quantifying vehicle volume rely on manual counting, video cameras, and loop detectors at a limited number of locations. These efforts require significant labor and cost for expansions. Researchers and private sector companies have also explored alternative solutions such as probe vehicle data, while still suffering from a low penetration rate. In recent years, along with the technological advancement in mobile sensors and mobile networks, Mobile Device Location Data (MDLD) have been growing dramatically in terms of the spatiotemporal coverage of the population and its mobility. This paper presents a big-data driven framework that can ingest terabytes of MDLD and estimate vehicle volume at a larger geographical area with a larger sample size. The proposed framework first employs a series of cloud-based computational algorithms to extract multimodal trajectories and trip rosters. A scalable map matching and routing algorithm is then applied to snap and route vehicle trajectories to the roadway network. The observed vehicle counts on each roadway segment are weighted and calibrated against ground truth control totals, i.e., Annual Vehicle-Miles of Travel (AVMT), and Annual Average Daily Traffic (AADT). The proposed framework is implemented on the all-street network in the state of Maryland using MDLD for the entire year of 2019. Results indicate that our proposed framework produces reliable vehicle volume estimates and also demonstrate its transferability and the generalization ability.
△ Less
Submitted 24 January, 2023; v1 submitted 20 January, 2023;
originally announced January 2023.