Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/2211
Title: Multitask Learning for Crash Analysis: A Fine-Tuned LLM Framework Using Twitter Data
Authors: Jaradat, Shadi$Other$Palestinian
Nayak, Richi$Other$Other
Paz, Alexander$Other$Other
Ashqar, Huthaifa$AAUP$Palestinian
Elhenawy, Mohammed$Other$Other
Issue Date: 1-Sep-2024
Publisher: MDPI
Citation: Jaradat, S.; Nayak, R.; Paz, A.; Ashqar, H.I.; Elhenawy, M. Multitask Learning for Crash Analysis: A Fine-Tuned LLM Framework Using Twitter Data. Smart Cities 2024, 7, 2422-2465. https://doi.org/10.3390/smartcities7050095
Abstract: Road traffic crashes (RTCs) are a global public health issue, with traditional analysis methods often hindered by delays and incomplete data. Leveraging social media for real-time traffic safety analysis offers a promising alternative, yet effective frameworks for this integration are scarce. This study introduces a novel multitask learning (MTL) framework utilizing large language models (LLMs) to analyze RTC-related tweets from Australia. We collected 26,226 traffic-related tweets from May 2022 to May 2023. Using GPT-3.5, we extracted fifteen distinct features categorized into six classification tasks and nine information retrieval tasks. These features were then used to fine-tune GPT-2 for language modeling, which outperformed baseline models, including GPT-4o mini in zero-shot mode and XGBoost, across most tasks. Unlike traditional single-task classifiers that may miss critical details, our MTL approach simultaneously classifies RTC-related tweets and extracts detailed information in natural language. Our fine-tunedGPT-2 model achieved an average accuracy of 85% across the six classification tasks, surpassing the baseline GPT-4o mini model’s 64% and XGBoost’s 83.5%. In information retrieval tasks, our fine-tuned GPT-2 model achieved a BLEU-4 score of 0.22, a ROUGE-I score of 0.78, and a WER of 0.30, significantly outperforming the baseline GPT-4 mini model’s BLEU-4 score of 0.0674, ROUGE-I score of 0.2992, and WER of 2.0715. These results demonstrate the efficacy of our fine-tuned GPT-2 model in enhancing both classification and information retrieval, offering valuable insights for data-driven decision-making to improve road safety. This study is the first to explicitly apply social media data and LLMs within an MTL framework to enhance traffic safety.
URI: http://repository.aaup.edu/jspui/handle/123456789/2211
ISSN: https://doi.org/10.3390/smartcities7050095
Appears in Collections:Faculty & Staff Scientific Research publications

Files in This Item:
File Description SizeFormat 
smartcities-07-00095.pdf4.29 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Admin Tools