Real-Time Food Allergen Detection Using OCR-Enhanced Machine Learning Techniques
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Peerj Inc
Abstract
Food allergies are a significant public health concern, emphasizing the need for precise and comprehensive allergen identification in food products. Despite the critical importance of allergen detection, existing allergen food datasets and detection approaches exhibit several limitations. These include small dataset sizes and low accuracy, particularly in real-time scenarios. To address these challenges, this study proposes a novel machine learning-based system evaluated in both real-time and offline environments. The proposed system is designed to analyze ingredient lists extracted from scanned product labels. By leveraging Optical Character Recognition (OCR) technology, the system efficiently retrieves ingredient information in real-time, enabling accurate identification of allergenic components. Once the ingredient information is extracted using OCR, feature extraction techniques such as Bag of Words (BoW), Term Frequency-Inverse Document Frequency (TF-IDF), and Global Vectors for Word Representation (GloVe) are applied. These features play a critical role in training various machine learning and deep learning models. Among the tested models, Logistic Regression (LR) outperformed others, achieving an impressive accuracy of 0.99 with a low computational cost of 13 milliseconds in offline testing. In real-time testing, where product images are captured and processed through the pipeline, the system demonstrated robust performance with a 0.90 accuracy score.
Description
Keywords
Food Allergy, Food Reaction, Machine Learning, Ocr, Feature Extraction
Turkish CoHE Thesis Center URL
WoS Q
Q2
Scopus Q
Q1
Source
Peerj Computer Science
Volume
11