A Comprehensive Hybrid Approach for Indoor Scene Recognition Combining CNNs and Text-Based Features

Uckan, Taner; Aslan, Cengiz; Hark, Cengiz

A Comprehensive Hybrid Approach for Indoor Scene Recognition Combining CNNs and Text-Based Features

dc.contributor.author	Uckan, Taner
dc.contributor.author	Aslan, Cengiz
dc.contributor.author	Hark, Cengiz
dc.date.accessioned	2025-09-30T16:36:07Z
dc.date.available	2025-09-30T16:36:07Z
dc.date.issued	2025
dc.description.abstract	Highlights What are the main findings? Proposed an innovative two-channel hybrid model by integrating convolutional neural networks (CNNs) with a text-based classifier. Leveraged an extended dataset derived from multiple object recognition models, increasing input data diversity and achieving a text-based classifier accuracy of 73.30%. Achieved a significant improvement of 8.33% in accuracy compared to CNN-only models, with the hybrid model attaining an accuracy of 90.46%. What is the implication of the main finding? Efficient and Scalable Methodology: Utilized EfficientNet for CNN-based feature extraction and Bag-of-Words for text representation, ensuring computational efficiency and scalability. Application Potential: Addressed challenges in indoor scene recognition, such as complex backgrounds and object diversity, demonstrating significant potential for applications in robotics, intelligent surveillance, and assistive systems.Highlights What are the main findings? Proposed an innovative two-channel hybrid model by integrating convolutional neural networks (CNNs) with a text-based classifier. Leveraged an extended dataset derived from multiple object recognition models, increasing input data diversity and achieving a text-based classifier accuracy of 73.30%. Achieved a significant improvement of 8.33% in accuracy compared to CNN-only models, with the hybrid model attaining an accuracy of 90.46%. What is the implication of the main finding? Efficient and Scalable Methodology: Utilized EfficientNet for CNN-based feature extraction and Bag-of-Words for text representation, ensuring computational efficiency and scalability. Application Potential: Addressed challenges in indoor scene recognition, such as complex backgrounds and object diversity, demonstrating significant potential for applications in robotics, intelligent surveillance, and assistive systems.Abstract Indoor scene recognition is a computer vision task that identifies various indoor environments, such as offices, libraries, kitchens, and restaurants. This research area is particularly significant for applications in robotics, security, and assistance for individuals with disabilities, as it enables the categorization of spaces and the provision of contextual information. Convolutional Neural Networks (CNNs) are commonly employed in this field. While CNNs perform well in outdoor scene recognition by focusing on global features such as mountains and skies, they often struggle with indoor scenes, where local features like furniture and objects are more critical. In this study, the "MIT 67 Indoor Scene" dataset is used to extract and combine features from both a CNN and a text-based model utilizing object recognition outputs, resulting in a two-channel hybrid model. The experimental results demonstrate that this hybrid approach, which integrates natural language processing and image processing techniques, improves the test accuracy of the image processing model by 8.3%, achieving a notable success rate. Furthermore, this study offers contributions to new application areas in remote sensing, particularly in indoor scene understanding and indoor mapping.	en_US
dc.identifier.doi	10.3390/s25175350
dc.identifier.issn	1424-8220
dc.identifier.scopus	2-s2.0-105015894592
dc.identifier.uri	https://doi.org/10.3390/s25175350
dc.language.iso	en	en_US
dc.publisher	MDPI	en_US
dc.relation.ispartof	Sensors	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Indoor Scene Recognition	en_US
dc.subject	EfficientNet	en_US
dc.subject	Object Recognition	en_US
dc.subject	Deep Learning	en_US
dc.subject	Text Classification	en_US
dc.title	A Comprehensive Hybrid Approach for Indoor Scene Recognition Combining CNNs and Text-Based Features	en_US
dc.type	Article	en_US
dspace.entity.type	Publication
gdc.author.wosid	Uckan, Taner/Izp-9705-2023
gdc.coar.access	open access
gdc.coar.type	text::journal::journal article
gdc.description.department	T.C. Van Yüzüncü Yıl Üniversitesi	en_US
gdc.description.departmenttemp	[Uckan, Taner] Van Yuzuncu Yil Univ, Fac Engn, Dept Comp Engn, TR-65080 Van, Turkiye; [Aslan, Cengiz] Van Yuzuncu Yil Univ, Dept Artificial Intelligence & Robot, TR-65080 Van, Turkiye; [Hark, Cengiz] Inonu Univ, Fac Engn, Dept Comp Engn, TR-44050 Malatya, Turkiye	en_US
gdc.description.issue	17	en_US
gdc.description.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
gdc.description.scopusquality	Q2
gdc.description.volume	25	en_US
gdc.description.woscitationindex	Science Citation Index Expanded
gdc.description.wosquality	Q2
gdc.identifier.pmid	40942779
gdc.identifier.wos	WOS:001570119300001
gdc.index.type	WoS
gdc.index.type	Scopus
gdc.index.type	PubMed

Collections

WoS İndeksli Yayınlar Koleksiyonu
PubMed İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

A Comprehensive Hybrid Approach for Indoor Scene Recognition Combining CNNs and Text-Based Features

Files

Collections