A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section “Mathematics and Computer Science“.
Deadline for manuscript submissions: 1 June 2024
https://www.mdpi.com/journal/mathematics/special_issues/X5FIAF3YZ8
Object detection and recognition are central tasks in computer vision, which include the detection of objects boundaries and their classification. They have become essential in many applications, such as search and rescue, warehouse logistics, video surveillance or monitoring using UAVs, with low-resolution or blurred images usually captured due to camera motion. Additionally, the conditions may differ across different situations, making it complex to achieve general solutions; thus, fine-tuning is essential in new scenarios.
The computer vision community has adopted deep-learning models in the last decade due to their superior performance with respect to those from classical methods. These models require a high processing power (GPUs) for training with large datasets and provide inferences in real time; typically, these models employ convolutional neural networks (CNNs). They are subdivided in two types: two-shot detectors, that search with maximum accuracy with the potential cost of inference time; and one-shot detectors, which are oriented at a minimum inference time for real-time applications. Two-shot detectors are dominated by the R-CNN family (region-proposal CNNs), such as Fast R-CNN, Faster R-CNN or Cascade R-CNN solutions, while the YOLO family dominates one-shot detectors, being SSD and RetinaNet other popular algorithms in this category. Additionally, in recent years, Vision Transformers (ViTs) have also been applied to object detection and recognition tasks. ViT-based algorithms, such as DETR or YOLOS, are based on a self-attention mechanism that learns the relationships between elements of a sequence, applying the transformer architecture to image grids. ViTs make use of CNNs as a backbone for feature extraction, given their ability to automatically extract relevant features. In addition, object detection is closely related with other open challenges in machine vision such as Multi-Object Tracking (MOT), which involves both the detection and tracking of objects of interest appearing in the video sequence. The goal in this case is not only to identify and locate the objects contained in each frame, but to also associate them across frames to keep track continuity and follow their dynamics over time. This task is usually solved by combining algorithms addressing object detection and data association, and some relevant algorithms in the SORT family (Simple Online and Real-time Tracking) can be mentioned such as deepSORT, StrongSORT or OCT-Sort.
Regarding evaluation, developing fair comparisons among different solutions is complex, considering the balance between accuracy and speed, the resolution of the input images, the configuration of the evaluation parameters, etc. Analyses are based on the available benchmarks and datasets, which are necessary to evaluate the performance of different architectures and configurations. In this sense, many authors have identified class imbalance as an additional challenge to achieving a high accuracy. In this sense, other deep-learning architectures, such as GAN or autoencoders, can be combined with detectors to enhance the training phase, increasing the size and variety of the datasets, for instance, to improve the detection of very small objects. Additionally, learning can be improved for imbalanced situations, adapting the loss function to focus learning on hard examples and avoid a bias towards numerous negative examples.
This Special Issue is aimed at contributions focused on these topics, showing the capability of novel mathematical algorithms, architectures and methods to improve the object detection and recognition tasks, with the possibility of multi-object tracking, with an emphasis in new solutions and analysis of their performance in challenging conditions in relevant applications.
Prof. Dr. Jesús García-Herrero
Prof. Dr. Johan Debayle
Guest Editors