Special Issue: Advanced Methods and Applications with Deep Learning in Object Recognition, 2nd Edition

2 views
Skip to first unread message
Cesar Astudillo

unread,
Apr 18, 2026, 1:10:15 PMApr 18
to Capítulo IEEE Inteligencia Computacional Chile
Colleagues in the AI and Computer Vision community: I invite you to review the Call for Papers for the 2nd Edition of the Special Issue on "Advanced Methods and Applications with Deep Learning in Object Recognition" in the Mathematics Journal (MDPI, JCR Q1).

Guest Editors: Prof. Dr. Jesús García-Herrero & Prof. Dr. Sergio A Velastin.

This issue focuses on the latest advancements and challenges in object detection, including:
• Transformer-based architectures (ViT, DETR) vs. classical CNNs (YOLO, RCNN).
• Multi-Object Tracking (MOT) architectures.
• Edge AI, class imbalance, and model generalization.

Researchers interested in contributing mathematical algorithms and novel architectures to solve complex detection scenarios are encouraged to submit.

Contact the Guest Editors directly to submit a manuscript proposal (title and authors). See the full scope details below.

#DeepLearning #ComputerVision #ObjectRecognition #PatternRecognition #AI #MachineLearning

---------------------------------------------------------------
Special Issue: Advanced Methods and Applications with Deep Learning in Object Recognition, 2nd Edition
Mathematics Journal: https://www.mdpi.com/journal/mathematics
Journal Rank: JCR - Q1 (Mathematics) / CiteScore - Q1 (General Mathematics)

Guest Editors

Prof. Dr. Jesús García-Herrero
Affiliation: Professor, Computer Science Department, Universidad Carlos III de Madrid, Colmenarejo, Spain
E-Mail: jghe...@inf.uc3m.es
Personal Website: https://giaa.uc3m.es/

Interests: information fusion; artificial intelligence; machine vision; autonomous vehicles



Prof. Dr. Sergio A Velastin
Affiliation: Visiting Professor, Computer Science Department, Universidad Carlos III de Madrid, Colmenarejo, Spain
E-Mail: sergio....@ieee.org
Personal Website: https://scholar.google.es/citations?user=FsE86kwAAAAJ&hl=en

Interests: Object detection; Computer Vision; Action Recognition

Please contact the Guest Editors if you are interested in sending a manuscript for review, indicating title and list of authors/affiliations.


Dear Colleagues,

We are pleased to announce a Second Edition of the Special Issue entitled “Advanced Methods and Applications with Deep Learning in Object Recognition”.

This second edition keeps the focus on object detection and recognition as central tasks in computer vision, essential in most applications of this technology such as video surveillance, warehouse logistics, search and rescue missions or video monitoring using UAVs. The detection conditions may differ across different situations such as low resolution or varying perspectives in aerial images, making it challenging to achieve general solutions requiring minimum fine-tuning to adapt them in new scenarios.

Deep Learning has become the reference technique in the field of computer vision for object detection, with a wide range of detection models, generally classified into three main categories: two-stage detectors, one-stage detectors and detectors based on attention mechanisms (transformers).

The two first are the classical detectors based on deep learning, specifically on CNN architectures. Two-stage methods first generate a set of region proposals which may contain objects and then perform classification on these proposals, looking for high accuracy at the cost of increased computational requirements. Some representative solutions of this class are RCNN (Region-based Convolutional Neural Network), Fast-RCNN and Faster-RCNN. In contrast, one-stage methods apply a predefined grid to the image and directly compute predictions for each grid cell. Models like Single-Shot Detector (SSD) and You Only Look Once (YOLO) are stand out as one-stage detectors, achieving a good trade-off between accuracy and computational efficiency.

Regarding the Transformer-based models, they are based on attention-based architectures which have been imported from natural language processing domain (NLP) and adapted to computer vision tasks. ViT (Vision Transformer) was the first approach designed for image classification, and thereafter other alternatives appeard as the Detection Transformer (DETR), or You Only Look One Sequence (YOLOS), including attentional mechanisms for object detection and classification as alternative to region proposals generation and postprocessing. ViTs usually make use of CNNs as a backbone for relevant feature extraction, and use transformer layers to learn contextualized representations.

Most classical detectors have been well studied and highly optimized, with very competitive performance compared with the newly developed modles like DETR and their variants. Therefore, evaluation is a fundamental task, with different metrics like IoU o mAP to develop fair comparisons among different solutions, considering the balance between accuracy and speed, the resolution of the input images, the configuration of the evaluation parameters, etc. Some relevant challenges are learning models for imbalanced situations, avoiding biases towards minority classes, or the detection of small targets like the ones present in images captured by UAVs, which typically cover big areas with small targets, sometimes also with high density of objects like in crowds or heavy traffic scenarios.

Finally, object detection is closely related with other open challenges in machine vision like Multi-Object Tracking (MOT), which involves both the detection and tracking of objects of interest appearing in the video sequence. The goal in this case is not only to identify and locate the objects contained in each frame, but to also associate them across frames to keep track continuity and follow their dynamics over time. This task is usually solved by combining algorithms addressing object detection and data association, with architectures like ByteTrack or the SORT family (Simple Online and Real-time Tracking) including deepSORT, StrongSORT or OCT-Sort. Typical metrics to evaluate MOT solutions inclue detection, localisation, and association over time, which metrics like MOTA, IDF1 or HOTA, to assess the object association and detection accuracy in complex scenarios such as sets of interacting objects trajectories.

This Special Issue is aimed at contributions focused on these topics, showing the capability of novel mathematical algorithms, architectures and methods to improve the object detection and recognition tasks, with the possibility of multi-object tracking, with an emphasis in new solutions and analysis of their performance in challenging conditions in relevant applications.


Keywords:

object detection and classification
multi-object tracking
deep-learning architectures
transformers for object detection and segmentation
loss functions in learning
class imbalance
model generalization and domain shift
evaluation metrics and datasets
applications of object detection and object tracking
aerial object identification
edge ai for object detection
multi-modal object detection



-- 
Prof Sergio A Velastin PhD MSc SMIEEE
Professor of Applied Computer Vision
https://scholar.google.es/citations?user=FsE86kwAAAAJ&hl=en
sergio....@ieee.org
ORCID: 0000-0001-6775-7137

Profesor Honorifico, Universidad Carlos III de Madrid, Spain

- CONFIDENTIAL-
This email and any files transmitted with it are confidential, and may also be legally privileged. If you are not the intended recipient, you may not review, use, copy, or
distribute this message. If you receive this email in error, please notify the sender immediately by reply email and then delete this email.
Reply all
Reply to author
Forward
0 new messages