Re: [FULL] Windows 7 Ultimate Extreme Edition R1 64 Bit Edition AMIT

0 views

Skip to first unread message

Message has been deleted

Tommye Hope

unread,

Jul 12, 2024, 4:35:01 AM7/12/24

to theovladmenmi

A computer views all kinds of visual media as an array of numerical values. As a consequence of this approach, they require image processing algorithms to inspect contents of images. This project compares 3 major image processing algorithms: Single Shot Detection (SSD), Faster Region based Convolutional Neural Networks (Faster R-CNN), and You Only Look Once (YOLO) to find the fastest and most efficient of three. In this comparative analysis, using the Microsoft COCO (Common Object in Context) dataset, the performance of these three algorithms is evaluated and their strengths and limitations are analysed based on parameters such as accuracy, precision and F1 score. From the results of the analysis, it can be concluded that the suitability of any of the algorithms over the other two is dictated to a great extent by the use cases they are applied in. In an identical testing environment, YOLO-v3 outperforms SSD and Faster R-CNN, making it the best of the three algorithms.

In recent times, the industrial revolution makes use of computer vision for their work. Automation industries, robotics, medical field, and surveillance sectors make extensive use of deep learning [1]. Deep learning has become the most talked-about technology owing to its results which are mainly acquired in applications involving language processing, object detection and image classification. The market forecast predicts outstanding growth around the coming years. The main reasons cited for this are primarily the accessibility of both strong Graphics Processing Units (GPUs) and many datasets [1]. In recent times, both these requirements are easily available [1].

[FULL] Windows 7 Ultimate Extreme Edition R1 64 Bit Edition AMIT

Download File https://ssurll.com/2yMP8T

Image classification and detection are the most important pillars of object detection. There is a plethora of datasets available. Microsoft COCO is one such widely used image classification domain. It is a benchmark dataset for object detection. It introduces a large-scale dataset that is available for image detection and classification [2].

This review article aims to make a comparative analysis of SSD, Faster-RCNN, and YOLO. The first algorithm for the comparison in the current work is SSD which adds layers of several features to the end network and facilitates ease of detection [3]. The Faster R-CNN is a unified, faster, and accurate method of object detection that uses a convolutional neural network. While YOLO was developed by Joseph Redmon that offers end-to-end network [3].

In this paper, by using the Microsoft COCO dataset as a common factor of the analysis and measuring the same metrics across all the implementations mentioned, the respective performances of the three above mentioned algorithms, which use different architectures, have been made comparable to each other. The results obtained by comparing the effectiveness of these algorithms on the same dataset can help gain an insight on the unique attributes of each algorithm, understand how they differ from one another and determine which method of object recognition is most effective for any given scenario.

Object detection has been an important topic of research in recent times. With powerful learning tools available deeper features can be easily detected and studied. This work is an attempt to compile information on various object detection tools and algorithms used by different researchers so that a comparative analysis can be done and meaningful conclusions can be drawn to apply them in object detection. Literature survey serves the purpose of getting an insight regarding our work.

Another research work done by Kim et al is discussed here. This research work uses CNN with background subtraction to build a framework that detects and recognizes moving objects using CCTV (Closed Circuit Television) cameras. It is based on the application of the background subtraction algorithm applied to each frame [5]. An architecture similar to the one in this paper was used in our work.

Tanvir Ahmed et al have proposed a modified method that uses an advanced YOLO v1 network model which optimizes the loss of function in YOLO v1, it has a new inception model structure, has a specialized pooling pyramid layer, and has better performance. The advanced application of YOLO is taken from this research paper. It is also an end-to-end process that carries out an extensive experiment on a PASCAL VOC (Visual Object Classes) dataset. The network is an improved version and also shows high effectiveness [7]. The training of the YOLO model using PASCAL VOC was done using the technique proposed in this paper.

Wei Liu et al came up with a new method of detecting objects in images using a single deep neural network. They named this procedure the Single Shot MultiBox Detector SSD. According to the team, SSD is a simple method and requires an object proposal as it is based on the complete elimination of the process that generates a proposal. It also eliminates the subsequent pixel and resampling stages. So, it combines everything into a single step. SSD is also very easy to train and is very straightforward when it comes to integrating it into the system. This makes detection easier. The primary feature of SSD is using multiscale convolutional bounding box outputs that are attached to several feature maps [8]. Training and model analysis of the SSD model of our work was inspired by the work discussed here.

Another paper is based on an advanced type of SSD. In his paper, the authors have proposed their research work to introduce Tiny SSD, a single shot detection deep convolutional neural network. TINY SSD aimed to ease real-time embedded object detection. It comprises of greatly enhanced layers comprising of non-uniform Fire subnetwork and a stack of non-uniform subnetwork of SSD based auxiliary convolutional feature layers. The best feature of Tiny SSD is its size of 2.3 MB which is even smaller than Tiny YOLO. The results of this work have shown that Tiny SSD is well suited for embedded detections [9]. A similar model of SSD was used for the purpose of comparison.

The paper by Pathak et al describes the role of deep learning technique by using CNN for object detection. The paper also accesses some deep learning techniques for object detection systems. The current paper states that deep CNNs work on the principle of weight sharing. It gives us information about some crucial points in CNN.

In a recent research work by Chen et al, they have used anchor boxes for face detection and more exact regression loss function. They have proposed a face detector termed as YOLO face which is based on YOLOv3 that aims at resolving detection problems of varying face scales. The authors concluded that their algorithm out performed previous YOLO versions and its varieties [10]. The YOLOv3 was used in our work for comparison with other models.

In the research work by Fan et al, they have proposed an improved system for the detection of pedestrians based on SSD model of object detection. In this work the multi-layered system they introduced the Squeeze-and-Excitation model as an additional layer to the SSD model. The improved model employed self-learning that further enhanced the accuracy of the system for small scale pedestrian detection. Experiments on the INRIA dataset showed high accuracy [11]. This paper was used for the purpose of understanding the SSD model.

In a recent survey published by Mittal et al, they discussed the algorithms namely Faster RCNN, Cascade RCNN, R-FCN, YOLO and its variants, SSD, RetinaNet and CornerNet, Objects as Point under advanced phases in detectors based on deep learning. This paper provides a comprehensive summary of low-altitude datasets and the algorithms used for the respective work [12]. Our comparison work was done using coco metrics similar to the comparison that has been done in this paper. The paper also discusses several other techniques for comparison which were considered in our work.

Artificial Intelligence (AI): It is a system's ability to correctly interpret external data, to learn from such data, and to use those learnings to achieve specific goals and tasks through flexible adaptation [13].

Convolutional Neural Network (CNN): It is a type of artificial neural network that is mainly used to analyse images. It was inspired by the neurological experiments conducted by Hubel and Wiesel on the visual cortex [17]. The visual cortex is the primary region processing visual sensory information in the brain. It extracts features from images and detects patterns and structures to detect objects in the images. Its distinct feature is the presence of convolutional layers that are hidden. These layers apply filters to extract patterns from images. The filter moves over the image to generates the output. Different filters recognize different patterns. Initial layers have filters to recognize simple patterns. They become more complex through the layers over time as follows:

SSD does not resample pixels or features for bounding box hypotheses and is as accurate as models that do. In addition to this, it is quite straightforward compared to methods that require object proposals because it completely eradicates feature resampling stages or pixel and proposal generation, by encompassing all computation in a single network. Therefore, SSD is very simple to train and can be easily integrated into systems that perform detection as one of their functions [8].

SSD is built on a feed-forward complex network that builds a collection of standard-size bounding boxes and for each occurrence of an object in those boxes, a respective score. After score generation, non-maximum suppression is used to generate the final detection results. The preliminary network layers are built on a standard architecture utilized for high quality image classification (and truncated before any classification layers), which is a VGG-16 network. An auxiliary structure is added to the truncated base network such as convo6 to produce detections.

Hard negative mining: After the matching step, almost all of the default boxes are negatives, largely when the total count of possible default boxes is high. This causes a large imbalance between the positive and negative training examples. Rather than using up all the negative examples, SSD sorts them by their greatest confidence loss for each default box, the highest ones such that at any point of time, the ratio of the positives and negatives is a maximum of 3:1. This leads to faster optimization and better training [8].