Vehicle Re-Identification Based on the Authenticity of Orthographic Projection

Authors: Qiang Lu; Fengwei Quan; Mingkai Qiu; Xiying Li
DIN
IJOER-NOV-2020-2
Abstract

Vehicle re-identification is still a problem do not receive much attention in the multimedia and vision communities. Since most existing approaches mainly focus on the overall vehicle appearance for re-identification and do not consider the visual appearance changes of sides of vehicle, called local deformation. In this paper, we propose a vehicle reidentification method based on the authenticity of orthographic projection, in which three sides of vehicle are extracted, and the local deformation is explicitly minimized by scaling each pair of corresponding side to uniform size before computing similarity. To compute the similarity between two vehicle images, we 1) construct 3D bounding boxes around the vehicles, 2) extract sub-images of the three sides of each vehicle like a three-view drawing, 3) compute the similarity between each pair of corresponding side sub-images, and 4) use their weighted mean as the final measure of similarity. After computing the similarity between the query vehicle and all candidate vehicles, we rank these similarities and take the vehicle with the maximum similarity as the best match. To evaluate this approach, we use a dataset with 240 pairs of vehicle images extracted from surveillance videos shot at seven locations in different directions. The experimental results show that our proposed method can achieve 75.83% matching accuracy for the top-1 ranked vehicle and 91.25% accuracy for the top-5 vehicles.

Keywords
3D bounding boxes local deformation vehicle re-identification weighted mean.
Introduction

Vehicle re-identification refers to the problem of identifying a query vehicle in a gallery of candidates captured from nonoverlapping cameras. As the development of smart city, how to research a given vehicle in a large-scale surveillance video data is an emerging and important problem that should pay more attention. Unlike person re-identification [1,2,3] which attract widespread attention, researches on vehicle re-identification are still limited. In vehicle-related research, the major researches in the multimedia and computer vision fields are focus on vehicle detection [4], fine-grained recognition [5] and driver behavior modeling [6]. Different with vehicle fine-grained recognition, which aims at recognizing the model of a given vehicle, vehicle re-identification is a more challenging task since there exist many vehicles share the same model and they should be identified as different classes.

As each vehicle’s license plate number is unique, vehicle re-identification may not difficult if the license plate number canbe distinguished [7,8,9]. However, in real-world applications, especially in most surveillance videos, license plates cannot be clear enough to identify a vehicle due to the influences of camera distance and resolution. Therefore, license plate number matching is not a reliable method of re-identification. Instead, achieving high re-identification accuracy based on vehicle appearance is desired.

Existing re-identification approaches [10,11,12,13] focus on learning an embedding space in which similar images are pulled closer and dissimilar images are pushed far away, and the embedding spaces are optimized by a triplet loss [14], circle loss [15] or other improved triplet losses function. These methods all reduce the intra-class variance between images of same vehicles implicitly, and here we aim to construct a method that could explicit reduce the intra-class variance.

Conclusion

In actual situations, the main differences between images captured from different camera views are differences in scale and image deformation. In order to eliminate these differences, we need a method that can deal with the local deformation of each side of a vehicle, rather than its overall deformation. In this paper, we proposed a method for vehicle re-identification based on the authenticity of orthogonal projection. This means that by making three-view drawings according to orthographic projection, we can obtain the real shape of all three sides of an object. This minimizes scale differences and image deformation.

By splitting vehicle information into three views and flipping the images as required, we can solve the problem presented by images being shot from different sides. This may provide a feasible way to perform vehicle re-identification when images are shot from different directions, including anterior and posterior views.

A comparison between our method and the direct DDIS method demonstrates the potential of our approach as an auxiliary method for improving the performance of some existing vehicle re-identification methods. Future work will expand this approach to other vehicle re-identification methods, ultimately aiming to build a system suitable for application to expressways.

Article Preview