Semi-Supervised Structured Subspace Learning for Multi-View Clustering
Multi-view clustering aims at simultaneously obtaining a consensus underlying subspace across multiple views and conducting clustering on the learned consensus subspace, which has gained a variety of interest in image processing. In this paper, we propose ...
A Novel Hybrid Level Set Model for Non-Rigid Object Contour Tracking
Most existing trackers use bounding boxes for object tracking. However, the background contained in the bounding box inevitably decreases the accuracy of the target model, which affects the performance of the tracker and is particularly pronounced for non-...
Spatio-Temporal Correlation Guided Geometric Partitioning for Versatile Video Coding
Geometric partitioning has attracted increasing attention by its remarkable motion field description capability in the hybrid video coding framework. However, the existing geometric partitioning (GEO) scheme in Versatile Video Coding (VVC) causes a non-...
AVLSM: Adaptive Variational Level Set Model for Image Segmentation in the Presence of Severe Intensity Inhomogeneity and High Noise
Intensity inhomogeneity and noise are two common issues in images but inevitably lead to significant challenges for image segmentation and is particularly pronounced when the two issues simultaneously appear in one image. As a result, most existing level ...
View-Wise Versus Cluster-Wise Weight: Which Is Better for Multi-View Clustering?
Weighted multi-view clustering (MVC) aims to combine the complementary information of multi-view data (such as image data with different types of features) in a weighted manner to obtain a consistent clustering result. However, when the cluster-wise ...
A Domain Gap Aware Generative Adversarial Network for Multi-Domain Image Translation
Recent image-to-image translation models have shown great success in mapping local textures between two domains. Existing approaches rely on a cycle-consistency constraint that supervises the generators to learn an inverse mapping. However, learning the ...
M<sup>5</sup>L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking
Classifying hard samples in the course of RGBT tracking is a quite challenging problem. Existing methods only focus on enlarging the boundary between positive and negative samples, but ignore the relations of multilevel hard samples, which are crucial for ...
Remote Sensing Scene Classification via Multi-Branch Local Attention Network
Remote sensing scene classification (RSSC) is a hotspot and play very important role in the field of remote sensing image interpretation in recent years. With the recent development of the convolutional neural networks, a significant breakthrough has been ...
Passive Non-Line-of-Sight Imaging Using Optimal Transport
Passive non-line-of-sight (NLOS) imaging has drawn great attention in recent years. However, all existing methods are in common limited to simple hidden scenes, low-quality reconstruction, and small-scale datasets. In this paper, we propose NLOS-OT, a ...
Euclidean Distance Approximations From Replacement Product Graphs
We introduce a new chamfering paradigm, locally connecting pixels to produce path distances that approximate Euclidean space by building a small network (a replacement product) inside each pixel. These “<inline-formula> <tex-math notation="LaTeX">$...
Edge Tracing Using Gaussian Process Regression
We introduce a novel edge tracing algorithm using Gaussian process regression. Our edge-based segmentation algorithm models an edge of interest using Gaussian process regression and iteratively searches the image for edge pixels in a recursive Bayesian ...
A Prototypical Knowledge Oriented Adaptation Framework for Semantic Segmentation
A prevalent family of fully convolutional networks are capable of learning discriminative representations and producing structural prediction in semantic segmentation tasks. However, such supervised learning methods require a large amount of labeled data ...
Defocus Image Deblurring Network With Defocus Map Estimation as Auxiliary Task
Different from the object motion blur, the defocus blur is caused by the limitation of the cameras’ depth of field. The defocus amount can be characterized by the parameter of point spread function and thus forms a defocus map. In this paper, we ...
Loss Re-Scaling VQA: Revisiting the Language Prior Problem From a Class-Imbalance View
Recent studies have pointed out that many well-developed Visual Question Answering (VQA) models are heavily affected by the language prior problem. It refers to making predictions based on the co-occurrence pattern between textual questions and answers ...
Triple-Level Model Inferred Collaborative Network Architecture for Video Deraining
Video deraining is an important issue for outdoor vision systems and has been investigated extensively. However, designing optimal architectures by the aggregating model formation and data distribution is a challenging task for video deraining. In this ...
Efficient and Accurate Stitching for 360° Dual-Fisheye Images and Videos
Back-to-back dual-fisheye cameras are the most cost-effective devices to capture 360° visual content. However, image and video stitching for such cameras often suffer from the effect of fisheye distortion, photometric inconsistency between the two ...
Completely Blind Quality Assessment of User Generated Video Content
In this work, we address the challenging problem of completely blind video quality assessment (BVQA) of user generated content (UGC). The challenge is twofold since the quality prediction model is oblivious of human opinion scores, and there are no well-...
Variational Abnormal Behavior Detection With Motion Consistency
Abnormal crowd behavior detection has recently attracted increasing attention due to its wide applications in computer vision research areas. However, it is still an extremely challenging task due to the great variability of abnormal behavior coupled with ...
Dynamic Facial Expression Recognition Under Partial Occlusion With Optical Flow Reconstruction
- Delphine Poux,
- Benjamin Allaert,
- Nacim Ihaddadene,
- Ioan Marius Bilasco,
- Chaabane Djeraba,
- Mohammed Bennamoun
Video facial expression recognition is useful for many applications and received much interest lately. Although some methods give good results in controlled environments (no occlusion), recognition in the presence of partial facial occlusion remains a ...
Contrastive Self-Supervised Pre-Training for Video Quality Assessment
Video quality assessment (VQA) task is an ongoing small sample learning problem due to the costly effort required for manual annotation. Since existing VQA datasets are of limited scale, prior research tries to leverage models pre-trained on ImageNet to ...
Spectral-Spatial Boundary Detection in Hyperspectral Images
In this paper, we propose a novel method for boundary detection in close-range hyperspectral images. This method can effectively predict the boundaries of objects of similar colour but different materials. To effectively extract the material information ...
JigsawGAN: Auxiliary Learning for Solving Jigsaw Puzzles With Generative Adversarial Networks
The paper proposes a solution based on Generative Adversarial Network (GAN) for solving jigsaw puzzles. The problem assumes that an image is divided into equal square pieces, and asks to recover the image according to information provided by the pieces. ...
Toward Scalable and Unified Example-Based Explanation and Outlier Detection
When neural networks are employed for high-stakes decision-making, it is desirable that they provide explanations for their prediction in order for us to understand the features that have contributed to the decision. At the same time, it is important to ...
Two-Stage Copy-Move Forgery Detection With Self Deep Matching and Proposal SuperGlue
Copy-move forgery detection identifies a tampered image by detecting pasted and source regions in the same image. In this paper, we propose a novel two-stage framework specially for copy-move forgery detection. The first stage is a backbone self deep ...
Fast Parameter-Free Multi-View Subspace Clustering With Consensus Anchor Guidance
Multi-view subspace clustering has attracted intensive attention to effectively fuse multi-view information by exploring appropriate graph structures. Although existing works have made impressive progress in clustering performance, most of them suffer ...
Dynamic Neural Network for Lossy-to-Lossless Image Coding
Lifting-based wavelet transform has been extensively used for efficient compression of various types of visual data. Generally, the performance of such coding schemes strongly depends on the lifting operators used, namely the prediction and update ...
Person Foreground Segmentation by Learning Multi-Domain Networks
Separating the dominant person from the complex background is significant to the human-related research and photo-editing based applications. Existing segmentation algorithms are either too general to separate the person region accurately, or not capable ...
Universal Adversarial Patch Attack for Automatic Checkout Using Perceptual and Attentional Bias
Adversarial examples are inputs with imperceptible perturbations that easily mislead deep neural networks (DNNs). Recently, adversarial patch, with noise confined to a small and localized patch, has emerged for its easy feasibility in real-world ...
Adaptive Affinity for Associations in Multi-Target Multi-Camera Tracking
Data associations in multi-target multi-camera tracking (MTMCT) usually estimate affinity directly from re-identification (re-ID) feature distances. However, we argue that it might not be the best choice given the difference in matching scopes between re-...
Learning From Pixel-Level Label Noise: A New Perspective for Semi-Supervised Semantic Segmentation
This paper addresses semi-supervised semantic segmentation by exploiting a small set of images with pixel-level annotations (strong supervisions) and a large set of images with only image-level annotations (weak supervisions). Most existing approaches aim ...