ITPM: Vol 45, No 12

Volume 45, Issue 12Dec. 2023

Publisher:

IEEE Computer Society
1730 Massachusetts Ave., NW Washington, DC
United States

ISSN:0162-8828

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

research-article

A Parametrical Model for Instance-Dependent Label Noise

Pages 14055–14068https://doi.org/10.1109/TPAMI.2023.3301876

In label-noise learning, estimating the <italic>transition matrix</italic> is a hot topic as the matrix plays an important role in building <italic>statistically consistent classifiers</italic>. Traditionally, the transition from clean labels to noisy ...

research-article

Open Access

A Survey of Vectorization Methods in Topological Data Analysis

Pages 14069–14080https://doi.org/10.1109/TPAMI.2023.3308391

Attempts to incorporate topological information in supervised learning tasks have resulted in the creation of several techniques for vectorizing persistent homology barcodes. In this paper, we study thirteen such methods. Besides describing an ...

research-article

Action Recognition and Benchmark Using Event Cameras

Pages 14081–14097https://doi.org/10.1109/TPAMI.2023.3300741

Recent years have witnessed remarkable achievements in video-based action recognition. Apart from traditional frame-based cameras, event cameras are bio-inspired vision sensors that only record pixel-wise brightness changes rather than the brightness ...

research-article

ActiveZero++: Mixed Domain Learning Stereo and Confidence-Based Depth Completion With Zero Annotation

Pages 14098–14113https://doi.org/10.1109/TPAMI.2023.3305399

Learning-based stereo methods usually require a large scale dataset with depth, however obtaining accurate depth in the real domain is difficult, but groundtruth depth is readily available in the simulation domain. In this article we propose a new ...

research-article

AdaPoinTr: Diverse Point Cloud Completion With Adaptive Geometry-Aware Transformers

Pages 14114–14130https://doi.org/10.1109/TPAMI.2023.3309253

In this paper, we propose a Transformer encoder-decoder architecture, called PoinTr, which reformulates point cloud completion as a set-to-set translation problem and employs a geometry-aware block to model local geometric relationships explicitly. The ...

research-article

Open Access

Adversarial Data Augmentation for HMM-Based Anomaly Detection

Pages 14131–14143https://doi.org/10.1109/TPAMI.2023.3303099

In this work, we concentrate on the detection of anomalous behaviors in systems operating in the physical world and for which it is usually not possible to have a complete set of all possible anomalies in advance. We present a data augmentation and ...

research-article

Attribute-Guided Collaborative Learning for Partial Person Re-Identification

Pages 14144–14160https://doi.org/10.1109/TPAMI.2023.3312302

Partial person re-identification (ReID) aims to solve the problem of image spatial misalignment due to occlusions or out-of-views. Despite significant progress through the introduction of additional information, such as human pose landmarks, mask maps, ...

research-article

AUC-Oriented Domain Adaptation: From Theory to Algorithm

Pages 14161–14174https://doi.org/10.1109/TPAMI.2023.3303943

The Area Under the ROC curve (AUC) is a crucial metric for machine learning, which is often a reasonable choice for applications like disease prediction and fraud detection where the datasets often exhibit a long-tail nature. However, most of the existing ...

research-article

Background-Aware Classification Activation Map for Weakly Supervised Object Localization

Pages 14175–14191https://doi.org/10.1109/TPAMI.2023.3309621

Weakly supervised object localization (WSOL) relaxes the requirement of dense annotations for object localization by using image-level annotation to supervise the learning process. However, most WSOL methods only focus on forcing the object classifier to ...

research-article

<italic>Bailando</italic>++: 3D Dance GPT With Choreographic Memory

Pages 14192–14207https://doi.org/10.1109/TPAMI.2023.3319435

Our proposed music-to-dance framework, <italic>Bailando</italic>++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. <...

research-article

CALDA: Improving Multi-Source Time Series Domain Adaptation With Contrastive Adversarial Learning

Pages 14208–14221https://doi.org/10.1109/TPAMI.2023.3298346

Unsupervised domain adaptation (UDA) provides a strategy for improving machine learning performance in data-rich (target) domains where ground truth labels are inaccessible but can be found in related (source) domains. In cases where meta-domain ...

research-article

Coarse-to-Fine Multi-Scene Pose Regression With Transformers

Pages 14222–14233https://doi.org/10.1109/TPAMI.2023.3310929

Absolute camera pose regressors estimate the position and orientation of a camera given the captured image alone. Typically, a convolutional backbone with a multi-layer perceptron (MLP) head is trained using images and pose labels to embed a single ...

research-article

Open Access

Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation

Pages 14234–14247https://doi.org/10.1109/TPAMI.2023.3310261

Deep-learning models for 3D point cloud semantic segmentation exhibit limited generalization capabilities when trained and tested on data captured with different sensors or in varying environments due to domain shift. Domain adaptation methods can be ...

research-article

Open Access

Comprehensive Vulnerability Evaluation of Face Recognition Systems to Template Inversion Attacks via 3D Face Reconstruction

Pages 14248–14265https://doi.org/10.1109/TPAMI.2023.3312123

In this article, we comprehensively evaluate the vulnerability of state-of-the-art face recognition systems to template inversion attacks using 3D face reconstruction. We propose a new method (called GaFaR) to reconstruct 3D faces from facial templates ...

research-article

Correlation Recurrent Units: A Novel Neural Architecture for Improving the Predictive Performance of Time-Series Data

Pages 14266–14283https://doi.org/10.1109/TPAMI.2023.3319557

Time-series forecasting (TSF) is a traditional problem in the field of artificial intelligence, and models such as recurrent neural network, long short-term memory, and gate recurrent units have contributed to improving its predictive accuracy. ...

research-article

CycleMLP: A MLP-Like Architecture for Dense Visual Predictions

Pages 14284–14300https://doi.org/10.1109/TPAMI.2023.3303397

This article presents a simple yet effective multilayer perceptron (MLP) architecture, namely CycleMLP, which is a versatile neural backbone network capable of solving various tasks of dense visual predictions such as object detection, segmentation, and ...

research-article

Digging Into Uncertainty-Based Pseudo-Label for Robust Stereo Matching

Pages 14301–14320https://doi.org/10.1109/TPAMI.2023.3300976

Due to the domain differences and unbalanced disparity distribution across multiple datasets, current stereo matching approaches are commonly limited to a specific dataset and generalize poorly to others. Such domain shift issue is usually addressed by ...

research-article

Discrete and Balanced Spectral Clustering With Scalability

Pages 14321–14336https://doi.org/10.1109/TPAMI.2023.3311828

Spectral Clustering (SC) has been the main subject of intensive research due to its remarkable clustering performance. Despite its successes, most existing SC methods suffer from several critical issues. First, they typically involve two independent ...

research-article

Distributionally Robust Memory Evolution With Generalized Divergence for Continual Learning

Pages 14337–14352https://doi.org/10.1109/TPAMI.2023.3317874

Continual learning (CL) aims to learn a non-stationary data distribution and not forget previous knowledge. The effectiveness of existing approaches that rely on memory replay can decrease over time as the model tends to overfit the stored examples. As a ...

research-article

Domain Adaptive Object Detection via Balancing Between Self-Training and Adversarial Learning

Pages 14353–14365https://doi.org/10.1109/TPAMI.2023.3290135

Deep learning based object detectors struggle generalizing to a new target domain bearing significant variations in object and background. Most current methods align domains by using image or instance-level adversarial feature alignment. This often ...

research-article

DPCN++: Differentiable Phase Correlation Network for Versatile Pose Registration

Pages 14366–14384https://doi.org/10.1109/TPAMI.2023.3317501

Pose registration is critical in vision and robotics. This article focuses on the challenging task of initialization-free pose registration up to 7DoF for homogeneous and heterogeneous measurements. While recent learning-based methods show promise using ...

research-article

DreamStone: Image as a Stepping Stone for Text-Guided 3D Shape Generation

Pages 14385–14403https://doi.org/10.1109/TPAMI.2023.3321329

This paper presents a new text-guided 3D shape generation approach DreamStone that uses images as a stepping stone to bridge the gap between the text and shape modalities for generating 3D shapes without requiring paired text and 3D data. The core of our ...

research-article

Dynamic Keypoint Detection Network for Image Matching

Pages 14404–14419https://doi.org/10.1109/TPAMI.2023.3307889

Establishing effective correspondences between a pair of images is difficult due to real-world challenges such as illumination, viewpoint and scale variations. Modern detector-based methods typically learn fixed detectors from a given dataset, which is ...

research-article

Dynamic Loss for Robust Learning

Pages 14420–14434https://doi.org/10.1109/TPAMI.2023.3311636

Label noise and class imbalance are common challenges encountered in real-world datasets. Existing approaches for robust learning often focus on addressing either label noise or class imbalance individually, resulting in suboptimal performance when both ...

research-article

Edge Guided GANs With Multi-Scale Contrastive Learning for Semantic Image Synthesis

Pages 14435–14452https://doi.org/10.1109/TPAMI.2023.3298721

We propose a novel <underline>e</underline>dge guided <underline>g</underline>enerative <underline>a</underline>dversarial <underline>n</underline>etwork with <underline>c</underline>ontrastive learning (ECGAN) for the challenging semantic image synthesis ...

research-article

Efficient Federated Learning Via Local Adaptive Amended Optimizer With Linear Speedup

Pages 14453–14464https://doi.org/10.1109/TPAMI.2023.3300886

Adaptive optimization has achieved notable success for distributed learning while extending adaptive optimizer to federated Learning (FL) suffers from severe inefficiency, including (i) rugged convergence due to inaccurate gradient estimation in global ...

research-article

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

Pages 14465–14480https://doi.org/10.1109/TPAMI.2023.3316020

During image editing, existing deep generative models tend to re-synthesize the entire output from scratch, including the unedited regions. This leads to a significant waste of computation, especially for minor editing operations. In this work, we present ...

research-article

End-to-End One-Shot Human Parsing

Pages 14481–14496https://doi.org/10.1109/TPAMI.2023.3301672

Previous human parsing methods are limited to parsing humans into pre-defined classes, which is inflexible for practical fashion applications that often have new fashion item classes. In this paper, we define a novel one-shot human parsing (OSHP) task ...

research-article

Evaluating the Generalization Ability of Super-Resolution Networks

Pages 14497–14513https://doi.org/10.1109/TPAMI.2023.3312313

Performance and generalization ability are two important aspects to evaluate the deep learning models. However, research on the generalization ability of Super-Resolution (SR) networks is currently absent. Assessing the generalization ability of deep ...

research-article

Evolving Domain Generalization via Latent Structure-Aware Sequential Autoencoder

Pages 14514–14527https://doi.org/10.1109/TPAMI.2023.3319984

Domain generalization (DG) refers to the problem of generalizing machine learning systems to out-of-distribution (OOD) data with knowledge learned from several provided source domains. Most prior works confine themselves to stationary and discrete ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

IEEE Transactions on Pattern Analysis and Machine Intelligence

Sections

A Parametrical Model for Instance-Dependent Label Noise

A Survey of Vectorization Methods in Topological Data Analysis

Action Recognition and Benchmark Using Event Cameras

ActiveZero++: Mixed Domain Learning Stereo and Confidence-Based Depth Completion With Zero Annotation

AdaPoinTr: Diverse Point Cloud Completion With Adaptive Geometry-Aware Transformers

Adversarial Data Augmentation for HMM-Based Anomaly Detection

Attribute-Guided Collaborative Learning for Partial Person Re-Identification

AUC-Oriented Domain Adaptation: From Theory to Algorithm

Background-Aware Classification Activation Map for Weakly Supervised Object Localization

<italic>Bailando</italic>++: 3D Dance GPT With Choreographic Memory

CALDA: Improving Multi-Source Time Series Domain Adaptation With Contrastive Adversarial Learning

Coarse-to-Fine Multi-Scene Pose Regression With Transformers

Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation

Comprehensive Vulnerability Evaluation of Face Recognition Systems to Template Inversion Attacks via 3D Face Reconstruction

Correlation Recurrent Units: A Novel Neural Architecture for Improving the Predictive Performance of Time-Series Data

CycleMLP: A MLP-Like Architecture for Dense Visual Predictions

Digging Into Uncertainty-Based Pseudo-Label for Robust Stereo Matching

Discrete and Balanced Spectral Clustering With Scalability

Distributionally Robust Memory Evolution With Generalized Divergence for Continual Learning

Domain Adaptive Object Detection via Balancing Between Self-Training and Adversarial Learning

DPCN++: Differentiable Phase Correlation Network for Versatile Pose Registration

DreamStone: Image as a Stepping Stone for Text-Guided 3D Shape Generation

Dynamic Keypoint Detection Network for Image Matching

Dynamic Loss for Robust Learning

Edge Guided GANs With Multi-Scale Contrastive Learning for Semantic Image Synthesis

Efficient Federated Learning Via Local Adaptive Amended Optimizer With Linear Speedup

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

End-to-End One-Shot Human Parsing

Evaluating the Generalization Ability of Super-Resolution Networks

Evolving Domain Generalization via Latent Structure-Aware Sequential Autoencoder