1CFCS, Peking University
2School of EECS, Peking University
3Beijing Institute for General Artificial Intelligence
4Tsinghua University
5University of California, Los Angeles
* equal contributions
† corresponding author
We propose to learn generalizable object perception and manipulation skills via Generalizable and Actionable Parts, and present GAPartNet, a large-scale interactive dataset with rich part annotations. We propose a domain generalization method for cross-category part segmentation and pose estimation. Our GAPart definition boosts cross-category object manipulation and can transfer to real.
For years, researchers have been devoted to generalizable object perception and manipulation, where crosscategory generalizability is highly desired yet underexplored. In this work, we propose to learn such crosscategory skills via Generalizable and Actionable Parts (GAParts). By identifying and defining 9 GAPart classes (lids, handles, etc.) in 27 object categories, we construct a large-scale part-centric interactive dataset, GAPartNet, where we provide rich, part-level annotations (semantics, poses) for 8,489 part instances on 1,166 objects. Based on GAPartNet, we investigate three cross-category tasks: part segmentation, part pose estimation, and part-based object manipulation. Given the significant domain gaps between seen and unseen object categories, we propose a robust 3D segmentation method from the perspective of domain generalization by integrating adversarial learning techniques. Our method outperforms all existing methods by a large margin, no matter on seen or unseen categories. Furthermore, with part segmentation and pose estimation results, we leverage the GAPart pose definition to design part-based manipulation heuristics that can generalize well to unseen object categories in both the simulator and the real world.
An Overview of Our Domain-generalizable Part Segmentation and Pose Estimation Method. We introduce a part-oriented domain adversarial training strategy that can tackle multi-resolution features and distribution imbalance for the domain-invariant GAPart feature extraction. The training strategy tackles the challenges in our tasks and dataset, significantly improving the generalizability of our method for part segmentation and pose estimation.
@article{geng2022gapartnet,
title={GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts},
author={Geng, Haoran and Xu, Helin and Zhao, Chengyang and Xu, Chao and Yi, Li and Huang, Siyuan and Wang, He},
journal={arXiv preprint arXiv:2211.05272},
year={2022}
}
If you have any questions, please feel free to contact us: