No abstract available.
Front Matter
Front Matter
Light-Dark: A Novel Lightweight Self-supervised Monocular Depth Estimation in the Dark
Self-supervised monocular depth estimation has been widely studied in recent years. In nighttime scenes where the photometric consistency assumption is not met, several solutions have emerged to address this challenge. However, existing monocular ...
Non-homogeneous Image Dehazing with Edge Attention Based on Relative Haze Density
Image dehazing is a widely used technology for recovering clear images from hazy inputs. However, most dehazing methods are designed to target a specific haze concentration, without considering the varying degrees of image degradation. Removing ...
Contrastive Learning for Silent Face Liveness Detection Based on A Hybrid Framework
Face liveness detection is essential to ensuring the security of face recognition systems. Most current models rely on convolutional neural networks to achieve domain generalization through complete representations on common modules. The ...
BS2CL: Balanced Self-supervised Contrastive Learning for Thyroid Cytology Whole Slide Image Multi-classification
Thyroid cytology whole slide images (WSIs) hold vital information essential for precise diagnosis. Given the huge size of WSIs, multiple instance learning (MIL) is an effective solution for the WSI classification task when only slide-level labels ...
Unsupervised Domain Adaptation Method for Medical Image Segmentation Using Fourier Feature Decoupling and Multi-scale Feature Fusion
Unsupervised Domain Adaptation (UDA) is an effective technique for utilizing labeled data from a source domain alongside unlabeled data from a target domain, aiming to mitigate the impact of domain shift on model performance. Feature decoupling-...
LVMUM: Toward Open-World Object Detection with Large Vision Models and Unsupervised Modeling
Open-world object detection (OWOD), as an emerging and challenging task in object detection, requires the model to have the ability to detect known and unknown objects in dynamic environments. Furthermore, it should have the capability to perform ...
Implementation and Application of Violence Detection System Based on Multi-head Attention and LSTM
The extensive expansion of surveillance has enabled the identification of numerous threats in advance. By examining surveillance footage, violent activities can be identified in time to prevent disastrous repercussions. In this paper, a method for ...
GFFNet: An Efficient Image Denoising Network with Group Feature Fusion
Image denoising is a critical pre-processing step for a wide range of image processing and computer vision applications, where the primary goal is to remove noise interference from corrupted images while preserving the essential features of the ...
End-to-End Object Detection with YOLOF
Within the field of computer vision, object detection is a core issue. A technique extensively utilized in convolution-oriented detectors is Non-Maximum Suppression (NMS), designed to suppress redundant predictions. However, the sequential nature ...
BiRGAN: Bi-directional Deep Image Retargeting
Current single retargeting operators perform poorly on diverse images and varying target sizes, rendering them unsuitable for both image reduction and expansion simultaneously. In this paper, we present a deep bi-directional image retargeting ...
MulTIR: Deep Multi-Target Image Retargeting
Image retargeting aims to resize images to fit various devices while maintaining good viewing experiences. Normally, multi-operator image retargeting shows better performance than single operator strategy, however, there is still no single method ...
PAAM (Parameter-free Attentional Aggregation Model)
The channel attention mechanism and spatial attention mechanism are crucial in enhancing the performance of convolutional neural networks. However, most existing methods focus on developing more intricate attention modules to improve performance, ...
FRFT Domain Watermarking Algorithm Based on GA Adaptive Optimization
In the field of digital communication and copyright protection, digital watermark technology is extremely critical to ensure the security of secret communications and copyrighted information. This paper proposes an improved hybrid watermark ...
Joint Semantic Feature and Optical Flow Learning for Automatic Echocardiography Segmentation
The left ventricle ejection fraction is an important index for assessing cardiac function and diagnosing cardiac diseases. At present, EchoNet-Dynamic dataset is the unique large-scale resource for studying ejection fraction estimation by ...
FMUnet: Frequency Feature Enhancement Multi-level U-Net for Low-Dose CT Denoising with a Real Collected LDCT Image Dataset
Accompanying the widespread use of CT systems in medical diagnostics has highlighted concerns about the health risks associated with X-ray radiation exposure. Despite reducing the use of X-rays, low-dose computed tomography (LDCT) as a method to ...
Research on Intelligent Recognition Algorithm of Container Numbers in Ports Based on Deep Learning
The identification of container number has important application value in the field of logistics and cargo transportation. A new container number recognition algorithm was proposed in this paper to solve the difficult problems such as different ...
Dr-SAM: U-Shape Structure Segment Anything Model for Generalizable Medical Image Segmentation
Medical image segmentation plays a pivotal role in computer-assisted medical diagnosis, contributing to precise diagnostics, treatment strategizing, and disease tracking. However, the availability of annotated data for medical image segmentation ...
Aerial Multi-object Tracking via Information Weighting
Multi-object tracking from an aerial perspective often faces typical challenges such as small objects, dual-source motion, and appearance similarity. This often results in low tracking accuracy. In this paper, we propose an Aerial multi-object ...
Optimization Method for Fractal Image Compression Based on Self-similarity Evaluation and Gradient Bisection Algorithm
Fractal Image Compression (FIC) is a spatial domain compression technique with high compression ratio and good image quality. It is widely used in the fields of image restoration, denoising and watermarking. However, in terms of coding time, ...
DiffGIC: Diffusion Prior Based Null-Space Correction for High Resolution Grayscale Image Colorization
Diffusion models have demonstrated exceptional abilities in colorizing grayscale images. To colorize high-resolution images, current methods use a strategy that combines super-resolution with hierarchical image processing (SR-HIPS). This approach ...
Chinese Character Image Inpainting with Skeleton Extraction and Adversarial Learning
Chinese character image inpainting aims to restore the missing textual regions with realistic contents. Existing algorithms for text image inpainting are primarily designed for English characters, however, their performance is suboptimal when ...
The Weakly Supervised Network of Hierarchical Attention Mechanism for Fine-Grained Classification
Fine-grained classification is challenging task to discriminate subtle and local differences from sub-categories. Many works improve the accuracy by relying heavily upon the use of the object or part annotations of images whose label are costly. ...
CS-KD: Confused Sample Knowledge Distillation for Semantic Segmentation of Aerial Imagery
Currently, semantic segmentation methods based on knowledge distillation (KD) mainly focus on transferring various structured knowledge to the student network and designing corresponding optimization goals to encourage the student network to ...
CD-Font: One-Shot Font Generation via Conditional Diffusion Model with Disentangled Guidance
One-shot font generation aims to create a new font library by extracting style information from the reference font. Most existing font generation methods rely on GAN-based image-to-image translation frameworks, which still suffer from unstable ...
Image Super-Resolution Reconstruction Based on Dual-Branch Channel Attention
Image super-resolution reconstruction is an important technique for converting low resolution images into high-resolution images. High resolution images can provide more information and are crucial for advanced visual tasks. However, traditional ...
A Flipped Reversible Information Hiding Method Based on AMP
Adaptive MSB Prediction is an effective technique to achieve Reversible Data Hiding in Encrypted Images(RDHEI). Specifically, the image is divided into 2 × 2 pixels blocks and encrypted to preserve pixel correlation, the shared MSB of pix-els in ...
Decoupling Control in Text-to-Image Diffusion Models
Large text-to-image models allow for high-quality and diverse synthesis of images from a given text prompt. However, many scenarios require that the content creation be controllable. Recent methods add image-level controls, e.g., edge and depth ...
Index Terms
- Advanced Intelligent Computing Technology and Applications: 20th International Conference, ICIC 2024, Tianjin, China, August 5–8, 2024, Proceedings, Part VII