MUME: Vol 30, No 5

Volume 30, Issue 5Oct 2024

Volume 30, Issue 5

Oct 2024

Publisher:

Springer-Verlag
Berlin, Heidelberg

ISSN:0942-4962

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

research-article

Multi-level pyramid fusion for efficient stereo matching

https://doi.org/10.1007/s00530-024-01419-4

Abstract

Stereo matching is a key technology for many autonomous driving and robotics applications. Recently, methods based on Convolutional Neural Network have achieved huge progress. However, it is still difficult to find accurate matching points in ...

research-article

Propagating prior information with transformer for robust visual object tracking

https://doi.org/10.1007/s00530-024-01423-8

Abstract

In recent years, the domain of visual object tracking has witnessed considerable advancements with the advent of deep learning methodologies. Siamese-based trackers have been pivotal, establishing a new architecture with a weight-shared backbone. ...

research-article

Underwater image enhancement based on weighted guided filter image fusion

https://doi.org/10.1007/s00530-024-01432-7

Abstract

An underwater image enhancement technique based on weighted guided filter image fusion is proposed to address challenges, including optical absorption and scattering, color distortion, and uneven illumination. The method consists of three stages: ...

research-article

Exploring multi-dimensional interests for session-based recommendation

https://doi.org/10.1007/s00530-024-01437-2

Abstract

Session-based recommendation (SBR) aims to recommend the next clicked item to users by mining the user’s interaction sequences in the current session. It has received widespread attention recently due to its excellent privacy protection ...

research-article

Gateinst: instance segmentation with multi-scale gated-enhanced queries in transformer decoder

https://doi.org/10.1007/s00530-024-01438-1

Abstract

Recently, a popular query-based end-to-end framework has been used for instance segmentation. However, queries update based on individual layers or scales of feature maps at each stage of Transformer decoding, which makes queries unable to gather ... $_{}$

research-article

Discrete codebook collaborating with transformer for thangka image inpainting

https://doi.org/10.1007/s00530-024-01439-0

Abstract

Thangka, as a precious heritage of painting art, holds irreplaceable research value due to its richness in Tibetan history, religious beliefs, and folk culture. However, it is susceptible to partial damage and form distortion due to natural ...

research-article

HNQA: histogram-based descriptors for fast night-time image quality assessment

https://doi.org/10.1007/s00530-024-01440-7

Abstract

Taking high quality images at night is a challenging issue for many applications. Therefore, assessing the quality of night-time images (NTIs) is a significant area of research. Since there is no reference image for such images, night-time image ...

research-article

3D human pose estimation method based on multi-constrained dilated convolutions

https://doi.org/10.1007/s00530-024-01441-6

Abstract

In recent years, research on 2D to 3D human pose estimation methods has gained increasing attention. However, these methods, such as depth ambiguity and self-occlusion, still need to be addressed. To address these problems, we propose a 3D human ...

research-article

Remote sensing image cloud removal based on multi-scale spatial information perception

https://doi.org/10.1007/s00530-024-01442-5

Abstract

Remote sensing imagery is indispensable in diverse domains, including geographic information systems, climate monitoring, agricultural planning, and disaster management. Nonetheless, cloud cover can drastically degrade the utility and quality of ...

research-article

Anomaly detection in surveillance videos using Transformer with margin learning

https://doi.org/10.1007/s00530-024-01443-4

Abstract

Weakly supervised video anomaly detection (WSVAD) constitutes a highly research-oriented and challenging project within the domains of image and video processing. In prior studies of WSVAD, it has typically been formulated as a multiple-instance ...

research-article

Exploiting multi-level consistency learning for source-free domain adaptation

https://doi.org/10.1007/s00530-024-01444-3

Abstract

Due to data privacy concerns, a more practical task known as Source-free Unsupervised Domain Adaptation (SFUDA) has gained significant attention recently. SFUDA adapts a pre-trained source model to the target domain without access to the source ...

research-article

Weakly-supervised temporal action localization using multi-branch attention weighting

https://doi.org/10.1007/s00530-024-01445-2

Abstract

Weakly-supervised temporal action localization aims to train an accurate and robust localization model using only video-level labels. Due to the lack of frame-level temporal annotations, existing weakly-supervised temporal action localization ...

research-article

MT-ASM: a multi-task attention strengthening model for fine-grained object recognition

https://doi.org/10.1007/s00530-024-01446-1

Abstract

Fine-Grained Object Recognition (FGOR) equips intelligent systems with recognition capabilities at or even beyond the level of human experts, making it a core technology for numerous applications such as biodiversity monitoring systems and ...

research-article

PS-YOLO: a small object detector based on efficient convolution and multi-scale feature fusion

https://doi.org/10.1007/s00530-024-01447-0

Abstract

Compared to generalized object detection, research on small object detection has been slow, mainly due to the need to learn appropriate features from limited information about small objects. This is coupled with difficulties such as information ...

research-article

Multimodal recommender system based on multi-channel counterfactual learning networks

https://doi.org/10.1007/s00530-024-01448-z

Abstract

Most multimodal recommender systems utilize multimodal content of user-interacted items as supplemental information to capture user preferences based on historical interactions without considering user-uninteracted items. In contrast, multimodal ...

research-article

Integrate encryption of multiple images based on a new hyperchaotic system and Baker map

Xingbin Liu

https://doi.org/10.1007/s00530-024-01449-y

Abstract

Image encryption serves as a crucial means to safeguard information against unauthorized access during both transmission and storage phases. This paper introduces an integrated encryption algorithm tailored for multiple images, leveraging a novel ...

research-article

SiamS3C: spatial-channel cross-correlation for visual tracking with centerness-guided regression

https://doi.org/10.1007/s00530-024-01450-5

Abstract

Visual object tracking can be divided into the object classification and bounding-box regression tasks, but only one sharing correlation map leads to inaccuracy. Siamese trackers compute correlation map by cross-correlation operation with high ...

research-article

Exploring multi-level transformers with feature frame padding network for 3D human pose estimation

https://doi.org/10.1007/s00530-024-01451-4

Abstract

Recently, transformer-based architecture achieved remarkable performance in 2D to 3D lifting pose estimation. Despite advancements in transformer-based architecture they still struggle to handle depth ambiguity, limited temporal information, ...

research-article

Adaptive B-spline curve fitting with minimal control points using an improved sparrow search algorithm for geometric modeling of aero-engine blades

https://doi.org/10.1007/s00530-024-01452-3

Abstract

In Industry 4.0 and advanced manufacturing, producing high-precision, complex products such as aero-engine blades involves sophisticated processes. Digital twin technology enables the creation of high-precision, real-time 3D models, optimizing ...

research-article

BSP-Net: automatic skin lesion segmentation improved by boundary enhancement and progressive decoding methods

https://doi.org/10.1007/s00530-024-01453-2

Abstract

Automatic skin lesion segmentation from dermoscopy images is of great significance in the early treatment of skin cancers, which is yet challenging even for dermatologists due to the inherent issues, i.e., considerable size, shape and color ...

research-article

Wacml: based on graph neural network for imbalanced node classification algorithm

https://doi.org/10.1007/s00530-024-01454-1

Abstract

The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor ...

research-article

3D model watermarking using surface integrals of generated random vector fields

https://doi.org/10.1007/s00530-024-01455-0

Abstract

We propose a new semi-blind semi-fragile watermarking algorithm for authenticating triangulated 3D models using the surface integrals of generated random vector fields. Watermark data is embedded into the flux of a vector field across the model’s ...

research-article

Contour-assistance-based video matting localization

https://doi.org/10.1007/s00530-024-01456-z

Abstract

Video matting is a technique used to replace foreground objects in video frames by predicting their alpha matte. Originally developed for film special effects, advertisements, and live streaming, video matting can also be exploited for malicious ...

research-article

Cvstgan: A Controllable Generative Adversarial Network for Video Style Transfer of Chinese Painting

https://doi.org/10.1007/s00530-024-01457-y

Abstract

Style transfer aims to apply the stylistic characteristics of a reference image onto a target image or video. Existing studies on style transfer suffer from either fixed style without adjustability or unclear stylistic patterns in output results. ...

research-article

Triple fusion and feature pyramid decoder for RGB-D semantic segmentation

https://doi.org/10.1007/s00530-024-01459-w

Abstract

Current RGB-D semantic segmentation networks incorporate depth information as an extra modality and merge RGB and depth features using methods such as equal-weighted concatenation or simple fusion strategies. However, these methods hinder the ...

research-article

Large scale multimodal fashion care recognition

https://doi.org/10.1007/s00530-024-01461-2

Abstract

Smart Fashion is reshaping people’s lives, and affects people’s choices and outfits. Existing computer-vision-enabled fashion technology has covered many aspects, such as fashion detection, fashion recognition, fashion segmentation, virtual ...

research-article

Physical-prior-guided single image dehazing network via unpaired contrastive learning

https://doi.org/10.1007/s00530-024-01462-1

Abstract

Image dehazing aims to restore high fidelity clear images from hazy ones. It has wide applications on many intelligent image analysis systems in computer vision area. Many prior-based and learning-based methods have already made significant ...

research-article

Multi-scale motion contrastive learning for self-supervised skeleton-based action recognition

https://doi.org/10.1007/s00530-024-01463-0

Abstract

People process things and express feelings through actions, action recognition has been able to be widely studied, yet under-explored. Traditional self-supervised skeleton-based action recognition focus on joint point features, ignoring the ...

research-article

LLR-MVSNet: a lightweight network for low-texture scene reconstruction

https://doi.org/10.1007/s00530-024-01464-z

Abstract

In recent years, learning-based MVS methods have achieved excellent performance compared with traditional methods. However, these methods still have notable shortcomings, such as the low efficiency of traditional convolutional networks and simple ...

research-article

Automatic lymph node segmentation using deep parallel squeeze & excitation and attention Unet

https://doi.org/10.1007/s00530-024-01465-y

Abstract

Automatic segmentation and lymph node (LN) detection for cancer staging are critical. In clinical practice, computed tomography (CT) and positron emission tomography (PET) imaging detect abnormal LNs. Yet, it is still a difficult task due to the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Multimedia Systems

Sections

Multi-level pyramid fusion for efficient stereo matching

Propagating prior information with transformer for robust visual object tracking

Underwater image enhancement based on weighted guided filter image fusion

Exploring multi-dimensional interests for session-based recommendation

Gateinst: instance segmentation with multi-scale gated-enhanced queries in transformer decoder

Discrete codebook collaborating with transformer for thangka image inpainting

HNQA: histogram-based descriptors for fast night-time image quality assessment

3D human pose estimation method based on multi-constrained dilated convolutions

Remote sensing image cloud removal based on multi-scale spatial information perception

Anomaly detection in surveillance videos using Transformer with margin learning

Exploiting multi-level consistency learning for source-free domain adaptation

Weakly-supervised temporal action localization using multi-branch attention weighting

MT-ASM: a multi-task attention strengthening model for fine-grained object recognition

PS-YOLO: a small object detector based on efficient convolution and multi-scale feature fusion

Multimodal recommender system based on multi-channel counterfactual learning networks

Integrate encryption of multiple images based on a new hyperchaotic system and Baker map

SiamS3C: spatial-channel cross-correlation for visual tracking with centerness-guided regression

Exploring multi-level transformers with feature frame padding network for 3D human pose estimation

Adaptive B-spline curve fitting with minimal control points using an improved sparrow search algorithm for geometric modeling of aero-engine blades

BSP-Net: automatic skin lesion segmentation improved by boundary enhancement and progressive decoding methods

Wacml: based on graph neural network for imbalanced node classification algorithm

3D model watermarking using surface integrals of generated random vector fields

Contour-assistance-based video matting localization

Cvstgan: A Controllable Generative Adversarial Network for Video Style Transfer of Chinese Painting

Triple fusion and feature pyramid decoder for RGB-D semantic segmentation

Large scale multimodal fashion care recognition

Physical-prior-guided single image dehazing network via unpaired contrastive learning

Multi-scale motion contrastive learning for self-supervised skeleton-based action recognition

LLR-MVSNet: a lightweight network for low-texture scene reconstruction

Automatic lymph node segmentation using deep parallel squeeze & excitation and attention Unet