VISC: Vol 40, No 9

Volume 40, Issue 9Sep 2024Current Issue

Latest Issue

Volume 40, Issue 9

Sep 2024

Publisher:

Springer-Verlag
Berlin, Heidelberg

ISSN:0178-2789

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

research-article

Stereo-RSSF: stereo robust sparse scene-flow estimation

Pages 5901–5919https://doi.org/10.1007/s00371-023-03143-y

Abstract

Scene-flow (SF) estimation is considered to be one of the most fundamental problems in scene understanding and autonomous control. The majority of the existing methods adopted for SF estimation suffer lack of robustness in some environments and ...

research-article

Boundary-aware small object detection with attention and interaction

Pages 5921–5934https://doi.org/10.1007/s00371-023-03144-x

Abstract

Object detection is a critical technology for the intelligent analytical processing of images captured by drones. The objects usually come in various scales and can be extremely small. Existing detection methods are inherently based on pyramid ...

research-article

TSNet: Task-specific network for joint diabetic retinopathy grading and lesion segmentation of ultra-wide optical coherence tomography angiography images

Pages 5935–5946https://doi.org/10.1007/s00371-023-03145-w

Abstract

Diabetic retinopathy (DR) is a common complication of diabetes which may lead to blindness. Early diagnosis can effectively prevent the deterioration of the disease and enable timely treatment. Ophthalmologists diagnose DR by observing ultra-wide ...

research-article

Cluster-based two-branch framework for point cloud attribute compression

Pages 5947–5960https://doi.org/10.1007/s00371-023-03146-9

Abstract

Owing to the irregular distribution of point clouds in 3D space, effectively compressing the point cloud is still challenging. Recently, numerous compression methods have been developed with outstanding performance for the compression of geometry ...

research-article

End-to-end learning for joint depth and image reconstruction from diffracted rotation

Pages 5961–5977https://doi.org/10.1007/s00371-023-03147-8

Abstract

Monocular depth estimation is an open challenge due to the ill-posed nature of the problem at hand. Deep learning techniques proved capable of producing acceptable depth estimation accuracy but the lack of robust depth cues within RGB images ...

research-article

Multi-scale color constancy based on salient varying local spatial statistics

Pages 5979–5995https://doi.org/10.1007/s00371-023-03148-7

Abstract

The human visual system unconsciously determines the color of the objects by “discounting” the effects of the illumination, whereas machine vision systems have difficulty performing this task. Color constancy algorithms assist computer vision ...

research-article

A self-attention model for viewport prediction based on distance constraint

Pages 5997–6014https://doi.org/10.1007/s00371-023-03149-6

Abstract

Panoramic video multimedia technology has made significant advancements in recent years, providing users with an immersive experience by displaying the entire 360° spherical scene centered around their virtual location. However, due to its larger ...

research-article

Wall segmentation in house plans: fusion of deep learning and traditional methods

Pages 6015–6031https://doi.org/10.1007/s00371-023-03150-z

Abstract

Recognition and extraction of elements from house plans present significant challenges in the construction, decoration and interior design industries. To address this issue, this paper proposes a wall segmentation system for house plans that ...

research-article

Masked-attention diffusion guidance for spatially controlling text-to-image generation

Yuki Endo

Pages 6033–6045https://doi.org/10.1007/s00371-023-03151-y

Abstract

Text-to-image synthesis has achieved high-quality results with recent advances in diffusion models. However, text input alone has high spatial ambiguity and limited user controllability. Most existing methods allow spatial control through ...

research-article

MFOGCN: multi-feature-based orthogonal graph convolutional network for 3D human motion prediction

Pages 6047–6062https://doi.org/10.1007/s00371-023-03152-x

Abstract

Human motion prediction in various motion capture applications, e.g., optical and inertial, is challenging because of the complexity of human motion sequences. Current studies on this issue have insufficient analysis on the latent motion ...

research-article

Modeling and realization of image-based garment texture transfer

Pages 6063–6079https://doi.org/10.1007/s00371-023-03153-w

Abstract

We present an automated framework founded on texture transfer, facilitating the substitution of textures in garment images with specified ones for applications in garment design and online presentation. In contrast to previous methodologies, our ...

research-article

Interpolating meshes of arbitrary topology by Catmull–Clark surfaces with energy constraint

Pages 6081–6092https://doi.org/10.1007/s00371-023-03154-9

Abstract

We propose an efficient method with energy constraints for constructing a Catmull–Clark surface that interpolates a given mesh. We approximate the surface energy of Catmull–Clark surfaces near extraordinary points by summing their finite ...

research-article

A novel deformable B-spline curve model based on elasticity

Pages 6093–6110https://doi.org/10.1007/s00371-023-03155-8

Abstract

The physically based deformable curve models are widely used to simulate thin one-dimensional objects in computer graphics, interactive simulation, and surgery simulation. These models consider objects to be rods described by an adapted frame ...

research-article

Answer sheet layout analysis based on YOLOv5s-DC and MSER

Pages 6111–6122https://doi.org/10.1007/s00371-023-03156-7

Abstract

Layout analysis is the first step in automatic grading and other OCR tasks. Although various layout analysis technologies have been developed for different application scenarios, existing approaches still have difficulty in achieving high accuracy ...

research-article

Fast continuous patch-based artistic style transfer for videos

Pages 6123–6136https://doi.org/10.1007/s00371-023-03157-6

Abstract

Convolutional neural network-based image style transfer models often suffer from temporal inconsistency when applied to video. Although several video style transfer models have been proposed to improve temporal consistency, they often trade off ...

research-article

3D point cloud denoising method based on global feature guidance

Pages 6137–6153https://doi.org/10.1007/s00371-023-03158-5

Abstract

Raw point cloud (PC) data acquired by 3D sensors or reconstruction algorithms inevitably contain noise and outliers, which can seriously impact downstream tasks, such as surface reconstruction and target detection. To address this problem, this ...

research-article

Ni-DehazeNet: representation learning via bilevel optimized architecture search for nighttime dehazing

Pages 6155–6170https://doi.org/10.1007/s00371-023-03159-4

Abstract

Nighttime dehazing is a challenging ill-posed problem due to the severe haze pollution and color attenuation. Since available daytime dehazing approaches cannot be consistently adapted to the nighttime case, this paper specifically designs ...

research-article

Survey on vision-based dynamic hand gesture recognition

Pages 6171–6199https://doi.org/10.1007/s00371-023-03160-x

Abstract

To communicate with one another hand, gesture is very important. The task of using the hand gesture in technology is influenced by a very common way humans communicate with the natural environment. The recognizing and finding pose estimation of ...

research-article

Segmentation-driven feature-preserving mesh denoising

Pages 6201–6217https://doi.org/10.1007/s00371-023-03161-w

Abstract

Feature-preserving mesh denoising has received noticeable attention in visual media, with the aim of recovering high-fidelity, clean mesh shapes from the ones that are contaminated by noise. Existing denoising methods often design smaller weights ...

research-article

Scene representation using a new two-branch neural network model

Pages 6219–6244https://doi.org/10.1007/s00371-023-03162-9

Abstract

Scene classification and recognition have always been one of the most challenging tasks of scene understanding due to the inherent ambiguity in visual scenes. The core of scene classification and recognition tasks is scene representation. Deep ...

research-article

Automated barcodeless product classifier for food retail self-checkout images

Pages 6245–6259https://doi.org/10.1007/s00371-023-03163-8

Abstract

Growing popularity of self-service in retail stores and increasing associated shrinkage presents an urgent need for computer-vision-based product recognition in the area of self-checkouts. The article focuses on individual product recognition ...

research-article

Bidirectional feature enhancement transformer for unsupervised domain adaptation

Pages 6261–6277https://doi.org/10.1007/s00371-023-03164-7

Abstract

Unsupervised domain adaptation (UDA) aims to generalize knowledge learned from one labeled source domain to another unlabeled target domain. To extract domain-invariant feature representations, most existing UDA approaches leverage convolution ...

research-article

STAM: a spatio-temporal adaptive module for improving static convolutions in action recognition

Pages 6279–6293https://doi.org/10.1007/s00371-023-03165-6

Abstract

Temporal adaptive convolution has demonstrated superior performance over static convolution techniques in video understanding. However, it needs to be improved in long-time series modeling and multi-scale feature-map adaptation. To address these ...

research-article

Mixture autoregressive and spectral attention network for multispectral image compression based on variational autoencoder

Pages 6295–6318https://doi.org/10.1007/s00371-023-03166-5

Abstract

Multispectral images, with their unique three-dimensional characteristics, require specialized spatial-spectral feature extraction modules to achieve superior compression results. Current end-to-end compression frameworks underperform compared to ...

research-article

Real-scene-constrained virtual scene layout synthesis for mixed reality

Pages 6319–6339https://doi.org/10.1007/s00371-023-03167-4

Abstract

Given a real source scene and a virtual target scene, the real-scene-constrained virtual scene layout synthesis problem is defined as how to re-synthesize the layout of the virtual furniture in the virtual scene to form a new virtual scene such ...

research-article

A self-attention-based fusion framework for facial expression recognition in wavelet domain

Pages 6341–6357https://doi.org/10.1007/s00371-023-03168-3

Abstract

Facial expression recognition (FER) plays a vital role for applications based on human–computer interaction. In the past few years, many deep learning models have been proposed for FER, but their performance is limited due to challenges such as ...

research-article

Residual network-based ocean wave modelling from satellite images using ensemble Kalman filter

Pages 6359–6368https://doi.org/10.1007/s00371-023-03169-2

Abstract

Nonlinear ocean waves have a significant impact on the functioning of several offshore activities. Predicting the internal ocean waves plays a crucial role on submarine and ship operations. Data assimilation is a mechanism in which data observed ...

research-article

Obtaining the user-defined polygons inside a closed contour with holes

Pages 6369–6387https://doi.org/10.1007/s00371-023-03170-9

Abstract

In image processing, computer vision algorithms are applied to regions bounded by closed contours. These contours are often irregular, poorly defined, and contain holes or unavailable areas inside. A common problem in computational geometry ...

research-article

TMGAN: two-stage multi-domain generative adversarial network for landscape image translation

Pages 6389–6405https://doi.org/10.1007/s00371-023-03171-8

Abstract

Chinese landscape paintings, realistic landscape photographs, and oil paintings each possess unique artistic characteristics and painting features. Image-to-image translation between these three domains is an extremely challenging task. Existing ...

correction

Correction: A coupling method of learning structured support correlation filters for visual tracking

Page 6407https://doi.org/10.1007/s00371-023-03172-7

Comments

Please enable JavaScript to view thecomments powered by Disqus.

The Visual Computer: International Journal of Computer Graphics

Sections

Stereo-RSSF: stereo robust sparse scene-flow estimation

Boundary-aware small object detection with attention and interaction

TSNet: Task-specific network for joint diabetic retinopathy grading and lesion segmentation of ultra-wide optical coherence tomography angiography images

Cluster-based two-branch framework for point cloud attribute compression

End-to-end learning for joint depth and image reconstruction from diffracted rotation

Multi-scale color constancy based on salient varying local spatial statistics

A self-attention model for viewport prediction based on distance constraint

Wall segmentation in house plans: fusion of deep learning and traditional methods

Masked-attention diffusion guidance for spatially controlling text-to-image generation

MFOGCN: multi-feature-based orthogonal graph convolutional network for 3D human motion prediction

Modeling and realization of image-based garment texture transfer

Interpolating meshes of arbitrary topology by Catmull–Clark surfaces with energy constraint

A novel deformable B-spline curve model based on elasticity

Answer sheet layout analysis based on YOLOv5s-DC and MSER

Fast continuous patch-based artistic style transfer for videos

3D point cloud denoising method based on global feature guidance

Ni-DehazeNet: representation learning via bilevel optimized architecture search for nighttime dehazing

Survey on vision-based dynamic hand gesture recognition

Segmentation-driven feature-preserving mesh denoising

Scene representation using a new two-branch neural network model

Automated barcodeless product classifier for food retail self-checkout images

Bidirectional feature enhancement transformer for unsupervised domain adaptation

STAM: a spatio-temporal adaptive module for improving static convolutions in action recognition

Mixture autoregressive and spectral attention network for multispectral image compression based on variational autoencoder

Real-scene-constrained virtual scene layout synthesis for mixed reality

A self-attention-based fusion framework for facial expression recognition in wavelet domain

Residual network-based ocean wave modelling from satellite images using ensemble Kalman filter

Obtaining the user-defined polygons inside a closed contour with holes

TMGAN: two-stage multi-domain generative adversarial network for landscape image translation

Correction: A coupling method of learning structured support correlation filters for visual tracking