Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleFebruary 2023
Initialization and Alignment for Adversarial Texture Optimization
AbstractWhile recovery of geometry from image and video data has received a lot of attention in computer vision, methods to capture the texture for a given geometry are less mature. Specifically, classical methods for texture generation often assume clean ...
- ArticleOctober 2022
Initialization and Alignment for Adversarial Texture Optimization
AbstractWhile recovery of geometry from image and video data has received a lot of attention in computer vision, methods to capture the texture for a given geometry are less mature. Specifically, classical methods for texture generation often assume clean ...
- ArticleMarch 2023
Ultra-Fast Lidar Scene Analysis Using Convolutional Neural Network
AbstractThis work introduces a ultra-fast object detection method named FLA-CNN for detecting objects in a scene from a planar LIDAR signal, using convolutional Neural Networks (CNN). Compared with recent methods using CNN on 2D/3D lidar scene ...
- ArticleJune 2022
Learning Scale-Invariant Object Representations with a Single-Shot Convolutional Generative Model
AbstractContemporary machine learning literature highlights learning object-centric image representations’ benefits, i.e. interpretability, and the improved generalization performance. In the current work, we develop a neural network architecture that ...
-
- review-articleMay 2022
An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction
EURASIP Journal on Audio, Speech, and Music Processing (EJASMP), Volume 2022, Issue 1https://doi.org/10.1186/s13636-022-00242-xAbstractThe domain of spatial audio comprises methods for capturing, processing, and reproducing audio content that contains spatial information. Data-based methods are those that operate directly on the spatial information carried by audio signals. This ...
- research-articleNovember 2021
RGB-D scene analysis in the NICU
Computers in Biology and Medicine (CBIM), Volume 138, Issue Chttps://doi.org/10.1016/j.compbiomed.2021.104873AbstractContinuity of care is achieved in the neonatal intensive care unit (NICU) through careful documentation of all events of clinical significance, including clinical interventions and routine care events (e.g., feeding, diaper change, weighing, etc.)...
Highlights- Multi-faceted and multi-modal computer vision for complex scene understanding in the neonatal intensive care unit (NICU).
- Proof-of-concept textual summary generation of a clinical NICU scene as a step towards semi-automated nurse ...
- research-articleMay 2021
A survey of recent 3D scene analysis and processing methods
Multimedia Tools and Applications (MTAA), Volume 80, Issue 13Pages 19491–19511https://doi.org/10.1007/s11042-021-10615-7AbstractWith ubiquitous cameras and popular 3D scanning and capturing devices to help us capture 2D/3D scene data, there are many scene understanding related applications, as well as quite a few important and interesting research problems in processing, ...
- research-articleJune 2020
Design of visual communication based on deep learning approaches
Soft Computing - A Fusion of Foundations, Methodologies and Applications (SOFC), Volume 24, Issue 11Pages 7861–7872https://doi.org/10.1007/s00500-019-03954-zAbstractAiming at the problem of object recognition caused by small object scale, multi-interaction (occlusion), and strong hiding characteristics in the scene analysis task, an object-region-enhanced network based on deep learning was proposed. The ...
- research-articleFebruary 2020
Building hierarchical structures for 3D scenes with repeated elements
The Visual Computer: International Journal of Computer Graphics (VISC), Volume 36, Issue 2Pages 361–374https://doi.org/10.1007/s00371-018-01625-yAbstractWe propose a novel hierarchy construction algorithm for 3D scenes with repeated elements, such as classrooms with multiple desk–chair pairs. Most existing algorithms focus on scenes such as bedrooms or living rooms, which rarely contain repeated ...
- research-articleJanuary 2020
Objects and scenes classification with selective use of central and peripheral image content
Journal of Visual Communication and Image Representation (JVCIR), Volume 66, Issue Chttps://doi.org/10.1016/j.jvcir.2019.102698Highlights- En-HMAX adopts a relative order of importance depending on the image category.
- ...
The human visual recognition system is more efficient than any current robotic vision setting. One reason for this superiority is that humans utilize different fields of vision, depending on the recognition task. For instance, ...
- articleFebruary 2019
Fractal dimension of bag-of-visual words
- Lucas Correia Ribas,
- Diogo Nunes Gonçalves,
- Jonathan Silva,
- Amaury Castro,
- Odemir Martinez Bruno,
- Wesley Nunes Gonçalves
Pattern Analysis & Applications (PAAS), Volume 22, Issue 1Pages 89–98https://doi.org/10.1007/s10044-018-0736-xScene recognition is an important and challenging problem in computer vision. One of the most used scene recognition methods is the bag-of-visual words. Despite the interesting results, this approach does not capture the detail richness of spatial ...
- short-paperNovember 2018
Driving in unknown areas: From UAV images to map for autonomous vehicles
IWCTS'18: Proceedings of the 11th ACM SIGSPATIAL International Workshop on Computational Transportation SciencePages 39–42https://doi.org/10.1145/3283207.3283211Along with the rapid development of autonomous vehicles and driving assistance systems, suitable maps have been intensively studied in recent years. Besides the improvement of conventional maps, new types such as high-density (HD) maps, have been ...
- articleNovember 2018
3D scene reconstruction using a texture probabilistic grammar
Multimedia Tools and Applications (MTAA), Volume 77, Issue 21Pages 28417–28440https://doi.org/10.1007/s11042-018-6052-zIn this paper, texture probabilistic grammar is defined for the first time. We have developed an algorithm to obtain the 3D information in a 2D scene by training the texture probabilistic grammar from the prebuilt model library. The well-trained texture ...
- articleAugust 2018
Exploiting visual saliency for assessing the impact of car commercials upon viewers
- F. Fernández-Martínez,
- A. Hernández-García,
- M. A. Fernández-Torres,
- I. González-Díaz,
- Á. García-Faura,
- F. Díaz María
Multimedia Tools and Applications (MTAA), Volume 77, Issue 15Pages 18903–18933https://doi.org/10.1007/s11042-017-5339-9Content based video indexing and retrieval (CBVIR) is a lively area of research which focuses on automating the indexing, retrieval and management of videos. This area has a wide spectrum of promising applications where assessing the impact of ...
- articleMarch 2017
Scalable video summarization via sparse dictionary learning and selection simultaneously
Multimedia Tools and Applications (MTAA), Volume 76, Issue 6Pages 7947–7971https://doi.org/10.1007/s11042-016-3433-zEvery day, a huge amount of video data is generated worldwide and processing this kind of data requires powerful resources in terms of time, manpower, and hardware. Therefore, to help quickly understand the content of video data, video summarization ...
- ArticleOctober 2015
Automatic Recovery of Networks of Thin Structures
3DV '15: Proceedings of the 2015 International Conference on 3D VisionPages 37–45https://doi.org/10.1109/3DV.2015.12Applications, such as construction monitoring and planning for renovations, require the accurate recovery of existing conditions of structures. Many types of infrastructure are primarily comprised of arbitrarily-shaped thin structures (e.g., Truss ...
- research-articleOctober 2015
Scene analysis by mid-level attribute learning using 2D LSTM networks and an application to web-image tagging
Pattern Recognition Letters (PTRL), Volume 63, Issue CPages 23–29https://doi.org/10.1016/j.patrec.2015.06.003Efficient 2D LSTM attribute learning without pre-/post- processing of the data.2D LSTM networks with only a small amount of parameters.Raw noisy web-images for training without manual annotation.Automatic web-image analysis (unknown number of attribute ...
- research-articleJune 2015
Table-top scene analysis using knowledge-supervised MCMC
Robotics and Computer-Integrated Manufacturing (RCIM), Volume 33, Issue CPages 110–123https://doi.org/10.1016/j.rcim.2014.08.009In this paper, we propose a probabilistic approach to generate abstract scene graphs from uncertain 6D pose estimates. We focus on generating a semantic understanding of the perceived scenes that well explains the composition of the scene and the inter-...
- articleJune 2014
Effective mobile mapping of multi-room indoor structures
The Visual Computer: International Journal of Computer Graphics (VISC), Volume 30, Issue 6-8Pages 707–716https://doi.org/10.1007/s00371-014-0947-0We present a system to easily capture building interiors and automatically generate floor plans scaled to their metric dimensions. The proposed approach is able to manage scenes not necessarily limited to the Manhattan World assumption, exploiting the ...