Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (155)

Search Parameters:
Keywords = target-guided fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 2639 KiB  
Systematic Review
Evaluating the Accuracy and Efficiency of Imaging Modalities in Guiding Ablation for Metastatic Spinal Column Tumors: A Systematic Review
by Siran Aslan, Mohammad Walid Al-Smadi, Murtadha Qais Al-Khafaji, András Gati, Mustafa Qais Al-Khafaji, Réka Viola, Yousif Qais Al-Khafaji, Ákos Viola, Thaer Alnofal and Árpád Viola
Cancers 2024, 16(23), 3946; https://doi.org/10.3390/cancers16233946 - 25 Nov 2024
Abstract
Background/Objectives: Spinal metastases are a frequent and serious complication in cancer patients, often causing severe pain, instability, and neurological deficits. Thermal ablation techniques such as radiofrequency ablation (RFA), microwave ablation (MWA), and cryoablation (CA) have emerged as minimally invasive treatments. These techniques rely [...] Read more.
Background/Objectives: Spinal metastases are a frequent and serious complication in cancer patients, often causing severe pain, instability, and neurological deficits. Thermal ablation techniques such as radiofrequency ablation (RFA), microwave ablation (MWA), and cryoablation (CA) have emerged as minimally invasive treatments. These techniques rely on precise imaging guidance to effectively target lesions while minimizing complications. This systematic review aims to compare the efficacy of different imaging modalities—computed tomography (CT), magnetic resonance imaging (MRI), fluoroscopy, and mixed techniques—in guiding thermal ablation for spinal metastases, focusing on success rates and complications. Methods: A systematic literature search was conducted across PubMed, OVID, Google Scholar, and Web of Science databases, yielding 3733 studies. After screening, 51 studies met the eligibility criteria. Data on success rates, tumor recurrence, complications, and patient outcomes were extracted. Success was defined as no procedure-related mortality, tumor recurrence or expansion, or nerve injury. This systematic review followed PRISMA guidelines and was registered with PROSPERO (ID: CRD42024567174). Results: CT-guided thermal ablation demonstrated high success rates, especially with RFA (75% complete success). Although less frequently employed, MRI guidance showed lower complication rates and improved soft-tissue contrast. Fluoroscopy-guided procedures were effective but had a higher incidence of nerve injury and incomplete tumor control. Mixed imaging techniques, such as CBCT-MRI fusion, showed potential for reducing complications and improving targeting accuracy. Conclusions: CT remains the most reliable imaging modality for guiding thermal ablation in spinal metastases, while MRI provides enhanced safety in complex cases. Fluoroscopy, although effective for real-time guidance, presents limitations in soft-tissue contrast. Mixed imaging techniques like CBCT-MRI fusion offer promising solutions by combining the advantages of both CT and MRI, warranting further exploration in future studies. Full article
(This article belongs to the Special Issue Bone and Spine Metastases)
Show Figures

Figure 1

Figure 1
<p>Schematic representation of study selection based on PRISMA.</p>
Full article ">Figure 2
<p>An overview of all patients undergoing image-guided TA. () the number of articles.</p>
Full article ">Figure 3
<p>An overview of all patients undergoing fluoroscopy-guided techniques. () the number of articles.</p>
Full article ">Figure 4
<p>An overview of all patients undergoing CT-guided techniques. () the number of articles.</p>
Full article ">Figure 5
<p>An overview of all patients undergoing MRI-guided technique. () the number of articles.</p>
Full article ">Figure 6
<p>An overview of all patients undergoing mixed image-guided techniques. () the number of articles.</p>
Full article ">Figure 7
<p>An overview of all patients undergoing fluoroscopy and CT-guided techniques. () the number of articles.</p>
Full article ">Figure 8
<p>An overview of all patients undergoing X-ray-, CT-, and MRI-guided techniques. () the number of articles.</p>
Full article ">Figure 9
<p>An overview of all patients undergoing fluoroscopy and MRI-guided techniques. () the number of articles.</p>
Full article ">Figure 10
<p>An overview of all patients undergoing CT and MRI-guided techniques. () the number of articles.</p>
Full article ">
45 pages, 24880 KiB  
Article
Future Low-Cost Urban Air Quality Monitoring Networks: Insights from the EU’s AirHeritage Project
by Saverio De Vito, Antonio Del Giudice, Gerardo D’Elia, Elena Esposito, Grazia Fattoruso, Sergio Ferlito, Fabrizio Formisano, Giuseppe Loffredo, Ettore Massera, Paolo D’Auria and Girolamo Di Francia
Atmosphere 2024, 15(11), 1351; https://doi.org/10.3390/atmos15111351 - 10 Nov 2024
Viewed by 627
Abstract
The last decade has seen a significant growth in the adoption of low-cost air quality monitoring systems (LCAQMSs), mostly driven by the need to overcome the spatial density limitations of traditional regulatory grade networks. However, urban air quality monitoring scenarios have proved extremely [...] Read more.
The last decade has seen a significant growth in the adoption of low-cost air quality monitoring systems (LCAQMSs), mostly driven by the need to overcome the spatial density limitations of traditional regulatory grade networks. However, urban air quality monitoring scenarios have proved extremely challenging for their operative deployment. In fact, these scenarios need pervasive, accurate, personalized monitoring solutions along with powerful data management technologies and targeted communications tools; otherwise, these scenarios can lead to a lack of stakeholder trust, awareness, and, consequently, environmental inequalities. The AirHeritage project, funded by the EU’s Urban Innovative Action (UIA) program, addressed these issues by integrating intelligent LCAQMSs with conventional monitoring systems and engaging the local community in multi-year measurement strategies. Its implementation allowed us to explore the benefits and limitations of citizen science approaches, the logistic and functional impacts of IoT infrastructures and calibration methodologies, and the integration of AI and geostatistical sensor fusion algorithms for mobile and opportunistic air quality measurements and reporting. Similar research or operative projects have been implemented in the recent past, often focusing on a limited set of the involved challenges. Unfortunately, detailed reports as well as recorded and/or cured data are often not publicly available, thus limiting the development of the field. This work openly reports on the lessons learned and experiences from the AirHeritage project, including device accuracy variance, field recording assessments, and high-resolution mapping outcomes, aiming to guide future implementations in similar contexts and support repeatability as well as further research by delivering an open datalake. By sharing these insights along with the gathered datalake, we aim to inform stakeholders, including researchers, citizens, public authorities, and agencies, about effective strategies for deploying and utilizing LCAQMSs to enhance air quality monitoring and public awareness on this challenging urban environment issue. Full article
(This article belongs to the Special Issue Air Quality and Energy Transition: Interactions and Impacts)
Show Figures

Figure 1

Figure 1
<p>The path from the goals to the selection of architectural design and technology for LCAQMS network deployment projects; connections illustrate possible routes throughout the project design choices.</p>
Full article ">Figure 2
<p>MONICA<sup>TM</sup> node diagram.</p>
Full article ">Figure 3
<p>Front and back picture of the MONICA node.</p>
Full article ">Figure 4
<p>Synthetic schema of complete software architecture for AirHeritage project.</p>
Full article ">Figure 5
<p>Status of air quality from fixed stations.</p>
Full article ">Figure 6
<p>Interactive map for a MONICA registered session. Mobility paths are highlighted using a color code base on the European Air Quality Index (EAQI).</p>
Full article ">Figure 7
<p>The position of the three co-location campaigns on a map performed in the AirHeritage project and details of the assembly and USB based on the multiple device power supply unit.</p>
Full article ">Figure 8
<p>Scheme of IoT architecture in stationary setup.</p>
Full article ">Figure 9
<p>The 7 fixed stations as deployed nearby the reference mobile station during calibration data gathering (co-location periods).</p>
Full article ">Figure 10
<p>Lognormal fitted pollutant concentrations as recorded in the first co-location period by mobile ARPAC air quality monitoring laboratory reporting reference values for data-driven calibration.</p>
Full article ">Figure 11
<p>Lognormal fitted concentrations of CO as recorded during the first co-location period by mobile ARPAC air quality monitoring laboratory reporting reference values for data-driven calibration.</p>
Full article ">Figure 12
<p>Distribution, across the 30 MONICA™ devices, of R<sup>2</sup> (first coloumn) and MAE (second column) short-term performance values for NO<sub>2</sub> (first row), O<sub>3</sub> (second row). and CO (third row), as estimated by MLR-based data-driven calibration in deployment period 1. The distributions appear to be skewed by a few outliers. Performed checks show that anomalous low performance is due to transients in raw sensor responses when they were first switched on.</p>
Full article ">Figure 13
<p>R<sup>2</sup> (1st coloumn) and MAE (2nd coloumn) and short-term performance for PM<sub>2.5</sub> (first row) and PM<sub>10</sub> (second row) as estimated by MLR-based data-driven calibration in deployment period 1, across the 30 MONICA devices.</p>
Full article ">Figure 14
<p>Histogram of PM<sub>2.5</sub> R<sup>2</sup> accuracy index; (violet) along with gaussian distribution fit (blue) 3 different device performance clusters are observable, each one corresponding to a co-location batch.</p>
Full article ">Figure 15
<p>Time series of PM<sub>10</sub> and PM<sub>2.5</sub> concentrations, as measured by the mobile laboratory, during the initial co-location period.</p>
Full article ">Figure 16
<p>Trend in the measured hourly mean NO<sub>2</sub> concentrations.</p>
Full article ">Figure 17
<p>The hourly mean concentration of ozone (O<sub>3</sub>) (black line) and the 8 h moving average (yellow line) are reported.</p>
Full article ">Figure 18
<p>Hourly average carbon monoxide CO concentration time series.</p>
Full article ">Figure 19
<p>The time series of PM<sub>10</sub> and PM2<sub>.5</sub> concentrations, as measured by the mobile laboratory, during the 2nd co-location period.</p>
Full article ">Figure 20
<p>Trend in the measured hourly mean NO<sub>2</sub> concentrations during the 2nd co-location.</p>
Full article ">Figure 21
<p>The hourly mean concentration of ozone (O3) (black line) and the 8 h moving average (yellow line) are presented along with the daily average temperature graph.</p>
Full article ">Figure 22
<p>Hourly average carbon monoxide CO concentration time series in 2nd co-location.</p>
Full article ">Figure 23
<p>The time series of PM<sub>10</sub> and PM<sub>2.5</sub> concentrations, as measured by the mobile laboratory, during the 3rd co-location period.</p>
Full article ">Figure 24
<p>NO<sub>2</sub> hourly average concentrations during the 3rd co-location.</p>
Full article ">Figure 25
<p>Hourly average concentration of ozone (O<sub>3</sub>) (black line) and the 8 h moving average (yellow line) are shown (<b>top</b>) with the daily average temperature plot (<b>bottom</b>).</p>
Full article ">Figure 26
<p>CO hourly average concentrations recorded by the mobile station during the 3rd co-location period.</p>
Full article ">Figure 27
<p>(<b>a</b>) An illustrative example of a user session as displayed on the webpage, accompanied by an indication of the location and the level of pollutants. (<b>b</b>) An illustrative example of a user session as displayed on the MONICA app.</p>
Full article ">Figure 28
<p>A schematic representation of the data flow in a mobile application scenario.</p>
Full article ">Figure 29
<p>The workflow performed in the Air-Heritage project.</p>
Full article ">Figure 30
<p>Site suitability map for networks of low-cost traffic-orientated stations for air pollutant monitoring across the city of Portici.</p>
Full article ">Figure 31
<p>Map of one of the optimal locations (red triangle within the red circle), with the related geographical coordinates (marked in red in the table) and the image of the mounted pole where NOx and PM<sub>2.5</sub> sensors have to be installed.</p>
Full article ">Figure 32
<p>Maps of the mobile monitoring campaigns along the selected monitoring route.</p>
Full article ">Figure 33
<p>Comparison between MONICA (blue line) and SIRANE (orange line), for CO pollutant on 5 and 21 June. Triangles are street canyons and circles are open roads. The ID receptors are grouped by monitoring road segments. The graphs (<b>a</b>,<b>c</b>,<b>e</b>) show the comparisons at 9 a.m., 1 p.m. and 5 p.m. on 5 June while the graphs (<b>b</b>,<b>d</b>,<b>f</b>) show the comparison at 9 a.m., 1 p.m. and 5 p.m. on 21 June.</p>
Full article ">Figure 34
<p>Maps of the PM<sub>2.5</sub> measurement density for each 25 m bin in summer (<b>a</b>) and winter campaigns (<b>b</b>).</p>
Full article ">Figure 35
<p>Maps of the distribution (median value) of the recorded PM<sub>2.5</sub> concentrations within the 25 m bins in summer (<b>a</b>) and winter campaigns (<b>b</b>).</p>
Full article ">
18 pages, 10244 KiB  
Article
Research on Closed-Loop Control of Screen-Based Guidance Operations in High-Speed Railway Passenger Stations Based on Visual Detection Model
by Chunjie Xu, Chenao Du, Mengkun Li, Tianyun Shi, Yitian Sun and Qian Wang
Electronics 2024, 13(22), 4400; https://doi.org/10.3390/electronics13224400 - 10 Nov 2024
Viewed by 371
Abstract
Due to adjustments to the operation plan of guided trains at high-speed railway stations, a large amount of information is inevitably displayed, sometimes with delays, omissions, and misalignments. The effective management of guidance information can provide important support for the personnel flow operation [...] Read more.
Due to adjustments to the operation plan of guided trains at high-speed railway stations, a large amount of information is inevitably displayed, sometimes with delays, omissions, and misalignments. The effective management of guidance information can provide important support for the personnel flow operation of high-speed railway stations. Aiming to meet the requirements of high real-time and high accuracy of guided job control, a closed-loop control method based on a guided job is proposed, which provides enhanced text detection and recognition in a target area. Firstly, using the introduction of the triplet attention mechanism in YOLOv5 and the addition of fusion modules, the feature pyramid network is used to enhance the effective feature and feature interactions between the modules to improve the detection speed of the display. Then, the text on the guide screen is recognized and extracted in combination with the PaddleOCR model, and then, the results are proofread against the original plan to adjust the screen information. Finally, the effectiveness and feasibility of the method are verified by experimental data, with the accuracy of the improved model reaching 90.6% and the speed reaching 1 ms, which meets the requirement of real-time closed-loop control of Screen-Based Guidance Operations. Full article
Show Figures

Figure 1

Figure 1
<p>Closed-loop control framework for Screen-Based Guidance Operations.</p>
Full article ">Figure 2
<p>Improved YOLOv5 network structure diagram.</p>
Full article ">Figure 3
<p>Principle of the triplet attention mechanism.</p>
Full article ">Figure 4
<p>Structure of the triplet attention mechanism.</p>
Full article ">Figure 5
<p>Structure of the characteristic pyramid horizontally connected network.</p>
Full article ">Figure 6
<p>Yellow characters on black background guide screen partial data.</p>
Full article ">Figure 7
<p>White characters on blue background guide screen partial data.</p>
Full article ">Figure 8
<p>Box loss, classification loss, and depth feature loss curves.</p>
Full article ">Figure 9
<p><span class="html-italic">p</span>-value, R-value, mAP50 vs. mAP50-95 curves.</p>
Full article ">Figure 10
<p>Effect of SSD algorithm detection.</p>
Full article ">Figure 11
<p>YOLOv5 algorithm detection effect.</p>
Full article ">Figure 12
<p>Effect of YOLOv6 algorithm detection.</p>
Full article ">Figure 13
<p>YOLOv8 algorithm detection effect.</p>
Full article ">Figure 14
<p>Detection effect of Improved YOLOv5.</p>
Full article ">Figure 15
<p>PaddleOCR algorithm recognition effect (Angle directly facing).</p>
Full article ">Figure 16
<p>PaddleOCR algorithm recognition effect (Angle of the right oblique side).</p>
Full article ">Figure 17
<p>PaddleOCR algorithm recognition effect (Angle of the left oblique side).</p>
Full article ">
23 pages, 9966 KiB  
Article
SFFNet: Shallow Feature Fusion Network Based on Detection Framework for Infrared Small Target Detection
by Zhihui Yu, Nian Pan and Jin Zhou
Remote Sens. 2024, 16(22), 4160; https://doi.org/10.3390/rs16224160 - 8 Nov 2024
Viewed by 415
Abstract
Infrared small target detection (IRSTD) is the process of recognizing and distinguishing small targets from infrared images that are obstructed by crowded backgrounds. This technique is used in various areas, including ground monitoring, flight navigation, and so on. However, due to complex backgrounds [...] Read more.
Infrared small target detection (IRSTD) is the process of recognizing and distinguishing small targets from infrared images that are obstructed by crowded backgrounds. This technique is used in various areas, including ground monitoring, flight navigation, and so on. However, due to complex backgrounds and the loss of information in deep networks, infrared small target detection remains a difficult undertaking. To solve the above problems, we present a shallow feature fusion network (SFFNet) based on detection framework. Specifically, we design the shallow-layer-guided feature enhancement (SLGFE) module, which guides multi-scale feature fusion with shallow layer information, effectively mitigating the loss of information in deep networks. Then, we design the visual-Mamba-based global information extension (VMamba-GIE) module, which leverages a multi-branch structure combining the capability of convolutional layers to extract features in local space with the advantages of state space models in the exploration of long-distance information. The design significantly extends the network’s capacity to acquire global contextual information, enhancing its capability to handle complex backgrounds. And through the effective fusion of the SLGFE and VMamba-GIE modules, the exorbitant computation brought by the SLGFE module is substantially reduced. The experimental results on two publicly available infrared small target datasets demonstrate that the SFFNet surpasses other state-of-the-art algorithms. Full article
Show Figures

Figure 1

Figure 1
<p>Illustration of 2D-Selective-Scan (SS2D).</p>
Full article ">Figure 2
<p>(<b>a</b>) Overall architecture of the SFFNet. (<b>b</b>) Architecture of SLGFE module. (<b>c</b>) Architecture of VMamba-GIE module.</p>
Full article ">Figure 3
<p>Illustration of the backbone network architecture.</p>
Full article ">Figure 4
<p>Illustration of the detection head architecture.</p>
Full article ">Figure 5
<p>Comparison of feature maps at different scales extracted by the backbone network. The target position is highlighted by the red dotted box. (<b>a</b>) original image. (<b>b</b>) <math display="inline"><semantics> <msub> <mi>F</mi> <mn>1</mn> </msub> </semantics></math>. (<b>c</b>) <math display="inline"><semantics> <msub> <mi>F</mi> <mn>2</mn> </msub> </semantics></math>. (<b>d</b>) <math display="inline"><semantics> <msub> <mi>F</mi> <mn>3</mn> </msub> </semantics></math>. (<b>e</b>) <math display="inline"><semantics> <msub> <mi>F</mi> <mn>4</mn> </msub> </semantics></math>.</p>
Full article ">Figure 6
<p>Partial visualization results obtained by different infrared small target detection methods on the NUAA-SIRST dataset.</p>
Full article ">Figure 7
<p>Partial visualization results obtained by different object detection networks on the NUAA-SIRST dataset.</p>
Full article ">Figure 8
<p>Partial visualization results obtained by different infrared small target detection methods on the IRSTD-1K dataset.</p>
Full article ">Figure 9
<p>Partial visualization results obtained by different object detection networks on the IRSTD-1k dataset.</p>
Full article ">Figure 10
<p>Partial feature maps at different stages. The positions of targets are highlighted with red dashed circle.</p>
Full article ">
15 pages, 1240 KiB  
Article
Position-Guided Multi-Head Alignment and Fusion for Video Super-Resolution
by Yanbo Gao, Xun Cai, Shuai Li, Jiajing Chai and Chuankun Li
Electronics 2024, 13(22), 4372; https://doi.org/10.3390/electronics13224372 - 7 Nov 2024
Viewed by 442
Abstract
Video super-resolution (VSR), which takes advantage of multiple low-resolution (LR) video frames to reconstruct corresponding high-resolution (HR) frames in a video, has raised increasing interest. To upsample an LR frame (denoted by a reference frame), VSR methods usually align multiple neighboring frames (denoted [...] Read more.
Video super-resolution (VSR), which takes advantage of multiple low-resolution (LR) video frames to reconstruct corresponding high-resolution (HR) frames in a video, has raised increasing interest. To upsample an LR frame (denoted by a reference frame), VSR methods usually align multiple neighboring frames (denoted by supporting frames) to the reference frame first in order to provide more relevant information. The existing VSR methods usually employ deformable convolution to conduct the frame alignment, where the whole supporting frame is aligned to the reference frame without a specific target and without supervision. Thus, the aligned features are not explicitly learned to provide the HR frame information and cannot fully explore the supporting frames. To address this problem, in this work, we propose a novel video super-resolution framework with Position-Guided Multi-Head Alignment, termed as PGMH-A, to explicitly align the supporting frames to different spatial positions of the HR frame (denoted by different heads). It injects explicit position information to obtain multi-head-aligned features of supporting frames to better formulate the HR frame. PGMH-A can be trained individually or end-to-end with the ground-truth HR frames. Moreover, a Position-Guided Multi-Head Fusion, termed as PGMH-F, is developed based on the attention mechanism to further fuse the spatial–temporal information across temporal supporting frames, across multiple heads corresponding to the different spatial positions of an HR frame, and across multiple channels. Together, the proposed Position-Guided Multi-Head Alignment and Fusion (PGMH-AF) can provide VSR with better local details and temporal coherence. The experimental results demonstrate that the proposed method outperforms the state-of-the-art VSR networks. Ablation studies have also been conducted to verify the effectiveness of the proposed modules. Full article
(This article belongs to the Special Issue Challenges and Applications in Multimedia and Visual Computing)
Show Figures

Figure 1

Figure 1
<p>Framework of the proposed PGMH-AF method. Three input frames are used as an illustrative example.</p>
Full article ">Figure 2
<p>Illustration of the different procedures between the proposed Position-Guided Multi-Head Alignment and the conventional temporal-alignment-based methods.</p>
Full article ">Figure 3
<p>The Position-Guided Multi-Head Alignment (PGMH-A) module. Take ×2 VSR task as an example: four multi-head features are first generated to perform alignment via a shared network with different position masks.</p>
Full article ">Figure 4
<p>Illustration of the Position-Guided Multi-Head Fusion containing the pixel-wise-attention-based temporal fusion and the channel-attention-based head fusion.</p>
Full article ">Figure 5
<p>Qualitative comparison for VSR with upscale factor 4 on the Vid4 dataset in comparison with the results of Bicubic and EDVR [<a href="#B10-electronics-13-04372" class="html-bibr">10</a>]. Zoom in for best view.</p>
Full article ">
18 pages, 4356 KiB  
Article
Hierarchical Fusion of Infrared and Visible Images Based on Channel Attention Mechanism and Generative Adversarial Networks
by Jie Wu, Shuai Yang, Xiaoming Wang, Yu Pei, Shuai Wang and Congcong Song
Sensors 2024, 24(21), 6916; https://doi.org/10.3390/s24216916 - 28 Oct 2024
Viewed by 492
Abstract
In order to solve the problem that existing visible and infrared image fusion methods rely only on the original local or global information representation, which has the problem of edge blurring and non-protrusion of salient targets, this paper proposes a layered fusion method [...] Read more.
In order to solve the problem that existing visible and infrared image fusion methods rely only on the original local or global information representation, which has the problem of edge blurring and non-protrusion of salient targets, this paper proposes a layered fusion method based on channel attention mechanism and improved Generative Adversarial Network (HFCA_GAN). Firstly, the infrared image and visible image are decomposed into a base layer and fine layer, respectively, by a guiding filter. Secondly, the visible light base layer is fused with the infrared image base layer by histogram mapping enhancement to improve the contour effect. Thirdly, the improved GAN algorithm is used to fuse the infrared and visible image refinement layer, and the depth transferable module and guided fusion network are added to enrich the detailed information of the fused image. Finally, the multilayer convolutional fusion network with channel attention mechanism is used to correlate the local information of the layered fusion image, and the final fusion image containing contour gradient information and useful details is obtained. TNO and RoadSence datasets are selected for training and testing. The results show that the proposed algorithm retains the global structure features of multilayer images and has obvious advantages in fusion performance, model generalization and computational efficiency. Full article
(This article belongs to the Special Issue Multi-Modal Image Processing Methods, Systems, and Applications)
Show Figures

Figure 1

Figure 1
<p>Framework of the proposed fusion method for infrared and visible image fusion.</p>
Full article ">Figure 2
<p>Enhanced mapping of base layer histograms.</p>
Full article ">Figure 3
<p>Framework for fusion of basic layer of infrared and visible images.</p>
Full article ">Figure 4
<p>Framework for fusion of refinement layer of infrared and visible images.</p>
Full article ">Figure 5
<p>Dense Channel Attention Mechanism Network (dCAMN).</p>
Full article ">Figure 6
<p>Ablation analysis of our method on the TNO dataset.</p>
Full article ">Figure 7
<p>Analysis of the effect of the channel attention mechanism network.</p>
Full article ">Figure 8
<p>Qualitative fusion results of selected Streamboat scenes in the dataset.</p>
Full article ">Figure 9
<p>Qualitative fusion results of selected Street scenes in the dataset.</p>
Full article ">Figure 10
<p>Histogram of indicators of 6 mainstream algorithms in the test set image on Streamboat.</p>
Full article ">Figure 11
<p>Histogram of indicators of 6 mainstream algorithms in the test set image on Street.</p>
Full article ">Figure 12
<p>HFCA_GAN trained loss function trend plots are shown in the “RoadSence” and “TNO” datasets. (<b>a</b>) Loss function under RoadSence; (<b>b</b>) loss function under TNO.</p>
Full article ">
16 pages, 37586 KiB  
Article
Driver Distraction Detection Based on Fusion Enhancement and Global Saliency Optimization
by Xueda Huang, Shuangshuang Gu, Yuanyuan Li, Guanqiu Qi, Zhiqin Zhu and Yiyao An
Mathematics 2024, 12(20), 3289; https://doi.org/10.3390/math12203289 - 20 Oct 2024
Viewed by 693
Abstract
Driver distraction detection not only effectively prevents traffic accidents but also promotes the development of intelligent transportation systems. In recent years, thanks to the powerful feature learning capabilities of deep learning algorithms, driver distraction detection methods based on deep learning have increased significantly. [...] Read more.
Driver distraction detection not only effectively prevents traffic accidents but also promotes the development of intelligent transportation systems. In recent years, thanks to the powerful feature learning capabilities of deep learning algorithms, driver distraction detection methods based on deep learning have increased significantly. However, for resource-constrained onboard devices, real-time lightweight models are crucial. Most existing methods tend to focus solely on lightweight model design, neglecting the loss in detection performance for small targets. To achieve a balance between detection accuracy and network lightweighting, this paper proposes a driver distraction detection method that combines enhancement and global saliency optimization. The method mainly consists of three modules: context fusion enhancement module (CFEM), channel optimization feedback module (COFM), and channel saliency distillation module (CSDM). In the CFEM module, one-dimensional convolution is used to capture information between distant pixels, and an injection mechanism is adopted to further integrate high-level semantic information with low-level detail information, enhancing feature fusion capabilities. The COFM module incorporates a feedback mechanism to consider the impact of inter-layer and intra-layer channel relationships on model compression performance, achieving joint pruning of global channels. The CSDM module guides the student network to learn the salient feature information from the teacher network, effectively balancing the model’s real-time performance and accuracy. Experimental results show that this method outperforms the state-of-the-art methods in driver distraction detection tasks, demonstrating good performance and potential application prospects. Full article
Show Figures

Figure 1

Figure 1
<p>The framework of the proposed driver distraction detection method. It consists of three main modules: context fusion enhancement module, global channel optimization feedback module and channel-saliency-aware distillation module.</p>
Full article ">Figure 2
<p>The details of the CFEM specifically designed for the fusion enhancement framework. It captures long-range contextual information through 1D depth-wise separable convolutions and uses a feature injection mechanism to explicitly fuse high-level rich semantic information with low-level fine detail information.</p>
Full article ">Figure 3
<p>The COFM architecture is specifically designed for model compression; given an original well-trained multitasking model, all target layers are sorted in a new order based on each layer’s contribution to the total FLOP reduction and sequentially fed back in this order until all layers are compressed.</p>
Full article ">Figure 4
<p>The CSDM architecture designed specifically for the distillation module. On the left is the feature map of the teacher–student network used for channel knowledge distillation. On the right side are the activation areas corresponding to different categories.</p>
Full article ">Figure 5
<p>Six categories of driving behaviors in the LDDB dataset.</p>
Full article ">Figure 6
<p>Ten categories of driving behaviors in the StateFarm dataset.</p>
Full article ">Figure 7
<p>Comparison of experimental indicators on the LDDB dataset. The best results are marked with a star.</p>
Full article ">Figure 8
<p>Comparison of experimental indicators on the StateFarm dataset. The best results are marked with a star.</p>
Full article ">Figure 9
<p>Evaluation metrics for ablation experiments on the LDDB dataset. The best results are marked with a star.</p>
Full article ">Figure 10
<p>Detection Heatmap Comparison.</p>
Full article ">
24 pages, 13098 KiB  
Article
A Multi-Scale Feature Fusion Based Lightweight Vehicle Target Detection Network on Aerial Optical Images
by Chengrui Yu, Xiaonan Jiang, Fanlu Wu, Yao Fu, Junyan Pei, Yu Zhang, Xiangzhi Li and Tianjiao Fu
Remote Sens. 2024, 16(19), 3637; https://doi.org/10.3390/rs16193637 - 29 Sep 2024
Viewed by 1391
Abstract
Vehicle detection with optical remote sensing images has become widely applied in recent years. However, the following challenges have remained unsolved during remote sensing vehicle target detection. These challenges include the dense and arbitrary angles at which vehicles are distributed and which make [...] Read more.
Vehicle detection with optical remote sensing images has become widely applied in recent years. However, the following challenges have remained unsolved during remote sensing vehicle target detection. These challenges include the dense and arbitrary angles at which vehicles are distributed and which make it difficult to detect them; the extensive model parameter (Param) that blocks real-time detection; the large differences between larger vehicles in terms of their features, which lead to a reduced detection precision; and the way in which the distribution in vehicle datasets is unbalanced and thus not conducive to training. First, this paper constructs a small dataset of vehicles, MiVehicle. This dataset includes 3000 corresponding infrared and visible image pairs, offering a more balanced distribution. In the infrared part of the dataset, the proportions of different vehicle types are as follows: cars, 48%; buses, 19%; trucks, 15%; freight, cars 10%; and vans, 8%. Second, we choose the rotated box mechanism for detection with the model and we build a new vehicle detector, ML-Det, with a novel multi-scale feature fusion triple cross-criss FPN (TCFPN), which can effectively capture the vehicle features in three different positions with an mAP improvement of 1.97%. Moreover, we propose LKC–INVO, which allows involution to couple the structure of multiple large kernel convolutions, resulting in an mAP increase of 2.86%. We also introduce a novel C2F_ContextGuided module with global context perception, which enhances the perception ability of the model in the global scope and minimizes model Params. Eventually, we propose an assemble–disperse attention module to aggregate local features so as to improve the performance. Overall, ML-Det achieved a 3.22% improvement in accuracy while keeping Params almost unchanged. In the self-built small MiVehicle dataset, we achieved 70.44% on visible images and 79.12% on infrared images with 20.1 GFLOPS, 78.8 FPS, and 7.91 M. Additionally, we trained and tested our model on the following public datasets: UAS-AOD and DOTA. ML-Det was found to be ahead of many other advanced target detection algorithms. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

Figure 1
<p>Comparison of horizontal and rotating boxes for arbitrary rotating vehicle target selection: (<b>a</b>) horizontal box and (<b>b</b>) rotating box.</p>
Full article ">Figure 2
<p>General architecture of the ML-Det model.</p>
Full article ">Figure 3
<p>The structure of the SPPX model. “*” denotes multiplication, and the following number represents the number of modules.</p>
Full article ">Figure 4
<p>Schematic diagram of LKC–INVO convolution.</p>
Full article ">Figure 5
<p>Comparison of model features before and after adding LKC–INVO convolution. (<b>a</b>) Before and (<b>b</b>) after adding LKC–INVO convolution.</p>
Full article ">Figure 6
<p>C2F_ContextGuided schematic diagram. (<b>a</b>) C2F_ContextGuide and (<b>b</b>) CG_Bottleneck Block.</p>
Full article ">Figure 6 Cont.
<p>C2F_ContextGuided schematic diagram. (<b>a</b>) C2F_ContextGuide and (<b>b</b>) CG_Bottleneck Block.</p>
Full article ">Figure 7
<p>Assemble–disperse schematic diagram.</p>
Full article ">Figure 8
<p>Examples from the MiVehicle dataset.</p>
Full article ">Figure 9
<p>Comparison of the share of each vehicle in the dataset.</p>
Full article ">Figure 10
<p>Examples of vehicles in UCAS_AOD dataset.</p>
Full article ">Figure 11
<p>Examples of vehicles in the DOTA dataset.</p>
Full article ">Figure 12
<p>Comparison of the detection differences of YOLOv5s-obb (<b>left</b>) and ML-Det (<b>right</b>).</p>
Full article ">Figure 12 Cont.
<p>Comparison of the detection differences of YOLOv5s-obb (<b>left</b>) and ML-Det (<b>right</b>).</p>
Full article ">Figure 13
<p>Comparison of confusion matrices of YOLOv5s-obb (<b>left</b>) and ML-Det (<b>right</b>).</p>
Full article ">Figure 14
<p>Comparison of the detection effectiveness of five advanced target detection algorithms on different modalities of the MiVehicle dataset.</p>
Full article ">Figure 15
<p>Comparison of the detection effectiveness of five advanced target detection algorithms on UCAS_AOD dataset.</p>
Full article ">Figure 15 Cont.
<p>Comparison of the detection effectiveness of five advanced target detection algorithms on UCAS_AOD dataset.</p>
Full article ">Figure 16
<p>Comparison of detection effectiveness of various advanced target detection algorithms in DOTA dataset.</p>
Full article ">Figure 16 Cont.
<p>Comparison of detection effectiveness of various advanced target detection algorithms in DOTA dataset.</p>
Full article ">
24 pages, 10901 KiB  
Article
Regulating Modality Utilization within Multimodal Fusion Networks
by Saurav Singh, Eli Saber, Panos P. Markopoulos and Jamison Heard
Sensors 2024, 24(18), 6054; https://doi.org/10.3390/s24186054 - 19 Sep 2024
Viewed by 756
Abstract
Multimodal fusion networks play a pivotal role in leveraging diverse sources of information for enhanced machine learning applications in aerial imagery. However, current approaches often suffer from a bias towards certain modalities, diminishing the potential benefits of multimodal data. This paper addresses this [...] Read more.
Multimodal fusion networks play a pivotal role in leveraging diverse sources of information for enhanced machine learning applications in aerial imagery. However, current approaches often suffer from a bias towards certain modalities, diminishing the potential benefits of multimodal data. This paper addresses this issue by proposing a novel modality utilization-based training method for multimodal fusion networks. The method aims to guide the network’s utilization on its input modalities, ensuring a balanced integration of complementary information streams, effectively mitigating the overutilization of dominant modalities. The method is validated on multimodal aerial imagery classification and image segmentation tasks, effectively maintaining modality utilization within ±10% of the user-defined target utilization and demonstrating the versatility and efficacy of the proposed method across various applications. Furthermore, the study explores the robustness of the fusion networks against noise in input modalities, a crucial aspect in real-world scenarios. The method showcases better noise robustness by maintaining performance amidst environmental changes affecting different aerial imagery sensing modalities. The network trained with 75.0% EO utilization achieves significantly better accuracy (81.4%) in noisy conditions (noise variance = 0.12) compared to traditional training methods with 99.59% EO utilization (73.7%). Additionally, it maintains an average accuracy of 85.0% across different noise levels, outperforming the traditional method’s average accuracy of 81.9%. Overall, the proposed approach presents a significant step towards harnessing the full potential of multimodal data fusion in diverse machine learning applications such as robotics, healthcare, satellite imagery, and defense applications. Full article
(This article belongs to the Special Issue Deep Learning Methods for Aerial Imagery)
Show Figures

Figure 1

Figure 1
<p>Computing modality utilization by randomly shuffling a modality <math display="inline"><semantics> <msub> <mi>M</mi> <mi>i</mi> </msub> </semantics></math> within the dataset to break the association between the input modality <math display="inline"><semantics> <msub> <mi>M</mi> <mi>i</mi> </msub> </semantics></math> and the output <span class="html-italic">Y</span>.</p>
Full article ">Figure 2
<p>Modality utilization-based training targets the decision layers while using pre-trained feature extractors with frozen weights.</p>
Full article ">Figure 3
<p>Visualization of multimodal fusion network’s gradient descent on the loss surface of the fusion network task. Optimizing <math display="inline"><semantics> <msub> <mi>L</mi> <mrow> <mi>m</mi> <mi>u</mi> </mrow> </msub> </semantics></math> from the very beginning can push the network in the local minima.</p>
Full article ">Figure 4
<p>Clipped exponential function-based for loss factor warm-up for MU-based training.</p>
Full article ">Figure 5
<p>NTIRE 2021 Multimodal Aerial View Object Classification Challenge Dataset [<a href="#B15-sensors-24-06054" class="html-bibr">15</a>].</p>
Full article ">Figure 6
<p>NTIRE 2021 Multimodal Aerial View Object Classification Network with ResNet18 as the backbone.</p>
Full article ">Figure 7
<p>Visualization of NTIRE21 dataset classification using Multimodal Aerial View Object Classification Network.</p>
Full article ">Figure 8
<p>MCubeS Multimodal Material Segmentation Dataset [<a href="#B16-sensors-24-06054" class="html-bibr">16</a>].</p>
Full article ">Figure 9
<p>MCubeS Multimodal Material Segmentation Network with UNet as the backbone.</p>
Full article ">Figure 10
<p>Visualization of MCubeS dataset image segmentation using the Multimodal Material Segmentation Network.</p>
Full article ">Figure 11
<p>Effects of different target utilization <math display="inline"><semantics> <mrow> <mi>M</mi> <msub> <mi>U</mi> <mrow> <mi>t</mi> <mi>a</mi> <mi>r</mi> <mi>g</mi> <mi>e</mi> <mi>t</mi> </mrow> </msub> </mrow> </semantics></math> on modality utilization and classification accuracy with modality utilization-based training method in the NTIRE21 dataset. Loss factor <math display="inline"><semantics> <mrow> <msub> <mi>λ</mi> <mi>L</mi> </msub> <mo>=</mo> <mn>100.0</mn> </mrow> </semantics></math> with SAR as the focus modality.</p>
Full article ">Figure 12
<p>Effects of the loss factor <math display="inline"><semantics> <msub> <mi>λ</mi> <mi>L</mi> </msub> </semantics></math> on modality utilization and classification accuracy with modality utilization-based training method in NTIRE21 dataset. Target utilization <math display="inline"><semantics> <mrow> <mi>M</mi> <msub> <mi>U</mi> <mrow> <mi>t</mi> <mi>a</mi> <mi>r</mi> <mi>g</mi> <mi>e</mi> <mi>t</mi> </mrow> </msub> <mo>=</mo> <mn>50</mn> </mrow> </semantics></math>% with SAR as the focus modality.</p>
Full article ">Figure 13
<p>Effects of the loss factor buildup rate <math display="inline"><semantics> <mi>β</mi> </semantics></math> on modality utilization and classification accuracy with modality utilization-based training method in the NTIRE21 dataset. Target utilization <math display="inline"><semantics> <mrow> <mi>M</mi> <msub> <mi>U</mi> <mrow> <mi>t</mi> <mi>a</mi> <mi>r</mi> <mi>g</mi> <mi>e</mi> <mi>t</mi> </mrow> </msub> <mo>=</mo> <mn>50</mn> </mrow> </semantics></math>%, Maximum Loss Factor <math display="inline"><semantics> <mrow> <msub> <mi>λ</mi> <mrow> <mi>L</mi> <mo>_</mo> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> <mo>=</mo> <mn>100</mn> </mrow> </semantics></math>, and buildup delay <math display="inline"><semantics> <mrow> <mi>δ</mi> <mo>=</mo> <mn>0</mn> </mrow> </semantics></math> with SAR as the focus modality.</p>
Full article ">Figure 14
<p>Effects of Gaussian noise with mean = 0 and variance = <math display="inline"><semantics> <mrow> <mo>{</mo> <mn>0.06</mn> <mo>,</mo> <mn>0.09</mn> <mo>,</mo> <mn>0.12</mn> <mo>}</mo> </mrow> </semantics></math> in the EO modality, the SAR modality, and both modalities during inference on networks trained with different levels of SAR utilization.</p>
Full article ">
21 pages, 35632 KiB  
Article
AgeDETR: Attention-Guided Efficient DETR for Space Target Detection
by Xiaojuan Wang, Bobo Xi, Haitao Xu, Tie Zheng and Changbin Xue
Remote Sens. 2024, 16(18), 3452; https://doi.org/10.3390/rs16183452 - 18 Sep 2024
Viewed by 807
Abstract
Recent advancements in space exploration technology have significantly increased the number of diverse satellites in orbit. This surge in space-related information has posed considerable challenges in developing space target surveillance and situational awareness systems. However, existing detection algorithms face obstacles such as complex [...] Read more.
Recent advancements in space exploration technology have significantly increased the number of diverse satellites in orbit. This surge in space-related information has posed considerable challenges in developing space target surveillance and situational awareness systems. However, existing detection algorithms face obstacles such as complex space backgrounds, varying illumination conditions, and diverse target sizes. To address these challenges, we propose an innovative end-to-end Attention-Guided Encoder DETR (AgeDETR) model, since artificial intelligence technology has progressed swiftly in recent years. Specifically, AgeDETR integrates Efficient Multi-Scale Attention (EMA) Enhanced FasterNet block (EF-Block) within a ResNet18 (EF-ResNet18) backbone. This integration enhances feature extraction and computational efficiency, providing a robust foundation for accurately identifying space targets. Additionally, we introduce the Attention-Guided Feature Enhancement (AGFE) module, which leverages self-attention and channel attention mechanisms to effectively extract and reinforce salient target features. Furthermore, the Attention-Guided Feature Fusion (AGFF) module optimizes multi-scale feature integration and produces highly expressive feature representations, which significantly improves recognition accuracy. The proposed AgeDETR framework achieves outstanding performance metrics, i.e., 97.9% in mAP0.5 and 85.2% in mAP0.5:0.95, on the SPARK2022 dataset, outperforming existing detectors and demonstrating superior performance in space target detection. Full article
Show Figures

Figure 1

Figure 1
<p>Overview of the proposed AgeDETR.</p>
Full article ">Figure 2
<p>(<b>a</b>) Overview of the overall framework of the proposed EF-ResNet18 architecture. (<b>b</b>) The improved residual connection structure incorporating the EF-Block module. (<b>c</b>) A detailed structural diagram of the EF-Block module.</p>
Full article ">Figure 3
<p>The schematic of the principle of Partial Convolution.</p>
Full article ">Figure 4
<p>Examples of SPARK2022 dataset. (<b>a</b>) Proba 2, (<b>b</b>) Cheops, (<b>c</b>) Debris, (<b>d</b>) Double star, (<b>e</b>) Lisa Pathfinder, (<b>f</b>) Smart 1, (<b>g</b>) Soho, (<b>h</b>) Proba 3 CSC, (<b>i</b>) Proba 3 OCS, (<b>j</b>) Xmm newton, and (<b>k</b>) Earth Observation Sat 1.</p>
Full article ">Figure 5
<p>Information about the manual labeling of the objects in the SPARK2022 dataset.</p>
Full article ">Figure 6
<p>Comparison of recognition effectiveness of different detection models on SPARK2022 dataset.</p>
Full article ">Figure 7
<p>Visualization of feature maps with different models. (<b>a</b>) Input image, (<b>b</b>) YOLOv5s, (<b>c</b>) YOLOv6s, (<b>d</b>) YOLOv8s, (<b>e</b>) YOLOv9c, (<b>f</b>) RT-DETR, and (<b>g</b>) AgeDETR.</p>
Full article ">Figure 8
<p>Visualization results of random initial weight output feature maps for different network architectures. (<b>a</b>) Input images, (<b>b</b>) ResNet18, and (<b>c</b>) EF-ResNet18.</p>
Full article ">Figure 9
<p>Prediction results on the SPARK2022 dataset: the upper section shows the ground truth labels for space targets, while the lower section shows the predictions by AgeDETR.</p>
Full article ">
24 pages, 16483 KiB  
Article
Semi-Supervised Remote Sensing Building Change Detection with Joint Perturbation and Feature Complementation
by Zhanlong Chen, Rui Wang and Yongyang Xu
Remote Sens. 2024, 16(18), 3424; https://doi.org/10.3390/rs16183424 - 14 Sep 2024
Viewed by 941
Abstract
The timely updating of the spatial distribution of buildings is essential to understanding a city’s development. Deep learning methods have remarkable benefits in quickly and accurately recognizing these changes. Current semi-supervised change detection (SSCD) methods have effectively reduced the reliance on labeled data. [...] Read more.
The timely updating of the spatial distribution of buildings is essential to understanding a city’s development. Deep learning methods have remarkable benefits in quickly and accurately recognizing these changes. Current semi-supervised change detection (SSCD) methods have effectively reduced the reliance on labeled data. However, these methods primarily focus on utilizing unlabeled data through various training strategies, neglecting the impact of pseudo-changes and learning bias in models. When dealing with limited labeled data, abundant low-quality pseudo-labels generated by poorly performing models can hinder effective performance improvement, leading to the incomplete recognition results of changes to buildings. To address this issue, we propose a feature multi-scale information interaction and complementation semi-supervised method based on consistency regularization (MSFG-SemiCD), which includes a multi-scale feature fusion-guided change detection network (MSFGNet) and a semi-supervised update method. Among them, the network facilitates the generation of multi-scale change features, integrates features, and captures multi-scale change targets through the temporal difference guidance module, the full-scale feature fusion module, and the depth feature guidance fusion module. Moreover, this enables the fusion and complementation of information between features, resulting in more complete change features. The semi-supervised update method employs a weak-to-strong consistency framework to achieve model parameter updates while maintaining perturbation invariance of unlabeled data at both input and encoder output features. Experimental results on the WHU-CD and LEVIR-CD datasets confirm the efficacy of the proposed method. There is a notable improvement in performance at both the 1% and 5% levels. The IOU in the WHU-CD dataset increased by 5.72% and 6.84%, respectively, while in the LEVIR-CD dataset, it improved by 18.44% and 5.52%, respectively. Full article
Show Figures

Figure 1

Figure 1
<p>Examples of low-quality pseudo-label when there are pseudo-changes and a small number of labels available.</p>
Full article ">Figure 2
<p>Diagram of the method. (<b>a</b>) denotes the supervised stage. (<b>b</b>) denotes the unsupervised stage.</p>
Full article ">Figure 3
<p>Unlabeled data input perturbation flowchart. Where the red box shows the result after the random mix-mask operation.</p>
Full article ">Figure 4
<p>Network framework diagram. The network structure consists of the encoder, an adaptive change feature perceptions module, and a Decoder part. The two-branch encoder consists of four residual blocks of ResNet-18. The adaptive difference feature learning module consists of four TDG modules. The decoder consists of the full-scale feature fusion module and the deep semantic guidance module.</p>
Full article ">Figure 5
<p>Adaptive difference feature learning module.</p>
Full article ">Figure 6
<p>Multiscale feature fusion and guidance module.</p>
Full article ">Figure 7
<p>Visualization results of different semi-supervised methods at a 5% labeling rate in WHU-CD dataset.</p>
Full article ">Figure 8
<p>Visualization results of MSFGNet under different labeling rates in the WHU-CD dataset.</p>
Full article ">Figure 9
<p>Visualization results of different semi-supervised methods at a 5% labeling rate in the LEVIR-CD dataset.</p>
Full article ">Figure 10
<p>Visualization results of MSFGNet under different labeling rates in the LEVIR-CD dataset.</p>
Full article ">Figure 11
<p>Visualization results of the ablation experiments.</p>
Full article ">Figure 12
<p>Experimental results of unsupervised loss function participation weights for each part. (<b>a</b>) Training results in WHU-CD at 5% data; (<b>b</b>) Training results in LEVIR-CD at 5% data.</p>
Full article ">Figure 13
<p>Feature visualization. (<b>a</b>) Initial features of image pairs. (<b>b</b>,<b>c</b>) are change features. (<b>d</b>,<b>e</b>) are different levels of change feature. (<b>f</b>) Output of the FSF module. (<b>g</b>–<b>i</b>) Three inputs of the DSG module. (<b>j</b>) Output of the DSG module. (<b>k</b>) Change probability map, where red indicates higher attention values and blue indicates lower attention values.</p>
Full article ">
14 pages, 665 KiB  
Review
Therapeutic Opportunities for Biomarkers in Metastatic Spine Tumors
by Christian Schroeder, Beatrice Campilan, Owen P. Leary, Jonathan Arditi, Madison J. Michles, Rafael De La Garza Ramos, Oluwaseun O. Akinduro, Ziya L. Gokaslan, Margot Martinez Moreno and Patricia L. Zadnik Sullivan
Cancers 2024, 16(18), 3152; https://doi.org/10.3390/cancers16183152 - 14 Sep 2024
Viewed by 755
Abstract
For many spine surgeons, patients with metastatic cancer are often present in an emergent situation with rapidly progressive neurological dysfunction. Since the Patchell trial, scoring systems such as NOMS and SINS have emerged to guide the extent of surgical excision and fusion in [...] Read more.
For many spine surgeons, patients with metastatic cancer are often present in an emergent situation with rapidly progressive neurological dysfunction. Since the Patchell trial, scoring systems such as NOMS and SINS have emerged to guide the extent of surgical excision and fusion in the context of chemotherapy and radiation therapy. Yet, while multidisciplinary decision-making is the gold standard of cancer care, in the middle of the night, when a patient needs spinal surgery, the wealth of chemotherapy data, clinical trials, and other medical advances can feel overwhelming. The goal of this review is to provide an overview of the relevant molecular biomarkers and therapies driving patient survival in lung, breast, prostate, and renal cell cancer. We highlight the molecular differences between primary tumors (i.e., the patient’s original lung cancer) and the subsequent spinal metastasis. This distinction is crucial, as there are limited data investigating how metastases respond to their primary tumor’s targeted molecular therapies. Integrating information from primary and metastatic markers allows for a more comprehensive and personalized approach to cancer treatment. Full article
(This article belongs to the Special Issue Surgical Treatment of Spinal Tumors)
Show Figures

Figure 1

Figure 1
<p>Genetic mutations in primary lung, breast, prostate, and renal cancers and their subsequent mutations upon metastasis to the spine. Mutations in italics represent known therapeutic targets discussed in this study.</p>
Full article ">
17 pages, 3340 KiB  
Article
GMS-YOLO: An Algorithm for Multi-Scale Object Detection in Complex Environments in Confined Compartments
by Qixiang Ding, Weichao Li, Chengcheng Xu, Mingyuan Zhang, Changchong Sheng, Min He and Nanliang Shan
Sensors 2024, 24(17), 5789; https://doi.org/10.3390/s24175789 - 5 Sep 2024
Cited by 1 | Viewed by 1086
Abstract
Many compartments are prone to pose safety hazards such as loose fasteners or object intrusion due to their confined space, making manual inspection challenging. To address the challenges of complex inspection environments, diverse target categories, and variable scales in confined compartments, this paper [...] Read more.
Many compartments are prone to pose safety hazards such as loose fasteners or object intrusion due to their confined space, making manual inspection challenging. To address the challenges of complex inspection environments, diverse target categories, and variable scales in confined compartments, this paper proposes a novel GMS-YOLO network, based on the improved YOLOv8 framework. In addition to the lightweight design, this network accurately detects targets by leveraging more precise high-level and low-level feature representations obtained from GhostHGNetv2, which enhances feature-extraction capabilities. To handle the issue of complex environments, the backbone employs GhostHGNetv2 to capture more accurate high-level and low-level feature representations, facilitating better distinction between background and targets. In addition, this network significantly reduces both network parameter size and computational complexity. To address the issue of varying target scales, the first layer of the feature fusion module introduces Multi-Scale Convolutional Attention (MSCA) to capture multi-scale contextual information and guide the feature fusion process. A new lightweight detection head, Shared Convolutional Detection Head (SCDH), is designed to enable the model to achieve higher accuracy while being lighter. To evaluate the performance of this algorithm, a dataset for object detection in this scenario was constructed. The experiment results indicate that compared to the original model, the parameter number of the improved model decreased by 37.8%, the GFLOPs decreased by 27.7%, and the average accuracy increased from 82.7% to 85.0%. This validates the accuracy and applicability of the proposed GMS-YOLO network. Full article
(This article belongs to the Special Issue Compressed Sensing and Imaging Processing—2nd Edition)
Show Figures

Figure 1

Figure 1
<p>Overview of the GMS-YOLO network framework.</p>
Full article ">Figure 2
<p>The network structure of HGNetv2.</p>
Full article ">Figure 3
<p>The network structure of GhostHGNetv2.</p>
Full article ">Figure 4
<p>The structure of Ghost Convolution module.</p>
Full article ">Figure 5
<p>The structure of GhostHGBlock module.</p>
Full article ">Figure 6
<p>The structure of MSCA module.</p>
Full article ">Figure 7
<p>The structure of SCDH module.</p>
Full article ">Figure 8
<p>Sample presentation of the dataset. These figures showcase some typical scenes and targets within our dataset, specifically focusing on the equipment and environment in this confined cabin setting, which includes all categories that need to be detected. The red detection boxes represent bolt, the orange detection boxes represent e-plug, the yellow detection boxes represent a-plug, the dark green detection boxes represent fod, the light green detection boxes represent a-bind, the pink detection boxes represent n-bind, the light blue detection boxes represent flaw, and the dark blue detection boxes represent water.</p>
Full article ">Figure 9
<p>Comparison of loss curves and metric curves.</p>
Full article ">Figure 10
<p>Validation set detection results comparison.</p>
Full article ">
20 pages, 27367 KiB  
Article
MCG-RTDETR: Multi-Convolution and Context-Guided Network with Cascaded Group Attention for Object Detection in Unmanned Aerial Vehicle Imagery
by Chushi Yu and Yoan Shin
Remote Sens. 2024, 16(17), 3169; https://doi.org/10.3390/rs16173169 - 27 Aug 2024
Cited by 1 | Viewed by 1107
Abstract
In recent years, object detection in unmanned aerial vehicle (UAV) imagery has been a prominent and crucial task, with advancements in drone and remote sensing technologies. However, detecting targets in UAV images pose challenges such as complex background, severe occlusion, dense small targets, [...] Read more.
In recent years, object detection in unmanned aerial vehicle (UAV) imagery has been a prominent and crucial task, with advancements in drone and remote sensing technologies. However, detecting targets in UAV images pose challenges such as complex background, severe occlusion, dense small targets, and lighting conditions. Despite the notable progress of object detection algorithms based on deep learning, they still struggle with missed detections and false alarms. In this work, we introduce an MCG-RTDETR approach based on the real-time detection transformer (RT-DETR) with dual and deformable convolution modules, a cascaded group attention module, a context-guided feature fusion structure with context-guided downsampling, and a more flexible prediction head for precise object detection in UAV imagery. Experimental outcomes on the VisDrone2019 dataset illustrate that our approach achieves the highest AP of 29.7% and AP50 of 58.2%, surpassing several cutting-edge algorithms. Visual results further validate the model’s robustness and capability in complex environments. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of the proposed MCG-RTDETR.</p>
Full article ">Figure 2
<p>The structure of the dual convolutional filter. <span class="html-italic">M</span> is the input channel count, <span class="html-italic">N</span> denotes the number of output channels and convolution filters, and <span class="html-italic">G</span> is the group count within dual convolution.</p>
Full article ">Figure 3
<p>The structure of the <math display="inline"><semantics> <mrow> <mn>3</mn> <mo>×</mo> <mn>3</mn> </mrow> </semantics></math> deformable convolution.</p>
Full article ">Figure 4
<p>Diagram of the cascaded group attention module.</p>
Full article ">Figure 5
<p>Diagram of the context-guided downsampling block.</p>
Full article ">Figure 6
<p>The diagram of prediction heads.</p>
Full article ">Figure 7
<p>Illustrative samples of the VisDrone2019 dataset.</p>
Full article ">Figure 8
<p>Visible object detection results of the proposed MCG-RTDETR and some state-of-the-art detectors on complex detection scenes of VisDrone2019 dataset. (<b>a</b>) depicts scenes with occlusion and complex environmental factors, (<b>b</b>) depicts vertical shooting angle during daylight, (<b>c</b>) depicts cloudy with intricate background. (<b>d</b>) depicts very small targets, (<b>e</b>) depicts low-light and night scene, (<b>f</b>) depicts dynamic objects like vehicles at night.</p>
Full article ">Figure 8 Cont.
<p>Visible object detection results of the proposed MCG-RTDETR and some state-of-the-art detectors on complex detection scenes of VisDrone2019 dataset. (<b>a</b>) depicts scenes with occlusion and complex environmental factors, (<b>b</b>) depicts vertical shooting angle during daylight, (<b>c</b>) depicts cloudy with intricate background. (<b>d</b>) depicts very small targets, (<b>e</b>) depicts low-light and night scene, (<b>f</b>) depicts dynamic objects like vehicles at night.</p>
Full article ">Figure 8 Cont.
<p>Visible object detection results of the proposed MCG-RTDETR and some state-of-the-art detectors on complex detection scenes of VisDrone2019 dataset. (<b>a</b>) depicts scenes with occlusion and complex environmental factors, (<b>b</b>) depicts vertical shooting angle during daylight, (<b>c</b>) depicts cloudy with intricate background. (<b>d</b>) depicts very small targets, (<b>e</b>) depicts low-light and night scene, (<b>f</b>) depicts dynamic objects like vehicles at night.</p>
Full article ">Figure 8 Cont.
<p>Visible object detection results of the proposed MCG-RTDETR and some state-of-the-art detectors on complex detection scenes of VisDrone2019 dataset. (<b>a</b>) depicts scenes with occlusion and complex environmental factors, (<b>b</b>) depicts vertical shooting angle during daylight, (<b>c</b>) depicts cloudy with intricate background. (<b>d</b>) depicts very small targets, (<b>e</b>) depicts low-light and night scene, (<b>f</b>) depicts dynamic objects like vehicles at night.</p>
Full article ">
13 pages, 632 KiB  
Systematic Review
Targeted Prostate Biopsy: How, When, and Why? A Systematic Review
by Giacomo Rebez, Maria Barbiero, Franco Alchiede Simonato, Francesco Claps, Salvatore Siracusano, Rosa Giaimo, Gabriele Tulone, Fabio Vianello, Alchiede Simonato and Nicola Pavan
Diagnostics 2024, 14(17), 1864; https://doi.org/10.3390/diagnostics14171864 - 26 Aug 2024
Viewed by 1021
Abstract
Objective: Prostate cancer, the second most diagnosed cancer among men, requires precise diagnostic techniques to ensure effective treatment. This review explores the technological advancements, optimal application conditions, and benefits of targeted prostate biopsies facilitated by multiparametric magnetic resonance imaging (mpMRI). Methods: A systematic [...] Read more.
Objective: Prostate cancer, the second most diagnosed cancer among men, requires precise diagnostic techniques to ensure effective treatment. This review explores the technological advancements, optimal application conditions, and benefits of targeted prostate biopsies facilitated by multiparametric magnetic resonance imaging (mpMRI). Methods: A systematic literature review was conducted to compare traditional 12-core systematic biopsies guided by transrectal ultrasound with targeted biopsy techniques using mpMRI. We searched electronic databases including PubMed, Scopus, and Web of Science from January 2015 to December 2024 using keywords such as “targeted prostate biopsy”, “fusion prostate biopsy”, “cognitive prostate biopsy”, “MRI-guided biopsy”, and “transrectal ultrasound prostate biopsy”. Studies comparing various biopsy methods were included, and data extraction focused on study characteristics, patient demographics, biopsy techniques, diagnostic outcomes, and complications. Conclusion: mpMRI-guided targeted biopsies enhance the detection of clinically significant prostate cancer while reducing unnecessary biopsies and the detection of insignificant cancers. These targeted approaches preserve or improve diagnostic accuracy and patient outcomes, minimizing the risks associated with overdiagnosis and overtreatment. By utilizing mpMRI, targeted biopsies allow for precise targeting of suspicious regions within the prostate, providing a cost-effective method that reduces the number of biopsies performed. This review highlights the importance of integrating advanced imaging techniques into prostate cancer diagnosis to improve patient outcomes and quality of life. Full article
Show Figures

Figure 1

Figure 1
<p>PRISMA flowchart.</p>
Full article ">
Back to TopTop