https://elcvia.cvc.uab.cat/issue/feed ELCVIA Electronic Letters on Computer Vision and Image Analysis 2024-11-07T01:16:20+00:00 Electronic Letters on Computer Vision and Image Analysis elcvia@cvc.uab.cat Open Journal Systems Electronic Journal on Computer Vision and Image Analysis https://elcvia.cvc.uab.cat/article/view/1833 Deep Learning based-framework for Math Formulas Understanding 2024-11-07T01:16:20+00:00 Kawther Khazri Ayeb kawther.khazri@yahoo.fr Afef Kacem afef.kacem@ensit.u-tunis.tn Takwa Ben Aïcha Gader takwa.ben.aichaa@gmail.com <p>Extracting mathematical formulas from images of scientific documents and converting them into structured data for storage in a database is essential for their further use. However, recognizing and extracting math formulas automatically, rapidly, and effectively can be challenging. To handle this problem, we have proposed a system, with a deep learning architecture, that uses the formula combination features to train the YOLOv8 model. This system can detect and classify the formula inside and outside the text. Once extracted, we built a robust end-to-end math formula recognition system that automatically identifies and classifies math symbols, using the faster R-CNN object detection, then a Convolution Graphical Neural network (ConvGNN) to analyze the math formula layout, as the formula is better represented as a graph with complex relationships and object interdependency. ConvGNN can predict formula linkages without resorting to laborious feature engineering. Experimental results on the IBEM and CROHME 2019 datasets reveal that the proposed approach can accurately extract isolated formulas with mAP of 99.3\%, embedded formulas with mAP of 80.3%, detect symbols with mAP of 87.3%, and analyze formula layout with an accuracy of 92%. We also showed that our system is competitive with related work.</p> 2024-09-06T00:00:00+00:00 Copyright (c) 2024 Afef Kacem https://elcvia.cvc.uab.cat/article/view/1755 Dr DAH-Unet: A modified UNet for Semantic Segmentation of MRI images for brain tumour detection 2024-08-16T14:45:46+00:00 Mohankrishna Potnuru mpotnuru@gitam.in B. Suribabu Naick sbhukya@gitam.edu <p class="western" align="justify"><span style="color: #000000;"><span style="font-family: Times New Roman, serif;"><span style="font-size: small;"><span lang="en-CA">Using sophisticated image processing techniques on brain MR images for medical image segmentation significantly improves the ability to detect tumors. It takes a lot of time and requires a doctor's training and experience to manually segment a brain tumor. </span></span></span></span><span style="color: #000000;"><span style="font-family: Times New Roman, serif;"><span style="font-size: small;"><span lang="en-CA">To address this issue, we proposed a modification in Unet architecture called DAH-Unet that combines residual blocks, a rebuilt atrous spatial pyramid pooling (ASPP), and depth-wise convolutions. Also, a hybrid loss function which is explicitly aware of the boundaries is another thing we suggested. Experiments were conducted on two publicly available dataset and proved better in some metrics as compare to existing semantic segmentation models. </span></span></span></span></p> <p>&nbsp;</p> <p>&nbsp;</p> 2024-11-12T00:00:00+00:00 Copyright (c) 2024 Mohankrishna Potnuru, B. Suribabu Naick https://elcvia.cvc.uab.cat/article/view/1917 An Efficient Deep Learning based License Plate Recognition for Smart Cities 2024-08-16T17:31:42+00:00 Swati d20ec012@eced.svnit.ac.in Shubh Dinesh Kawa shubhkawa11@gmail.com Shubham Kamble shubhamkamble200431@gmail.com Darshit Desai darshitdesai962@gmail.com Pratik Himanshu Karelia pratik25h.k@gmail.com Pinalkumar Engineer pje@eced.svnit.ac.in <p>Computer vision algorithm with the amalgamation of deep learning technologies has provided endless possible applications. Currently, with the high load of vehicle traffic it is very difficult to trace and capture vehicular information over traffic surveillance on roads, parking or for safety concerns. Here, we have done an exploration for such a use case where a deep learning model is trained to detect and recognize a license plate in a vehicle. In the proposed method an object detection model, EfficientDet-D0 has been trained with custom dataset for license plate detection and have used optical character recognition model, Tesseract. In the proposed method, we have used a novel license plate extraction algorithm which reduces false localization followed by character recognition in a pipeline manner. We have also explored model quantization method to compress the model at reduced precision for efficient edge-based deployment for an end-application. In the proposed work, we have dedicated our study for Indian vehicles and have evaluated the performance with standard datasets like CCPD, UFPR and have achieved 97.9% in license localization and 95.15% in end-to-end detection and recognition respectively. We have implemented on Raspberry Pi3 and NVIDIA Jetson Nano deviced with improved performances. Comparing with state-of-the-art we have achieved 2×, 3.8× and 2.5× in CPU, GPU and edge platform respectively.</p> 2024-11-12T00:00:00+00:00 Copyright (c) 2024 Swati, Shubh Dinesh Kawa, Shubham Kamble, Darshit Desai, Pratik Himanshu Karelia, Pinalkumar Engineer https://elcvia.cvc.uab.cat/article/view/1941 A Labeled Array Distance Metric for Measuring Image Segmentation Quality 2024-09-20T09:50:41+00:00 Maryam Berijanian berijani@msu.edu Katrina Gensterblum katrina.gensterblum@kellanova.com Doruk Alp Mutlu mutludor@msu.edu Katelyn Reagan kreagan@smith.edu Andrew Hart hartand9@msu.edu Dirk Colbry colbrydi@msu.edu <p>This work introduces two new distance metrics for comparing labeled arrays, which are common outputs of image segmentation algorithms. Each pixel in an image is assigned a label, with binary segmentation providing only two labels ('foreground' and 'background'). These can be represented by a simple binary matrix and compared using pixel differences. However, many segmentation algorithms output multiple regions in a labeled array. We propose two distance metrics, named LAD and MADLAD, that calculate the distance between two labeled images. By doing so, the accuracy of different image segmentation algorithms can be evaluated by measuring their outputs against a 'ground truth' labeling. Both proposed metrics, operating with a complexity of <em>O(N)</em> for images with <em>N</em> pixels, are designed to quickly identify similar labeled arrays, even when different labeling methods are used. Comparisons are made between images labeled manually and those labeled by segmentation algorithms. This evaluation is crucial when searching through a space of segmentation algorithms and their hyperparameters via a genetic algorithm to identify the optimal solution for automated segmentation, which is the goal in our lab, SEE-Insight. By measuring the distance from the ground truth, these metrics help determine which algorithm provides the most accurate segmentation.</p> 2024-11-12T00:00:00+00:00 Copyright (c) 2024 Maryam Berijanian, Katrina Gensterblum, Doruk Alp Mutlu, Katelyn Reagan, Andrew Hart, Dirk Colbry