Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
sensors-logo

Journal Browser

Journal Browser

Selected Papers from the 9th International Conference on Imaging for Crime Detection and Prevention (ICDP-19)

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: closed (10 October 2020) | Viewed by 15987

Special Issue Editors


E-Mail Website
Guest Editor
School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK
Interests: computer vision; object location; human action recognition; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
University of Westminster
Interests: explainable AI; semantic computing; information retrieval and search

E-Mail Website
Guest Editor
School of Computer Science and Engineering, University of Westminster, London, UK
Interests: computer vision; neural networks; computational intelligence; machine learning; multimodal human computer interaction, robotics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
University of Westminster
Interests: signal processing; sensors and sensor networks; image processing

Special Issue Information

Dear Colleagues,

The 9th International Conference on Imaging for Crime Detection and Prevention (ICDP-19), 16–18 December in London, UK, seeks contributions on the development of automated surveillance systems, adaptive behaviours and machine learning methods by harnessing data generated from very different sources and with many different features, such as social networks, smart cities, etc., to tackle the vulnerability of public spaces and individuals.

The aim of this Special Issue is to include selected papers from the 2019 9th International Conference on Imaging for Crime Detection and Prevention (ICDP-19) describing research from both academia and industry, on the recent advances in the theory, application and implementation of crime detection and prevention concepts, technologies and applications. Authors of selected highly qualified papers, presented at the conference, will be invited to submit extended versions of their original papers (at least 50% extensions of contents of the conference paper with significantly different title, abstract and contents) to be fully peer-reviewed and contributions under the following conference topics:

  • Surveillance systems and solutions (system architecture aspects, operational procedures, usability, scalability)
  • Multicamera systems
  • Information fusion (e.g., from visible and infrared cameras, microphone arrays, etc.)
  • Learning systems, cognitive systems engineering and video mining
  • Robust computer vision algorithms (24/7 operation under variable conditions, object tracking, multicamera algorithms, behaviour analysis and learning, scene segmentation)
  • Human machine interfaces, human systems engineering and human factors
  • Wireless communications and networks for video surveillance, video coding, compression, authentication, watermarking, location-dependent services
  • Metadata generation, video database indexing, searching and browsing
  • Embedded systems, surveillance middleware
  • Gesture and posture analysis and recognition
  • Biometrics (including face recognition)
  • Forensics and crime scene reconstruction
  • X-Ray and terahertz scanning
  • Case studies, practical systems and testbeds
  • Data protection, civil liberties and social exclusion issues
  • Algorithmic bias and transparency for machine learning
  • AI ethics
  • Custom FPGA-based approximate computing

Accepted papers (after peer review) will be published immediately.

Prof. Dr. Sergio A. Velastin
Dr. Epaminondas Kapetanios
Dr. Anastasia Angelopoulou
Prof. Izzet Kale
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Surveillance systems
  • Wireless communications and networks for video surveillance
  • Machine learning
  • Artificial Intelligence
  • Computer vision
  • Security and privacy

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 2558 KiB  
Article
Directed Gaze Trajectories for Biometric Presentation Attack Detection
by Asad Ali, Sanaul Hoque and Farzin Deravi
Sensors 2021, 21(4), 1394; https://doi.org/10.3390/s21041394 - 17 Feb 2021
Cited by 4 | Viewed by 2188
Abstract
Presentation attack artefacts can be used to subvert the operation of biometric systems by being presented to the sensors of such systems. In this work, we propose the use of visual stimuli with randomised trajectories to stimulate eye movements for the detection of [...] Read more.
Presentation attack artefacts can be used to subvert the operation of biometric systems by being presented to the sensors of such systems. In this work, we propose the use of visual stimuli with randomised trajectories to stimulate eye movements for the detection of such spoofing attacks. The presentation of a moving visual challenge is used to ensure that some pupillary motion is stimulated and then captured with a camera. Various types of challenge trajectories are explored on different planar geometries representing prospective devices where the challenge could be presented to users. To evaluate the system, photo, 2D mask and 3D mask attack artefacts were used and pupillary movement data were captured from 80 volunteers performing genuine and spoofing attempts. The results support the potential of the proposed features for the detection of biometric presentation attacks. Full article
Show Figures

Figure 1

Figure 1
<p>The block diagram of the proposed presentation attack detection (PAD) system. [RPD = Relative Pupillary Displacement].</p>
Full article ">Figure 2
<p>Samples of the random trajectory used as a challenge during the data collection: (<b>a</b>) <span class="html-italic">Lines</span>; and (<b>b</b>) <span class="html-italic">Curves</span> (the labels’ “start” and “end” are added for the clarity of illustration only).</p>
Full article ">Figure 3
<p>Detected Landmarks and corresponding distances from the pupil centres used for feature extraction. Features from the two eyes were treated independently for attack detection in this implementation.</p>
Full article ">Figure 4
<p>Data collection process: (<b>a</b>) genuine attempt; (<b>b</b>) photo attack; (<b>c</b>) 2D mask attack; (<b>d</b>) 3D mask attack; and (<b>e</b>) the setup used for data acquisition.</p>
Full article ">Figure 5
<p>ROC curves for the photo, 2D mask and 3D mask. Stimulus trajectory: <span class="html-italic">Lines</span>. Form factors: (<b>a</b>) tablet format; and (<b>b</b>) phone format.</p>
Full article ">Figure 6
<p>ROC curves for the photo, 2D mask and 3D mask. Stimulus trajectory: <span class="html-italic">Curves</span>. Form factors: (<b>a</b>) tablet format; and (<b>b</b>) phone format.</p>
Full article ">Figure 7
<p>ROC curves for the photo, 2D mask and 3D mask. Stimulus trajectory: composite. Form factors: tablet format (<b>left</b>) and phone format (<b>right</b>). [The plot titles indicate the device format as well as the attack artefact.]</p>
Full article ">Figure 8
<p>ROC curves for mixed attack types for both device formats. Stimulus trajectory: (<b>a</b>) <span class="html-italic">Lines</span> or <span class="html-italic">Curves</span> challenge for 5 s; (<b>b</b>) composite challenge for 6 s.</p>
Full article ">
17 pages, 5638 KiB  
Article
Rectification and Super-Resolution Enhancements for Forensic Text Recognition
by Pablo Blanco-Medina, Eduardo Fidalgo, Enrique Alegre, Rocío Alaiz-Rodríguez, Francisco Jáñez-Martino and Alexandra Bonnici
Sensors 2020, 20(20), 5850; https://doi.org/10.3390/s20205850 - 16 Oct 2020
Cited by 4 | Viewed by 3374
Abstract
Retrieving text embedded within images is a challenging task in real-world settings. Multiple problems such as low-resolution and the orientation of the text can hinder the extraction of information. These problems are common in environments such as Tor Darknet and Child Sexual Abuse [...] Read more.
Retrieving text embedded within images is a challenging task in real-world settings. Multiple problems such as low-resolution and the orientation of the text can hinder the extraction of information. These problems are common in environments such as Tor Darknet and Child Sexual Abuse images, where text extraction is crucial in the prevention of illegal activities. In this work, we evaluate eight text recognizers and, to increase the performance of text transcription, we combine these recognizers with rectification networks and super-resolution algorithms. We test our approach on four state-of-the-art and two custom datasets (TOICO-1K and Child Sexual Abuse (CSA)-text, based on text retrieved from Tor Darknet and Child Sexual Exploitation Material, respectively). We obtained a 0.3170 score of correctly recognized words in the TOICO-1K dataset when we combined Deep Convolutional Neural Networks (CNN) and rectification-based recognizers. For the CSA-text dataset, applying resolution enhancements achieved a final score of 0.6960. The highest performance increase was achieved on the ICDAR 2015 dataset, with an improvement of 4.83% when combining the MORAN recognizer and the Residual Dense resolution approach. We conclude that rectification outperforms super-resolution when applied separately, while their combination achieves the best average improvements in the chosen datasets. Full article
Show Figures

Figure 1

Figure 1
<p>Images crawled from Tor darknet. Samples from (<b>a</b>) dismantled weapon, (<b>b</b>) fake id, (<b>c</b>) fake money and (<b>d</b>) credit cards.</p>
Full article ">Figure 2
<p>Common problems found in Tor images. Orientation (<b>left</b>, <b>middle-top</b> and <b>right</b>) and low-resolution (<b>middle</b>).</p>
Full article ">Figure 3
<p>Resolution and orientation issues in state-of-the-art datasets.</p>
Full article ">Figure 4
<p>TOICO-1K sample images. (<b>a</b>) represents the original image, (<b>b</b>) cropped text regions and (<b>c</b>) details labelling examples.</p>
Full article ">Figure 5
<p>Child Sexual Abuse (CSA)-text dataset sample images.</p>
Full article ">Figure 6
<p>Proposed methodology. Images that were not correctly recognized are enhanced by super-resolution and rectification techniques standalone and in combination.</p>
Full article ">
38 pages, 130648 KiB  
Article
Toward Mass Video Data Analysis: Interactive and Immersive 4D Scene Reconstruction
by Matthias Kraus, Thomas Pollok, Matthias Miller, Timon Kilian, Tobias Moritz, Daniel Schweitzer, Jürgen Beyerer, Daniel Keim, Chengchao Qu and Wolfgang Jentner
Sensors 2020, 20(18), 5426; https://doi.org/10.3390/s20185426 - 22 Sep 2020
Cited by 6 | Viewed by 4168
Abstract
The technical progress in the last decades makes photo and video recording devices omnipresent. This change has a significant impact, among others, on police work. It is no longer unusual that a myriad of digital data accumulates after a criminal act, which must [...] Read more.
The technical progress in the last decades makes photo and video recording devices omnipresent. This change has a significant impact, among others, on police work. It is no longer unusual that a myriad of digital data accumulates after a criminal act, which must be reviewed by criminal investigators to collect evidence or solve the crime. This paper presents the VICTORIA Interactive 4D Scene Reconstruction and Analysis Framework (“ISRA-4D” 1.0), an approach for the visual consolidation of heterogeneous video and image data in a 3D reconstruction of the corresponding environment. First, by reconstructing the environment in which the materials were created, a shared spatial context of all available materials is established. Second, all footage is spatially and temporally registered within this 3D reconstruction. Third, a visualization of the hereby created 4D reconstruction (3D scene + time) is provided, which can be analyzed interactively. Additional information on video and image content is also extracted and displayed and can be analyzed with supporting visualizations. The presented approach facilitates the process of filtering, annotating, analyzing, and getting an overview of large amounts of multimedia material. The framework is evaluated using four case studies which demonstrate its broad applicability. Furthermore, the framework allows the user to immerse themselves in the analysis by entering the scenario in virtual reality. This feature is qualitatively evaluated by means of interviews of criminal investigators and outlines potential benefits such as improved spatial understanding and the initiation of new fields of application. Full article
Show Figures

Figure 1

Figure 1
<p>Processing pipeline of the crime scene analysis framework. Multimedia input data are processed in three main steps: First, a static reconstruction of the crime scene is created using a structure-from-motion approach. Second, dynamic elements are extracted as dynamic point clouds. Third, tracks of persons and objects are extracted using machine learning models.</p>
Full article ">Figure 2
<p>3D reconstruction after being manually geo-registered into satellite imagery based map data.</p>
Full article ">Figure 3
<p>(<b>Left</b>) Point cloud reconstructed from a stereo camera using classical stereo block matching. (<b>Right</b>) Point cloud reconstructed with our geometrically based monocular depth reconstruction.</p>
Full article ">Figure 4
<p>(<b>Top left</b>) Input image. (<b>Top right</b>) Result of our method, in which people are segmented and placed upright on the ground.(<b>Bottom left</b>) Resulting depth map using Monodepth2. (<b>Bottom right</b>) Embedded point cloud generated using Monodepth2.</p>
Full article ">Figure 5
<p>Dynamic objects can be displayed differently in the static 3D reconstruction. (<b>Top</b>) Detected bounding boxes of persons are embedded upright. (<b>Center</b>) Complete depth map of the segmented image is superimposed. (<b>Bottom</b>) People reconstructed with PIFuHD are embedded.</p>
Full article ">Figure 6
<p>(<b>Left</b>) OpenPose annotation key points. The red silhouette represents the segmented instance boundary when using MaskRCNN. (<b>Top right</b>) Exemplary OpenPose result on an image with several persons and partial occlusions. (<b>Bottom right</b>) Neural network-based automatic foreground segmentation of people. This foreground is the dynamic part of the image that has to be placed in the scene as dynamic content.</p>
Full article ">Figure 7
<p>3D models of different persons reconstructed from single images using PiFuHD. The reconstruction time was approximately 10 seconds per person. The size of the image patches varied between 260 × 330 and 440 × 960 pixels. Most models were successfully reconstructed from all sides. Only the kneeing man opening a suitcase (lowest resolution) could not be reconstructed from the back.</p>
Full article ">Figure 8
<p>Frame taken from feature detection preprocessing procedure. During processing, the original video is played back while detected objects are highlighted.</p>
Full article ">Figure 9
<p>For each person recognized in a video frame (<b>Left</b>), OpenPose is applied for skeleton extraction. The extracted skeletons can later be displayed in the scene as connected points (<b>right</b>).</p>
Full article ">Figure 10
<p>Its extrinsic parameters define the world coordinates of a camera in a 3D scene (camera icon). Based on intrinsic parameters, the pixel coordinate position of an object can be transformed into its respective world position through raycasting. A ray (red line) is emitted through the image at the lower edge of the bounding box of a detection (red rectangle in the camera frame). The intersection of the ray with the mesh provides the related 3D world coordinate.</p>
Full article ">Figure 11
<p>Main elements of the analysis application.</p>
Full article ">Figure 12
<p>Multiple data sources are bundled and displayed simultaneously in a shared context. On the left side, three frames from static surveillance cameras are displayed. Their locations are indicated by small camera icons in the 3D scene (orange, blue, and red). Detections from all cameras are displayed simultaneously in the scene (dashed lines) as well as static material, such as photos (light green) and panoramic images (teal).</p>
Full article ">Figure 13
<p>The graphical user interface of the presented demonstrator consists of four main parts: a menu at the top, a minimap at the top right, a bottom panel, and the main window as a view of the inspected scene.</p>
Full article ">Figure 14
<p>Minimap depicting a top-down view of the reconstructed environment. The locations of the cameras recording the investigated incident are displayed as small camera icons (3 static cameras: blue, green, and magenta; 2 moving cameras: red and yellow). The current location of the user is shown as a small dot, with a red frustum indicating the viewing direction (center) and field of view.</p>
Full article ">Figure 15
<p>3D scene that can be inspected by flying around in it, which interactively changes the perspective.</p>
Full article ">Figure 16
<p>Panoramas (<b>Left</b>) are displayed as spheres in the scene (<b>center</b>). By opening a sphere, the user “enters” the photosphere to inspect it (<b>right</b>).</p>
Full article ">Figure 17
<p>Static user annotations can be manually added to the scene.</p>
Full article ">Figure 18
<p>The camera frustums displayed in the scene and minimap can be customized: either as semi-transparent objects (<b>Left</b>) or using additive (<b>top right</b>) or subtractive (<b>bottom right</b>) lighting.</p>
Full article ">Figure 19
<p>The user can configure the display of moving cameras in the scene. The location of the camera at the currently selected time is highlighted with a red halo. In this example, the camera locations of the last four time steps are also shown with increasing opacity.</p>
Full article ">Figure 20
<p>(<b>a</b>) A detection can be displayed as a bounding box, (<b>b</b>) the best shot of its track, (<b>c</b>) the corresponding snippet from its frame, (<b>d</b>) a combination of bounding box and best shot or frame snippet, or, if available, (<b>e</b>) its skeleton.</p>
Full article ">Figure 21
<p>The trajectory of a selected detection is visualized as a directed path within the scene. A menu allows to change the displayed title of a detection and to leave notes.</p>
Full article ">Figure 22
<p>Faces of displayed persons within the 4D reconstruction can be anonymized for privacy reasons. A face detection algorithm detects the bounding boxes of faces (<b>center</b>) which are subsequently blurred in the displayed content throughout the visual analysis (<b>right</b>).</p>
Full article ">Figure 23
<p>Dynamic point clouds displayed in the static scene from the current perspective (<b>Left</b>). If one navigates through space, the perspective changes and point clouds generated from different cameras can be perceived. For example, (1) the (<b>top right</b>) point cloud snippet can be seen from the direction indicated by (1) the orange camera and (2) the (<b>bottom right</b>) one from the direction indicated by the blue camera.</p>
Full article ">Figure 24
<p>To animate annotations, waypoints can be set and arranged on a timeline that temporarily replaces the bottom panel. Waypoints determine the location of an annotation at a particular time.</p>
Full article ">Figure 25
<p>The bottom panel consists of three elements: At the top is a frame preview of all selected cameras. In the center, the class distributions of the detections are visualized as horizon charts. At the bottom is a chart depicting the appearances of all detections as lines.</p>
Full article ">Figure 26
<p>A heatmap visualization of selected detections can be projected onto the environment providing an overview of where objects or persons were detected in the analyzed scene.</p>
Full article ">Figure 27
<p>Interactive tool for measuring distances and object sizes in the reconstruction.</p>
Full article ">Figure 28
<p>(<b>Left</b>) View of an exemplary scene in VR. (<b>Right</b>) Set-up with immersed investigator.</p>
Full article ">Figure 29
<p>(<b>Left</b>) A radial menu can be opened on the right controller to open various menus that are displayed on the left controller to configure the visualized scene. (<b>Right</b>) A minimap and a distance measuring tool can be activated on demand.</p>
Full article ">Figure 30
<p>Collaborative setup with multiple monitors connected to the system. One monitor shows the usual view of the 4D reconstruction (<b>Left</b>) and the other one, a view from the simultaneous observer’s perspective in VR (<b>right</b>). An avatar of the VR observer is displayed in the desktop interface (<b>Left</b>).</p>
Full article ">Figure 31
<p>Reconstruction of airport in which multiple surveillance cameras are spatially registered. Video streams of the cameras are fed into the system and automatically extracted detections are depicted in the 3D reconstruction in real-time.</p>
Full article ">Figure 32
<p>Exemplary reconstruction of the environment for strategy planning in police operations. The demonstrator creates a static mesh from drone recordings. The planned movement of police forces can be sketched in it.</p>
Full article ">
21 pages, 3366 KiB  
Article
Assessment and Estimation of Face Detection Performance Based on Deep Learning for Forensic Applications
by Deisy Chaves, Eduardo Fidalgo, Enrique Alegre, Rocío Alaiz-Rodríguez, Francisco Jáñez-Martino and George Azzopardi
Sensors 2020, 20(16), 4491; https://doi.org/10.3390/s20164491 - 11 Aug 2020
Cited by 18 | Viewed by 5289
Abstract
Face recognition is a valuable forensic tool for criminal investigators since it certainly helps in identifying individuals in scenarios of criminal activity like fugitives or child sexual abuse. It is, however, a very challenging task as it must be able to handle low-quality [...] Read more.
Face recognition is a valuable forensic tool for criminal investigators since it certainly helps in identifying individuals in scenarios of criminal activity like fugitives or child sexual abuse. It is, however, a very challenging task as it must be able to handle low-quality images of real world settings and fulfill real time requirements. Deep learning approaches for face detection have proven to be very successful but they require large computation power and processing time. In this work, we evaluate the speed–accuracy tradeoff of three popular deep-learning-based face detectors on the WIDER Face and UFDD data sets in several CPUs and GPUs. We also develop a regression model capable to estimate the performance, both in terms of processing time and accuracy. We expect this to become a very useful tool for the end user in forensic laboratories in order to estimate the performance for different face detection options. Experimental results showed that the best speed–accuracy tradeoff is achieved with images resized to 50% of the original size in GPUs and images resized to 25% of the original size in CPUs. Moreover, performance can be estimated using multiple linear regression models with a Mean Absolute Error (MAE) of 0.113, which is very promising for the forensic field. Full article
Show Figures

Figure 1

Figure 1
<p>Strategy to predict the face detection performance to an input image.</p>
Full article ">Figure 2
<p>Pipeline of detecting faces after resizing.</p>
Full article ">Figure 3
<p>Precision-Recall curves on WIDER Face data set for MTCNN, PyramidBox and DSFD face detection methods using four different image resolutions.</p>
Full article ">Figure 4
<p>Average CPU and GPU computation time(s) on the WIDER Face data set for MTCNN, PyramidBox and DSFD face detection methods using four different image resolutions.</p>
Full article ">Figure 5
<p>Detected faces using the MTCNN, PyramidBox and DSFD methods with four image resolutions.</p>
Full article ">Figure 6
<p>Precision-Recall curves on the UFDD Face data set for the MTCNN, PyramidBox and DSFD face detection methods using four different image resolutions.</p>
Full article ">Figure 7
<p>Average CPU and GPU computation time (s) on the UFDD Face data set for the MTCNN, PyramidBox and DSFD face detection methods using four different image resolutions.</p>
Full article ">
Back to TopTop