Nothing Special   »   [go: up one dir, main page]

CN114637870A - Image data processing method, device, equipment and storage medium - Google Patents

Image data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN114637870A
CN114637870A CN202210246863.1A CN202210246863A CN114637870A CN 114637870 A CN114637870 A CN 114637870A CN 202210246863 A CN202210246863 A CN 202210246863A CN 114637870 A CN114637870 A CN 114637870A
Authority
CN
China
Prior art keywords
image
block
original
detected
redundant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210246863.1A
Other languages
Chinese (zh)
Other versions
CN114637870B (en
Inventor
谭玉娟
肖丹
晏志超
江泓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202210246863.1A priority Critical patent/CN114637870B/en
Publication of CN114637870A publication Critical patent/CN114637870A/en
Application granted granted Critical
Publication of CN114637870B publication Critical patent/CN114637870B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0007Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

The invention discloses an image data processing method, device, equipment and storage medium, comprising: acquiring an image to be detected and an original image; the original image is composed of a plurality of original blocks; based on an image alignment algorithm, partitioning the image to be detected according to the original image to obtain a plurality of blocks to be detected, and establishing a mapping relation between the blocks to be detected and the original block; respectively calculating hash values of the block to be detected and an original block corresponding to the block to be detected, and comparing the hash values; and when the comparison result of the hash value meets a preset redundancy condition, the block to be detected is a redundant block, and the redundant block is deleted. According to the embodiment of the invention, the image to be detected is blocked, the mapping relation between the block to be detected and the original block of the original image is established, and the redundant block is determined and deleted according to the hash value comparison result of the block to be detected and the corresponding original block, so that the deletion of the sensing redundant content is realized, and the storage of the image data is optimized.

Description

Image data processing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a method, an apparatus, a device, and a storage medium for processing image data.
Background
With the development of computer technology, ubiquitous intelligent mobile devices can capture electronic images anytime and anywhere and spread the images widely across social media applications. In the process of image spreading and sharing, a large number of similar images are easily derived from a single original image, firstly, users always tend to modify the images to achieve the expectations of the users or express the feelings of the users, secondly, one image can be compressed by dozens of popular compression algorithms according to different compression levels, so that a large number of compressed copies with the same perception content but completely different code stream content are generated, and in addition, a mainstream picture editing tool driven by image processing or artificial intelligence algorithms can also generate a large number of highly similar images which share a large number of redundant perception content but completely different bit streams. Existing deduplication tools, however, are bitstream-based and are not able to efficiently identify perceptually redundant images with different bitstreams.
Disclosure of Invention
Embodiments of the present invention provide an image data processing method, an image data processing apparatus, an image data processing device, and a storage medium, which are capable of determining and deleting redundant blocks according to hash value comparison results of a block to be detected and a corresponding original block by blocking an image to be detected and establishing a mapping relationship between the block to be detected and the original block of the original image, thereby implementing deletion of perceptual redundant content and optimizing storage of image data.
In order to achieve the above object, an embodiment of the present invention provides an image data processing method, including:
acquiring an image to be detected and an original image; the original image is composed of a plurality of original blocks;
based on an image alignment algorithm, partitioning the image to be detected according to the original image to obtain a plurality of blocks to be detected, and establishing a mapping relation between the blocks to be detected and the original block;
respectively calculating hash values of the block to be detected and an original block corresponding to the block to be detected, and comparing the hash values;
and when the comparison result of the hash value meets a preset redundancy condition, the block to be detected is a redundant block, and the redundant block is deleted.
As an improvement of the above scheme, the method further comprises the following steps:
when the comparison result of the hash value does not meet the redundancy condition, the block to be tested is a non-redundant block, and the non-redundant block is stored in an image memory; wherein the redundancy condition is:
and the Hamming distance between the hash value of the block to be detected and the hash value of the original block corresponding to the block to be detected is smaller than a preset distance threshold value.
As an improvement of the above solution, the original image is acquired by:
extracting identification information of the image to be detected;
and selecting an image with the identification information same as that of the image to be detected from an image memory as an original image.
As an improvement of the above solution, before the original image is stored in the image memory, the method further includes:
based on a preset image blocking rule, blocking the original image to obtain a plurality of original blocks;
acquiring original identification information;
adding the original identification information to each of the original blocks.
As an improvement of the above solution, the original identification information is generated by: acquiring description information of the original image, and generating original identification information according to the description information; wherein the description information includes at least one of a user, a device, an application, a date, a location, or a resolution associated with the original image generation.
As an improvement of the above scheme, before the deleting the redundant block, the method further includes:
extracting the image information of the redundant block and extracting the image information of an original block corresponding to the redundant block;
obtaining the image operation information of the redundant block according to the image information of the redundant block and the image information of the original block corresponding to the redundant block, and storing the image operation information in an image description memory;
the saving the non-redundant block to the image memory specifically includes:
adding the image content near the non-redundant block into the non-redundant block, and storing the non-redundant block added with the image content into the image memory.
As an improvement of the above scheme, after the deleting the redundant block, the method further includes:
responding to an image recovery instruction, and generating a redundant image according to the image operation information and the original image;
and recovering the image to be detected according to the redundant image and the non-redundant block.
In order to achieve the above object, an embodiment of the present invention further provides an image data processing apparatus, including:
the image acquisition module is used for acquiring an image to be detected and an original image; the original image is composed of a plurality of original blocks;
the image blocking module is used for blocking the image to be detected according to the original image based on an image alignment algorithm to obtain a plurality of blocks to be detected and establishing a mapping relation between the blocks to be detected and the original blocks;
the image comparison module is used for respectively calculating the hash values of the block to be detected and the original block corresponding to the block to be detected and comparing the hash values;
and the image deleting module is used for deleting the redundant block when the comparison result of the hash value meets the preset redundant condition and the block to be detected is the redundant block.
To achieve the above object, an embodiment of the present invention further provides an image data processing apparatus, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, and the processor implements the image data processing method according to any of the above embodiments when executing the computer program.
To achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, and when the processor executes the computer program, the processor implements the image data processing method according to any one of the above embodiments.
Compared with the prior art, the image data processing method, the device, the equipment and the computer readable storage medium disclosed by the embodiment of the invention have the advantages that firstly, an image to be detected and an original image are obtained; the original image is composed of a plurality of original blocks; secondly, based on an image alignment algorithm, partitioning the image to be detected according to the original image to obtain a plurality of blocks to be detected, and establishing a mapping relation between the blocks to be detected and the original block; then, respectively calculating hash values of the block to be detected and an original block corresponding to the block to be detected, and comparing the hash values; and finally, when the comparison result of the hash value meets a preset redundancy condition, the block to be tested is a redundant block, and the redundant block is deleted. According to the embodiment of the invention, the redundant blocks can be determined and deleted according to the hash value comparison result of the block to be detected and the corresponding original block by partitioning the image to be detected, establishing the mapping relation between the block to be detected and the original block of the original image, so that the deletion of the sensing redundant content is realized, and the storage of image data is optimized.
Drawings
Fig. 1 is a flowchart of an image data processing method according to an embodiment of the present invention;
fig. 2 is a schematic block diagram of an original image and an image to be measured according to an embodiment of the present invention;
FIG. 3 is a histogram of image storage space requirements according to an embodiment of the present invention;
fig. 4 is a block diagram of an image data processing according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, which is a flowchart of an image data processing method according to an embodiment of the present invention, the image data processing method includes steps S1 to S4:
s1, acquiring an image to be detected and an original image; the original image is composed of a plurality of original blocks;
s2, based on an image alignment algorithm, blocking the image to be detected according to the original image to obtain a plurality of blocks to be detected, and establishing a mapping relation between the blocks to be detected and the original image;
s3, respectively calculating hash values of the block to be detected and the original block corresponding to the block to be detected, and comparing the hash values;
and S4, when the comparison result of the hash values meets the preset redundancy condition, the block to be tested is a redundant block, and the redundant block is deleted.
It should be noted that the image data processing method may be executed by a user side, and the user side may be a user terminal device such as a computer, a mobile phone, a tablet, and the like.
In step S1, the image to be measured and the original image may be acquired by responding to the image deduplication instruction, for example; the image deduplication instruction can be input by a user, and the input mode can be keyboard input, mouse input and touch screen input, which is not limited herein; the image deduplication instruction may be stored in the user terminal in advance, may be a deduplication instruction triggered when new image generation is detected or a new image is received, or may be a startup cycle of a deduplication program set in advance, and deduplication operation is performed every preset time.
In step S1, it is worth mentioning that the image to be measured is obtained by performing an image modification operation on an original image, where the original image is composed of a plurality of original blocks; therefore, in step S2, for example, based on an image alignment algorithm, such as ransac (random sample consensus) algorithm, the image to be measured and the original image are respectively subjected to feature extraction and feature matching to solve an alignment matrix, and the feature of the original image composed of a plurality of original blocks is combined to realize the blocking of the image to be measured so as to obtain a plurality of blocks to be measured, so as to further establish a mapping relationship between the blocks to be measured and the original blocks.
In step S3, since the image is processed by blocking, the whole image does not need to be used as the analysis of the repeated data object, but a smaller analysis unit, i.e., the block to be detected, is used as the repeated data object to be analyzed, the perceptual hash algorithm is used to calculate the hash values of the block to be detected and the original block, and the block to be detected and the original block corresponding to the block to be detected are compared with each other based on the mapping relationship between the block to be detected and the original block. Illustratively, the hash values of the blocks to be measured and the original blocks are calculated by using cosine-aware hashing (pHash), and the Hamming distance between the hash value of each block to be measured and the hash value of the original block corresponding to the block to be measured is calculated.
In step S4, specifically, the comparison result of the hash value of each block to be detected and the corresponding original block is analyzed, and whether each comparison result satisfies the predetermined redundancy condition is determined, and when the comparison result satisfies the predetermined redundancy condition, it indicates that the block to be detected and the corresponding original block are similar blocks, that is, the block to be detected is a redundant block, so that the redundant block is deleted to achieve the purpose of optimizing the storage space.
Compared with the prior art, the embodiment of the invention can align the acquired image to be detected and the original image based on the image alignment algorithm; combining the blocking condition of the original image, carrying out blocking processing on the image to be detected to obtain a block to be detected, and establishing a one-to-one corresponding mapping relation between the original block of the original image and the block to be detected of the image to be detected; the hash value comparison between the block to be detected and the original block is carried out based on the mapping relation, and then the redundant block is determined according to the comparison result of the hash value for deleting the redundant block, so that the deletion of the sensing redundant content at the block level is realized, the calculation amount and the calculation delay for directly detecting the redundancy by taking the image to be detected as a redundancy detection object are reduced, meanwhile, the redundant parts in two images with small overall similarity can be effectively deleted, the waste of storage space is reduced, and the data storage of the images is optimized.
In one embodiment, the method further includes step S5:
s5, when the comparison result of the hash values does not meet the redundancy condition, the block to be tested is a non-redundant block, and the non-redundant block is stored in an image memory; wherein the redundancy condition is:
and the Hamming distance between the hash value of the block to be detected and the hash value of the original block corresponding to the block to be detected is smaller than a preset distance threshold value.
Exemplarily, assuming that a preset distance threshold is 5, an image to be detected comprises n blocks to be detected, the original image comprises n original blocks, the blocks to be detected correspond to the original blocks one to one, assuming that a hamming distance between an ith block to be detected and a hash value of the corresponding original block is 4 and is smaller than the preset distance threshold 5, and satisfying a redundancy condition, it indicates that the ith block to be detected is similar to the corresponding original block, the ith block to be detected is a redundant block, and the ith block to be detected is deleted; if the hamming distance between the jth block to be tested and the hash value of the corresponding original block is 10 and is greater than the preset distance threshold 5, the redundancy condition is not satisfied, the jth block to be tested is not similar to the corresponding original block, the jth block to be tested is a non-redundant block, and the jth block to be tested is stored in the image memory at this moment. Through the redundancy detection of the block to be detected, the deletion of the redundant block and the storage of the non-redundant block are realized, and the data storage of the image is optimized.
In one embodiment, the raw image is obtained by:
s11, extracting the identification information of the image to be detected;
and S12, selecting an image with the identification information being the same as that of the image to be detected from the image memory as an original image.
Specifically, since the image to be measured is obtained by modifying the original image, when the original image has the identification information, the image to be measured must have the same identification information. In order to accurately and quickly acquire an original image corresponding to an image to be measured from an image memory, an image having the same identification information is found from the memory as the original image by querying the image memory with the identification information extracted from the image to be measured.
In one embodiment, before the original image is stored in the image memory, the method further includes steps S01 to S03:
s01, based on a preset image blocking rule, blocking the original image to obtain a plurality of original blocks;
s02, acquiring original identification information;
and S03, adding the original identification information into each original block.
Illustratively, assume that the identification information is a robust digital watermark and the original identification information is steganographic watermark metadata "User ID @ DeviceID @ San Jose, CA, US 1:23:45am 01/01/20203024 x 4032". Considering factors such as complexity of blocking, calculation overhead and the like, when an original image is generated, the original image is divided into a plurality of blocks with fixed sizes, due to the conjugate symmetry characteristic of Fourier transform, a frequency spectrum is centrosymmetric in a frequency domain, therefore, metadata is coded into a mirror bitmap image, then the metadata image and the original block are converted into a specific feature space, such as a cosine transform domain, a Fourier transform domain or a wavelet transform domain, then the signals are added into the specific feature space, and finally, the result is converted back to an original time domain, so that the original image with the steganographic watermark is obtained. Further, in order to prevent malicious software from forging watermark information to forge an image, metadata is encrypted before a mirror bitmap image is generated, and the watermark is extracted and then decrypted. The encryption scheme may be a kaiser encryption scheme, hiding the plaintext image description information by replacing each character of the metadata with another printable character. The overhead of this encryption scheme is negligible, because only a few shuffle instructions are added, it should be noted that the encryption scheme may also be other encryption manners, and is not limited herein.
Preferably, the process of blocking the original image and embedding the identification information is combined with the generation of the image, for example, by adding this function to the firmware of the digital camera or to a specific application program, so that the blocking and identification information addition can be performed at the time of the generation of the original image.
In one embodiment, the original identification information is generated by: obtaining description information of the original image, and generating original identification information according to the description information; wherein the description information includes at least one of a user, a device, an application, a date, a location, or a resolution associated with the original image generation.
Illustratively, the original identification information is a steganographic watermark, a bitmap image composed of description information such as User, Device/application, date, location, resolution, etc. related to image generation, assuming that the steganographic watermark metadata is "User ID @ Device ID @ San Jose, CA, US 1:23:45am 01/01/20203024 x 4032", the description information includes User ID, Device ID, location (San Jose, CA, usa), date (1 am 23 min 45 sec, 1 st am, 2020), resolution (3024x 4032).
In one embodiment, before the deleting the redundant block in step S4, the method further includes S41 to S42:
s41, extracting the image information of the redundant block, and extracting the image information of the original block corresponding to the redundant block;
s42, obtaining image operation information of the redundant block according to the image information of the redundant block and the image information of the original block corresponding to the redundant block, and storing the image operation information in an image description memory;
the saving the non-redundant block to the image memory in step S5 specifically includes:
adding the image content near the non-redundant block into the non-redundant block, and storing the non-redundant block added with the image content into the image memory.
Illustratively, referring to fig. 2, the image B to be measured is obtained by cutting an original image a, the image a is obtained by cutting original blocks a 1-a 24, the image B is obtained by aligning the original blocks a 1-B24, the image B is obtained by aligning the original blocks B1-B24, the image B is obtained by comparing hash values, the image B is obtained by cutting redundant blocks B6, B7, B10, B11, B14, B15, and B15, the image a15 → B15, a15, B15, a15 → B15, B15 → 15, a15 → B15, B15 → 15, B → 4, B15, B → 4, B15, B → 4, B15, B → 4, B15, B → 4, B15, B → 4, B15, B → 4, B15, B → B15, B → 4, B → 4, B15, B → B15, B → 4, B → B15, B → 4, B15, B → B15, B → B15, B, these redundant blocks are identical to the corresponding original blocks, the image manipulation information is a simple replacement operation, the image operation information is stored in the image description memory, and the erased image file can be restored by calling the corresponding image operation information. In order to facilitate subsequent image recovery, when the non-redundant blocks are stored, redundant contents near the non-redundant blocks are stored together to help the blocks to be reassembled into corresponding images in the image recovery process.
It is worth noting that the number of redundant contents added to the non-redundant block can be set according to actual requirements; the modification method of the original image is not limited to cropping, and may also be scaling, rotating, adding a filter, adding a text, changing a color, and the like, and is not limited herein, the image operation information at this time is not a simple replacement operation, and the image operation information may be divided into block-level description information and file-level description information, for example, assuming that the brightness of the redundant block x is 10, and the brightness of the original block y corresponding to the redundant block is 50, the corresponding image operation information includes "reducing the brightness of the original block y by 40 to obtain the redundant block x", and the image operation information is block-level description information; assuming that the image to be measured includes three blocks a1, b1, c1, the luminances of which are 1, 2, and 3, and all the blocks to be measured of the image to be measured are redundant blocks (i.e., a1, b1, and c1 are redundant blocks), the original image includes three original blocks a2, b2, and c2, the luminances of which are 4, 5, and 6, a1 and a2 correspond to each other, b1 and b2 correspond to each other, and c1 and c2 correspond to each other, then through calculation, the luminance of the original block a2 is reduced by 3 to obtain a redundant block a1, the luminance of the original block b2 is reduced by 3 to obtain a redundant block b1, and the luminance of the original block c2 is reduced by 3 to obtain a redundant block c1, that the luminance of the original image is reduced by 3 to obtain the image to be measured, and the corresponding image operation information includes "the image to be obtained by reducing the luminance of the original image by 3" and the image operation information is file-level description information.
In one embodiment, after the deleting the redundant block in the step S4, the method further includes steps S6 to S7:
s6, responding to an image restoration instruction, and generating a redundant image according to the image operation information and the original image;
and S7, restoring the image to be detected according to the redundant image and the non-redundant block.
Specifically, image restoration refers to the process of restoring a perceptually equivalent image to an image that was replaced by a redundant copy during the image deduplication phase. Deploying corresponding operations to restore an image according to information in an image memory and an image description memory generally comprises three steps: conversion, splicing and enhancement. The conversion stage mainly adopts a series of operations to recover the blocks, and if the image is subjected to repeated data deletion at the file level, the image can bypass the stage to be directly spliced. In the splicing process, all the blocks can be quickly spliced to generate a candidate image according to extra redundant content around each non-redundant block boundary, finally, the image is further optimized in an enhancement stage by using a super-resolution recovery model, the quality of the candidate image, particularly the block boundary, the reason of the image quality loss comes from the embedded steganography watermark, but the quality loss caused by the operation is not obvious in perception, an optimal interpolation is generated by adopting a super-resolution optimization method to reduce the mean square error, and the PSNR value (peak signal-to-noise ratio) of the recovered image is almost the same as that of the image embedded with the steganography watermark.
Compared with the prior art, the embodiment of the invention realizes deletion of the sensing redundancy content at the block level, reduces the calculation amount and calculation delay for directly detecting the redundancy by taking the image to be detected as a redundancy detection object, and can effectively delete the redundancy parts in two images with smaller overall similarity, thereby reducing the waste of storage space and optimizing the data storage of the images.
In order to better illustrate the advantages of the image processing method according to the embodiment of the present invention, the image processing results of the practical application example are compared as follows.
In an example of practical application, the original image in the image database has a data set of 1.3GB with 362.5GB copies in total, where the repeated copies have 114.2GB in total, 22.8GB remaining after the distinguishable copies are filtered out by a similarity-based duplicate elimination (SIM-dup) method, whereas the image data processing method of the embodiment of the present invention can further reduce the data to 6.7GB, which saves 70.6% of the storage space compared to SIM-dup, and the redundant image magnification factor is reduced from 17.5(22.8 ÷ 1.3 ═ 17.5) of SIM-dup to 5.2(6.7 ÷ 1.3 ═ 5.2), where the redundant image magnification factor is defined as the size of the actual image data divided by the size of the original image data, as shown in the image storage space bar chart of fig. 3, and the vertical axis is the size of the storage space, and the horizontal axis is the size of all copies in the data set, respectively, from left to right, As can be seen from fig. 3, the embodiment of the present invention can effectively delete the redundant portion of the image, reduce the waste of the storage space, and have a lower requirement on the storage space.
In terms of execution time, the SIM-dead uses most of the time to extract the high-dimensional feature vectors based on VGG16, while WM-dead takes much less time in terms of feature extraction (watermark extraction). The scheme spends most of the time on the restored image, recombines the image by using perceptually equivalent blocks, and applies super-resolution optimization on all boundaries between any two adjacent blocks to restore the de-duplicated image as much as possible. The SIM-dedup operation for recovering the image is very simple, and the delay mainly comes from selecting and loading a nearly repeated image, but the image recovery effect is poor. Specifically, the speed of extracting the features is 5.7 times that of the SIM-dedup, the speed of deleting the repeated data is 4.5 times that of the SIM-dedup, and although the block-level repeated data deleting mode used by the WM-dedup can increase the query times, the time required by the query operation is microsecond level, so that the system performance cannot be obviously influenced. Although the method takes much more time than the SIM-dedup in the image recovery stage, the overall execution time of the method is still nearly 60% shorter than that of the SIM-dedup, and the detailed execution time comparison data in each stage is shown in the following table.
Feature extraction Image hashing Inquiry (searching original image) De-weighting Recovery
SIM-dedup 482ms 4ms 25 us/file 502ms 8ms
Method for producing a composite material 84ms 4ms 12 us/block 112ms 195ms
Referring to fig. 4, an embodiment of the present invention further provides an image data processing apparatus, including:
the image acquisition module 11 is used for acquiring an image to be detected and an original image; the original image is composed of a plurality of original blocks;
an image blocking module 12, configured to block the image to be detected according to the original image based on an image alignment algorithm to obtain a plurality of blocks to be detected, and establish a mapping relationship between the blocks to be detected and the original block;
an image comparison module 13, configured to calculate hash values of the block to be detected and an original block corresponding to the block to be detected, respectively, and compare the hash values;
and the image deleting module 14 is configured to delete the redundant block when the comparison result of the hash value meets a preset redundant condition, where the block to be detected is the redundant block.
It should be noted that, for a specific working process of the image data processing apparatus, reference may be made to the working process of the image data processing method in the foregoing embodiment, and details are not repeated here.
Compared with the prior art, the device provided by the embodiment of the invention can align the acquired image to be detected and the original image based on the image alignment algorithm; combining the blocking condition of the original image, carrying out blocking processing on the image to be detected to obtain a block to be detected, and establishing a one-to-one corresponding mapping relation between the original block of the original image and the block to be detected of the image to be detected; the hash value comparison between the block to be detected and the original block is carried out based on the mapping relation, and then the redundant block is determined according to the comparison result of the hash value for deleting the redundant block, so that the deletion of the sensing redundant content at the block level is realized, the calculation amount and the calculation delay for directly detecting the redundancy by taking the image to be detected as a redundancy detection object are reduced, meanwhile, the redundant parts in two images with small overall similarity can be effectively deleted, the waste of storage space is reduced, and the data storage of the images is optimized.
Embodiments of the present invention further provide an image data processing apparatus, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor, when executing the computer program, implements steps in the above image data processing method embodiments, such as steps S1 to S4 described in fig. 1; alternatively, the processor, when executing the computer program, implements the functions of the modules in the above device embodiments, for example, the reference factor obtaining module.
Illustratively, the computer program may be partitioned into one or more modules, stored in the memory and executed by the processor, to implement the invention. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program in the image data processing apparatus. For example, the computer program may be divided into a plurality of modules, each module having the following specific functions:
the image acquisition module 11 is used for acquiring an image to be detected and an original image; the original image is composed of a plurality of original blocks;
an image blocking module 12, configured to block the image to be detected according to the original image based on an image alignment algorithm to obtain a plurality of blocks to be detected, and establish a mapping relationship between the blocks to be detected and the original image;
an image comparison module 13, configured to calculate hash values of the block to be detected and an original block corresponding to the block to be detected, respectively, and compare the hash values;
and the image deleting module 14 is configured to delete the redundant block when the comparison result of the hash value meets a preset redundant condition, where the block to be detected is the redundant block.
The specific working process of each module may refer to the working process of the image data processing apparatus described in the above embodiment, and is not described herein again.
The image data processing device can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing devices. The image data processing apparatus may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of an image data processing apparatus and does not constitute a limitation of the image data processing apparatus, and may include more or less components than those shown, or combine some components, or different components, for example, the image data processing apparatus may also include an input-output device, a network access device, a bus, etc.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is a control center of the image data processing apparatus and connects the respective parts of the entire image data processing apparatus with various interfaces and lines.
The memory may be used to store the computer programs and/or modules, and the processor may implement various functions of the image data processing apparatus by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the mobile phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Wherein the module integrated with the image data processing apparatus may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U.S. disk, removable hard disk, magnetic diskette, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signal, telecommunications signal, and software distribution medium, etc.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (10)

1. An image data processing method characterized by comprising:
acquiring an image to be detected and an original image; the original image is composed of a plurality of original blocks;
based on an image alignment algorithm, partitioning the image to be detected according to the original image to obtain a plurality of blocks to be detected, and establishing a mapping relation between the blocks to be detected and the original block;
respectively calculating hash values of the block to be detected and an original block corresponding to the block to be detected, and comparing the hash values;
and when the comparison result of the hash value meets a preset redundancy condition, the block to be detected is a redundant block, and the redundant block is deleted.
2. The image data processing method according to claim 1, further comprising:
when the comparison result of the hash value does not meet the redundancy condition, the block to be tested is a non-redundant block, and the non-redundant block is stored in an image memory; wherein the redundancy condition is:
and the Hamming distance between the hash value of the block to be detected and the hash value of the original block corresponding to the block to be detected is smaller than a preset distance threshold value.
3. The image data processing method according to claim 1, wherein the original image is obtained by:
extracting identification information of the image to be detected;
and selecting an image with the identification information same as that of the image to be detected from an image memory as an original image.
4. The image data processing method of claim 3, before the storing of the original image in the image memory, further comprising:
based on a preset image blocking rule, blocking the original image to obtain a plurality of original blocks;
acquiring original identification information;
adding the original identification information to each of the original blocks.
5. The image data processing method according to claim 4, wherein the original identification information is generated by: obtaining description information of the original image, and generating original identification information according to the description information; wherein the description information includes at least one of a user, a device, an application, a date, a location, or a resolution associated with the original image generation.
6. The image data processing method according to claim 2, further comprising, before said deleting said redundant block:
extracting the image information of the redundant block and extracting the image information of an original block corresponding to the redundant block;
obtaining the image operation information of the redundant block according to the image information of the redundant block and the image information of the original block corresponding to the redundant block, and storing the image operation information in an image description memory;
the saving the non-redundant block to the image memory specifically includes:
adding the image content near the non-redundant block into the non-redundant block, and storing the non-redundant block added with the image content into the image memory.
7. The image data processing method according to claim 6, further comprising, after said deleting said redundant block:
responding to an image recovery instruction, and generating a redundant image according to the image operation information and the original image;
and recovering the image to be detected according to the redundant image and the non-redundant block.
8. An image data processing apparatus characterized by comprising:
the image acquisition module is used for acquiring an image to be detected and an original image; the original image is composed of a plurality of original blocks;
the image blocking module is used for blocking the image to be detected according to the original image based on an image alignment algorithm to obtain a plurality of blocks to be detected and establishing a mapping relation between the blocks to be detected and the original blocks;
the image comparison module is used for respectively calculating the hash values of the block to be detected and the original block corresponding to the block to be detected and comparing the hash values;
and the image deleting module is used for deleting the redundant block when the comparison result of the hash value meets the preset redundant condition and the block to be detected is the redundant block.
9. An image data processing apparatus comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the image data processing method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the image data processing method according to any one of claims 1 to 7 when executing the computer program.
CN202210246863.1A 2022-03-14 2022-03-14 Image data processing method, device, equipment and storage medium Active CN114637870B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210246863.1A CN114637870B (en) 2022-03-14 2022-03-14 Image data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210246863.1A CN114637870B (en) 2022-03-14 2022-03-14 Image data processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114637870A true CN114637870A (en) 2022-06-17
CN114637870B CN114637870B (en) 2023-03-24

Family

ID=81948867

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210246863.1A Active CN114637870B (en) 2022-03-14 2022-03-14 Image data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114637870B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114817230A (en) * 2022-06-29 2022-07-29 深圳市乐易网络股份有限公司 Data stream filtering method and system
CN117372933A (en) * 2023-12-06 2024-01-09 南京智绘星图信息科技有限公司 Image redundancy removing method and device and electronic equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809732A (en) * 2015-05-07 2015-07-29 山东鲁能智能技术有限公司 Electrical equipment appearance abnormity detection method based on image comparison
CN105095903A (en) * 2015-07-16 2015-11-25 努比亚技术有限公司 Electronic equipment and image processing method
CN106155592A (en) * 2016-07-26 2016-11-23 深圳天珑无线科技有限公司 A kind of photo processing method and terminal
US20170161271A1 (en) * 2015-12-04 2017-06-08 Intel Corporation Hybrid nearest neighbor search tree with hashing table
CN107566826A (en) * 2017-01-12 2018-01-09 北京大学 The method of testing and device of grating image processor
CN110297680A (en) * 2019-06-03 2019-10-01 北京星网锐捷网络技术有限公司 A kind of method and device of transfer of virtual desktop picture
CN112200740A (en) * 2020-10-08 2021-01-08 华中科技大学 Image blocking and de-duplication method and system based on image edge detection
CN112261388A (en) * 2020-09-07 2021-01-22 中国电影器材有限责任公司 Redundancy recovery method, device and system for satellite transmission digital film packet
CN113516601A (en) * 2021-06-17 2021-10-19 西南大学 Image restoration technology based on deep convolutional neural network and compressed sensing

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809732A (en) * 2015-05-07 2015-07-29 山东鲁能智能技术有限公司 Electrical equipment appearance abnormity detection method based on image comparison
CN105095903A (en) * 2015-07-16 2015-11-25 努比亚技术有限公司 Electronic equipment and image processing method
US20170161271A1 (en) * 2015-12-04 2017-06-08 Intel Corporation Hybrid nearest neighbor search tree with hashing table
CN106155592A (en) * 2016-07-26 2016-11-23 深圳天珑无线科技有限公司 A kind of photo processing method and terminal
CN107566826A (en) * 2017-01-12 2018-01-09 北京大学 The method of testing and device of grating image processor
CN110297680A (en) * 2019-06-03 2019-10-01 北京星网锐捷网络技术有限公司 A kind of method and device of transfer of virtual desktop picture
CN112261388A (en) * 2020-09-07 2021-01-22 中国电影器材有限责任公司 Redundancy recovery method, device and system for satellite transmission digital film packet
CN112200740A (en) * 2020-10-08 2021-01-08 华中科技大学 Image blocking and de-duplication method and system based on image edge detection
CN113516601A (en) * 2021-06-17 2021-10-19 西南大学 Image restoration technology based on deep convolutional neural network and compressed sensing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YUJUAN TAN 等: "SAM: A Semantic-Aware Multi-tiered Source De-duplication Framework for Cloud Backup", 《2010 39TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING》 *
江小平 等: "基于分块DCT的图像去重算法", 《中南民族大学学报(自然科学版)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114817230A (en) * 2022-06-29 2022-07-29 深圳市乐易网络股份有限公司 Data stream filtering method and system
CN117372933A (en) * 2023-12-06 2024-01-09 南京智绘星图信息科技有限公司 Image redundancy removing method and device and electronic equipment
CN117372933B (en) * 2023-12-06 2024-02-20 南京智绘星图信息科技有限公司 Image redundancy removing method and device and electronic equipment

Also Published As

Publication number Publication date
CN114637870B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
US10438000B1 (en) Using recognized backup images for recovery after a ransomware attack
Babu et al. Efficient detection of copy-move forgery using polar complex exponential transform and gradient direction pattern
Walia et al. Digital image forgery detection: a systematic scrutiny
Kwon et al. Learning jpeg compression artifacts for image manipulation detection and localization
Zampoglou et al. Large-scale evaluation of splicing localization algorithms for web images
CN114637870B (en) Image data processing method, device, equipment and storage medium
US20170344433A1 (en) Apparatus and method for data migration
Bi et al. Multi-scale feature extraction and adaptive matching for copy-move forgery detection
CA3018437C (en) Optical character recognition utilizing hashed templates
US11249665B2 (en) Object synthesis
Samanta et al. Analysis of perceptual hashing algorithms in image manipulation detection
WO2013104432A1 (en) Detecting video copies
US20160182224A1 (en) Method and apparatus for deriving a perceptual hash value from an image
JP2013134781A (en) Method for automatically managing image in image collection and device corresponding to the same
Yuan et al. Feature extraction and local Zernike moments based geometric invariant watermarking
CN110262925B (en) Remote backup method, device, terminal equipment and medium for pictures
US20190311744A1 (en) Comparing frame data to generate a textless version of a multimedia production
CN112651953A (en) Image similarity calculation method and device, computer equipment and storage medium
Novozámský et al. Detection of copy-move image modification using JPEG compression model
Bi et al. Multi-task wavelet corrected network for image splicing forgery detection and localization
CN109729231B (en) File scanning method, device and equipment
Shashidhar et al. Reviewing the effectivity factor in existing techniques of image forensics
JP2017022690A (en) Method and device for use when reassembling fragmented jpeg image
Ravi et al. Forensic analysis of linear and nonlinear image filtering using quantization noise
Kumar et al. Key-point based copy-move forgery detection in digital images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant