Copy-Move Document Image Forgery Detection and Localization Based On JPEG Clues
Copy-Move Document Image Forgery Detection and Localization Based On JPEG Clues
Copy-Move Document Image Forgery Detection and Localization Based On JPEG Clues
ABSTRACT
The amount of image forgery strongly increases recently. There are different ways to fake an image, one of the common
ones is a copy-move manipulation. There are numerous methods for detecting copy-move manipulations on natural images.
However, they are difficult to adapt for document images due to their features. This work proposes an algorithm for
detecting and localizing copy-move manipulations on digital images of documents. The main idea is to use JPEG artifacts
in order to find the target region area and then localize the source and target regions precisely. For the efficient application
of the proposed method, firstly, the original image must have been subjected to JPEG compression, and secondly, after the
manipulation the image must have been saved in a lossless format. The experiments were carried out on an open set of
document images CMID; in the detection task, the recall was 0.992, the specificity was 1.0; in the localization task, the
recall was 0.923, the false discovery rate was 0.021, which means that the proposed algorithm successfully detects more
than 99% of copy-move manipulations, similar to manipulations in the CMID and does not give false positives.
Keywords: copy-move, image tampering, image forgery, document forgery, JPEG artifacts .
1. INTRODUCTION
A digital image, after its registration by the capture device, can be modified using photo editors or other software. Such
modifications are called digital image manipulation [1,2] and they are often used for a data forgery [3]. In this paper, the
initial image obtained after being captured is referred to as the original image, and the image obtained after manipulation
is referred to as the manipulated image.
Figure 1. Sample images from the COVERAGE [6], H2020 Figure 2. Copy-move example from the CMID [16]
[7], CASIA2.0 [8], respectively. The top row shows original (tampered/0_FRA_TS_N_2.png). The forged image is on the
images, and the bottom row shows falsifications of copy- left, the source and target regions are on the right, marked
move, erase-fill and splicing (from left to right). green and red respectively.
Sometimes manipulations can be used in good faith to improve a digital image without changing its semantic content,
for example, for contrasting a photograph. Conversely, manipulations can sometimes be used to change the semantic
content of a digital image with malicious intent, such manipulations are called image forgery. A special place in researching
image forgery is held by manipulations in which part of the final image remains authentic, and part is modified; such
manipulations are called image tampering. Among them, three main types are distinguished: copy-move, splicing, erase-
fill [1,4] (Fig. 1). These types of manipulations can be viewed as filling the modified zone with a new content. If the source
for filling the zone is taken from the original image and represents a whole part, this is a copy-move manipulation. If the
source for filling is taken from another image and also represents a whole part, this is a splicing manipulation. If the zone
Sixteenth International Conference on Machine Vision (ICMV 2023), edited by Wolfgang Osten,
Proc. of SPIE Vol. 13072, 130720K · © 2024 SPIE · 0277-786X
doi: 10.1117/12.3023365
where Ɲ(𝑎, 𝜎 2 ) is a normal distribution with the mean 𝑎 and the variance 𝜎 2 .
The article [22] shows that the distribution of DCT coefficients for images that have not been subjected to JPEG
compression can be modeled using a Laplace distribution.
Proposition 2. The distribution of discrete cosine transform coefficients at any fixed frequency 𝜓 ≠ (0,0) is a Laplace
distribution with zero mean:
𝜆𝜓
𝐷(𝜓𝑤 + 8𝑥, 𝜓ℎ + 8𝑦) ∼ 𝜌(𝑡) = 𝑒 −𝜆𝜓|𝑡| , 𝑡 ∈ ℝ. (2)
2
Histograms of DCT coefficients calculated on images that have not been / have been subjected to JPEG compression
are shown schematically in Figure 3.
Figure 3. Visual representation of histograms of the calculated DCT coefficients of frequency 𝜓 of a non-JPEG-compressed image
(left) and a JPEG-compressed image, excluding blocks where truncation occurred with 𝑄(𝜓) = 𝑞 (right).
3. PROPOSED ALGORITHM
Our algorithm globally consists of two stages. The first one is determining the focus area using JPEG compression
artifacts. The second one is detecting the source and target regions with a known focus area. We will assume that an
Figure 5. The input image with manipulation is illustrated in A, up-scaled calculated images 𝐼1 , 𝐼2 , 𝐼3 , 𝐼4 are depicted in B, C, D, E.
The brightnesses of both modified zones equalize at 𝐼3 .
we set the upper limit for 𝐴𝑑(𝑡⁄𝑠) as 2.5 for more stable behavior (Fig. 6D).
To determine the relative position of source and target regions within 𝐴𝑡 and 𝐴𝑠 , we find a rectangle 𝑟̂ in 𝐻 (Fig. 6E)
that minimizes the intra-class variance:
∑𝑢∈𝑟(𝐻(𝑢))2 ∑𝑢∈𝑟 𝐻(𝑢) 2 ∑ 1 ∑𝑢∉𝑟(𝐻(𝑢))2 ∑𝑢∉𝑟 𝐻(𝑢) 2 ∑ 1 ∑𝑢∈𝑟 𝐻(𝑢) ∑𝑢∉𝑟 𝐻(𝑢)
𝑟̂ = argmin {( ∑𝑢∈𝑟 1
−( ∑𝑢∈𝑟 1
) ) ∑ 𝑢∈𝑟 + ( ∑𝑢∉𝑟 1
−( ∑𝑢∉𝑟 1
) ) ∑ 𝑢∉𝑟 | ∑𝑢∈𝑟 1
≤ ∑𝑢∉𝑟 1
} . (12)
𝑟 𝑢∈𝐻 1 𝑢∈𝐻 1
We extend 𝑟̂ by 2 pixels in each direction to compensate for 𝛻1 operations (Fig. 6E). Then we obtain the absolute positions
of the found source and target regions using the absolute positions of 𝐴𝑡 , 𝐴𝑠 and the found rectangle.
Figure 6. Elements of the algorithm operation for the tampered/0_FRA_TS_N_2.png from the CMID set [16] (see Fig. 2). A is the
area of the source image. B is the area of the binary mask, where the focus area is white. C — 𝐴𝑠 (green) and 𝐴𝑡 (red) before and after
extension. D is the image 𝐴𝑑(𝑡⁄𝑠) limited at the top at 2.5, E is image 𝐻 with the 𝑟̂ before and after expansion.
4. RESULTS
The CMID set is dedicated to two tasks: detecting the presence of falsification (image-level task) and localizing
modified regions (pixel-level task). The following quality indicators are provided in the tables: True Positive Rate (TPR),
False Positive Rate (FPR), Matthews Correlation Coefficient (MCC), False Discovery Rate (FDR), 𝐹1 score (F1).
To calculate the TPR, FPR, MCC for our method in the image-level task, we mark the images of the CMID set [16]
from the folder ‘tampered’ as positive, and the images from the folder ‘ref’ as negative. Our algorithm marks the image as
manipulated (positive) if the focus area is not empty, otherwise it marks the image as original (negative). We have TPR =
0.9922, FPR = 0, MCC = 0.9848. The indicators in the image-level task for methods [9, 17, 18, 19] from the table in the
work [16], supplemented by the indicators of our method, are presented in Table 1 (columns labeled “image”).
In the CMID set, the folder ‘gt’ contains three-channel mask images corresponding to the images from the folder
‘tampered’. In the green channel of images from the folder ‘gt’, values of 255 indicate pixels of the source area (the
remaining pixels are in channel 0), in the red channel, pixels of the target area are indicated, but not only with values of
Table 1. Results of testing the proposed algorithm and comparison with others.
Method TPR ↑ image FPR ↓ image MCC ↑ image TPR ↑ pixel FDR ↓ pixel F1 ↑ pixel MCC ↑ pixel
SURF [9] 0.7919 0.7697 0.0236 0.2155 0.9792 0.0378 0.0606
SIFT [9] 0.9676 0.9145 0.1104 0.6004 0.9610 0.0731 0.1471
BusterNet [17] 0.1601 0.1607 -0.0006 0.0016 0.9979 0.0018 0.0000
FE-CMFD [18] 0.0246 0.0066 0.0561 0.0341 0.3114 0.0650 0.1530
SIFT-LDM [19] 0.7917 0.0197 0.6847 0.2555 0.0541 0.4024 0.4912
Proposed algorithm 0.9922 0.0000 0.9848 0.9234 0.0209 0.9504 0.9507
The performance indicators of our method are ahead of the maximum values of the performance of methods [9, 17, 18,
19] in both tasks. Thus, in the image-level task our MCC = 0.9742 versus MCC = 0.6847 of the method holding the second
place in this performance indicator [19]. In the pixel-level task, the situation is similar, our MCC = 0.9554 versus MCC =
0.4912 of the second place [19]. Interpreting the TPR and FPR performance indicators in the image-level task, we can state
that our method successfully detects more than 99% of copy-move manipulations, similar to manipulations in the CMID
[16] and does not give false positives.
CONCLUSION
In this paper, we proposed a method for detecting and localizing copy-move manipulations on the CMID set [16]. This
method uses the properties of the JPEG format and assumptions about the model of transforming the source region into
the target region, which allows the algorithm to be robust to the large number of SGOs and small modification areas which
occur in the CMID images [16]. The stability of the method is confirmed by experiments. Later on, it is planned to make
the method resistant to JPEG re-compression using a different strategy for highlighting the focus area.
REFERENCES
[1] Zheng, Lilei, Ying Zhang, and Vrizlynn LL Thing. "A survey on image tampering and its detection in real-world
photos." Journal of Visual Communication and Image Representation 58 (2019): 380-399.
[2] Thakur, Rahul, and Rajesh Rohilla. "Recent advances in digital image manipulation detection techniques: A brief
review." Forensic science international 312 (2020): 110311.
[3] Garfinkel S. L. Digital forensics research: The next 10 years //digital investigation. – 2010. – Т. 7. – С. S64-S73.
[4] Meena, Kunj Bihari, and Vipin Tyagi. "Image forgery detection: survey and future directions." Data, Engineering and
Applications: Volume 2 (2019): 163-194.
[5] Castillo Camacho, Ivan, and Kai Wang. "A comprehensive review of deep-learning-based methods for image
forensics." Journal of imaging 7.4 (2021): 69.
[6] B. Wen, Y. Zhu, et al., COVERAGE: a novel database for copy-move forgery detection, in: Proceedings of ICIP, IEEE,
2016, pp. 161–165.
[7] Faria Hossain, Asim Gul, Rameez Raja, Tasos Dagiuklas, Chathura Galkandage, January 13, 2022, "Forgery Image
Dataset", IEEE Dataport, doi: https://dx.doi.org/10.21227/9dmj-yn86.