Abstract
Document forgery is quite common nowadays due to the availability of cost-effective scanners and printers. Important documents like certificates, passport, identification cards, etc., are protected using watermarks or signatures. These are made secured with a protective printing mechanism with extrinsic fingerprints. Therefore, it is easy to authenticate such documents. Other documents required a passive approach for their authentication. These approaches look for document inconsistencies for chances of modification. Some of these attempt to detect and fix the source of the printed document. This paper proposes a classifier-based model to identify the source printer and classify the questioned document in one of the printer classes. A novel approach of utilizing Speeded Up Robust Features and Oriented Fast Rotated and BRIEF feature descriptors is proposed for printer attribution. Naive Bayes, k-NN, random forest and different combinations of these classifiers have been experimented for classification. The proposed model can efficiently classify the questioned documents to their respective printer class. An accuracy of 86.5% has been achieved using a combination of Naive Bayes, k-NN, random forest classifiers with a simple majority voting scheme and adaptive boosting methodology.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ali, G.N., Mikkilineni, A.K., Allebach, J.P., Delp, E.J., Chiang, P.J., Chiu, G.T.: Intrinsic and extrinsic signatures for information hiding and secure printing with electrophotographic devices. In: Proceedings of the Non-impact Printing and Digital Fabrication Conference, New Orleans, Louisiana, vol. 2, pp. 511–515 (2003)
Ali, G.N., Mikkilineni, A.K., Delp, E.J., Allebach, J.P., Chiang, P.J., Chiu, G.T.: Application of principal components analysis and gaussian mixture models to printer identification. In: Proceedings of the Non-impact Printing and Digital Fabrication Conference, Salt Lake City, Utah, vol. 1, pp. 301–305 (2004)
Bertrand, R., Gomez-Kramer, P., Terrades, O.R., Franco, P., Ogier, J.M.: A system based on intrinsic features for fraudulent document detection. In: Proceedings of the 12th International Conference on Document Analysis and Recognition, Washington DC, USA, pp. 106–110 (2013)
Breiman L (2001) Random forests. Mach. Learn. 45(1):5–32
Elkasrawi, S., Shafait, F.: Printer identification using supervised learning for document forgery detection. In: Proceedings of the 11th IAPR International Workshop on Document Analysis Systems, France, pp. 146–150 (2014)
Ferreira A, Bondi L, Baroffio L, Bestagini P, Huang J, dos Santos J, Tubaro S, Rocha A (2017) Data-driven feature characterization techniques for laser printer attribution. IEEE Trans. Inf. Forensics Secur 12(8):1860–1873
Freund Y, Schapire RE (1999) A Short Introduction to Boosting. J. Jpn. Soc. Artif. Intell. 14(5):771–780
Fu YR, Yang SY (2012) CCS-LTP for printer identification based on texture analysis. Int. J. Digit. Content Technol. Appl. 6(13):250–264
Gebhardt, J., Goldstein, M., Shafait, F., Dengel, A.: Document authentication using printing technique features and unsupervised anomaly detection. In: Proceedings of the 12th International Conference on Document Analysis and Recognition, Washington, DC, US, pp. 479–483 (2013)
Gupta S, Kumar M (2019) Forensic document examination system using boosting and bagging methodologies. Soft Comput. https://doi.org/10.1007/s00500-019-04297-5
Jiang F, Fu Y, Gupta BB, Lou F, Rho S, Meng F, Tian Z (2018) Deep learning based multi-channel intelligent attack detection for data security. IEEE Trans. Sustain. Comput. https://doi.org/10.1109/TSUSC.2018.2793284
John, G.H., Langley, P.: Estimating Continuous distributions in bayesian classifiers. In: Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345 (1995)
Joshi S, Khanna N (2018) Single classifier-based passive system for source printer classification using local texture features. IEEE Trans. Inf. Forensics Secur. 13(7):1603–1614
Khanna, N., Mikkilineni, A.K., Chiu, G.T.C., Allebach, J.P., Delp, E.J.: Scanner identification using sensor pattern noise. In: Proceedings of the Security, Steganography, and Watermarking of Multimedia Contents, Electronic Imaging, San Jose, CA, US, 65051K1-K11 (2007)
Khanna N, Mikkilineni AK, Delp EJ (2009) Scanner identification using feature-based processing and analysis. IEEE Trans. Inf. Forensics Secur. 4(1):123–139
Kim M (2017) Simultaneous learning of sentence clustering and class prediction for improved document classification. Int. J. Fuzzy Logic Intell. Syst. 17(1):35–42. https://doi.org/10.5391/IJFIS.2017.17.1.35
Li Z, Jiang W, Kenzhebalin D, Gokan A, Allebach J (2018) Intrinsic signatures for forensic identification of SOHO inkjet printers. NIP Digit. Fabric Conf. 1:231–236
Mikkilineni, A.K., Chiang, P.J., Ali, G.N., Chiu, G.T.C., Allebach, J.P., Delp, E.J.: Printer identification based on graylevel co-occurrence features for security and forensic applications. In: Proceedings of the Security, Steganography, and Watermarking of Multimedia Contents, Electronic Imaging, California, USA, pp. 430–440 (2005)
Mikkilineni, A.K., Chiang, P.J., Ali, G.N., Chiu, G.T.C., Allebach, J.P., Delp, E.J.: Printer identification based on texture features. In: Proceedings of the Non-impact Printing and Digital Fabrication Conference, Society for Imaging Science and Technology, Salt Lake City, Utah, vol. 1, pp. 306–311 (2004)
Mikkilineni AK, Khanna N, Delp EJ (2011) Forensic printer detection using intrinsic signatures. In: SPIE proceedings, media watermarking, security, and forensics III, vol. 7880. 78800R. https://doi.org/10.1117/12.876742
Olakanmi OO, Dada A (2019) An efficient privacy-preserving approach for secure verifiable outsourced computing on untrusted platforms. Int. J. Cloud Appl. Comput. 9(2):79–98
Rasli, R.M., Zalizam, T., Muda, T., Yusof, Y., Bakar, J.A.: Comparative analysis of content based image retrieval techniques using color histogram: a case study of GLCM and K-Means clustering. In: Proceedings of the Third International Conference on Intelligent Systems Modelling and Simulation, pp. 283–286 (2012)
Ryu SJ, Lee HY, Cho IW, Lee HK (2008) Document forgery detection with SVM classifier and image quality measures. In: Proceedings of the 9th pacific rim conference on multimedia (PCM’08), pp 486–495
Tsai MJ, Liu J (2013) Digital forensics forprinted source identification. In: Proc. IEEE international symposium on circuits and systems. Melbourne, Australia, pp 2347–2350
Tsai MJ, Yuadi I, Tao YH (2018) Decision-theoretic model to identify printed sources. Multimed. Tools Appl. 77:27543–27587
Van Beusekom J, Shafait F, Breuel TM (2013) Automatic authentication of color laser print-outs using machine identification codes. Pattern Anal. Appl. 16(4):663–678
Vinay A, Kumar CA, Shenoy GR, Murthy NKB, Natarajan S (2015) ORB-PCA based feature extraction technique for face recognition. Proc. Comput. Sci. 58:614–621
Zhuo L, Cheng B, Zhang J (2014) A comparative study of dimensionality reduction methods for large-scale image retrieval. Neurocomputing 141:202–210
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kumar, M., Gupta, S. & Mohan, N. A computational approach for printed document forensics using SURF and ORB features. Soft Comput 24, 13197–13208 (2020). https://doi.org/10.1007/s00500-020-04733-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-04733-x