Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach

Bin Li, Julia Yu, Jie Zhang, Bin Ke
Asian Conference on Machine Learning, PMLR 45:173-188, 2016.

Abstract

This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v45-Li15, title = {Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach}, author = {Li, Bin and Yu, Julia and Zhang, Jie and Ke, Bin}, booktitle = {Asian Conference on Machine Learning}, pages = {173--188}, year = {2016}, editor = {Holmes, Geoffrey and Liu, Tie-Yan}, volume = {45}, series = {Proceedings of Machine Learning Research}, address = {Hong Kong}, month = {20--22 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v45/Li15.pdf}, url = {https://proceedings.mlr.press/v45/Li15.html}, abstract = {This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment.} }
Endnote
%0 Conference Paper %T Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach %A Bin Li %A Julia Yu %A Jie Zhang %A Bin Ke %B Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Geoffrey Holmes %E Tie-Yan Liu %F pmlr-v45-Li15 %I PMLR %P 173--188 %U https://proceedings.mlr.press/v45/Li15.html %V 45 %X This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment.
RIS
TY - CPAPER TI - Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach AU - Bin Li AU - Julia Yu AU - Jie Zhang AU - Bin Ke BT - Asian Conference on Machine Learning DA - 2016/02/25 ED - Geoffrey Holmes ED - Tie-Yan Liu ID - pmlr-v45-Li15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 45 SP - 173 EP - 188 L1 - http://proceedings.mlr.press/v45/Li15.pdf UR - https://proceedings.mlr.press/v45/Li15.html AB - This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment. ER -
APA
Li, B., Yu, J., Zhang, J. & Ke, B.. (2016). Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach. Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 45:173-188 Available from https://proceedings.mlr.press/v45/Li15.html.

Related Material