Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Shapley Value Estimation based on Differential Matrix

Published: 11 February 2025 Publication History

Abstract

The Shapley value has been extensively used in many fields as the unique metric to fairly evaluate player contributions in cooperative settings. Since the exact computation of Shapley values is \#P-hard in the task-agnostic setting, many studies have been developed to utilize the Monte Carlo method for Shapley value estimation. The existing methods estimate the Shapley values directly. In this paper, we explore a novel idea-inferring the Shapley values by estimating the differences between them. Technically, we estimate a differential matrix consisting of pairwise Shapley value differences to reduce the variance of the estimated Shapley values. We develop a least-squares optimization solution to derive the Shapley values from the differential matrix, minimizing the estimator variances. Additionally, we devise a Monte Carlo method for efficient estimation of the differential matrix and introduce two stratified Monte Carlo methods for further variance reduction. Our experimental results on real and synthetic data sets demonstrate the effectiveness and efficiency of the differential-matrix-based sampling approaches.

References

[1]
Anish Agarwal, Munther A. Dahleh, and Tuhin Sarkar. 2019. A marketplace for data: An algorithmic solution. In Proceedings of the 2019 ACM Conference on Economics and Computation, EC 2019, Phoenix, AZ, USA, June 24--28, 2019, Anna Karlin, Nicole Immorlica, and Ramesh Johari (Eds.). ACM, 701--726. https://doi.org/10.1145/3328526.3329589
[2]
Dana Arad, Daniel Deutch, and Nave Frost. 2024. Predicting Fact Contributions from Query Logs with Machine Learning. In Proceedings 27th International Conference on Extending Database Technology, EDBT 2024, Paestum, Italy, March 25 - March 28, Letizia Tanca, Qiong Luo, Giuseppe Polese, Loredana Caruccio, Xavier Oriol, and Donatella Firmani (Eds.). OpenProceedings.org, 704--716. https://doi.org/10.48786/EDBT.2024.60
[3]
Santiago Andrés Azcoitia, Costas Iordanou, and Nikolaos Laoutaris. 2023. Understanding the Price of Data in Commercial Data Marketplaces. In 39th IEEE International Conference on Data Engineering, ICDE 2023, Anaheim, CA, USA, April 3--7, 2023. IEEE, 3718--3728. https://doi.org/10.1109/ICDE55515.2023.00300
[4]
Leopoldo E. Bertossi, Benny Kimelfeld, Ester Livshits, and Mika''el Monet. 2023. The Shapley Value in Database Management. SIGMOD Rec., Vol. 52, 2 (2023), 6--17. https://doi.org/10.1145/3615952.3615954
[5]
Ranran Bian, Yun Sing Koh, Gillian Dobbie, and Anna Divoli. 2019. Identifying Top-k Nodes in Social Networks: A Survey. ACM Comput. Surv., Vol. 52, 1 (2019), 22:1--22:33. https://doi.org/10.1145/3301286
[6]
Meghyn Bienvenu, Diego Figueira, and Pierre Lafourcade. 2024. When is Shapley Value Computation a Matter of Counting? PODS, Vol. 2, 2 (2024), 105. https://doi.org/10.1145/3651606
[7]
Javier Castro, Daniel Gómez, Elisenda Molina, and Juan Tejada. 2017. Improving polynomial estimation of the Shapley value by stratified random sampling with optimum allocation. Comput. Oper. Res., Vol. 82 (2017), 180--188. https://doi.org/10.1016/J.COR.2017.01.019
[8]
Javier Castro, Daniel Gómez, and Juan Tejada. 2009. Polynomial calculation of the Shapley value based on sampling. Comput. Oper. Res., Vol. 36, 5 (2009), 1726--1730. https://doi.org/10.1016/J.COR.2008.04.004
[9]
Lingjiao Chen, Paraschos Koutris, and Arun Kumar. 2019a. Towards Model-based Pricing for Machine Learning in a Data Marketplace. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD'19, Amsterdam, The Netherlands, June 30 - July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1535--1552. https://doi.org/10.1145/3299869.3300078
[10]
Lingjiao Chen, Hongyi Wang, Leshang Chen, Paraschos Koutris, and Arun Kumar. 2019b. Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD'19, Amsterdam, The Netherlands, June 30 - July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1885--1888. https://doi.org/10.1145/3299869.3320231
[11]
Yiwei Chen, Kaiyu Li, Guoliang Li, and Yong Wang. 2024. Contributions Estimation in Federated Learning: A Comprehensive Experimental Evaluation. Proceedings of the VLDB Endowment, Vol. 17, 8 (2024), 2077--2090.
[12]
Shay B. Cohen, Eytan Ruppin, and Gideon Dror. 2005. Feature Selection Based on the Shapley Value. In IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30 - August 5, 2005, Leslie Pack Kaelbling and Alessandro Saffiotti (Eds.). Professional Book Center, 665--670. http://ijcai.org/Proceedings/05/Papers/0763.pdf
[13]
Ian Covert and Su-In Lee. 2021. Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression. In The 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021, April 13--15, 2021, Virtual Event (Proceedings of Machine Learning Research, Vol. 130), Arindam Banerjee and Kenji Fukumizu (Eds.). PMLR, 3457--3465. http://proceedings.mlr.press/v130/covert21a.html
[14]
Susan B. Davidson, Daniel Deutch, Nave Frost, Benny Kimelfeld, Omer Koren, and Mika''el Monet. 2022. ShapGraph: An Holistic View of Explanations through Provenance Graphs and Shapley Values. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary G. Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 2373--2376. https://doi.org/10.1145/3514221.3520172
[15]
Xiaotie Deng and Christos H. Papadimitriou. 1994. On the Complexity of Cooperative Solution Concepts. Math. Oper. Res., Vol. 19, 2 (1994), 257--266. https://doi.org/10.1287/MOOR.19.2.257
[16]
Daniel Deutch, Nave Frost, Benny Kimelfeld, and Mika''el Monet. 2022. Computing the Shapley Value of Facts in Query Answering. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary G. Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 1570--1583. https://doi.org/10.1145/3514221.3517912
[17]
Zhenan Fan, Huang Fang, Xinglu Wang, Zirui Zhou, Jian Pei, Michael Friedlander, and Yong Zhang. 2024. Fair and Efficient Contribution Valuation for Vertical Federated Learning. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=sLQb8q0sUi
[18]
Zhenan Fan, Huang Fang, Zirui Zhou, Jian Pei, Michael P. Friedlander, Changxin Liu, and Yong Zhang. 2022. Improving Fairness for Data Valuation in Horizontal Federated Learning. In 38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9--12, 2022. IEEE, 2440--2453. https://doi.org/10.1109/ICDE53745.2022.00228
[19]
Eitan Farchi, Ramasuri Narayanam, and Lokesh Nagalapatti. 2021. Ranking Data Slices for ML Model Validation: A Shapley Value Approach. In 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19--22, 2021. IEEE, 1937--1942. https://doi.org/10.1109/ICDE51399.2021.00180
[20]
Raul Castro Fernandez. 2022. Protecting Data Markets from Strategic Buyers. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Zachary G. Ives, Angela Bonifati, and Amr El Abbadi (Eds.). ACM, 1755--1769. https://doi.org/10.1145/3514221.3517855
[21]
R. A. Fisher. 1988. Iris. UCI Machine Learning Repository.
[22]
Yihan Geng, Kunyu Wang, Ziqi Liu, Michael Yu, and Jeffrey Xu Yu. 2023. Influence Maximization Revisited. In Databases Theory and Applications - 34th Australasian Database Conference, ADC 2023, Melbourne, VIC, Australia, November 1--3, 2023, Proceedings (Lecture Notes in Computer Science, Vol. 14386), Zhifeng Bao, Renata Borovica-Gajic, Ruihong Qiu, Farhana Murtaza Choudhury, and Zhengyi Yang (Eds.). Springer, 356--370. https://doi.org/10.1007/978--3-031--47843--725
[23]
Amirata Ghorbani, Michael P. Kim, and James Zou. 2020. A Distributional Framework For Data Valuation. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 3535--3544. http://proceedings.mlr.press/v119/ghorbani20a.html
[24]
Amirata Ghorbani and James Y. Zou. 2019. Data Shapley: Equitable Valuation of Data for Machine Learning. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9--15 June 2019, Long Beach, California, USA (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 2242--2251. http://proceedings.mlr.press/v97/ghorbani19c.html
[25]
Stefan Grafberger, Shubha Guha, Paul Groth, and Sebastian Schelter. 2023. MLWHATIF: What If You Could Stop Re-Implementing Your Machine Learning Pipeline Analyses Over and Over? Proc. VLDB Endow., Vol. 16, 12 (2023), 4002--4005. https://doi.org/10.14778/3611540.3611606
[26]
Ben Halstead, Yun Sing Koh, Patricia Riddle, Mykola Pechenizkiy, Albert Bifet, and Russel Pears. 2021. Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information. In 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19--22, 2021. IEEE, 1056--1067. https://doi.org/10.1109/ICDE51399.2021.00096
[27]
W. Hoeffding. 1963. Probability Inequalities for Sums of Bounded Random Variables. J. Amer. Statist. Assoc., Vol. 58, 301 (1963), 13--30.
[28]
Wassily Hoeffding. 1994. Probability inequalities for sums of bounded random variables. The collected works of Wassily Hoeffding (1994), 409--426.
[29]
Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nezihe Merve G''urel, Bo Li, Ce Zhang, Costas J. Spanos, and Dawn Song. 2019a. Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms. Proc. VLDB Endow., Vol. 12, 11 (2019), 1610--1623. https://doi.org/10.14778/3342263.3342637
[30]
Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nick Hynes, Nezihe Merve G''urel, Bo Li, Ce Zhang, Dawn Song, and Costas J. Spanos. 2019b. Towards Efficient Data Valuation Based on the Shapley Value. In The 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019, 16--18 April 2019, Naha, Okinawa, Japan (Proceedings of Machine Learning Research, Vol. 89), Kamalika Chaudhuri and Masashi Sugiyama (Eds.). PMLR, 1167--1176. http://proceedings.mlr.press/v89/jia19a.html
[31]
Ahmet Kara, Dan Olteanu, and Dan Suciu. 2024. From Shapley Value to Model Counting and Back. PODS, Vol. 2, 2 (2024), 79. https://doi.org/10.1145/3651142
[32]
Bojan Karlaš, David Dao, Matteo Interlandi, Sebastian Schelter, Wentao Wu, and Ce Zhang. 2024. Data Debugging with Shapley Importance over Machine Learning Pipelines. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=qxGXjWxabq
[33]
Pratik Karmakar, Mika''el Monet, Pierre Senellart, and Stéphane Bressan. 2024. Expected Shapley-Like Scores of Boolean functions: Complexity and Applications to Probabilistic Databases. SIGMOD, Vol. 2, 2 (2024), 92. https://doi.org/10.1145/3651593
[34]
Patrick Kolpaczki, Viktor Bengs, Maximilian Muschalik, and Eyke H''ullermeier. 2024. Approximating the Shapley Value without Marginal Contributions. In Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20--27, 2024, Vancouver, Canada, Michael J. Wooldridge, Jennifer G. Dy, and Sriraam Natarajan (Eds.). AAAI Press, 13246--13255. https://doi.org/10.1609/AAAI.V38I12.29225
[35]
Feifei Li. 2023. Modernization of Databases in the Cloud Era: Building Databases that Run Like Legos. Proc. VLDB Endow., Vol. 16, 12 (2023), 4140--4151. https://doi.org/10.14778/3611540.3611639
[36]
Jinyang Li, Yuval Moskovitch, and H. V. Jagadish. 2023a. Detection of Groups with Biased Representation in Ranking. In 39th IEEE International Conference on Data Engineering, ICDE 2023, Anaheim, CA, USA, April 3--7, 2023. IEEE, 2167--2179. https://doi.org/10.1109/ICDE55515.2023.00168
[37]
Weida Li and Yaoliang Yu. 2024. Faster Approximation of Probabilistic and Distributional Values via Least Squares. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7--11, 2024. OpenReview.net. https://openreview.net/forum?id=lvSMIsztka
[38]
Ye Li, Jian Tan, Bin Wu, Xiao He, and Feifei Li. 2023b. ShapleyIQ: Influence Quantification by Shapley Values for Performance Debugging of Microservices. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4, ASPLOS 2023, Vancouver, BC, Canada, March 25--29, 2023, Tor M. Aamodt, Michael M. Swift, and Natalie D. Enright Jerger (Eds.). ACM, 287--323. https://doi.org/10.1145/3623278.3624771
[39]
Stephen C Littlechild and Guillermo Owen. 1973. A simple expression for the Shapley value in a special case. Management Science, Vol. 20, 3 (1973), 370--372.
[40]
Stephen C Littlechild and GF Thompson. 1977. Aircraft landing fees: a game theory approach. The Bell Journal of Economics (1977), 186--204.
[41]
Jinfei Liu, Qiongqiong Lin, Jiayao Zhang, Kui Ren, Jian Lou, Junxu Liu, Li Xiong, Jian Pei, and Jimeng Sun. 2021a. Demonstration of Dealer: An End-to-End Model Marketplace with Differential Privacy. Proc. VLDB Endow., Vol. 14, 12 (2021), 2747--2750. https://doi.org/10.14778/3476311.3476335
[42]
Jinfei Liu, Jian Lou, Junxu Liu, Li Xiong, Jian Pei, and Jimeng Sun. 2021b. Dealer: An End-to-End Model Marketplace with Differential Privacy. Proc. VLDB Endow., Vol. 14, 6 (2021), 957--969. https://doi.org/10.14778/3447689.3447700
[43]
Ester Livshits, Leopoldo E. Bertossi, Benny Kimelfeld, and Moshe Sebag. 2020. The Shapley Value of Tuples in Query Answering. In 23rd International Conference on Database Theory, ICDT 2020, March 30-April 2, 2020, Copenhagen, Denmark (LIPIcs, Vol. 155), Carsten Lutz and Jean Christoph Jung (Eds.). Schloss Dagstuhl - Leibniz-Zentrum f''ur Informatik, 20:1--20:19. https://doi.org/10.4230/LIPICS.ICDT.2020.20
[44]
Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 4765--4774. https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
[45]
Xuan Luo and Jian Pei. 2024. Applications and Computation of the Shapley Value in Databases and Machine Learning. In Companion of the 2024 International Conference on Management of Data, SIGMOD/PODS 2024, Santiago AA, Chile, June 9--15, 2024, Pablo Barceló, Nayat Sánchez Pi, Alexandra Meliou, and S. Sudarshan (Eds.). ACM, 630--635. https://doi.org/10.1145/3626246.3654680
[46]
Xuan Luo, Jian Pei, Zicun Cong, and Cheng Xu. 2022. On Shapley Value in Data Assemblage Under Independent Utility. Proc. VLDB Endow., Vol. 15, 11 (2022), 2761--2773. https://doi.org/10.14778/3551793.3551829
[47]
Xuan Luo, Jian Pei, Cheng Xu, Wenjie Zhang, and Jianliang Xu. 2024. Fast Shapley Value Computation in Data Assemblage Tasks as Cooperative Simple Games. SIGMOD, Vol. 2, 1 (2024), 56:1--56:28. https://doi.org/10.1145/3639311
[48]
Shuaicheng Ma, Yang Cao, and Li Xiong. 2021. Transparent Contribution Evaluation for Secure Federated Learning on Blockchain. In 37th IEEE International Conference on Data Engineering Workshops, ICDE Workshops 2021, Chania, Greece, April 19--22, 2021. IEEE, 88--91. https://doi.org/10.1109/ICDEW53142.2021.00023
[49]
Sasan Maleki. 2015. Addressing the computational issues of the Shapley value with applications in the smart grid. Ph.,D. Dissertation. University of Southampton, UK. http://eprints.soton.ac.uk/383963/
[50]
Irwin Mann and Lloyd S Shapley. 1960. Values of large games, IV: Evaluating the electoral college by Montecarlo techniques. Rand Corporation.
[51]
Rory Mitchell, Joshua Cooper, Eibe Frank, and Geoffrey Holmes. 2022. Sampling Permutations for Shapley Value Estimation. J. Mach. Learn. Res., Vol. 23 (2022), 43:1--43:46. http://jmlr.org/papers/v23/21-0439.html
[52]
Nikolaos Myrtakis, Ioannis Tsamardinos, and Vassilis Christophides. 2021. PROTEUS: Predictive Explanation of Anomalies. In 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19--22, 2021. IEEE, 1967--1972. https://doi.org/10.1109/ICDE51399.2021.00185
[53]
Sellers Tracy Talbot Simon Cawthorn Andrew Nash, Warwick and Wes Ford. 1994. Abalone. UCI Machine Learning Repository.
[54]
Art B. Owen. 2013a. Monte Carlo theory, methods and examples. https://artowen.su.domains/mc/.
[55]
Guillermo Owen. 2013b. Game theory. Emerald Group Publishing.
[56]
Romila Pradhan, Aditya Lahiri, Sainyam Galhotra, and Babak Salimi. 2022. Explainable AI: Foundations, Applications, Opportunities for Data Management Research. In 38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9--12, 2022. IEEE, 3209--3212. https://doi.org/10.1109/ICDE53745.2022.00300
[57]
Alon Reshef, Benny Kimelfeld, and Ester Livshits. 2020. The Impact of Negation on the Complexity of the Shapley Value in Conjunctive Queries. In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2020, Portland, OR, USA, June 14--19, 2020, Dan Suciu, Yufei Tao, and Zhewei Wei (Eds.). ACM, 285--297. https://doi.org/10.1145/3375395.3387664
[58]
Alvin E Roth. 1988. The Shapley value: essays in honor of Lloyd S. Shapley. Cambridge University Press.
[59]
Sebastian Schelter, Stefan Grafberger, Shubha Guha, Bojan Karlas, and Ce Zhang. 2023. Proactively Screening Machine Learning Pipelines with ARGUSEYES. In Companion of the 2023 International Conference on Management of Data, SIGMOD/PODS 2023, Seattle, WA, USA, June 18--23, 2023, Sudipto Das, Ippokratis Pandis, K. Selçuk Candan, and Sihem Amer-Yahia (Eds.). ACM, 91--94. https://doi.org/10.1145/3555041.3589682
[60]
Lloyd S. Shapley. 1953. A value for n-person games. (1953).
[61]
Tianshu Song, Yongxin Tong, and Shuyue Wei. 2019. Profit Allocation for Federated Learning. In 2019 IEEE International Conference on Big Data (IEEE BigData), Los Angeles, CA, USA, December 9--12, 2019, Chaitanya K. Baru, Jun Huan, Latifur Khan, Xiaohua Hu, Ronay Ak, Yuanyuan Tian, Roger S. Barga, Carlo Zaniolo, Kisung Lee, and Yanfang (Fanny) Ye (Eds.). IEEE, 2577--2586. https://doi.org/10.1109/BIGDATA47090.2019.9006327
[62]
Qiheng Sun, Xiang Li, Jiayao Zhang, Li Xiong, Weiran Liu, Jinfei Liu, Zhan Qin, and Kui Ren. 2023. ShapleyFL: Robust Federated Learning Based on Shapley Value. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023, Long Beach, CA, USA, August 6--10, 2023, Ambuj K. Singh, Yizhou Sun, Leman Akoglu, Dimitrios Gunopulos, Xifeng Yan, Ravi Kumar, Fatma Ozcan, and Jieping Ye (Eds.). ACM, 2096--2108. https://doi.org/10.1145/3580305.3599500
[63]
Sofiane Touati, Mohammed Said Radjef, and Lakhdar Sais. 2021. A Bayesian Monte Carlo method for computing the Shapley value: Application to weighted voting and bin packing games. Comput. Oper. Res., Vol. 125 (2021), 105094. https://doi.org/10.1016/J.COR.2020.105094
[64]
Guan Wang. 2019. Interpret Federated Learning with Shapley Values. CoRR, Vol. abs/1905.04519 (2019). showeprint[arXiv]1905.04519 http://arxiv.org/abs/1905.04519
[65]
Junhao Wang, Lan Zhang, Anran Li, Xuanke You, and Haoran Cheng. 2022. Efficient Participant Contribution Evaluation for Horizontal and Vertical Federated Learning. In 38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9--12, 2022. IEEE, 911--923. https://doi.org/10.1109/ICDE53745.2022.00073
[66]
Tingting Wang, Shixun Huang, Zhifeng Bao, J. Shane Culpepper, Volkan Dedeoglu, and Reza Arablouei. 2024. Optimizing Data Acquisition to Enhance Machine Learning Performance. Proc. VLDB Endow., Vol. 17, 6 (2024), 1310--1323. https://www.vldb.org/pvldb/vol17/p1310-bao.pdf
[67]
Lauren Watson, Zeno Kujawa, Rayna Andreeva, Hao-Tsung Yang, Tariq Elahi, and Rik Sarkar. 2023. Accelerated Shapley Value Approximation for Data Evaluation. CoRR, Vol. abs/2311.05346 (2023). https://doi.org/10.48550/ARXIV.2311.05346 showeprint[arXiv]2311.05346
[68]
WIlliam Wolberg. 1992. Breast Cancer Wisconsin (Original). UCI Machine Learning Repository.
[69]
Mengmeng Wu, Ruoxi Jia, Changle Lin, Wei Huang, and Xiangyu Chang. 2023. Variance reduced Shapley value estimation for trustworthy data valuation. Comput. Oper. Res., Vol. 159 (2023), 106305. https://doi.org/10.1016/J.COR.2023.106305
[70]
Haocheng Xia, Xiang Li, Junyuan Pang, Jinfei Liu, Kui Ren, and Li Xiong. 2024. P-Shapley: Shapley Values on Probabilistic Classifiers. Proc. VLDB Endow., Vol. 17, 7 (2024), 1737--1750. https://www.vldb.org/pvldb/vol17/p1737-liu.pdf
[71]
Haocheng Xia, Jinfei Liu, Jian Lou, Zhan Qin, Kui Ren, Yang Cao, and Li Xiong. 2023. Equitable Data Valuation Meets the Right to Be Forgotten in Model Markets. Proc. VLDB Endow., Vol. 16, 11 (2023), 3349--3362. https://doi.org/10.14778/3611479.3611531
[72]
Jiayao Zhang, Qiheng Sun, Jinfei Liu, Li Xiong, Jian Pei, and Kui Ren. 2023a. Efficient Sampling Approaches to Shapley Value Approximation. SIGMOD, Vol. 1, 1 (2023), 48:1--48:24. https://doi.org/10.1145/3588728
[73]
Jiayao Zhang, Haocheng Xia, Qiheng Sun, Jinfei Liu, Li Xiong, Jian Pei, and Kui Ren. 2023b. Dynamic Shapley Value Computation. In 39th IEEE International Conference on Data Engineering, ICDE 2023, Anaheim, CA, USA, April 3--7, 2023. IEEE, 639--652. https://doi.org/10.1109/ICDE55515.2023.00055
[74]
Xinyi Zhang, Zhuo Chang, Yang Li, Hong Wu, Jian Tan, Feifei Li, and Bin Cui. 2022. Facilitating Database Tuning with Hyper-Parameter Optimization: A Comprehensive Experimental Evaluation. Proc. VLDB Endow., Vol. 15, 9 (2022), 1808--1821. https://doi.org/10.14778/3538598.3538604
[75]
Shuyuan Zheng, Yang Cao, and Masatoshi Yoshikawa. 2023. Secure Shapley Value for Cross-Silo Federated Learning. Proc. VLDB Endow., Vol. 16, 7 (2023), 1657--1670. https://doi.org/10.14778/3587136.3587141
[76]
Guanghui Zhu, Wenjie Wang, Zhuoer Xu, Feng Cheng, Mengchuan Qiu, Chunfeng Yuan, and Yihua Huang. 2022. PSP: Progressive Space Pruning for Efficient Graph Neural Architecture Search. In 38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9--12, 2022. IEEE, 2168--2181. https://doi.org/10.1109/ICDE53745.2022.00208
[77]
Yuqing Zhu, Jing Tang, Xueyan Tang, and Lei Chen. 2021. Analysis of Influence Contribution in Social Advertising. Proc. VLDB Endow., Vol. 15, 2 (2021), 348--360. https://doi.org/10.14778/3489496.3489514

Index Terms

  1. Shapley Value Estimation based on Differential Matrix

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 3, Issue 1
    SIGMOD
    February 2025
    2261 pages
    EISSN:2836-6573
    DOI:10.1145/3717614
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 February 2025
    Published in PACMMOD Volume 3, Issue 1

    Permissions

    Request permissions for this article.

    Author Tags

    1. sampling
    2. shapley value

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 54
      Total Downloads
    • Downloads (Last 12 months)54
    • Downloads (Last 6 weeks)54
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media