计算机科学 ›› 2019, Vol. 46 ›› Issue (7): 38-49.doi: 10.11896/j.issn.1002-137X.2019.07.006
刘梦娟1,曾贵川1,岳威1,仇笠舟1,王加昌2
LIU Meng-juan1,ZENG Gui-chuan1,YUE Wei1,QIU Li-zhou1,WANG Jia-chang2
摘要: 点击率预测模型的研究近年来备受学术界和工业界的关注。针对展示广告定向投放的点击率预测模型,研究了样本特征的预处理技术、基于传统机器学习模型的CTR预测方案、基于最新的深度学习模型的CTR预测方案、CTR预测模型的主要性能评价指标等,并基于一个开放数据集对其中的典型方案给出性能对比和量化分析,最后讨论了目前面向展示广告的点击率预测模型研究存在的问题和未来发展趋势。
中图分类号:
[1]OLIVIER C.Offline evaluation of response prediction in online advertising auctions[C]∥The International Conference of World Wide Web.Florence,Italy,2015:18-22.<br /> [2]LIU P,WANG C.Computational advertising:market and technology of Internet business realization [M].Beijing:The People’s Posts and Telecommunications Press,2015.(in Chinese)<br /> 刘鹏,王超.计算广告:互联网商业变现的市场与技术[M].北京:人民邮电出版社,2015.<br /> [3]WANG J,ZHANG W,YUAN S.Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting [J].Foundations & Trends in Information Retrieval,2017,11(4-5):297-435.<br /> [4]李航.统计学习方法[M].北京:清华大学出版社,2012.<br /> [5]伊恩·古德费洛,约书亚·本吉奥,亚伦·库维尔.深度学习[M].赵申剑,黎彧君,符天凡,等译.北京:人民邮电出版社,2017.<br /> [6]BECK J E,WOOLF B P.High-level Student Modeling with Machine Learning[M]∥Intelligent Tutoring Systems.Berlin,Germany,2000:584-593.<br /> [7]ZHANG W,DU T,WANG J.Deep Learning over Multi-field Categorical Data:A Case Study on User Response Prediction[C]∥Proceedings of European Conference on Information Retrieval.Switzerland Cham:Springer,2016:45-57.<br /> [8]CHENG H T,KOC L,HARMSEN J,et al.Wide & Deep Learning for Recommender Systems[C]∥The Workshop on Deep Learning for Recommender Systems.Boston,USA,2016:7-10.<br /> [9]GUO H,TANG R,YE Y,et al.DeepFM:A Factorization-Machine based Neural Network for CTR Prediction[C]∥Procee-dings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.Melbourne,Australia,2017:1725-1731.<br /> [10]WANG R,FU B,FU G,et al.Deep & Cross Network for Ad Click Predictions[C]∥Proceedings of AdKDD and TargetAd.Halifax,2017:1-7 .<br /> [11]BOTTOU L.Online Learning and Neural Networks[M].Cambridge,UK:Cambridge University Press,1998.<br /> [12]ZINKEVICH M.Online Convex Programming and Generalized Infinitesimal Gradient Ascent:Technical Report CMU-CS-03-110[R].Carnegie Mellon University,2003.<br /> [13]DUCHI J,SINGER Y.Efficient Online and Batch Learning Using Forward Backward Splitting[J].Journal of Machine Learning Research,2009,10(18):2899-2934.<br /> [14]XIAO L.Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization[J].Journal of Machine Learning Research,2010,11(1):2543-2596.<br /> [15]MCMAHAN H B,HOLT G,SCULLEY D,et al.Ad click prediction:a view from the trenches[C]∥ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Chicago,2013:1222-1230.<br /> [16]CHANG Y W,HSIEH C J,CHANG K W,et al.Training and Testing Low-degree Polynomial Data Mappings via Linear SVM[J].Journal of Machine Learning Research,2014,11(11):1471-1490.<br /> [17]OENTARYO R J,LIM E P,LOW J W,et al.Predicting response in mobile advertising with hierarchical importance-aware factorization machine[C]∥ACM International Conference on Web Search and Data Mining.New York,USA:ACM,2014:123-132.<br /> [18]RENDLE S.Factorization Machines with libFM[J].Acm Transactions on Intelligent Systems & Technology,2012,3(3):1-22.<br /> [19]JUAN Y,ZHUANG Y,CHIN W S,et al.Field-aware Factorization Machines for CTR Prediction[C]∥ACM Conference on Recommender Systems.Boston MA,USA:ACM,2016:43-50.<br /> [20]HE X R,PAN J F,JIN O,et al.Practical lessons from predicting clicks on ads at facebook [C]∥ACM SIGKDD Conference on Knowledge Discovery and Data Mining.New York,USA,2014:1-9.<br /> [21]ZHOU Z H.Ensemble Methods:Foundations and Algorithms [M].New York:CRC press,2012.<br /> [22]JUAN Y C,CHIN W S,ZHUANG Y.kaggle-2014-criteo[DB/OL].[2018-07-12].ttps://github.com/guestwalk/kaggle-2014-criteo.<br /> [23]KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNet Classification with Deep Convolutional Neural Networks[J].Advances in neural information processing systems,2012,25(2):1097-1105.<br /> [24]ALEX G,ABDEL-RAHMAN M,GEOFFREY H.Speech re- cognition with deep recurrent neural networks [C]∥IEEE International Conference on Acoustics,Speech and Signal Proces-sing.Vancouver,Canada,2013:6645-6649.<br /> [25]SHEN Y L,HE X D,GAO J F,et al.A latent semantic model with convolutional-pooling structure for information retrieval [C]∥ACM International Conference on Conference on Information and Knowledge Management.Shanghai,China,2014:101-110.<br /> [26]周志华.机器学习[M].北京:清华大学出版社,2016:114.<br /> [27]QU Y R,CAI H,REN K,et al.Product-based neural networks for user response prediction[C]∥IEEE International Confe-rence on Data Mining.Barcelona,Spain,2016:1-6.<br /> [28]LIAO H R,PENG L X,LIU Z C,et al.Ipinyou global rtb bidding algorithm competition dataset[C]∥ACM SIGKDD Confe-rence on Knowledge Discovery and Data Mining.New York,USA,2014:1-6.<br /> [29]MURPHY K P.Machine Learning:A Probabilistic Perspective[M].Boston:MIT,2012.<br /> [30]HE H,GARCIA E A.Learning from imbalanced data[J].IEEE Transactions on Knowledge and Data Engineering,2009,21(9):1263-1284.<br /> [31]XIE Q Z,DAI Z H,DU Y L,et al.Controllable Invariance through Adversarial Feature Learning[C]∥31st Conference on Neural Information Processing Systems.Long Beach,CA,USA,2017.<br /> [32]DENG Y,SHEN Y,JIN H,et al.Disguise Adversarial Networks for Click-through Rate Prediction[C]∥Twenty-Sixth International Joint Conference on Artificial Intelligence.Melbourne.Australia,2017:1589-1595.<br /> [33]SU Y H,JIN Z M,CHEN Y,et al.Improving Click-Through Rate Prediction Accuracy in Online Advertising by Transfer Learning [C]∥Proceedings of WI 17.Leipzig,Germany,2017.<br /> [34]ZHANG W,ZHOU T,WANG J,et al.Bid-aware Gradient Descent for Unbiased Learning with Censored Data in Display Advertising[C]∥ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.San Francisco,CA,USA,2016:665-674.<br /> [35]JASON Y,JEFF C,YOSHUA B,et al.How transferable are features in deep neural networks[C]∥Advances in Neural Information Processing Systems.Montreal,Canada,2014:3320-3328.<br /> [36]XIAO J,YE H,HE X N.Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks[C]∥Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.Melbourne,Australia,2017:3119-3125.<br /> [37]HE X G,CHUA T S.Neural Factorization Machines for Sparse Predictive Analytics[C]∥The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.Shinjuku,Tokyo,Japan,2017:355-364.<br /> [38]CHEN J,SUN B,LI H,et al.Deep CTR Prediction in Display Advertising[C]∥The 2016 ACM Multimedia Conference.Amsterdam,Netherlands,2016:811-820.<br /> [39]CHAPELLE O,MANAVOGLU E,ROSALES R.Simple and Scalable Response Prediction for Display Advertising [J].ACM Transactions on Intelligent Systems and Technology,2014,5(4):1-34.<br /> [40]LEE K C,ORTEN B,DASDAN A,et al.Estimating Conver- sion Rate in Display Advertising From Past Performance Data [C]∥Proceedings of the 18<sup>th</sup> ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM press,2012:768-776.<br /> [41]GRAEPEL T,CANDELA Q,BORCHERT T,et al.Web-scale Bayesian Click-throuth Rate Prediction for Sponsored Search Advertising in Microsoft’s Bing Search Engine [C]∥Procee-dings of the 27<sup>th</sup> International Conference on Machine Learning.Israel:Omnipress,2010:13-20.<br /> [42]RICHARDSON M,DOMINOWSKA E,RAGNO R.Predicting Clicks:Estimating the Click-through Rate for New Ads[C]∥International Conference on World Wide Web.Canada:ACM,2007:521-530. |
[1] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[2] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[3] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[4] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[5] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[8] | 帅剑波, 王金策, 黄飞虎, 彭舰. 基于神经架构搜索的点击率预测模型 Click-Through Rate Prediction Model Based on Neural Architecture Search 计算机科学, 2022, 49(7): 10-17. https://doi.org/10.11896/jsjkx.210600009 |
[9] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
[10] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[11] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[12] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
[13] | 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫. 小样本雷达辐射源识别的深度学习方法综述 Survey of Deep Learning for Radar Emitter Identification Based on Small Sample 计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138 |
[14] | 刘伟业, 鲁慧民, 李玉鹏, 马宁. 指静脉识别技术研究综述 Survey on Finger Vein Recognition Research 计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056 |
[15] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
|