Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3459637.3481921acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

From Limited Annotated Raw Material Data to Quality Production Data: A Case Study in the Milk Industry

Published: 30 October 2021 Publication History

Abstract

Industry 4.0 offers opportunities to combine multiple sensor data sources using IoT technologies for better utilization of raw material in production lines. A common belief that data is readily available (the big data phenomenon), is oftentimes challenged by the need to effectively acquire quality data under severe constraints. In this paper we propose a design methodology, using active learning to enhance learning capabilities, for building a model of production outcome using a constrained amount of raw material training data. The proposed methodology extends existing active learning methods to effectively solve regression-based learning problems and may serve settings where data acquisition requires excessive resources in the physical world. We further suggest a set of qualitative measures to analyze learners performance. The proposed methodology is demonstrated using an actual application in the milk industry, where milk is gathered from multiple small milk farms and brought to a dairy production plant to be processed into cottage cheese.

References

[1]
[n.d.]. Repository. https://github.com/nitaytech/ActiveLearningForRegression.
[2]
[n.d.]. Technical Report. https://github.com/nitaytech/ActiveLearningForRegression/blob/main/Tech-Report.pdf.
[3]
Arvind Arasu, Michaela Götz, and Raghav Kaushik. 2010. On active learning of record matching packages. In SIGMOD. 783--794.
[4]
H. V. Atherton and J. A. Newlander. 1977. Chemistry and testing of dairy products .AVI Publishing Co. Inc.
[5]
Yoram Baram, Ran El Yaniv, and Kobi Luz. 2004. Online choice of active learning algorithms. Journal of Machine Learning Research, Vol. 5, Mar (2004), 255--291.
[6]
HAR Barnett. 1985. Criteria of smoothness. Journal of the Institute of Actuaries, Vol. 112, 3 (1985), 331--367.
[7]
D. G. Blackburn, V. Hayssen, and C.J. Murphy. 1989. The origins of lactation and the evolution of milk: a review with new hypotheses. Mammal Review, Vol. 19, 1 (1989), 1--26.
[8]
Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010. Springer, 177--186.
[9]
R. J. Brown, C. A. Ernstrom, and M. E. Johnson. 1988. Milk-Clotting Enzymes and Cheese Chemistry. Springer, 609--654.
[10]
Robert Burbidge, Jem J Rowland, and Ross D King. 2007. Active learning for regression based on query by committee. In International conference on intelligent data engineering and automated learning. Springer, 209--218.
[11]
Wenbin Cai, Ya Zhang, and Jun Zhou. 2013. Maximizing expected model change for active learning in regression. In ICDM. IEEE, 51--60.
[12]
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In SIGKDD. 785--794.
[13]
Gordon V Cormack and Maura R Grossman. 2016. Scalability of continuous active learning for reliable high-recall text classification. In Proceedings of the 25th ACM international on conference on information and knowledge management. 1039--1048.
[14]
Ido Dagan and Sean P Engelson. 1995. Committee-based sampling for training probabilistic classifiers. In Machine Learning Proceedings 1995. Elsevier, 150--157.
[15]
David I Douphrate, G Robert Hagevoort, Matthew W Nonnenmann, Christina Lunner Kolstrup, Stephen J Reynolds, Martina Jakob, and Mark Kinsel. 2013. The dairy industry: a brief description of production practices, trends, and farm characteristics around the world. Journal of agromedicine, Vol. 18, 3 (2013), 187--197.
[16]
TF Dunlap, RA Kohn, GE Dahl, M Varner, and RA Erdman. 2000. The impact of somatotropin, milking frequency, and photoperiod on dairy farm nutrient flows. Journal of Dairy Science, Vol. 83, 5 (2000), 968--976.
[17]
NC Friggens, GC Emmans, I Kyriazakis, JD Oldham, and M Lewis. 1998. Feed intake relative to stage of lactation for dairy cows consuming total mixed diets with a high or low ratio of concentrate to forage. Journal of dairy science, Vol. 81, 8 (1998), 2228--2239.
[18]
Avigdor Gal, Avishai Mandelbaum, Francc ois Schnitzler, Arik Senderovich, and Matthias Weidlich. 2017. Traveling time prediction in scheduled transportation with journey segments. Information Systems, Vol. 64 (2017), 266--280.
[19]
W.M. Gelbart, A Ben-Shaul, and D ROUX (Eds.). 1994. Micelles, Membranes, Microemulsions and Monolayers. Springer, New York.
[20]
W. M. Gelbart and A. Ben-Shaul. 1996. The "New" Science of "Complex Fluids". The Journal of Physical Chemistry, Vol. 100, 31 (1996), 13169--13189.
[21]
D. Gianola and G. J. M. Rosa. 2015. One Hundred Years of Statistical Developments in Animal Breeding. Annual Review of Animal Biosciences, Vol. 3, 1 (2015), 19--56.
[22]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. 249--256.
[23]
Husheng Guo and Wenjian Wang. 2015. An active learning-based SVM multi-class classification model. Pattern recognition, Vol. 48, 5 (2015), 1577--1597.
[24]
Philipp Haehnel, Jakub Marevc ek, Julien Monteil, and Fearghal O'Donncha. 2020. Using deep learning to extend the range of air pollution monitoring and forecasting. J. Comput. Phys., Vol. 408 (2020), 109278.
[25]
Enhui Huang, Liping Peng, Luciano Di Palma, Ahmed Abdelkafi, Anna Liu, and Yanlei Diao. [n.d.]. Optimization for Active Learning-based Interactive Database Exploration. Proceedings of the VLDB Endowment, Vol. 12, 1 ([n.,d.]).
[26]
Sheng-Jun Huang, Rong Jin, and Zhi-Hua Zhou. 2010. Active learning by querying informative and representative examples. In Advances in neural information processing systems. 892--900.
[27]
K. L. Ingvartsen, O. Aes, and J. B. Andersen. 2001. Effects of pattern of concentrate allocation in the dry period and early lactation on feed intake and lactational performance in dairy cows. Livestock Production Science, Vol. 71, 2--3 (2001), 207--221.
[28]
R. Jensen. 2002. Invited Review: The Composition of Bovine Milk Lipids: January 1995 to December 2000. Journal of Dairy Science, Vol. 85, 2 (2002), 295--350.
[29]
Ana M Jiménez-Carvelo, Antonio González-Casado, M Gracia Bagur-González, and Luis Cuadros-Rodríguez. 2019. Alternative data mining/machine learning methods for the analytical evaluation of food quality and authenticity--A review. Food research international, Vol. 122 (2019), 25--39.
[30]
C.T. Kadzerea, M.R. Murphy, N. Silanikov, and E. Maltz. 2013. Heat stress in lactating dairy cows: a review. Livestock Production Science, Vol. 77, 1 (2013), 59--91.
[31]
G. Katz, U. Merin, D. Bezman, S. Lavie, L. Lemberskiy-Kuzin, and G. Leitner. 2016. Real-time evaluation of individual cow milk for higher cheese-milk quality with increased cheese yield. Journal of Dairy Science, Vol. 99, 6 (2016), 10587--10598.
[32]
Mahnoosh Kholghi, Laurianne Sitbon, Guido Zuccon, and Anthony Nguyen. 2015. External knowledge and query strategies in active learning: a study in clinical information extraction. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 143--152.
[33]
I. Kubarsepp, M. Henno, O. Kart, and T. Tupasela. 2005. A comparison of the methods for determination of the rennet coagulation properties of milk. Acta Agriculturae Scandinavica, Vol. Section A -- Animal Science, 55 (2005), 145--148.
[34]
Ken Lang. 1995. Newsweeder: Learning to filter netnews. In Machine Learning Proceedings 1995. Elsevier, 331--339.
[35]
Juha Lappi. 2006. Smooth height/age curves from stem analysis with linear programming.(2006).
[36]
B.A. Law and A.Y. Tamime. 2010. Technology of Cheesemaking .Blackwell Publishing Ltd.
[37]
Gabriel Leitner, Nissim Silanikove, Shamay Jacobi, Limor Weisblit, Solange Bernstein, and Uzi Merin. 2008. The influence of storage on the farm and in dairy silos on milk quality for cheese production. International Dairy Journal, Vol. 18, 2 (2008), 109--113.
[38]
Bing Li, Yinzi Lin, Wei Yu, David I Wilson, and Brent R Young. 2020. Application of mechanistic modelling and machine learning for cream cheese fermentation pH prediction. Journal of Chemical Technology & Biotechnology (2020).
[39]
Mingkun Li and Ishwar K Sethi. 2006. Confidence-based active learning. IEEE transactions on pattern analysis and machine intelligence, Vol. 28, 8 (2006), 1251--1261.
[40]
Shen Liang, Yanchun Zhang, and Jiangang Ma. 2020. Active model selection for positive unlabeled time series classification. In ICDE. IEEE, 361--372.
[41]
Andy Liaw, Matthew Wiener, et al. 2002. Classification and regression by random Forest. R news, Vol. 2, 3 (2002), 18--22.
[42]
J.A. Lucey, M.E. Johnson, and D.S. Horne. 2003. Invited Review: Perspectives on the Basis of the Rheologyand Texture Properties of Cheese. Journal of Dairy Science, Vol. 86, 9 (2003), 2725--2743.
[43]
Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in neural information processing systems. 4765--4774.
[44]
Lin Ma, Bailu Ding, Sudipto Das, and Adith Swaminathan. 2020. Active learning for ML enhanced database systems. In SIGMOD. 175--191.
[45]
Yinqing Ma, C Ryan, DM Barbano, DM Galton, MA Rudan, and KJ Boor. 2000. Effects of Somatic Cell Count on Quality and Shelf-Life of Pasteurized Fluid Milk1. Journal of dairy science, Vol. 83, 2 (2000), 264--274.
[46]
I. Q. Macedo, C. J. Faro, and E. M. Pires. 1993. Specificity and kinetics of the milk-clotting enzyme from cardoon (Cynara cardunculus L.) toward bovine.kappa.-casein. Journal of Agricultural and Food Chemistry, Vol. 41, 10 (1993), 1537--1540.
[47]
James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. Oakland, CA, USA, 281--297.
[48]
E. Maltz. 2010. Novel Technologies: Sensors, Data and Precision Dairy Farming. The first North American conference on precision dairy management (2010).
[49]
D. J. McMahon and R. J. Brown. 1982. Evaluation of Formagraph for Comparing Rennet Solutions. Journal of dairy science, Vol. 65, 8 (1982), 1639--1642.
[50]
Barzan Mozafari, Purna Sarkar, Michael Franklin, Michael Jordan, and Samuel Madden. 2014. Scaling up crowd-sourcing to very large datasets: a case for active learning. Proceedings of the VLDB Endowment, Vol. 8, 2 (2014), 125--136.
[51]
G. L. Munro, P. A. Grieve, and B. J. Kitchen. 1984. Effects of mastitis on milk yield, milk composition, processing properties and yield and quality of milk products. Australian Journal of Dairy Technology, Vol. 39, 1 (1984), 7--16.
[52]
Jack O'Neill. 2015. An evaluation of selection strategies for active learning with regression. (2015).
[53]
Niklas Christoffer Petersen, Filipe Rodrigues, and Francisco Camara Pereira. 2019. Multi-output bus travel time prediction with convolutional LSTM neural network. Expert Systems with Applications, Vol. 120 (2019), 426--435.
[54]
Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, and Zhongfei Zhang. 2018. Deep air learning: Interpolation, prediction, and feature analysis of fine-grained air quality. TKDE, Vol. 30, 12 (2018), 2285--2297.
[55]
M. Ron and J. Weller. 2007. From QTL to QTN identification in livestock - winning by points rather than knock-out: a review. Animal Genetics, Vol. 38, 5 (2007), 429--439.
[56]
Peter J Rousseeuw. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, Vol. 20 (1987), 53--65.
[57]
George AF Seber and Alan J Lee. 2012. Linear regression analysis. Vol. 329. John Wiley & Sons.
[58]
Burr Settles. 1995. Active Learning Literature Survey. Science, Vol. 10, 3 (1995), 237--304.
[59]
Dan Shen, Jie Zhang, Jian Su, Guodong Zhou, and Chew-Lim Tan. 2004. Multi-criteria-based active learning for named entity recognition. In ACL. Association for Computational Linguistics, 589.
[60]
G. E. Shook. 2006. Major Advances in Determining Appropriate Selection Goals. Journal of Dairy Science, Vol. 89, 4 (2006), 1349--1361.
[61]
J. I. Weller and E. Ezra. 2016. Genetic and phenotypic analysis of daily Israeli Holstein milk, fat, and protein production as determined by a real-time milk analyzer. Journal of Dairy Science, Vol. 99, 12 (2016), 9782--9795.
[62]
Dongrui Wu. 2018. Pool-based sequential active learning for regression. IEEE transactions on neural networks and learning systems, Vol. 30, 5 (2018), 1348--1359.
[63]
Yanjun Yao, Qing Cao, and Athanasios V Vasilakos. 2014. EDAL: An energy-efficient, delay-aware, and lifetime-balancing data collection protocol heterogeneous wireless sensor networks. IEEE/ACM transactions on networking, Vol. 23, 3 (2014), 810--823.
[64]
DaeYoung Yoon and Simon S Woo. 2020. Who is Delivering My Food? Detecting Food Delivery Abusers using Variational Reward Inference Networks. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2917--2924.
[65]
Jingbo Zhu, Huizhen Wang, Benjamin K Tsou, and Matthew Ma. 2009. Active learning with sampling by uncertainty and density for data annotations. IEEE Transactions on audio, speech, and language processing, Vol. 18, 6 (2009), 1323--1331.

Cited By

View all
  • (2023)Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Video Streaming QualityProceedings of the ACM on Networking10.1145/36291391:CoNEXT3(1-27)Online publication date: 28-Nov-2023
  • (2023)The Battleship Approach to the Low Resource Entity Matching ProblemProceedings of the ACM on Management of Data10.1145/36267111:4(1-25)Online publication date: 12-Dec-2023
  • (2023)Using Active Learning for the Computational Design of Polymer Molecular Weight DistributionsACS Engineering Au10.1021/acsengineeringau.3c000564:2(231-240)Online publication date: 25-Dec-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. active learning
  2. dairy industry
  3. industry 4.0

Qualifiers

  • Research-article

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)8
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Empowerment of Atypical Viewers via Low-Effort Personalized Modeling of Video Streaming QualityProceedings of the ACM on Networking10.1145/36291391:CoNEXT3(1-27)Online publication date: 28-Nov-2023
  • (2023)The Battleship Approach to the Low Resource Entity Matching ProblemProceedings of the ACM on Management of Data10.1145/36267111:4(1-25)Online publication date: 12-Dec-2023
  • (2023)Using Active Learning for the Computational Design of Polymer Molecular Weight DistributionsACS Engineering Au10.1021/acsengineeringau.3c000564:2(231-240)Online publication date: 25-Dec-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media