research-article

Open access

ScatterShot: Interactive In-context Example Curation for Text Transformation

Authors:

Marco Tulio RibeiroAuthors Info & Claims

IUI '23: Proceedings of the 28th International Conference on Intelligent User Interfaces

Pages 353 - 367

https://doi.org/10.1145/3581641.3584059

Published: 27 March 2023 Publication History

All formats PDF

Abstract

The in-context learning capabilities of LLMs like GPT-3 allow annotators to customize an LLM to their specific tasks with a small number of examples. However, users tend to include only the most obvious patterns when crafting examples, resulting in underspecified in-context functions that fall short on unseen cases. Further, it is hard to know when “enough” examples have been included even for known patterns. In this work, we present ScatterShot, an interactive system for building high-quality demonstration sets for in-context learning. ScatterShot iteratively slices unlabeled data into task-specific patterns, samples informative inputs from underexplored or not-yet-saturated slices in an active learning manner, and helps users label more efficiently with the help of an LLM and the current example set. In simulation studies on two text perturbation scenarios, ScatterShot sampling improves the resulting few-shot functions by 4-5 percentage points over random sampling, with less variance as more examples are added. In a user study, ScatterShot greatly helps users in covering different patterns in the input space and labeling in-context examples more efficiently, resulting in better in-context learning and less user effort.

References

[1]

Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, and Marjan Ghazvininejad. 2022. In-context Examples Selection for Machine Translation. ArXiv preprint abs/2212.02437 (2022). https://arxiv.org/abs/2212.02437

[2]

Satya Almasian, Dennis Aumiller, and Michael Gertz. 2021. BERT got a Date: Introducing Transformers to Temporal Tagging. ArXiv preprint abs/2109.14927 (2021). https://arxiv.org/abs/2109.14927

[3]

Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. Ai Magazine 35, 4 (2014), 105–120.

Digital Library

[4]

Peter Auer. 2002. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research 3, Nov (2002), 397–422.

Digital Library

[5]

Max Bartolo, Alastair Roberts, Johannes Welbl, Sebastian Riedel, and Pontus Stenetorp. 2020. Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension. Transactions of the Association for Computational Linguistics 8 (2020), 662–678. https://doi.org/10.1162/tacl_a_00338

[6]

Max Bartolo, Tristan Thrush, Sebastian Riedel, Pontus Stenetorp, Robin Jia, and Douwe Kiela. 2022. Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 3754–3767. https://doi.org/10.18653/v1/2022.naacl-main.275

[7]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html

[8]

Ángel Alexander Cabrera, Marco Tulio Ribeiro, Bongshin Lee, Rob DeLine, Adam Perer, and Steven M Drucker. 2022. What Did My AI Learn? How Data Scientists Make Sense of Model Behavior. ACM Transactions on Computer-Human Interaction (2022).

[9]

Angel X. Chang and Christopher Manning. 2012. SUTime: A library for recognizing and normalizing time expressions. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA), Istanbul, Turkey, 3735–3740. http://www.lrec-conf.org/proceedings/lrec2012/pdf/284_Paper.pdf

[10]

Ernie Chang, Xiaoyu Shen, Hui-Syuan Yeh, and Vera Demberg. 2021. On Training Instance Selection for Few-Shot Neural Text Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, Online, 8–13. https://doi.org/10.18653/v1/2021.acl-short.2

[11]

Vincent S. Chen, Sen Wu, Alexander J. Ratner, Jen Weng, and Christopher Ré. 2019. Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 9392–9402. https://proceedings.neurips.cc/paper/2019/hash/351869bde8b9d6ad1e3090bd173f600d-Abstract.html

[12]

Gui Citovsky, Giulia DeSalvo, Claudio Gentile, Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, and Sanjiv Kumar. 2021. Batch active learning at scale. Advances in Neural Information Processing Systems 34 (2021), 11933–11944.

[13]

Emily Dinan, Samuel Humeau, Bharath Chintagunta, and Jason Weston. 2019. Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 4537–4546. https://doi.org/10.18653/v1/D19-1461

[14]

Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, and Matt Gardner. 2019. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 2368–2378. https://doi.org/10.18653/v1/N19-1246

[15]

Avia Efrat and Omer Levy. 2020. The Turking Test: Can Language Models Understand Instructions?ArXiv preprint abs/2010.11982 (2020). https://arxiv.org/abs/2010.11982

[16]

Sabri Eyuboglu, Maya Varma, Khaled Saab, Jean-Benoit Delbrouck, Christopher Lee-Messer, Jared Dunnmon, James Zou, and Christopher Ré. 2022. Domino: Discovering systematic errors with cross-modal embeddings. ArXiv preprint abs/2203.14960 (2022). https://arxiv.org/abs/2203.14960

[17]

Yonatan Geifman and Ran El-Yaniv. 2017. Deep active learning over the long tail. ArXiv preprint abs/1711.00941 (2017). https://arxiv.org/abs/1711.00941

[18]

Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel Bowman, and Noah A. Smith. 2018. Annotation Artifacts in Natural Language Inference Data. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Association for Computational Linguistics, New Orleans, Louisiana, 107–112. https://doi.org/10.18653/v1/N18-2017

[19]

Florian Heimerl, Steffen Koch, Harald Bosch, and Thomas Ertl. 2012. Visual classifier training for text document retrieval. IEEE Transactions on Visualization and Computer Graphics 18, 12(2012), 2839–2848.

Digital Library

[20]

Henrik Imberg, Johan Jonasson, and Marina Axelson-Fisk. 2020. Optimal sampling in unbiased active learning. In The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, 26-28 August 2020, Online [Palermo, Sicily, Italy](Proceedings of Machine Learning Research, Vol. 108), Silvia Chiappa and Roberto Calandra (Eds.). PMLR, 559–569. http://proceedings.mlr.press/v108/imberg20a.html

[21]

Ellen Jiang, Kristen Olson, Edwin Toh, Alejandra Molina, Aaron Donsbach, Michael Terry, and Carrie J. Cai. 2022. Prompt-based Prototyping with Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems.

[22]

Ellen Jiang, Kristen Olson, Edwin Toh, Alejandra Molina, Aaron Donsbach, Michael Terry, and Carrie J Cai. 2022. PromptMaker: Prompt-based Prototyping with Large Language Models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–8.

[23]

Fereshte Khani, Martin Rinard, and Percy Liang. 2016. Unanimous Prediction for 100% Precision with Application to Learning Semantic Mappings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 952–962. https://doi.org/10.18653/v1/P16-1090

[24]

Douwe Kiela, Max Bartolo, Yixin Nie, Divyansh Kaushik, Atticus Geiger, Zhengxuan Wu, Bertie Vidgen, Grusha Prasad, Amanpreet Singh, Pratik Ringshia, Zhiyi Ma, Tristan Thrush, Sebastian Riedel, Zeerak Waseem, Pontus Stenetorp, Robin Jia, Mohit Bansal, Christopher Potts, and Adina Williams. 2021. Dynabench: Rethinking Benchmarking in NLP. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 4110–4124. https://doi.org/10.18653/v1/2021.naacl-main.324

[25]

Mina Lee, Percy Liang, and Qian Yang. 2022. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities. ArXiv preprint abs/2201.06796 (2022). https://arxiv.org/abs/2201.06796

[26]

David D Lewis and William A Gale. 1994. A sequential algorithm for training text classifiers. In SIGIR’94. Springer, 3–12.

[27]

Rensis Likert. 1932. A technique for the measurement of attitudes.Archives of psychology(1932).

[28]

Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81. https://aclanthology.org/W04-1013

[29]

Alisa Liu, Swabha Swayamdipta, Noah A Smith, and Yejin Choi. 2022. Wanli: Worker and ai collaboration for natural language inference dataset creation. ArXiv preprint abs/2201.05955 (2022). https://arxiv.org/abs/2201.05955

[30]

Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. 2022. What Makes Good In-Context Examples for GPT-3?. In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Association for Computational Linguistics, Dublin, Ireland and Online, 100–114. https://doi.org/10.18653/v1/2022.deelio-1.10

[31]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ArXiv preprint abs/2107.13586 (2021). https://arxiv.org/abs/2107.13586

[32]

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. 2022. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, 8086–8098. https://doi.org/10.18653/v1/2022.acl-long.556

[33]

Alena Lukasová. 1979. Hierarchical agglomerative clustering procedure. Pattern Recognition 11, 5-6 (1979), 365–381.

[34]

Prem Melville and Raymond J. Mooney. 2004. Diverse ensembles for active learning. In Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, July 4-8, 2004(ACM International Conference Proceeding Series, Vol. 69), Carla E. Brodley (Ed.). ACM. https://doi.org/10.1145/1015330.1015385

Digital Library

[35]

Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2022. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?ArXiv preprint abs/2202.12837 (2022). https://arxiv.org/abs/2202.12837

[36]

Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi. 2022. Cross-Task Generalization via Natural Language Crowdsourcing Instructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, 3470–3487. https://doi.org/10.18653/v1/2022.acl-long.244

[37]

Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi. 2022. Cross-Task Generalization via Natural Language Crowdsourcing Instructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, 3470–3487. https://doi.org/10.18653/v1/2022.acl-long.244

[38]

Joon Sung Park, Lindsay Popowski, Carrie J Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. 2022. Social Simulacra: Creating Populated Prototypes for Social Computing Systems. ArXiv preprint abs/2208.04024 (2022). https://arxiv.org/abs/2208.04024

[39]

Kayur Patel, James Fogarty, James A Landay, and Beverly Harrison. 2008. Investigating statistical machine learning as a tool for software development. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 667–676.

Digital Library

[40]

Remus Pop and Patric Fulop. 2018. Deep ensemble bayesian active learning: Addressing the mode collapse issue in monte carlo dropout via ensembles. ArXiv preprint abs/1811.03897 (2018). https://arxiv.org/abs/1811.03897

[41]

James Pustejovsky, Jessica Littman, Roser Saurí, and Marc Verhagen. 2006. Timebank 1.2 documentation. Event London, no. April(2006), 6–11.

[42]

Alexander Ratner, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christopher Ré. 2017. Snorkel: Rapid training data creation with weak supervision. In Proceedings of the VLDB Endowment. International Conference on Very Large Data Bases, Vol. 11. NIH Public Access, 269.

Digital Library

[43]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3982–3992. https://doi.org/10.18653/v1/D19-1410

[44]

Marco Tulio Ribeiro, Carlos Guestrin, and Sameer Singh. 2019. Are Red Roses Red? Evaluating Consistency of Question-Answering Models. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 6174–6184. https://doi.org/10.18653/v1/P19-1621

[45]

Marco Tulio Ribeiro and Scott Lundberg. 2022. Adaptive Testing and Debugging of NLP Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, 3253–3267. https://doi.org/10.18653/v1/2022.acl-long.230

[46]

Ohad Rubin, Jonathan Herzig, and Jonathan Berant. 2022. Learning To Retrieve Prompts for In-Context Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 2655–2671. https://doi.org/10.18653/v1/2022.naacl-main.191

[47]

Shiori Sagawa, Pang Wei Koh, Tatsunori B Hashimoto, and Percy Liang. 2019. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. ArXiv preprint abs/1911.08731 (2019). https://arxiv.org/abs/1911.08731

[48]

Ozan Sener and Silvio Savarese. 2018. Active Learning for Convolutional Neural Networks: A Core-Set Approach. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=H1aIuk-RW

[49]

Burr Settles. 2009. Active Learning Literature Survey. Computer Sciences Technical Report 1648. University of Wisconsin–Madison.

[50]

H Sebastian Seung, Manfred Opper, and Haim Sompolinsky. 1992. Query by committee. In Proceedings of the fifth annual workshop on Computational learning theory. 287–294.

Digital Library

[51]

Patrice Simard, David Chickering, Aparna Lakshmiratan, Denis Charles, Léon Bottou, Carlos Garcia Jurado Suarez, David Grangier, Saleema Amershi, Johan Verwey, and Jina Suh. 2014. Ice: enabling non-experts to build models interactively for large-scale lopsided problems. arXiv preprint arXiv:1409.4814(2014).

[52]

Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen Wong, Margaret Burnett, Thomas Dietterich, Erin Sullivan, and Jonathan Herlocker. 2009. Interacting meaningfully with machine learning systems: Three experiments. International journal of human-computer studies 67, 8 (2009), 639–662.

Digital Library

[53]

Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith, 2022. Selective annotation makes language models better few-shot learners. ArXiv preprint abs/2209.01975 (2022). https://arxiv.org/abs/2209.01975

[54]

Ben Swanson, Kory Mathewson, Ben Pietrzak, Sherol Chen, and Monica Dinalescu. 2021. Story Centaur: Large Language Model Few Shot Learning as a Creative Writing Tool. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics, Online, 244–256. https://doi.org/10.18653/v1/2021.eacl-demos.29

[55]

Jeniya Tabassum, Alan Ritter, and Wei Xu. 2016. TweeTime : A Minimally Supervised Method for Recognizing and Normalizing Time Expressions in Twitter. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 307–318. https://doi.org/10.18653/v1/D16-1030

[56]

Alex Tamkin, Dat Nguyen, Salil Deshpande, Jesse Mu, and Noah Goodman. 2022. Active Learning Helps Pretrained Models Learn the Intended Task. ArXiv preprint abs/2204.08491 (2022). https://arxiv.org/abs/2204.08491

[57]

Toan Tran, Thanh-Toan Do, Ian D. Reid, and Gustavo Carneiro. 2019. Bayesian Generative Active Deep Learning. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 6295–6304. http://proceedings.mlr.press/v97/tran19a.html

[58]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

Digital Library

[59]

Vijay V Vazirani. 2013. Approximation algorithms. Springer Science & Business Media.

Digital Library

[60]

Gust Verbruggen, Vu Le, and Sumit Gulwani. 2021. Semantic programming by example with pre-trained models. Proceedings of the ACM on Programming Languages 5, OOPSLA(2021), 1–25.

Digital Library

[61]

Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu, and Michael Zeng. 2021. Want To Reduce Labeling Cost? GPT-3 Can Help. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 4195–4205. https://doi.org/10.18653/v1/2021.findings-emnlp.354

[62]

Yunlong Wang, Priyadarshini Venkatesh, and Brian Y Lim. 2022. Interpretable Directed Diversity: Leveraging Model Explanations for Iterative Crowd Ideation. In CHI Conference on Human Factors in Computing Systems. 1–28.

[63]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny Zhou. 2022. Chain of thought prompting elicits reasoning in large language models. ArXiv preprint abs/2201.11903 (2022). https://arxiv.org/abs/2201.11903

[64]

Tongshuang Wu, Ellen Jiang, Aaron Donsbach, Jeff Gray, Alejandra Molina, Michael Terry, and Carrie J Cai. 2022. Promptchainer: Chaining large language model prompts through visual programming. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–10.

Digital Library

[65]

Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, and Daniel Weld. 2019. Errudite: Scalable, Reproducible, and Testable Error Analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 747–763. https://doi.org/10.18653/v1/P19-1073

[66]

Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, and Daniel Weld. 2021. Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 6707–6723. https://doi.org/10.18653/v1/2021.acl-long.523

[67]

Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 385, 22 pages. https://doi.org/10.1145/3491102.3517582

Digital Library

[68]

Tongshuang Wu, Daniel S Weld, and Jeffrey Heer. 2019. Local decision pitfalls in interactive machine learning: An investigation into feature selection in sentiment analysis. ACM Transactions on Computer-Human Interaction (TOCHI) 26, 4(2019), 1–27.

Digital Library

[69]

Tongshuang Wu, Kanit Wongsuphasawat, Donghao Ren, Kayur Patel, and Chris DuBois. 2020. Tempura: Query Analysis with Structural Templates. In CHI ’20: CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, April 25-30, 2020, Regina Bernhaupt, Florian ’Floyd’ Mueller, David Verweij, Josh Andres, Joanna McGrenere, Andy Cockburn, Ignacio Avellino, Alix Goguey, Pernille Bjøn, Shengdong Zhao, Briane Paul Samson, and Rafal Kocielnik (Eds.). ACM, 1–12. https://doi.org/10.1145/3313831.3376451

Digital Library

[70]

Sang Michael Xie, Aditi Raghunathan, Percy Liang, and Tengyu Ma. 2021. An explanation of in-context learning as implicit bayesian inference. ArXiv preprint abs/2111.02080 (2021). https://arxiv.org/abs/2111.02080

[71]

Zhilin Yang, Saizheng Zhang, Jack Urbanek, Will Feng, Alexander H. Miller, Arthur Szlam, Douwe Kiela, and Jason Weston. 2018. Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SJ-C6JbRW

[72]

Tianyi Zhang, London Lowmanstone, Xinyu Wang, and Elena L Glassman. 2020. Interactive Program Synthesis by Augmented Examples. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 627–648.

Digital Library

Cited By

Pecher BSrba IBielikova M(2024)A Survey on Stability of Learning with Limited Labelled Data and its Sensitivity to the Effects of RandomnessACM Computing Surveys10.1145/3691339Online publication date: 2-Sep-2024
https://dl.acm.org/doi/10.1145/3691339
Petridis STerry MCai C(2024)PromptInfuser: How Tightly Coupling AI and UI Design Impacts Designers’ WorkflowsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661613(743-756)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3661613
Petridis SWedin BWexler JPushkarna MDonsbach AGoyal NCai CTerry M(2024)ConstitutionMaker: Interactively Critiquing Large Language Models by Converting Feedback into PrinciplesProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645144(853-868)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645144
Show More Cited By

Index Terms

ScatterShot: Interactive In-context Example Curation for Text Transformation
1. Computing methodologies
  1. Machine learning
2. Human-centered computing
  1. Human computer interaction (HCI)

Index terms have been assigned to the content through auto-classification.

Recommendations

Context-aware MIML instance annotation: exploiting label correlations with classifier chains

In multi-instance multi-label (MIML) instance annotation, the goal is to learn an instance classifier while training on a MIML dataset, which consists of bags of instances paired with label sets; instance labels are not provided in the training data. ...
Label Enhancement Using Inter-example Correlation Information
PRICAI 2022: Trends in Artificial Intelligence
Abstract
Label distribution learning (LDL) to characterize the importance of different labels by label distribution has achieved good results in many application fields. LDL can learn more semantic information from the data than multi-label learning, ...
Semi-supervised text categorization: Exploiting unlabeled data using ensemble learning algorithms

Text categorization is one of the fundamental tasks in text mining. Classical supervised methods need lot of labeled data to train a classifier. Since assigning labels to the large amount of data is very costly and time consuming, it is useful to use ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '23: Proceedings of the 28th International Conference on Intelligent User Interfaces

March 2023

972 pages

ISBN:9798400701061

DOI:10.1145/3581641

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 March 2023

Check for updates

Badges

Honorable Mention

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

IUI '23

Sponsor:

IUI '23: 28th International Conference on Intelligent User Interfaces

March 27 - 31, 2023

NSW, Sydney, Australia

Acceptance Rates

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
1,250
Total Downloads

Downloads (Last 12 months)758
Downloads (Last 6 weeks)60

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pecher BSrba IBielikova M(2024)A Survey on Stability of Learning with Limited Labelled Data and its Sensitivity to the Effects of RandomnessACM Computing Surveys10.1145/3691339Online publication date: 2-Sep-2024
https://dl.acm.org/doi/10.1145/3691339
Petridis STerry MCai C(2024)PromptInfuser: How Tightly Coupling AI and UI Design Impacts Designers’ WorkflowsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661613(743-756)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3661613
Petridis SWedin BWexler JPushkarna MDonsbach AGoyal NCai CTerry M(2024)ConstitutionMaker: Interactively Critiquing Large Language Models by Converting Feedback into PrinciplesProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645144(853-868)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645144
Gao JGebreegziabher SChoo KLi TPerrault SMalone T(2024)A Taxonomy for Human-LLM Interaction Modes: An Initial ExplorationExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650786(1-11)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650786
Kim TLee YShin JKim YKim J(2024)EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined CriteriaProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642216(1-21)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642216
Yang WLiu MWang ZLiu S(2024)Foundation models meet visualizations: Challenges and opportunitiesComputational Visual Media10.1007/s41095-023-0393-x10:3(399-424)Online publication date: 2-May-2024
https://doi.org/10.1007/s41095-023-0393-x

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents