Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3613904.3642462acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

DirectGPT: A Direct Manipulation Interface to Interact with Large Language Models

Published: 11 May 2024 Publication History

Abstract

We characterize and demonstrate how the principles of direct manipulation can improve interaction with large language models. This includes: continuous representation of generated objects of interest; reuse of prompt syntax in a toolbar of commands; manipulable outputs to compose or control the effect of prompts; and undo mechanisms. This idea is exemplified in DirectGPT, a user interface layer on top of ChatGPT that works by transforming direct manipulation actions to engineered prompts. A study shows participants were 50% faster and relied on 50% fewer and 72% shorter prompts to edit text, code, and vector images compared to baseline ChatGPT. Our work contributes a validated approach to integrate LLMs into traditional software using direct manipulation. Data, code, and demo available at https://osf.io/3wt6s.

Supplemental Material

MP4 File - Video Preview
Video Preview
Transcript for: Video Preview
MP4 File - Video Presentation
Video Presentation
Transcript for: Video Presentation
MP4 File - Video Figure
DirectGPT Video Figure
Transcript for: Video Figure
ZIP File - Compiled version of DirectGPT
This archive contains the compiled version of the DirectGPT system as well as the system to run the study comparing DirectGPT to ChatGPT.

References

[1]
Eytan Adar, Mira Dontcheva, and Gierad Laput. 2014. CommandSpace: Modeling the Relationships between Tasks, Descriptions and Features. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology(UIST ’14). Association for Computing Machinery, New York, NY, USA, 167–176. https://doi.org/10.1145/2642918.2647395
[2]
Tyler Angert, Miroslav Suzara, Jenny Han, Christopher Pondoc, and Hariharan Subramonyam. 2023. Spellburst: A Node-based Interface for Exploratory Creative Coding with Natural Language Prompts. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(UIST ’23). Association for Computing Machinery, New York, NY, USA, 1–22. https://doi.org/10.1145/3586183.3606719
[3]
Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, and Elena Glassman. 2023. ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing. arxiv:2309.09128 [cs.HC]
[4]
Aaron Bangor, Philip T. Kortum, and James T. Miller. 2008. An Empirical Evaluation of the System Usability Scale. International Journal of Human–Computer Interaction 24, 6 (July 2008), 574–594. https://doi.org/10.1080/10447310802205776
[5]
Michel Beaudouin-Lafon. 2000. Instrumental Interaction: An Interaction Model for Designing Post-WIMP User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’00). Association for Computing Machinery, New York, NY, USA, 446–453. https://doi.org/10.1145/332040.332473
[6]
Michel Beaudouin-Lafon. 2004. Designing Interaction, Not Interfaces. In Proceedings of the Working Conference on Advanced Visual Interfaces(AVI ’04). Association for Computing Machinery, New York, NY, USA, 15–22. https://doi.org/10.1145/989863.989865
[7]
Michel Beaudouin-Lafon and Wendy E. Mackay. 2000. Reification, Polymorphism and Reuse: Three Principles for Designing Visual Interfaces. In Proceedings of the Working Conference on Advanced Visual Interfaces(AVI ’00). Association for Computing Machinery, New York, NY, USA, 102–109. https://doi.org/10.1145/345513.345267
[8]
Richard A. Bolt. 1980. “Put-that-there”: Voice and Gesture at the Graphics Interface. ACM SIGGRAPH Computer Graphics 14, 3 (July 1980), 262–270. https://doi.org/10.1145/965105.807503
[9]
Patrick D Bridge and Shlomo S Sawilowsky. 1999. Increasing Physicians’ Awareness of the Impact of Statistics on Research Outcomes: Comparative Power of the t-Test and Wilcoxon Rank-Sum Test in Small Samples Applied Research. Journal of Clinical Epidemiology 52, 3 (March 1999), 229–235. https://doi.org/10.1016/S0895-4356(98)00168-1
[10]
John Brooke. 1995. SUS: A Quick and Dirty Usability Scale. Usability Eval. Ind. 189 (Nov. 1995), 7.
[11]
Xiang ’Anthony’ Chen, Jeff Burke, Ruofei Du, Matthew K. Hong, Jennifer Jacobs, Philippe Laban, Dingzeyu Li, Nanyun Peng, Karl D. D. Willis, Chien-Sheng Wu, and Bolei Zhou. 2023. Next Steps for Human-Centered Generative AI: A Technical Perspective. arxiv:2306.15774 [cs]
[12]
John Joon Young Chung and Eytan Adar. 2023. PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like Interactions. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(UIST ’23). Association for Computing Machinery, New York, NY, USA, 1–17. https://doi.org/10.1145/3586183.3606777
[13]
John Joon Young Chung, Wooseok Kim, Kang Min Yoo, Hwaran Lee, Eytan Adar, and Minsuk Chang. 2022. TaleBrush: Sketching Stories with Generative Pretrained Language Models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems(CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–19. https://doi.org/10.1145/3491102.3501819
[14]
Elizabeth Clark, Anne Spencer Ross, Chenhao Tan, Yangfeng Ji, and Noah A. Smith. 2018. Creative Writing with a Machine in the Loop: Case Studies on Slogans and Stories. In 23rd International Conference on Intelligent User Interfaces(IUI ’18). Association for Computing Machinery, New York, NY, USA, 329–340. https://doi.org/10.1145/3172944.3172983
[15]
Andy Cockburn, Carl Gutwin, Joey Scarr, and Sylvain Malacria. 2015. Supporting Novice to Expert Transitions in User Interfaces. Comput. Surveys 47, 2 (Jan. 2015), 1–36. https://doi.org/10.1145/2659796
[16]
Richard L. Daft and Robert H. Lengel. 1986. Organizational Information Requirements, Media Richness and Structural Design. Management Science 32, 5 (1986), 554–571. jstor:2631846
[17]
Hai Dang, Sven Goller, Florian Lehmann, and Daniel Buschek. 2023. Choice Over Control: How Users Write with Large Language Models Using Diegetic and Non-Diegetic Prompting. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. ACM, Hamburg Germany, 1–17. https://doi.org/10.1145/3544548.3580969
[18]
Hai Dang, Lukas Mecke, and Daniel Buschek. 2022. GANSlider: How Users Control Generative Models for Images Using Multiple Sliders with and without Feedforward Information. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems(CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3491102.3502141
[19]
Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, and Daniel Buschek. 2022. How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models. arxiv:2209.01390 [cs]
[20]
Sjaak de Mul and Herre van Oostendorp. 1996. Learning User Interfaces by Exploration. Acta Psychologica 91, 3 (April 1996), 325–344. https://doi.org/10.1016/0001-6918(95)00060-7
[21]
Pierre Dragicevic. 2016. Fair Statistical Communication in HCI. In Modern Statistical Methods for HCI, Judy Robertson and Maurits Kaptein (Eds.). Springer International Publishing, Cham, 291–330. https://doi.org/10.1007/978-3-319-26633-6_13
[22]
Stephen W. Draper and Stephen B. Barton. 1993. Learning by Exploration and Affordance Bugs. In INTERACT ’93 and CHI ’93 Conference Companion on Human Factors in Computing Systems(CHI ’93). Association for Computing Machinery, New York, NY, USA, 75–76. https://doi.org/10.1145/259964.260084
[23]
Adam Fourney, Richard Mann, and Michael Terry. 2011. Query-Feature Graphs: Bridging User Vocabulary and System Functionality. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology. ACM, Santa Barbara California USA, 207–216. https://doi.org/10.1145/2047196.2047224
[24]
Camille Gobert and Michel Beaudouin-Lafon. 2023. Lorgnette: Creating Malleable Code Projections. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. ACM, San Francisco CA USA, 1–16. https://doi.org/10.1145/3586183.3606817
[25]
Google. 2023. Bard - Chat Based AI Tool from Google, Powered by PaLM 2. https://bard.google.com.
[26]
Transparent Statistics in Human-Computer Interaction Working Group. 2019. Transparent Statistics Guidelines. https://doi.org/10.5281/zenodo.1186169
[27]
Matthew Guzdial, Nicholas Liao, Jonathan Chen, Shao-Yu Chen, Shukan Shah, Vishwa Shah, Joshua Reno, Gillian Smith, and Mark O. Riedl. 2019. Friend, Collaborator, Student, Manager: How Design of an AI-Driven Game Level Editor Affects Creators. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300854
[28]
A. G. Hauptmann. 1989. Speech and Gestures for Graphic Image Manipulation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’89). Association for Computing Machinery, New York, NY, USA, 241–245. https://doi.org/10.1145/67449.67496
[29]
Jeffrey Heer. 2019. Agency plus Automation: Designing Artificial Intelligence into Interactive Systems. Proceedings of the National Academy of Sciences 116, 6 (Feb. 2019), 1844–1850. https://doi.org/10.1073/pnas.1807184115
[30]
Edwin L. Hutchins, James D. Hollan, and Donald A. Norman. 1985. Direct Manipulation Interfaces. Human–Computer Interaction 1, 4 (Dec. 1985), 311–338. https://doi.org/10.1207/s15327051hci0104_2
[31]
Ellen Jiang, Edwin Toh, Alejandra Molina, Aaron Donsbach, Carrie J Cai, and Michael Terry. 2021. GenLine and GenForm: Two Tools for Interacting with Generative Language Models in a Code Editor. In Adjunct Proceedings of the 34th Annual ACM Symposium on User Interface Software and Technology(UIST ’21 Adjunct). Association for Computing Machinery, New York, NY, USA, 145–147. https://doi.org/10.1145/3474349.3480209
[32]
Peiling Jiang, Jude Rayan, Steven P. Dow, and Haijun Xia. 2023. Graphologue: Exploring Large Language Model Responses with Interactive Diagrams. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(UIST ’23). Association for Computing Machinery, New York, NY, USA, 1–20. https://doi.org/10.1145/3586183.3606737
[33]
Tae Soo Kim, DaEun Choi, Yoonseo Choi, and Juho Kim. 2022. Stylette: Styling the Web with Natural Language. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems(CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–17. https://doi.org/10.1145/3491102.3501931
[34]
Alfred Kobsa, Jürgen Allgayer, Carola Reddig, Norbert Reithinger, Dagmar Schmauks, Karin Harbusch, and Wolfgang Wahlster. 1986. Combining Deictic Gestures and Natural Language for Referent Identification. In Proceedings of the 11th Coference on Computational Linguistics (Bonn, Germany) (COLING ’86). Association for Computational Linguistics, USA, 356–361. https://doi.org/10.3115/991365.991471
[35]
Gierad P. Laput, Mira Dontcheva, Gregg Wilensky, Walter Chang, Aseem Agarwala, Jason Linder, and Eytan Adar. 2013. PixelTone: A Multimodal Interface for Image Editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, Paris France, 2185–2194. https://doi.org/10.1145/2470654.2481301
[36]
Florian Lehmann, Niklas Markert, Hai Dang, and Daniel Buschek. 2022. Suggestion Lists vs. Continuous Generation: Interaction Design for Writing with Generative Models on Mobile Devices Affect Text Length, Wording and Perceived Authorship. In Proceedings of Mensch Und Computer 2022(MuC ’22). Association for Computing Machinery, New York, NY, USA, 192–208. https://doi.org/10.1145/3543758.3543947
[37]
Robert H. Lengel and Richard L. Daft. 1988. The Selection of Communication Media as an Executive Skill. The Academy of Management Executive (1987-1989) 2, 3 (1988), 225–232. jstor:4164833
[38]
Yaniv Leviathan, Matan Kalman, and Yossi Matias. 2023. Fast Inference from Transformers via Speculative Decoding. https://doi.org/10.48550/arXiv.2211.17192 arxiv:2211.17192 [cs]
[39]
Toby Jia-Jun Li, Marissa Radensky, Justin Jia, Kirielle Singarajah, Tom M. Mitchell, and Brad A. Myers. 2019. PUMICE: A Multi-Modal Agent That Learns Concepts and Conditionals from Natural Language and Demonstrations. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology(UIST ’19). Association for Computing Machinery, New York, NY, USA, 577–589. https://doi.org/10.1145/3332165.3347899
[40]
Michael Xieyang Liu, Advait Sarkar, Carina Negreanu, Benjamin Zorn, Jack Williams, Neil Toronto, and Andrew D. Gordon. 2023. “What It Wants Me To Say”: Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. ACM, Hamburg Germany, 1–31. https://doi.org/10.1145/3544548.3580817
[41]
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. arxiv:2107.13586 [cs]
[42]
Zhaoyang Liu, Yinan He, Wenhai Wang, Weiyun Wang, Yi Wang, Shoufa Chen, Qinglong Zhang, Zeqiang Lai, Yang Yang, Qingyun Li, Jiashuo Yu, Kunchang Li, Zhe Chen, Xue Yang, Xizhou Zhu, Yali Wang, Limin Wang, Ping Luo, Jifeng Dai, and Yu Qiao. 2023. InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language. arxiv:2305.05662 [cs]
[43]
Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, and Carrie J. Cai. 2020. Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376739
[44]
Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. 2022. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. arxiv:2104.08786 [cs]
[45]
Aran Lunzer and Kasper Hornbæk. 2008. Subjunctive Interfaces: Extending Applications to Support Parallel Setup, Viewing and Control of Alternative Scenarios. ACM Transactions on Computer-Human Interaction 14, 4 (Jan. 2008), 1–44. https://doi.org/10.1145/1314683.1314685
[46]
Pattie Maes. 1994. Agents That Reduce Work and Information Overload. Commun. ACM 37, 7 (July 1994), 30–40. https://doi.org/10.1145/176789.176792
[47]
Damien Masson, Sylvain Malacria, Géry Casiez, and Daniel Vogel. 2023. Statslator: Interactive Translation of NHST and Estimation Statistics Reporting Styles in Scientific Documents. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(UIST ’23). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3586183.3606762
[48]
Damien Masson, Jo Vermeulen, George Fitzmaurice, and Justin Matejka. 2022. Supercharging Trial-and-Error for Learning Complex Software Applications. In CHI Conference on Human Factors in Computing Systems. ACM, New Orleans LA USA, 1–13. https://doi.org/10.1145/3491102.3501895
[49]
Microsoft. 2023. Introducing the New Bing. https://www.bing.com/new.
[50]
Dan Milmo. 2023. ChatGPT Reaches 100 Million Users Two Months after Launch. The Guardian (Feb. 2023), 1.
[51]
Aditi Mishra, Utkarsh Soni, Anjana Arunkumar, Jinbin Huang, Bum Chul Kwon, and Chris Bryan. 2023. PromptAid: Prompt Exploration, Perturbation, Testing and Iteration Using Visual Analytics for Large Language Models. arxiv:2304.01964 [cs]
[52]
B.A. Myers. 1992. Demonstrational Interfaces: A Step beyond Direct Manipulation. Computer 25, 8 (Aug. 1992), 61–73. https://doi.org/10.1109/2.153286
[53]
Mathieu Nancel and Andy Cockburn. 2014. Causality: A Conceptual Model of Interaction History. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, Toronto Ontario Canada, 1777–1786. https://doi.org/10.1145/2556288.2556990
[54]
David G. Novick, Oscar D. Andrade, and Nathaniel Bean. 2009. The Micro-Structure of Use of Help. In Proceedings of the 27th ACM International Conference on Design of Communication(SIGDOC ’09). Association for Computing Machinery, New York, NY, USA, 97–104. https://doi.org/10.1145/1621995.1622014
[55]
Changhoon Oh, Jungwoo Song, Jinhan Choi, Seonghyeon Kim, Sungwoo Lee, and Bongwon Suh. 2018. I Lead, You Help but Only with Enough Details: Understanding User Experience of Co-Creation with Artificial Intelligence. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems(CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3174223
[56]
OpenAI. 2022. Introducing ChatGPT. https://openai.com/blog/chatgpt.
[57]
OpenAI. 2023. GPT Best Practices. https://platform.openai.com/docs/guides/gpt-best-practices.
[58]
OpenAI. 2023. OpenAI Node API Library. OpenAI.
[59]
OpenAI. 2023. OpenAI Platform. https://platform.openai.com/docs/api-reference/chat/object.
[60]
Xingang Pan, Ayush Tewari, Thomas Leimkühler, Lingjie Liu, Abhimitra Meka, and Christian Theobalt. 2023. Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold. In ACM SIGGRAPH 2023 Conference Proceedings(SIGGRAPH ’23). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3588432.3591500
[61]
Prism.js 2015. Prism. Prism.js. https://prismjs.com
[62]
Jef Raskin. 2000. The Humane Interface: New Directions for Designing Interactive Systems (1st edition ed.). Addison-Wesley Professional, Reading, Mass.
[63]
Meta 2013. React – A JavaScript Library for Building User Interfaces. Meta. https://reactjs.org/
[64]
Laria Reynolds and Kyle McDonell. 2021. Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–7. https://doi.org/10.1145/3411763.3451760
[65]
Andrew Ross, Nina Chen, Elisa Zhao Hang, Elena L. Glassman, and Finale Doshi-Velez. 2021. Evaluating the Interpretability of Generative Models by Interactive Reconstruction. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–15. https://doi.org/10.1145/3411764.3445296
[66]
Kevin Sheppard, Stanislav Khrapov, Gábor Lipták, mikedeltalima, Rob Capellini, alejandro cermeno, Hugle, esvhd, Snyk bot, Alex Fortin, JPN, Matt Judell, Ryan Russell, Weiliang Li, 645775992, Austin Adams, jbrockmendel, LGTM Migrator, M. Rabba, Michael E. Rose, Nikolay Tretyak, Tom Rochette, UNO Leo, Xavier RENE-CORAIL, Xin Du, Joren, and Burak Çelik. 2023. bashtage/arch: Release 6.1.0. https://doi.org/10.5281/zenodo.7975104
[67]
Shneiderman. 1983. Direct Manipulation: A Step Beyond Programming Languages. Computer 16, 8 (Aug. 1983), 57–69. https://doi.org/10.1109/MC.1983.1654471
[68]
Ben Shneiderman. 1982. The Future of Interactive Systems and the Emergence of Direct Manipulation†. Behaviour & Information Technology 1, 3 (July 1982), 237–256. https://doi.org/10.1080/01449298208914450
[69]
B. Shneiderman. 1993. Beyond Intelligent Machines: Just Do It. IEEE Software 10, 1 (Jan. 1993), 100–103. https://doi.org/10.1109/52.207235
[70]
Ben Shneiderman. 1997. Direct Manipulation for Comprehensible, Predictable and Controllable User Interfaces. In Proceedings of the 2nd International Conference on Intelligent User Interfaces - IUI ’97. ACM Press, Orlando, Florida, United States, 33–39. https://doi.org/10.1145/238218.238281
[71]
Ben Shneiderman and Pattie Maes. 1997. Direct Manipulation vs. Interface Agents. Interactions 4, 6 (Nov. 1997), 42–61. https://doi.org/10.1145/267505.267514
[72]
Ben Shneiderman, Catherine Plaisant, Maxine Cohen, Steven M. Jacobs, and Niklas Elmqvist. 2017. Designing the User Interface: Strategies for Effective Human-Computer Interaction (sixth edition ed.). Pearson, Boston.
[73]
Hendrik Strobelt, Albert Webson, Victor Sanh, Benjamin Hoover, Johanna Beyer, Hanspeter Pfister, and Alexander M. Rush. 2022. Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models. arxiv:2208.07852 [cs]
[74]
Sangho Suh, Bryan Min, Srishti Palani, and Haijun Xia. 2023. Sensecape: Enabling Multilevel Exploration and Sensemaking with Large Language Models. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(UIST ’23). Association for Computing Machinery, New York, NY, USA, 1–18. https://doi.org/10.1145/3586183.3606756
[75]
Michael Terry, Elizabeth D. Mynatt, Kumiyo Nakakoji, and Yasuhiro Yamamoto. 2004. Variation in Element and Action: Supporting Simultaneous Development of Alternative Solutions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, Vienna Austria, 711–718. https://doi.org/10.1145/985692.985782
[76]
Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. ACM, New Orleans LA USA, 1–7. https://doi.org/10.1145/3491101.3519665
[77]
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods 17 (2020), 261–272. https://doi.org/10.1038/s41592-019-0686-2
[78]
Jianyuan Wang, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, and Bolei Zhou. 2022. Improving GAN Equilibrium by Raising Spatial Awareness. https://doi.org/10.48550/arXiv.2112.00718 arxiv:2112.00718 [cs]
[79]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Advances in Neural Information Processing Systems 35 (Dec. 2022), 24824–24837.
[80]
Catherine G. Wolf and James R. Rhyne. 1987. A Taxonomic Approach to Understanding Direct Manipulation. Proceedings of the Human Factors Society Annual Meeting 31, 5 (Sept. 1987), 576–580. https://doi.org/10.1177/154193128703100522
[81]
Tongshuang Wu, Ellen Jiang, Aaron Donsbach, Jeff Gray, Alejandra Molina, Michael Terry, and Carrie J Cai. 2022. PromptChainer: Chaining Large Language Model Prompts through Visual Programming. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. ACM, New Orleans LA USA, 1–10. https://doi.org/10.1145/3491101.3519729
[82]
Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In CHI Conference on Human Factors in Computing Systems. ACM, New Orleans LA USA, 1–22. https://doi.org/10.1145/3491102.3517582
[83]
Tom Yeh, Tsung-Hsiang Chang, and Robert C. Miller. 2009. Sikuli: Using GUI Screenshots for Search and Automation. In Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology(UIST ’09). Association for Computing Machinery, New York, NY, USA, 183–192. https://doi.org/10.1145/1622176.1622213
[84]
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative Image Inpainting with Contextual Attention. arxiv:1801.07892 [cs]
[85]
Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. In 27th International Conference on Intelligent User Interfaces. ACM, Helsinki Finland, 841–852. https://doi.org/10.1145/3490099.3511105
[86]
J.D. Zamfirescu-Pereira, Richmond Y. Wong, Bjoern Hartmann, and Qian Yang. 2023. Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. ACM, Hamburg Germany, 1–21. https://doi.org/10.1145/3544548.3581388
[87]
Yaqian Zhu and John Kolassa. 2018. Assessing and Comparing the Accuracy of Various Bootstrap Methods. Communications in Statistics - Simulation and Computation 47, 8 (Sept. 2018), 2436–2453. https://doi.org/10.1080/03610918.2017.1348516

Cited By

View all
  • (2024)Dynamic Abstractions: Building the Next Generation of Cognitive Tools and InterfacesAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686706(1-3)Online publication date: 13-Oct-2024
  • (2024)WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code VisualizationProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676374(1-14)Online publication date: 13-Oct-2024
  • (2024)StyleFactory: Towards Better Style Alignment in Image Creation through Style-Strength-Based Control and EvaluationProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676370(1-15)Online publication date: 13-Oct-2024
  • Show More Cited By

Index Terms

  1. DirectGPT: A Direct Manipulation Interface to Interact with Large Language Models
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems
    May 2024
    18961 pages
    ISBN:9798400703300
    DOI:10.1145/3613904
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 May 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. direct manipulation
    2. large language models
    3. prompt engineering

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • NSERC Discovery Grant
    • LAI Réapp
    • Canada Foundation for Innovation Infrastructure

    Conference

    CHI '24

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI '25
    CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2,969
    • Downloads (Last 6 weeks)497
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Dynamic Abstractions: Building the Next Generation of Cognitive Tools and InterfacesAdjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3672539.3686706(1-3)Online publication date: 13-Oct-2024
    • (2024)WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code VisualizationProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676374(1-14)Online publication date: 13-Oct-2024
    • (2024)StyleFactory: Towards Better Style Alignment in Image Creation through Style-Strength-Based Control and EvaluationProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676370(1-15)Online publication date: 13-Oct-2024
    • (2024)Who did it? How User Agency is influenced by Visual Properties of Generated ImagesProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676335(1-17)Online publication date: 13-Oct-2024
    • (2024)TrICy: Trigger-Guided Data-to-Text Generation With Intent Aware Attention-CopyIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2024.335357432(1173-1184)Online publication date: 12-Jan-2024

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media