Abstract.
Our goal is to help automate the capture and broadcast of lectures to online audiences. Such systems have two interrelated design components. The technology component includes hardware and associated software. The aesthetic component comprises the rules and idioms that human videographers follow to make a video visually engaging; these rules guide hardware placement and software algorithms. We report the design of a complete system that captures and broadcasts lectures automatically and report on a user study and a detailed set of video-production rules obtained from professional videographers who critiqued the system, which has been deployed in our organization for 2 years. We describe how the system can be generalized to a variety of lecture room environments differing in room size and number of cameras. We also discuss gaps between what professional videographers do and what is technologically feasible today.
Similar content being viewed by others
References
Arijon D (1976) Grammar of the film language. Hastings House, New York
Baumberg A, Hogg D (1994) An efficient method for contour tracking using active shape models, TR 94.11. University of Leeds
Benesty J (2000) Adaptive eigenvalue decomposition algorithm for passive acoustic source localization. J Acoust Am 107:384-391
Bianchi M (1998) AutoAuditorium: a fully automatic, multi-camera system to televise auditorium presentations. In: Proceedings of the Joint DARPA/NIST workshop on smart spaces technology
Brandstein M, Silverman H (1997) A practical methodology for speech source localization with microphone arrays. Comput Speech Lang 11(2):91-126
Brotherton J, Abowd G (1998) Rooms take note: room takes notes. In: Proceedings of the AAAI symposium on intelligent environments, pp 23-30
Cruz G, Hill R (1994) Capturing and playing multimedia events with STREAMS. In: Proceedings of ACM Multimedia’94, pp 193-200
Cutler R, Turk M (1998) View-based interpretation of real-time optical flow for gesture recognition. In: Proceedings of the IEEE conference on automatic face and gesture recognition (FG’98), pp 416-421
Finn K, Sellen A, Wilbur S (eds) (1997) Video-mediated communication. Erlbaum, Mahwah, NJ
Gleicher M, Masanz J (2000) Towards virtual videography. In: Proceedings of ACM Multimedia’00, Los Angeles
He L, Cohen M, Salesin D (1996) The virtual cinematographer: a paradigm for automatic real-time camera control and directing. In: Proceedings of ACM SIGGRAPH’96, New Orleans
He L, Grudin J, Gupta A (2000) Designing presentations for on-demand viewing. In: Proceedings of CSCW’00
Kleban J (2000) Combined acoustic and visual processing for video conferencing systems. MS thesis, The State University of New Jersey, Rutgers
Liu Q, Kimber D, Foote J, Wilcox L, Boreczky J (2002) FLYSEPC: a multi-user video camera system with hybrid human and automatic control. In: Proceedings of ACM Multimedia 2002, Juan-les-Pins, France, pp 484-492
Liu Q, Rui Y, Gupta A, Cadiz JJ (2001) Automating camera management in lecture room environments. In: Proceedings of ACM CHI 2001, Seattle
Mukhopadhyay S, Smith B (1999) Passive capture and structuring of lectures. In: Proceedings of ACM Multimedia’99
ParkerVision. http://www.parkervision.com/
PictureTel. http://www.picturetel.com/
PolyCom. http://www.polycom.com/
Professional Master’s Program, University of Washington. http://www.cs.washington.edu/education/dl/course\_index.html
Rui Y, He L, Gupta A, Liu Q (2001) Building an intelligent camera management system. In: Proceedings of ACM Multimedia, Ottawa, Canada, pp 2-11
Rui Y, Gupta A, Grudin J (2003) Videography for telepresentation. In: Proceedings of ACM CHI 2003, Ft Lauderdale, FL, pp 457-464
Song D, Goldberg K (2003) ShareCam Part I: Interface, system architecture, and implementation of a collaboratively controlled robotic webcam. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, October 2003
Song D, Goldberg K, Pashkevich A (2003) ShareCam Part II: Approximate and distributed algorithms for a collaboratively controlled robotic webcam. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, Las Vegas, October 2003
Stanford Online. http://scpd.stanford.edu/scpd/students/onlineclass.htm
Stiefelhagen R, Yang J, Waibel A (1999) Modeling focus of attention for meeting indexing. In: Proceedings of ACM Multiemdia’99
Wang C, Brandstein M (1998) A hybrid real-time face tracking system. In: Proceedings of ICASSP98, Seattle, May 1998, pp 3737-3740
Wang H, Chu P (1997) Voice source localization for automatic camera pointing system in video conferencing. In: Proceedings of ICASSP’97
Zhai S, Morimoto C, Ihde S (1999) Manual and gaze input cascaded (MAGIC) pointing. In: Proceedings of CHI’99, pp 246-253
Yong Rui A, Florencio D: New direct approaches to robust sound source localization. Proc of IEEE ICME 2003, Baltimore, MD, July 6-9
Yong Rui B, Florencio D: Time Delay Estimation in the Presence of Correlated Noise and Reverberation. Proc of IEEE ICASSP 2004, Montreal, Quebec, Canada, May 17-21
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rui, Y., Gupta, A., Grudin, J. et al. Automating lecture capture and broadcast: technology and videography. Multimedia Systems 10, 3–15 (2004). https://doi.org/10.1007/s00530-004-0132-9
Issue Date:
DOI: https://doi.org/10.1007/s00530-004-0132-9