Building a smart lecture-recording system using MK-CPN network for heterogeneous data sources

Chiung-Yao Fang¹,
An-Chun Luo²,
Yu-Shan Deng²,
Chia-Ju Lu¹ &
…
Sei-Wang Chen¹

470 Accesses
Explore all metrics

Abstract

Nowadays, lecture-recording systems play a vital role in collecting spoken discourse for e-learning. However, in view of the growing development of e-learning, the lack of content is becoming a problem. This research presents a smart lecture-recording (SLR) system that can record orations at the same level of quality as a human team, but with a reduced degree of human involvement. The proposed SLR system is composed of two subsystems, referred to as virtual cameraman (VC), and virtual director (VD), respectively. All camera man components of VC subsystem are automatic and can take actions that include target and event detection, tracking, and view searching. The videos taken by these three components are forwarded to the VD subsystem, in which the representative shot is chosen for recording or direct broadcasting. We refer to this function of the VD subsystem as shot selection that is based on the content analysis. The capability of shot selection is pre-trained through a machine-learning process characterized by the counter-propagation neural (CPN) network. However, the CPN network yielded poor results when the input data were heterogeneous data. To increases the accuracy of shot selection, we applied multiple kernel learning (MKL) techniques into CPN network, called MK-CPN, to transform all the heterogeneous data from different content analysis methods into unified space. A series of experiments for real lecture has been conducted. The results showed that the proposed SLR system can provide oration records close to some extend to those taken by real human teams. We believe that the proposed system may not be limited to live speeches, if it can be configured with appropriate training materials.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A video course enhancement technique utilizing generated talking heads

Article 22 December 2024

Key-Lectures: Keyframes Extraction in Video Lectures

Fuzzy-based DCKN: Fuzzy-based deep convolutional kronecker network for semantic analysis of summarized video

Article 12 February 2025

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Rowe LA, Harley D, Pletcher P, Lawrence S (2001) BIBS: a lecture webcasting system. Berkeley Multimedia Research Center Report, pp 1–23
Rui Y, He L, Gupta A, Liu Q (2001) Building an intelligent camera management system. ACM Multimed 9:2–11
Google Scholar
Bianchi M (1998) AutoAuditorium: a fully automatic, multi-camera system to televise auditorium presentations. In: Joint DARPA/NIST smart spaces technology workshop
Bianchi M (2004) Automatic video production of lectures using an intelligent and aware environment. In: The 3rd international conference on mobile and ubiquitous multimedia, pp 117–123
Abowd GD (1999) Classroom 2000: an experiment with the instrumentation of a living educational environment. IBM Syst J 38:508–530
Article Google Scholar
Cruz G, Hill R (1994) Capturing and playing multimedia events with STREAMS. In: ACM international conference on multimedia, pp 193–200
Zhang C, Rui Y, Crawford J, He LW (2008) An automated end-to-end lecture capture and broadcasting system. ACM Trans Multimed Comput Commun Appl 4:2–11
Article Google Scholar
Yong R, Anoop G, Jonathan G, He LW (2004) Automating lecture capture and broadcast: technology and videography. Multimed Syst 10:3–15
Article Google Scholar
Onishi M, Fukunaga K (2004) Shooting the lecture scene using computer-controlled cameras based on situation understanding and evaluation of video images. In: The 17th international conference on pattern recognition, pp 781–784
Lu CT, Chen SW (2011) Automatic lecture recording system. In: The 24th IPPR conference on computer vision, graphics, and image processing
Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799
Article Google Scholar
Gleicher M, Masanz J (2000) Towards virtual videography. In: ACM Multimedia, pp 375–378
Okuni S, Tsuruoka S, Rayat GP, Kawanaka H, Shinogi T (2007) Video scene segmentation using the state recognition of blackboard for blended learning. In: International conference on convergence information technology, pp 2437–2442
Kumano M, Ariki Y, Amano M, Uehara K (2002) Video editing support system based on video grammar and content analysis. In: International conference on pattern recognition, pp 1031–1036
Wang T, Mansfield A, Hu R, Collomosse J (2009) An evolutionary approach to automatic video editing. In: International conference on visual media production (CVMP), pp 127–134
Machnicki E, Rowe LA (2002) Virtual director: automating a webcast. In: SPIE international conference on multimedia computer network. San Jose, California, pp 208–225
Liu Q, Rui Y, Gupta A, Cadiz JJ (2001) Automating camera management for lecture room environments. In: The SIGCHI conference on human factors in computing systems, pp 442–449
Ugalde HMR, Carmona JC, Reyes-Reyes J, Alvarado VM, Corbier C (2015) Balanced simplicity–accuracy neural network model families for system identification. Neural Comput Appl 26(1):171–186
Article Google Scholar
Xu Z, Song Q, Wang D (2014) A robust recurrent simultaneous perturbation stochastic approximation training algorithm for recurrent neural networks. Neural Comput Appl 24(7):1851–1866
Article Google Scholar
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Article Google Scholar
Zhang H, Cao X, Ho J, Chow T (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inform 13(2):520–531
Article Google Scholar
Hecht-Nielsen R (1987) Counter-propagation networks. Appl Opt 26(23):4979–4983
Article Google Scholar
G¨onen M, Alpaydın E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268
MathSciNet MATH Google Scholar
Lin YY, Liu TL, Fuh CS (2011) Multiple kernel learning for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 33(6):1147–1160
Article Google Scholar
Cheng KH, Hsieh CH, Wang CC (2011) Human action recognition using 3D body joints. In: The 24th IPPR conference on computer vision, graphics, and image processing
Lin SY, You ZH, Hung YP (2011) A real-time action recognition approach with 3D tracked body joints and its application. In: The 24th IPPR conference on computer vision, graphics, and image processing
Johann P, Hamböker R (1994) Parametric statistical theory. Walter de Gruyter, Berlin, pp 207–208. ISBN 3-11-013863-8
Google Scholar
Rosten E, Drummond T (2005) Fusing points and lines for high performance tracking. In: IEEE international conference on computer vision (ICCV’05), vol 2, pp 1508–1511
Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: Imaging understanding workshop, pp 121–130
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum HY (2011) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33(2):353–367
Article Google Scholar
Fang CJ, Chen SW, Fu CS (2003) Automatic change detection of driving environments in a vision-based driver assistance system. IEEE Trans Neural Netw 14(3):646–657
Article Google Scholar
Abdollahian G, Taskiran CM, Pizlo Z, Delp EJ (2010) Camera motion-based analysis of user generated video. IEEE Trans Multimed 12(1):28–41
Article Google Scholar

Download references

Acknowledgements

The article was written as parts of a research Grant No. NSC-102-2221-E-003-013 financed by the Ministry of Science and Technology (MOST), Taiwan, R.O.C.

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan Normal University, No.88, Sec. 4, Tingzhou Rd, Wenshan Dist, Taipei City, 11677, Taiwan, ROC
Chiung-Yao Fang, Chia-Ju Lu & Sei-Wang Chen
Industrial Technology Research Institute, No.195, Sec.4, Chung Hsing Rd, Chutung, Hsinchu, 31040, Taiwan, ROC
An-Chun Luo & Yu-Shan Deng

Authors

Chiung-Yao Fang
View author publications
You can also search for this author in PubMed Google Scholar
An-Chun Luo
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Shan Deng
View author publications
You can also search for this author in PubMed Google Scholar
Chia-Ju Lu
View author publications
You can also search for this author in PubMed Google Scholar
Sei-Wang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to An-Chun Luo.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, CY., Luo, AC., Deng, YS. et al. Building a smart lecture-recording system using MK-CPN network for heterogeneous data sources. Neural Comput & Applic 31, 3759–3777 (2019). https://doi.org/10.1007/s00521-017-3328-6

Download citation

Received: 25 May 2017
Accepted: 28 December 2017
Published: 06 January 2018
Issue Date: August 2019
DOI: https://doi.org/10.1007/s00521-017-3328-6

Building a smart lecture-recording system using MK-CPN network for heterogeneous data sources

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A video course enhancement technique utilizing generated talking heads

Key-Lectures: Keyframes Extraction in Video Lectures

Fuzzy-based DCKN: Fuzzy-based deep convolutional kronecker network for semantic analysis of summarized video

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Building a smart lecture-recording system using MK-CPN network for heterogeneous data sources

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A video course enhancement technique utilizing generated talking heads

Key-Lectures: Keyframes Extraction in Video Lectures

Fuzzy-based DCKN: Fuzzy-based deep convolutional kronecker network for semantic analysis of summarized video

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now