Collecting Semantic Information for Locations in the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System

Masoud Rouhizadeh²⁵,
Bob Coyne²⁶ &
Richard Sproat²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6884))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

952 Accesses
1 Citations

Abstract

WordsEye is a system for automatically converting a text description of a scene into a 3D image. In converting a text description into a corresponding 3D scene, it is necessary to map objects and locations specified in the text into the actual 3D objects. Individual objects typically correspond to single 3D models, but locations (e.g. a living room) are typically an ensemble of objects. Prototypical mappings from locations to objects and their relations are called location vignettes, which are not present in existing lexical resources. In this paper we propose a new methodology using Amazon’s Mechanical Turk to collect semantic information for location vignettes. Our preliminary results show that this is a promising approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Image selection and annotation for an environmental knowledge base

Article 12 February 2016

Fusing Text and Image Data with the Help of the OWLnotator

GEO-NASS: A Semantic Tagging Experience from Geographical Data on the Media

References

Adorni, G., Di Manzo, M., Giunchiglia, F.: Natural language driven image generation. In: COLING, pp. 495–500 (1984)
Google Scholar
Baker, C., Fillmore, C., Lowe, J.: The Berkeley FrameNet Project. In: COLING-ACL (1998)
Google Scholar
Badler, N., Bindiganavale, R., Bourne, J., Palmer, M., Shi, J., Schule, W.: A parameterized action representation for virtual human agents. In: Workshop on Embodied Conversational Characters, Lake Tahoe (1998)
Google Scholar
Boberg, R.: Generating Line Drawings from Abstract Scene Descriptions. Masters thesis, Dept. of Elec. Eng, MIT, Cambridge, MA (1972)
Google Scholar
Callison-Burch, C., Dredze, M.: Creating speech and language data with Amazons mechanical turk. In: NAACL 2010 Workshop on Creating Speech and Language Data with Amazons Mechanical Turk, Los Angeles, USA, pp. 1–12 (2010)
Google Scholar
Clay, S.R., Wilhelms, J.: Put: Language-based interactive manipulation of objects. IEEE Computer Graphics and Applications, 31–39 (1996)
Google Scholar
Coyne, B., Sproat, R.: Wordseye: An automatic text-to-scene conversion system. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, pp. 487–496 (2001)
Google Scholar
Coyne, B., Rambow, O., Hirschberg, J., Sproat, R.: Frame Semantics in Text-to-Scene Generation. In: Setchi, R., Jordanov, I., Howlett, R., Jain, L. (eds.) KES 2010. LNCS, vol. 6279, pp. 375–384. Springer, Heidelberg (2010)
Chapter Google Scholar
Dupuy, S., Egges, A., Legendre, V., Nugues, P.: Generating a 3d simulation of a car accident from a written description in natural language: The CarSim system. In: Proceedings of ACL Workshop on Temporal and Spatial Information Processing, pp. 1–8 (2001)
Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. Bradford Books (1998)
Google Scholar
Girju, R., Beamer, B., Rozovskaya, A., Fister, A., Bhat, S.: A knowledge-rich approach to identifying semantic relations between nominals. Information Processing and Management 46(5), 589–610 (2010)
Google Scholar
Glass, K.R.: Automating the conversion of natural language fiction to multi-modal 3D animated virtual environments. PhD thesis, Rhodes University (2009)
Google Scholar
Hanser, E., Mc Kevitt, P., Lunney, T., Condell, J., Ma, M.: SceneMaker: Multimodal Visualisation of Natural Language Film Scripts. In: Setchi, R., Jordanov, I., Howlett, R., Jain, L. (eds.) KES 2010. LNCS, vol. 6279, pp. 430–439. Springer, Heidelberg (2010)
Chapter Google Scholar
Johansson, R., Williams, D., Berglund, A., Nugues, P.: Carsim: A System to Visualize Written Road Accident Reports as Animated 3D Scenes. In: Proceedings of the 2nd Workshop on Text Meaning and Interpretation, pp. 57–64. Association for Computational Linguistics, Stroudsburg (2004)
Google Scholar
Kahn, K.: Creation of Computer Animation from Story Descriptions. Ph.D. thesis, MIT, AI Lab, Cambridge, MA (1979)
Google Scholar
Ma, M.: Automatic Conversion of Natural Language to 3D Animation. Ph.D. thesis, University of Ulster (2006)
Google Scholar
Rouhizadeh, M., Bowler, M., Sproat, R., Coyne, B.: Data Collection and Normalization for Building the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System. In: SMAP 2010: 5th International Workshop on Semantic Media Adaptation and Personalization, Limassol, Cyprus (2010)
Google Scholar
Rouhizadeh, M., Bowler, M., Sproat, R., Coyne, B.: Collecting Semantic Data from Amazon’s Mechanical Turk for a Lexical Knowledge Resource in a Text to Picture Generating System. In: International Conference on Computational Semantics (IWCS 2011), Oxford (2011)
Google Scholar
Schwarz, K., Rojtberg, P., Caspar, J., Gurevych, I., Goesele, M., Lensch, H.P.A.: Text-to-Video: Story Illustration from Online Photo Collections. In: Setchi, R., Jordanov, I., Howlett, R., Jain, L. (eds.) KES 2010. LNCS, vol. 6279, pp. 402–409. Springer, Heidelberg (2010)
Chapter Google Scholar
Seversky, L.: Real-time Automatic 3D Scene Generation from Natural Language Voice and Text Descriptions. In: Proceedings of The 14Th Annual ACM International Conference on Multimedia (2006)
Google Scholar
Simmons, R.: The clowns microworld. In: Proceedings of TINLAP, pp. 17–19 (1998)
Google Scholar
Sproat, R.: Inferring the environment in a text-to-scene conversion system. In: First International Conference on Knowledge Capture, Victoria, BC (2001)
Google Scholar
Turney, P., Littman, M.: Corpus-based Learning of Analogies and Semantic Relations. Machine Learning Journal 60(1-3), 251–278 (2005)
Article Google Scholar
Turney, P.: Expressing implicit semantic relations without supervision. In: Proceedings of COLING-ACL, Australia (2006)
Google Scholar
Winograd, T.: Understanding Natural Language. Ph.D. thesis, Massachusetts Institute of Technology (1972)
Google Scholar
Ye, P., Baldwin, T.: Towards automatic animated storyboarding. In: Proceedings of the 23rd National Conference on Artificial Intelligence, vol. 1, pp. 578–583 (2008)
Google Scholar
Zhu, X., Goldberg, A., Eldawy, M., Dyer, C., Strock, B.: A text-to-picture synthesis system for augmenting communication. In: Proceedings of the 22nd National Conference on Artificial Intelligence, vol. 2, pp. 1590–1595 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Oregon Health & Science University, Portland, OR, USA
Masoud Rouhizadeh & Richard Sproat
Columbia University, New York, NY, USA
Bob Coyne

Authors

Masoud Rouhizadeh
View author publications
You can also search for this author in PubMed Google Scholar
Bob Coyne
View author publications
You can also search for this author in PubMed Google Scholar
Richard Sproat
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Integrated Sensor Systems, University of Kaiserslautern, Erwin-Schroedinger-str. 12, 67663, Kaiserslautern, Germany
Andreas König
Knowledge-Based Systems Group, Department of Computer Science, University of Kaiserslautern, P.O. Box 3049, 67653, Kaiserslautern, Germany
Andreas Dengel
School of Business, University of Applied Sciences Northwestern Switzerland, Riggenbachstr. 16, 4600, Olten, Switzerland
Knut Hinkelmann
Graduate School of Engineering, Osaka Prefecture University, 1-1 Gakuen-cho, 599-8531, Sakai, Osaka, Japan
Koichi Kise
KES International, P.O. Box 2115, BN43 9AF, Shoreham-by-sea, UK
Robert J. Howlett
University of South Australia, Adelaide, 5095, Mawson Lakes, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rouhizadeh, M., Coyne, B., Sproat, R. (2011). Collecting Semantic Information for Locations in the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System. In: König, A., Dengel, A., Hinkelmann, K., Kise, K., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based and Intelligent Information and Engineering Systems. KES 2011. Lecture Notes in Computer Science(), vol 6884. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23866-6_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-23866-6_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23865-9
Online ISBN: 978-3-642-23866-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Collecting Semantic Information for Locations in the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Image selection and annotation for an environmental knowledge base

Fusing Text and Image Data with the Help of the OWLnotator

GEO-NASS: A Semantic Tagging Experience from Geographical Data on the Media

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Collecting Semantic Information for Locations in the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Image selection and annotation for an environmental knowledge base

Fusing Text and Image Data with the Help of the OWLnotator

GEO-NASS: A Semantic Tagging Experience from Geographical Data on the Media

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation