WO2007058517A1 - Method and apparatus for synchronizing visual and voice data in dab/dmb service system - Google Patents
Method and apparatus for synchronizing visual and voice data in dab/dmb service system Download PDFInfo
- Publication number
- WO2007058517A1 WO2007058517A1 PCT/KR2006/004901 KR2006004901W WO2007058517A1 WO 2007058517 A1 WO2007058517 A1 WO 2007058517A1 KR 2006004901 W KR2006004901 W KR 2006004901W WO 2007058517 A1 WO2007058517 A1 WO 2007058517A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- synchronization
- web document
- speech
- document
- visual
- Prior art date
Links
- 230000000007 visual effect Effects 0.000 title claims abstract description 91
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000001360 synchronised effect Effects 0.000 claims description 21
- 238000012546 transfer Methods 0.000 claims description 8
- 230000001131 transforming effect Effects 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 241000252067 Megalops atlanticus Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H60/00—Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
- H04H60/76—Arrangements characterised by transmission systems other than for broadcast, e.g. the Internet
- H04H60/81—Arrangements characterised by transmission systems other than for broadcast, e.g. the Internet characterised by the transmission system itself
- H04H60/82—Arrangements characterised by transmission systems other than for broadcast, e.g. the Internet characterised by the transmission system itself the transmission system being the Internet
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/242—Synchronization processes, e.g. processing of PCR [Program Clock References]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
- H04N21/43072—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of multiple content streams on the same device
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234318—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into objects, e.g. MPEG-4 objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/478—Supplemental services, e.g. displaying phone caller identification, shopping application
- H04N21/4782—Web browsing, e.g. WebTV
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8106—Monomedia components thereof involving special audio data, e.g. different tracks for different languages
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8126—Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8543—Content authoring using a description language, e.g. Multimedia and Hypermedia information coding Expert Group [MHEG], eXtensible Markup Language [XML]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/04—Synchronising
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H2201/00—Aspects of broadcast communication
- H04H2201/10—Aspects of broadcast communication characterised by the type of broadcast system
- H04H2201/20—Aspects of broadcast communication characterised by the type of broadcast system digital audio broadcasting [DAB]
Definitions
- the present invention relates to a digital data broadcasting service; and, more particularly, to a method for providing a Web service that can simultaneously input/output speech data along with visual data by integrating speech Web data with broadcasting Web sites
- DMB Multimedia Broadcasting
- DAB Digital Audio Broadcasting
- BWS Conventional broadcasting Web sites
- HTTP Hyper Text Markup Language
- DMB Digital Multimedia Broadcasting
- DAB Digital Audio Broadcasting
- the method can simply output the Web data defined by the HTML onto the screen. Therefore, the method cannot sufficiently transfer data in a broadcasting system for a mobile environment, such as a DAB-based DMB.
- an X + V method is underway for standardization and development to provide a multi-modal Web service.
- the method too, operates based on a visual interface with the XHTML as a host language, and it is somewhat inappropriate for a mobile environment.
- the present invention provides a method for synchronizing visual and speech Web data that can overcome the aforementioned drawbacks and provide users with a speech-directed Web service in a mobile environment or a fixed location environment, instead of a visual-directed Web service, and an apparatus thereof.
- BWS broadcasting Web sites
- an embodiment of the present invention defines a speech-directed Web language to provide a speech-directed Web service in consideration of a mobile environment, instead of a screen-directed Web service .
- another embodiment of the present invention provides a service capable of inputting/outputting speech data by integrating a conventional Web service framework, e.g., a BWS service, with a speech input/output module.
- a conventional Web service framework e.g., a BWS service
- yet another embodiment of the present invention provides a technology of synchronizing a content following a visual Web specification, e.g., HTML, and a VoiceXML content capable of providing a speech Web service, that is, a technology of synchronizing visual data with speech data.
- a visual Web specification e.g., HTML
- a VoiceXML content capable of providing a speech Web service
- processing of documents should be synchronized, and a user input device should be synchronized for one document. It is the object of the present invention to provide a method and apparatus for the synchronizations.
- a method for synchronizing visual data with speech data to provide a broadcasting Web sites (BWS) service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service which includes the steps of: a) generating a visual Web document; b) generating a speech Web document including synchronization tags related to the visual Web document; and c) identifying the speech Web document and the visual Web document based on a sub-channel or a directory and transmitting the speech Web document and the visual Web document independently.
- BWS broadcasting Web sites
- an apparatus for synchronizing visual data with speech data to provide a BWS service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service which includes: a) a content data generator for generating a visual Web document and a speech Web document including synchronization tags related to the visual Web document; b) a multimedia object transfer (MOT) server for transforming both the generated visual Web document and the speech Web document into an MOT protocol; and c) a transmitting system for identifying the speech Web document and the visual Web document of the MOT protocol based on a sub-channel or a directory and transmitting the speech Web document and the visual Web document independently .
- a content data generator for generating a visual Web document and a speech Web document including synchronization tags related to the visual Web document
- a multimedia object transfer (MOT) server for transforming both the generated visual Web document and the speech Web document into an MOT protocol
- a transmitting system for identifying the speech Web document and the visual Web document of the MOT protocol based on a sub-channel or
- a method for synchronizing visual data with speech data to provide a BWS service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service which includes the steps of: a) receiving and loading a visual Web document and a speech Web document including synchronization tags related to the visual Web document, the visual Web document and the speech Web document being identified based on a sub-channel or a directory and transmitted independently; and b) analyzing the synchronization tags when a synchronization event occurs and performing a corresponding synchronization operation.
- an apparatus for synchronizing visual data with speech data to provide a BWS service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service which includes: a) a baseband receiver for receiving broadcasting signals through a multimedia broadcasting network and performing channel decoding; b) a multimedia object transfer (MOT) decoder for decoding channel-decoded packets and restoring a visual Web document and a speech Web document including synchronization tags related to the visual Web document; and c) an integrated Web browser for analyzing the synchronization tag when a synchronization event occurs and executing a corresponding synchronization operation.
- a baseband receiver for receiving broadcasting signals through a multimedia broadcasting network and performing channel decoding
- a multimedia object transfer (MOT) decoder for decoding channel-decoded packets and restoring a visual Web document and a speech Web document including synchronization tags related to the visual Web document
- MOT multimedia object transfer
- an HTML document which is a visual Web document
- a VoiceXML content which is a speech Web document
- a multimedia broadcasting service user can conveniently access to corresponding information by receiving both screen output and speech output for a Web data service and, if necessary, making a command by speech even in a mobile environment.
- the present invention has an advantage that it can ensure the backward compatibility with lower-ranked services by individually authoring and transmitting data to provide an integrated synchronization service, instead of integrating markup languages and transmitting them in the form of data of a sort, which is generally used.
- the technology of the present invention adds synchronization-related elements to a host markup language to thereby maintain a conventional service framework.
- users can receive a conventional broadcasting Web site and, at the same time, access to the Web by speech, listen to information, and control the Web by speech.
- Fig. 1 is an exemplary view illustrating how broadcasting Web site documents are authored to be synchronized and capable of speech input/output in accordance with an embodiment of the present invention
- Fig. 2 is a view describing broadcasting Web site documents capable of speech input/output and a data transmitting method in accordance with an embodiment of the present invention
- Fig. 3 is an exemplary view showing broadcasting Web site documents capable of speech input/output when a synchronization document is separately provided in accordance with an embodiment of the present invention
- Fig. 4 is a block view describing a Digital Multimedia Broadcasting (DMB) system which is configured based on a Digital Audio Broadcasting (DAB) and providing a broadcasting Web sites (BWS) service capable of simultaneous speech input/output; and
- DMB Digital Multimedia Broadcasting
- DAB Digital Audio Broadcasting
- BWS broadcasting Web sites
- Fig. 5 is a block view illustrating an integrated Web browser of Fig. 4.
- broadcasting Web sites defined to provide a Web service in a multimedia broadcasting service, such as Digital Audio Broadcasting (DAB) and Digital Multimedia Broadcasting (DMB)
- DAB Digital Audio Broadcasting
- DMB Digital Multimedia Broadcasting
- the Web language that becomes the basis for providing the service includes a basic profile which adopts HTML 3.2 as a Web specification in consideration of a terminal with a relatively low specification, and a non-restrictive profile which has no restriction in consideration of a high-specification terminal, such as a personal computer (PC).
- PC personal computer
- the profiles are based on the HTML, which is a Web representation language, it requires a Web browser to provide a terminal with a BWS service.
- the browser may be called a BWS browser and it provides a Web service by receiving and decoding Web contents of txt, html, jpg, and png formats transmitted as objects through a multimedia object transfer (MOT) .
- MOT multimedia object transfer
- the output is provided in the visual form. That is, texts or still images are displayed on a screen with a hyperlink function and they transit into the other contents transmitted together through the MOT to thereby provide a visual-based local Web service.
- the specification includes a function of recovering a speech file or other multimedia files, it is possible to provide the output not only on the screen but also by speech.
- GUI Graphical User Interface
- VoiceXML is a Web language devised for an interactive speech response service of an Interactive Voice Response (IVR) type. When it is actually mounted on the terminal, it can provide a speech enabled Web service.
- the technology defines a markup language that can be transited into another application, document, or dialogue based on a dialogue obtained by modeling a conversation between a human being and a machine.
- the VoiceXML can provide a Web service that can input/output data by speech.
- Web information can be delivered by speech by applying a Text To Speech (TTS) technology which transforms text data into speech data and the Automatic Speech Recognition (ASR) technology which performs speech recognition to an input/output module, and user input data are received by speech to process a corresponding command or execute a corresponding application.
- TTS Text To Speech
- ASR Automatic Speech Recognition
- the VoiceXML is effective in a mobile environment. It has an advantage that users listen to a Web service provided without a visual output on the screen and perform navigation by inputting speech data at desired information.
- there is a limitation in delivering Web information by speech only and, when speech input/output is made together with visual data on the screen it is convenient and it is possible to provide diverse additional data services.
- the present invention provides a transmission and synchronization method for providing a multi-modal Web service by integrating the conventional BWS Web specification, i.e., HTML, with a speech Web language, i.e., VoiceXML.
- a transmission and synchronization method will be described.
- the basic principle of the present invention is to generate a speech Web document including synchronization information related to a visual Web document and transmit the visual Web document and the speech Web document through another sub-channel or another directory of the same sub-channel.
- Fig. 1 is an exemplary view illustrating how of broadcasting Web site documents are authored to be synchronized and capable of speech input/output in accordance with an embodiment of the present invention.
- a visual Web document and a speech Web document are separately created in the embodiment of the present invention.
- the visual Web document is an HTML or an xHTML content defined in the BWS
- the speech Web document is a document integrating elements or tags in charge of synchronization between the VoiceXML and the visual Web documents, a speech recognition module, and a component-related module such as a speech combiner and a receiver.
- Fig. 2 is a view describing broadcasting Web site documents capable of speech input/output and a data transmitting method in accordance with an embodiment of the present invention.
- the visual Web document and the speech Web document are transmitted and signaled through different sub-channels or the same subchannel, they are transmitted using different directories, This is to make a terminal capable of receiving an existing BWS service receive the conventional service, even if the BWS is cooperated with a speech Web document.
- the signaling for the speech BWS is additionally processed in the speech Web document, i.e., a speech module.
- the synchronization between the visual Web document and the speech Web document is processed by using synchronization tags ⁇ esync>, ⁇ icync> and ⁇ fsync>.
- the synchronization tags are described in the speech Web document without exception. Also, the synchronization tags are identified by the following namespace.
- a VoiceXML forming a speech Web document has the following name space.
- the entire namespace including the visual Web document and the speech Web document may be designated as follows:
- the synchronization tags for processing synchronization between the visual Web document, i.e., HTML, and the speech Web document, i.e., the VoiceXML, should describe synchronization between an application, documents, and forms within the document.
- 30 synchronization tags used for the purpose are ⁇ esync>, ⁇ isync> and ⁇ fsync>.
- the tags ⁇ esync> and ⁇ isync> are in charge of synchronization between an application and documents, whereas the ⁇ fsync> tag is in charge of synchronization between forms.
- the synchronization between the application and the documents should be simultaneously loaded, interpreted and rendered in the initial period when the application starts.
- the synchronization between the forms signifies that user input data are simultaneously inputted to a counterpart form.
- the ⁇ esync> tag is used to describe synchronization information between applications or between documents, when synchronization related information, i.e., ⁇ esync>, and related attributes do not exist in the speech Web document but exist in an independent external document, e.g., a document with an extension name of '.sync'.
- the ⁇ esync> tag supports the synchronization function based on the attributes shown in the following Table 1.
- the external document synchronization using the ⁇ esync> tag requires metadata which provide the speech Web document with information on the external synchronization document.
- metadata used is a ⁇ metasync> tag having attributes defined as shown in Table 2.
- the ⁇ metasync> tag should be positioned in the speech Web document and it provides metadata to the ⁇ esync> tags stored in the external document.
- the entire operation mechanism is as shown in Fig. 3. That is, the synchronization document and the related ⁇ esync> tags are interpreted through the ⁇ metasync> tag described in the speech Web document and then the related BWS document is simultaneously loaded and rendered.
- the ⁇ isync> tag indicates a synchronization method of a document. Differently from the ⁇ esync> tag, it is not authored in a separate document but it is formed by directly describing related synchronization tags within a predetermined form.
- the form includes a ⁇ form> tag and a ⁇ menu> tag of a VoiceXML, in a speech Web document. This is to support synchronization occurring when a predetermined form of the speech Web document should be synchronized with a BWS Web document and when a predetermined document needs to be transited.
- each form when there are a plurality of forms in on speech Web document and each form requires BWS documents having multiple pages and a synchronized operation with the BWS documents, it can be resolved by describing related ⁇ isync> tags in each form.
- the tag ⁇ isync> may be described in tags ⁇ link> or ⁇ goto> of the VoiceXML and secure synchronized transit.
- the attributes of the ⁇ isync> tag for realizing synchronization based on the tag ⁇ isync> tag are as shown in the following Table 3
- the following shows an example that shows synchronization of an application document authored by using the tags ⁇ esync> and ⁇ isync>.
- a X ⁇ service_main_intro" dialogue is outputted by speech and, at the same time, a corresponding html page which affects the entire document, for example, a main page of the entire service, is synchronized based on the
- ⁇ dialogue is executed.
- a "hotnews_intro" dialogue is executed, a BWS Web document corresponding to the
- the ⁇ fsync> tag is needed for inter-form synchronization between the speech Web document and the BWS Web document and it processes the user input data.
- the concept of the ⁇ fsync> tag is similar to document synchronization. It signifies that, when the user input data are processed through speech recognition of the speech Web document, the processed user input data are transferred to and reflected in a ⁇ input> tag of the BWS. Conversely, when data are inputted from a user on the BWS, the content of the user input data is reflected in a ⁇ field> tag of the speech Web document.
- the ⁇ fsync> tag is a sort of executable contents. It may be positioned within the ⁇ form> of the VoiceXML or it may exist independently. If any, the scope of the ⁇ fsync> tag is limited to a document. When the ⁇ fsync> has a global scope over all documents, it should be specified in a root document. Then, it may be activated in all documents.
- the ⁇ field' attribute When the ⁇ field' attribute is not specified, it means that the ⁇ field> of a corresponding form is an object to be synchronized. In this case, the form should have only one ⁇ field> tag. If there are a plurality of ⁇ field> tags in one form, its attributes should be specified necessarily. Also, each field should have a unique name.
- the ⁇ fsync> tag has the attributes shown in the following Table 4 and it should be in charge of synchronization between forms.
- Attribute Function field This signifies a ⁇ field> name of a VoiceXML
- Speech data input from the user should be updated in the ⁇ field> tag of VoiceXML and ⁇ input> tag of the BWS HTML.
- Visual data inputted from the user such as data input through a keyboard or a pen, should be updated in the ⁇ input> tag of HTML and the ⁇ field> tag of VoiceXML simultaneously.
- Visual data inputted from the user should satisfy a guard condition of the ⁇ field> tag of VoiceXML.
- the ⁇ field> tag of VoiceXML should be matched one- to-one with the ⁇ input> tag of HTML in the moment when the inputted data are about to be reflected.
- the form synchronization should be carried out in parallel to the document synchronization. That is, the ⁇ field> or ⁇ input> tag to be synchronized may be validly updated only in a document already synchronized.
- the tags of the two modules, which should receive the inputted data should mutually exist in the synchronized document. When they are described on the external document, only the 'root' document is allowed for general synchronization.
- the data should be mutually inputted only in the form of an activated speech Web document.
- Fig. 4 is a block view describing a Digital Multimedia Broadcasting (DMB) system providing a broadcasting Web sites (BWS) service capable of simultaneous speech input/output.
- DMB Digital Multimedia Broadcasting
- the DMB system for providing speech-based BWS service capable of simultaneous speech input/output can be divided into a DMB transmission part and a DMB reception part based on a DAB system.
- the DMB transmission part includes a content data generator 110, a multimedia object transfer server (MOT) 120, and a DMB transmitting system 130.
- the content data generator 110 generates speech contents (speech Web documents) and BWS contents (visual Web documents).
- the MOT server 120 transforms the directory and file objects of the speech contents and BWS contents into MOT protocols before they are transmitted.
- the DMB transmitting system 130 multiplexes the respective MOT data of the transformed MOT protocol, which include both speech Web documents and visual Web documents, with different directory of the same sub- channel or different sub-channels and broadcasts them through a DMB broadcasting network.
- the present invention is not limited to them and a speech Web document and a visual Web document may be generated in an external device and transmitted from the external device.
- the DMB broadcasting reception part i.e., the DMB receiving block 200, includes a DMB baseband receiver 210, an MOT decoder 220, and a DMB integrated Web browser 230.
- the DMB baseband receiver 210 receives DMB broadcasting signals from the DMB broadcasting network based on the DAB system, performs decoding for corresponding subchannels, and outputs data of the respective sub-channels.
- the MOT decoder 220 decodes packets transmitted from the DMB baseband receiver 210 and restores MOT objects.
- the DMB integrated Web browser 230 executes the restored MOT objects, which include directories and files, independently or based on a corresponding synchronization method.
- the restored objects includes visual Web documents and speech Web documents related to the visual Web documents
- the DMB integrated Web browser 230 analyzes the aforementioned synchronization tags during the generation of synchronization event and executes the synchronization function based on the synchronization tags.
- Fig. 5 is a block view illustrating an integrated Web browser of Fig. 4.
- the integrated Web browser 230 includes a speech Web browser 233, a BWS browser 235, and a synchronization management module 231.
- the speech Web browser 233 drives speech markup generation language extended based on VoiceXML .
- the BWS browser 235 drives Web pages based on the HTML.
- the synchronization management module 231 manages synchronization between the speech Web browser 233 and the BWS browser 235.
- the speech Web browser 233 sequentially drives Web pages authored in the VoiceXML-based speech markup language, outputs speech to the user, and processes user input data through a speech device.
- the BWS browser 235 drives the Web pages authored in the HTML language defined in the DAB/DMB specifications and displays input/output on a screen, just as commercial browsers.
- the synchronization management module 231 receives synchronization events generated in the speech Web browser 233 and the BWS browser 235 and synchronizes corresponding pages and forms of each page based on the pre-defined synchronization protocol (synchronization tags).
- Examples of the DMB receiving block 200 include a personal digital assistant (PDA), a mobile communication terminal, and a settop box for a vehicle that can receive and restore DAB and DMB service.
- PDA personal digital assistant
- mobile communication terminal a mobile communication terminal
- settop box for a vehicle that can receive and restore DAB and DMB service.
- the method of the present invention can be realized in a computer-readable recording medium, such as CD-ROM, RAM, ROM, floppy disks, hard disks, magneto-optical disks and the like. Since the process can be easily implemented by those skilled in the art to which the present invention pertains, further description will not be provided herein.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Computer Security & Cryptography (AREA)
- Information Transfer Between Computers (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Provided is a digital broadcasting service, more particularly, a method for providing a Web service that can simultaneously input/output speech data along with visual data by integrating speech Web data with broadcasting Web sites (BWS) provided for a multimedia broadcasting in a digital multimedia broadcasting (DMB) system configured based on a digital audio broadcasting (DAB), and an apparatus thereof . The method for synchronizing visual data with speech data includes the steps of : a) generating a visual Web document; b) generating a speech Web document including synchronization tags related to the visual Web document; and c) identifying the speech Web document and the visual Web document based on a sub-channel or a directory and transmitting the speech Web document and the visual Web document independently.
Description
METHOD AND APPARATUS FOR SYNCHRONIZING VISUAL AND VOICE DATA IN DAB/DMB SERVICE SYSTEM
Description Technical Field
The present invention relates to a digital data broadcasting service; and, more particularly, to a method for providing a Web service that can simultaneously input/output speech data along with visual data by integrating speech Web data with broadcasting Web sites
(BWS) provided for a multimedia broadcasting in a Digital
Multimedia Broadcasting (DMB) system configured based on a Digital Audio Broadcasting (DAB), and an apparatus thereof.
Background Art
Conventional broadcasting Web sites (BWS) providing methods make Hyper Text Markup Language (HTML) contents, which is of a Web specification, data and transmit the data to provide a Web service on a screen by using a multimedia object transfer (MOT) method through a Digital Multimedia Broadcasting (DMB) network configured based on Digital Audio Broadcasting (DAB), and an apparatus thereof. The method, however, can simply output the Web data defined by the HTML onto the screen. Therefore, the method cannot sufficiently transfer data in a broadcasting system for a mobile environment, such as a DAB-based DMB. Also, an X + V method is underway for standardization and development to provide a multi-modal Web service. It is a method directed through a screen and it can provide a multi-modal Web service that can input/output speech data by combining the host language XHTML with forms in charge of speech interface of
VoiceXML. However, the method, too, operates based on a visual interface with the XHTML as a host language, and it is somewhat inappropriate for a mobile environment.
The present invention provides a method for synchronizing visual and speech Web data that can overcome the aforementioned drawbacks and provide users with a speech-directed Web service in a mobile environment or a fixed location environment, instead of a visual-directed Web service, and an apparatus thereof.
Disclosure Technical Problem
It is, therefore, an object of the present invention to provide a broadcasting Web sites (BWS) service that can overcome the limitation of inputting/outputting only visual data of Web data in a conventional Digital Audio Broadcasting (DAB) BWS service or a conventional Digital Multimedia Broadcasting (DMB) BWS service and offer the speech enabled web data service, too.
In the first place, an embodiment of the present invention defines a speech-directed Web language to provide a speech-directed Web service in consideration of a mobile environment, instead of a screen-directed Web service .
Secondly, another embodiment of the present invention provides a service capable of inputting/outputting speech data by integrating a conventional Web service framework, e.g., a BWS service, with a speech input/output module.
Thirdly, yet another embodiment of the present invention provides a technology of synchronizing a content following a visual Web specification, e.g., HTML, and a VoiceXML content capable of providing a speech Web
service, that is, a technology of synchronizing visual data with speech data. For this, processing of documents should be synchronized, and a user input device should be synchronized for one document. It is the object of the present invention to provide a method and apparatus for the synchronizations.
Technical Solution
In accordance with one aspect of the present invention, there is provided a method for synchronizing visual data with speech data to provide a broadcasting Web sites (BWS) service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service, the method which includes the steps of: a) generating a visual Web document; b) generating a speech Web document including synchronization tags related to the visual Web document; and c) identifying the speech Web document and the visual Web document based on a sub-channel or a directory and transmitting the speech Web document and the visual Web document independently.
In accordance with another aspect of the present invention, there is provided an apparatus for synchronizing visual data with speech data to provide a BWS service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service, the apparatus which includes: a) a content data generator for generating a visual Web document and a speech Web document including synchronization tags related to the visual Web document; b) a multimedia object transfer (MOT) server for transforming both the generated visual Web document and the speech Web document into an MOT protocol; and c) a transmitting system for identifying the speech Web
document and the visual Web document of the MOT protocol based on a sub-channel or a directory and transmitting the speech Web document and the visual Web document independently . In accordance with another aspect of the present invention, there is provided a method for synchronizing visual data with speech data to provide a BWS service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service, the method which includes the steps of: a) receiving and loading a visual Web document and a speech Web document including synchronization tags related to the visual Web document, the visual Web document and the speech Web document being identified based on a sub-channel or a directory and transmitted independently; and b) analyzing the synchronization tags when a synchronization event occurs and performing a corresponding synchronization operation.
In accordance with yet another aspect of the present invention, there is provided an apparatus for synchronizing visual data with speech data to provide a BWS service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service, the apparatus which includes: a) a baseband receiver for receiving broadcasting signals through a multimedia broadcasting network and performing channel decoding; b) a multimedia object transfer (MOT) decoder for decoding channel-decoded packets and restoring a visual Web document and a speech Web document including synchronization tags related to the visual Web document; and c) an integrated Web browser for analyzing the synchronization tag when a synchronization event occurs and executing a corresponding synchronization operation.
Advantageous Effects
When an HTML document, which is a visual Web document, is synchronized with a VoiceXML content, which is a speech Web document, by using the synchronization tags and it is possible to perform synchronized input/output, a multimedia broadcasting service user can conveniently access to corresponding information by receiving both screen output and speech output for a Web data service and, if necessary, making a command by speech even in a mobile environment.
In other words, the present invention has an advantage that it can ensure the backward compatibility with lower-ranked services by individually authoring and transmitting data to provide an integrated synchronization service, instead of integrating markup languages and transmitting them in the form of data of a sort, which is generally used. To synchronize two Web documents, the technology of the present invention adds synchronization-related elements to a host markup language to thereby maintain a conventional service framework. Thus, users can receive a conventional broadcasting Web site and, at the same time, access to the Web by speech, listen to information, and control the Web by speech.
Description of Drawings
The above and other objects and features of the present invention will become apparent from the following description of the preferred embodiments given in conjunction with the accompanying drawings, in which:
Fig. 1 is an exemplary view illustrating how broadcasting Web site documents are authored to be synchronized and capable of speech input/output in accordance with an embodiment of the present invention;
Fig. 2 is a view describing broadcasting Web site documents capable of speech input/output and a data transmitting method in accordance with an embodiment of the present invention; Fig. 3 is an exemplary view showing broadcasting Web site documents capable of speech input/output when a synchronization document is separately provided in accordance with an embodiment of the present invention;
Fig. 4 is a block view describing a Digital Multimedia Broadcasting (DMB) system which is configured based on a Digital Audio Broadcasting (DAB) and providing a broadcasting Web sites (BWS) service capable of simultaneous speech input/output; and
Fig. 5 is a block view illustrating an integrated Web browser of Fig. 4.
Best Mode for the Invention
To have a look at conventional technologies, broadcasting Web sites (BWS) defined to provide a Web service in a multimedia broadcasting service, such as Digital Audio Broadcasting (DAB) and Digital Multimedia Broadcasting (DMB), is a specification for providing users with a Web site by authoring a Web content based on Hyper Text Markup Language (HTML) and transmitting the Web content through a multimedia broadcasting network. The Web language that becomes the basis for providing the service includes a basic profile which adopts HTML 3.2 as a Web specification in consideration of a terminal with a relatively low specification, and a non-restrictive profile which has no restriction in consideration of a high-specification terminal, such as a personal computer (PC). Since the profiles are based on the HTML, which is a Web representation language, it requires a Web browser to provide a terminal with a BWS service. The browser
may be called a BWS browser and it provides a Web service by receiving and decoding Web contents of txt, html, jpg, and png formats transmitted as objects through a multimedia object transfer (MOT) . Generally, the output is provided in the visual form. That is, texts or still images are displayed on a screen with a hyperlink function and they transit into the other contents transmitted together through the MOT to thereby provide a visual-based local Web service. Of course, when the specification includes a function of recovering a speech file or other multimedia files, it is possible to provide the output not only on the screen but also by speech. However, with the current specification, which is the basic profile, it is possible to provide only a local Web service capable of visual input/output based on the Graphical User Interface (GUI).
Particularly, there are increasing demands for services that can provide a data service additionally while securing mobility in a mobile multimedia broadcasting such as the DAB and DMB and access to multimodal information, instead of a single-modal information. The World Wide Web Consortium (W3C) has completed developing VoiceXML and SALT specifications capable of providing speech-based Web services, and it is expected to embark in standardization of multi-modal Web specification additionally.
VoiceXML is a Web language devised for an interactive speech response service of an Interactive Voice Response (IVR) type. When it is actually mounted on the terminal, it can provide a speech enabled Web service. The technology defines a markup language that can be transited into another application, document, or dialogue based on a dialogue obtained by modeling a conversation between a human being and a machine. Differently from the conventional visual-based Web
service, the VoiceXML can provide a Web service that can input/output data by speech. Also, Web information can be delivered by speech by applying a Text To Speech (TTS) technology which transforms text data into speech data and the Automatic Speech Recognition (ASR) technology which performs speech recognition to an input/output module, and user input data are received by speech to process a corresponding command or execute a corresponding application. The VoiceXML is effective in a mobile environment. It has an advantage that users listen to a Web service provided without a visual output on the screen and perform navigation by inputting speech data at desired information. However, there is a limitation in delivering Web information by speech only and, when speech input/output is made together with visual data on the screen, it is convenient and it is possible to provide diverse additional data services. For this, the present invention provides a transmission and synchronization method for providing a multi-modal Web service by integrating the conventional BWS Web specification, i.e., HTML, with a speech Web language, i.e., VoiceXML. Hereinafter, the transmission and synchronization method will be described.
The basic principle of the present invention is to generate a speech Web document including synchronization information related to a visual Web document and transmit the visual Web document and the speech Web document through another sub-channel or another directory of the same sub-channel.
Fig. 1 is an exemplary view illustrating how of broadcasting Web site documents are authored to be synchronized and capable of speech input/output in accordance with an embodiment of the present invention.
As shown in Fig. 1, a visual Web document and a speech Web document are separately created in the embodiment of the present invention. The visual Web document is an HTML or an xHTML content defined in the BWS, whereas the speech Web document is a document integrating elements or tags in charge of synchronization between the VoiceXML and the visual Web documents, a speech recognition module, and a component-related module such as a speech combiner and a receiver. Fig. 2 is a view describing broadcasting Web site documents capable of speech input/output and a data transmitting method in accordance with an embodiment of the present invention.
Referring to Fig. 2, although the visual Web document and the speech Web document are transmitted and signaled through different sub-channels or the same subchannel, they are transmitted using different directories, This is to make a terminal capable of receiving an existing BWS service receive the conventional service, even if the BWS is cooperated with a speech Web document. The signaling for the speech BWS is additionally processed in the speech Web document, i.e., a speech module.
The synchronization between the visual Web document and the speech Web document is processed by using synchronization tags <esync>, <icync> and <fsync>. The synchronization tags are described in the speech Web document without exception. Also, the synchronization tags are identified by the following namespace.
<va version = "1.0" xmlns:va="http://www. worlddab.org/schemes/va"
xsi:schemaLocation="http://www. worlddab.org/schemas/va va.xsd">
Also, an HTML forming a BWS document uses the following namespace.
HTMUIDOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"
"http://www.w3.org/TR/HTML32.dtd"> b
A VoiceXML forming a speech Web document has the following name space.
IQ <vxml version = "2.0" xmlns= "http://www.w3.org/2OOl /vxml" xmlns :xsi= http://www.w3.org/2001 /XMLSchema-instance xsi :schemaLocation= "http://www.w3.org/2001 /vxml> http://www.w3/org/TR/voicexml20/vxml.xsd">
15 To take an example, the entire namespace including the visual Web document and the speech Web document may be designated as follows:
<?xml version ="1.0" encoding ="UTF-8"> <vxml version ="2.O' xmlns = "http"//www.w3.org/2001 /vxml
„(, xmlns = "http://www.w3.org/20OI /XMLSchema-instance xmlns = "http://www.worlddab.org/schemas/va" xsi :schemaLocation =
"http://www.w3.org/2OOl /vxml http7/www .w3.org/TR/voicexml20/vxml/sxd http://www.worlddab.org/schemas/va http://www.worlddab.org/schemas/va/va.xsd">
25 The synchronization tags for processing synchronization between the visual Web document, i.e., HTML, and the speech Web document, i.e., the VoiceXML, should describe synchronization between an application, documents, and forms within the document. The
30 synchronization tags used for the purpose are <esync>, <isync> and <fsync>. The tags <esync> and <isync> are in charge of synchronization between an application and documents, whereas the <fsync> tag is in charge of synchronization between forms. Herein, the
synchronization between the application and the documents should be simultaneously loaded, interpreted and rendered in the initial period when the application starts. The synchronization between the forms signifies that user input data are simultaneously inputted to a counterpart form.
The <esync> tag is used to describe synchronization information between applications or between documents, when synchronization related information, i.e., <esync>, and related attributes do not exist in the speech Web document but exist in an independent external document, e.g., a document with an extension name of '.sync'. The <esync> tag supports the synchronization function based on the attributes shown in the following Table 1.
This designates a URI for a speech part document to be synchronized Also it vadoc specifies a form of VoiceXML as follows : e.g. ,
" . /ensemble.vxml#sndform".
This is a URI for a BWS part document to be bwsdoc synchronized.
The external document synchronization using the <esync> tag requires metadata which provide the speech Web document with information on the external synchronization document. For the metadata, used is a <metasync> tag having attributes defined as shown in Table 2.
Table 2
The <metasync> tag should be positioned in the speech Web document and it provides metadata to the <esync> tags stored in the external document. The entire operation mechanism is as shown in Fig. 3. That is, the synchronization document and the related <esync> tags are interpreted through the <metasync> tag described in the speech Web document and then the related BWS document is simultaneously loaded and rendered.
The <isync> tag indicates a synchronization method of a document. Differently from the <esync> tag, it is not authored in a separate document but it is formed by directly describing related synchronization tags within a predetermined form. Herein, the form includes a <form>
tag and a <menu> tag of a VoiceXML, in a speech Web document. This is to support synchronization occurring when a predetermined form of the speech Web document should be synchronized with a BWS Web document and when a predetermined document needs to be transited.
According to an example, when there are a plurality of forms in on speech Web document and each form requires BWS documents having multiple pages and a synchronized operation with the BWS documents, it can be resolved by describing related <isync> tags in each form. Actually, the tag <isync> may be described in tags <link> or <goto> of the VoiceXML and secure synchronized transit.
Therefore, when a specific speech Web document is stacked and no specific form is designated, synchronization is processed with reference to the synchronization document providing the initial <esync>. When a related tag <isync> is designated to a specific form, the tag <isync> has a priority to be synchronized with the BWS Web document. For the synchronization, another <isync> tag is used, which is shown in the following Table. When the <isync> tag is defined in the <link> or <goto> tag, the BWS Web document should be transited according to the definition of the <isync> tag. When the form has a designated synchronization, there should be only one <isync> related to this form. However, when synchronization is specified in a form, there may be a plurality of transit tags. In other words, when synchronization is described with the tag <goto> or <link>, there may be a plurality of <isync> tags. The <isync> tags should be necessarily described in the <goto> or <link>, and synchronization information only affects to the transition process. For this, an attribute λtype' is supported.
The attributes of the <isync> tag for realizing synchronization based on the tag <isync> tag are as shown
in the following Table 3
Table 3
The following shows an example that shows synchronization of an application document authored by using the tags <esync> and <isync>.
<?xml version ="1.0" encoding = "UTF-8">
<vxml version ="2.0" xmlns = "http"//www.w3.org/2001/vxml xmlns = "http://www.w3.org/2001/XMLSchema-instance xmlns = "http://www.worlddab.org/schemas/va" xsi:schemaLocation = "http://www.w3.org/2001/vxml http://www. w 3. org/TR/voicexml20/vxml/sxd http://www.worlddab.org/schemas/va http://www.worlddab.org/schemas/va/va.xsd">
<va:metasync doc="main.esync" syncid ~"#service_main" />
<form id = ' 'service _mainjntro"> <block bargine =f ' alse '>
In this data service, you can get various information dedicated to life in general such as local hot news, local transportation, shopping and so on. Are you ready for surfing this service? </bhck> </form>
<form id = "hotnews_intro"> <va:isync> id = "hotnews jntro sync" type ="form" next= "hotnews_mtro.html" </va:isync>
<fιeld name ="move_to_news_page">
<grammar mode = "voice" version-" 1.0" root="conιmand"> <rule id= "command" scope -"dialog" > <one-of>
<item> news </item> <item> local news </item> <item> hot news </item> <item> headline </item> </one-op' </rule> </grammar>
<prompt> In this service, headline news and breaking news are provided.
Do you want to move? </prompt~> <fllled>
<if cond = "move_to_news_page =— 'news'">
<goto next ="#initial_dialogjbr_news" >
<va:isync type = "transit" next = "../../hotnews. html" /> </goto> </if>
</filled>
<catch event — "noinput"
<goto next = "#game_intro" /> </catch> </field>
<form id = "game_intrυ"> <va:isync> id = "game_intro_sync" type ="form" next="game _inlro.html" </va:isync>
< field name ="move to_game_page">
<grammar mode = "voice" version="1.0" root="command">
<rule id=" command" scope ="dialog" > game </rule> </grammar>
<prompt> In this service, voice quiz and multimodal games are provided.
Do you want to play? </prompt> <filled>
<if cond = "move_to_game_page == game'"> <goto next ="# dialog Jbr_games" >
<va:isync type = "transit" next ="../../games.html"/> </goto> </if> </flIled> </βeld> </foπn>
When the document is executed in the above example, a Xλservice_main_intro" dialogue is outputted by speech and, at the same time, a corresponding html page which affects the entire document, for example, a main page of the entire service, is synchronized based on the
<metasync> tag and rendered onto a screen. Herein, since bargine is not permitted, user input data are not processed and the screen is maintained until the next
■dialogue is executed. When a "hotnews_intro" dialogue is executed, a BWS Web document corresponding to the
"hotnews_intro" is automatically synchronized based on the "hotnews_intro_sync" <isync>, and eventually loaded and rendered.
The <fsync> tag is needed for inter-form synchronization between the speech Web document and the BWS Web document and it processes the user input data. The concept of the <fsync> tag is similar to document synchronization. It signifies that, when the user input data are processed through speech recognition of the speech Web document, the processed user input data are transferred to and reflected in a <input> tag of the BWS. Conversely, when data are inputted from a user on the BWS, the content of the user input data is reflected in a <field> tag of the speech Web document. For this, an input function should be provided to the two-party- modules, and synchronization mechanism should be accompanied. The <fsync> tag is a sort of executable contents. It may be positioned within the <form> of the VoiceXML or it may exist independently. If any, the scope of the <fsync> tag is limited to a document. When the <fsync> has a global scope over all documents, it should be specified in a root document. Then, it may be activated in all documents.
When the λfield' attribute is not specified, it means that the <field> of a corresponding form is an object to be synchronized. In this case, the form should have only one <field> tag. If there are a plurality of <field> tags in one form, its attributes should be specified necessarily. Also, each field should have a unique name.
The <fsync> tag has the attributes shown in the following Table 4 and it should be in charge of synchronization between forms.
Table 4
Also, the following conditions should be satisfied to achieve the form synchronization.
Speech data input from the user should be updated in the <field> tag of VoiceXML and <input> tag of the BWS HTML.
Visual data inputted from the user, such as data input through a keyboard or a pen, should be updated in the <input> tag of HTML and the <field> tag of VoiceXML simultaneously.
Visual data inputted from the user should satisfy a guard condition of the <field> tag of VoiceXML.
The <field> tag of VoiceXML should be matched one- to-one with the <input> tag of HTML in the moment when the inputted data are about to be reflected.
The form synchronization should be carried out in parallel to the document synchronization. That is, the <field> or <input> tag to be synchronized may be validly updated only in a document already synchronized. In short, the tags of the two modules, which should receive the inputted data, should mutually exist in the synchronized document. When they are described on the external document, only the 'root' document is allowed for general synchronization. The data should be mutually inputted only in the form of an activated speech Web document. In short, there are a plurality of <input> tags in one BWS document and, when data are inputted into an <input> tag which is not linked with a <form> tag activated in the speech Web
document, update into the <form> currently activated in the speech Web document is prohibited. For example, when a form corresponding to a speech output dictating to input a card number is executed and a valid date field is requested to be filled into an <input> of the BWS Web document, the mixed initiative form operation is prohibited. An example of the form synchronization is as follows .
10
<!— Voice Part -> <form id = "gasstationjsync"> <va:isync> id = "id_gas" type ^"document" next= "lbs_service.html"
</ va:isync>
<field name ~"yourjocation">
<grammar src= "yourlocation.grxml" type=' 'application/ 'srgs+xml"/> 15 <prompt>
If you say your city name, you can get your gas station information. </prompt>
<catch event — nomatch help> <prompt>
This service is only available in Seoul, Daejeon, Kwangju, Taegue, Busan, lncheon. So please say above cities. </prompt> </catch~>
<!-- BWS Part "dab://lbs _service.html" --> <head>
<title> Location based service </title> </head> <body> <->3 <hl> Local gas station information </hl>
<p> If you type the name of your city, you can get some gas station information </p>
<form id = "local _gas_station" method ="post" action ="cgi/hotel.pl">
<input name ="your location" type ="text"/>
<input type="submit" value=" my Jocation" />
</form>
</body>
When the <fsync> tag is specified as shown in the above example, values are simultaneously inputted from the speech module to the <input> tags of the BWS Web document. Fig. 4 is a block view describing a Digital Multimedia Broadcasting (DMB) system providing a broadcasting Web sites (BWS) service capable of simultaneous speech input/output.
Referring to Fig. 4, the DMB system for providing speech-based BWS service capable of simultaneous speech input/output can be divided into a DMB transmission part and a DMB reception part based on a DAB system. The DMB transmission part includes a content data generator 110, a multimedia object transfer server (MOT) 120, and a DMB transmitting system 130. The content data generator 110 generates speech contents (speech Web documents) and BWS contents (visual Web documents). The MOT server 120 transforms the directory and file objects of the speech contents and BWS contents into MOT protocols before they are transmitted.
The DMB transmitting system 130 multiplexes the respective MOT data of the transformed MOT protocol, which include both speech Web documents and visual Web documents, with different directory of the same sub- channel or different sub-channels and broadcasts them through a DMB broadcasting network. The present invention, however, is not limited to them and a speech Web document and a visual Web document may be generated in an external device and transmitted from the external device.
The DMB broadcasting reception part, i.e., the DMB receiving block 200, includes a DMB baseband receiver 210, an MOT decoder 220, and a DMB integrated Web browser 230. The DMB baseband receiver 210 receives DMB broadcasting signals from the DMB broadcasting network based on the
DAB system, performs decoding for corresponding subchannels, and outputs data of the respective sub-channels. The MOT decoder 220 decodes packets transmitted from the DMB baseband receiver 210 and restores MOT objects. The DMB integrated Web browser 230 executes the restored MOT objects, which include directories and files, independently or based on a corresponding synchronization method. In the present invention, the restored objects includes visual Web documents and speech Web documents related to the visual Web documents, and the DMB integrated Web browser 230 analyzes the aforementioned synchronization tags during the generation of synchronization event and executes the synchronization function based on the synchronization tags. Fig. 5 is a block view illustrating an integrated Web browser of Fig. 4.
Referring to Fig. 5, the integrated Web browser 230 includes a speech Web browser 233, a BWS browser 235, and a synchronization management module 231. The speech Web browser 233 drives speech markup generation language extended based on VoiceXML . The BWS browser 235 drives Web pages based on the HTML. The synchronization management module 231 manages synchronization between the speech Web browser 233 and the BWS browser 235. The speech Web browser 233 sequentially drives Web pages authored in the VoiceXML-based speech markup language, outputs speech to the user, and processes user input data through a speech device. The BWS browser 235 drives the Web pages authored in the HTML language defined in the DAB/DMB specifications and displays input/output on a screen, just as commercial browsers. The synchronization management module 231 receives synchronization events generated in the speech Web browser 233 and the BWS browser 235 and synchronizes corresponding pages and forms of each page based on the pre-defined
synchronization protocol (synchronization tags).
Examples of the DMB receiving block 200 include a personal digital assistant (PDA), a mobile communication terminal, and a settop box for a vehicle that can receive and restore DAB and DMB service.
As described above, the method of the present invention can be realized in a computer-readable recording medium, such as CD-ROM, RAM, ROM, floppy disks, hard disks, magneto-optical disks and the like. Since the process can be easily implemented by those skilled in the art to which the present invention pertains, further description will not be provided herein.
While the present invention has been described with respect to certain preferred embodiments, it will be apparent to those skilled in the art that various changes and modifications may be made without departing from the scope of the invention as defined in the following claims,
Claims
1. A method for synchronizing visual data with speech data to provide a broadcasting Web sites (BWS) service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service, comprising the steps of: a) generating a visual Web document; b) generating a speech Web document including synchronization tags related to the visual Web document; and c) identifying the speech Web document and the visual Web document based on a sub-channel or a directory and transmitting the speech Web document and the visual Web document independently.
2. The method as recited in claim 1, wherein the synchronization tags include an external document synchronization tag <esync> that provides synchronization information with the visual Web document which should be simultaneously loaded and rendered along with the speech Web document when an application is rendered and driven.
3. The method as recited in claim 2, wherein the synchronization tags further include a metadata tag
<metasync> that provides metadata on the external document synchronization tags <esync>.
4. The method as recited in claim 3, wherein the external document synchronization tag <esync> has attribute information defined as the following table:
Attribute Function id This is an identifier indicating an <esync> tag and referred to in a <metasync> tag.
5. The method as recited in claim 3, wherein the metadata tag <metasync> has attribute information defined as the following table:
6. The method as recited in claim 1, wherein the synchronization tags includes an internal document synchronization tag <isync> that provides synchronization information with the visual Web document related to a corresponding form among the forms of the speech Web document.
7. The method as recited in claim 6, wherein the internal document synchronization tag <isync> has attribute information defined as the following table:
8. The method as recited in claim 1, wherein the synchronization tags include an inter-form synchronization tag <fsync> that provides synchronization information between a form of the speech Web document and a form of the visual Web document to secure mutual input synchronized between the form of the speech Web document and the form of the visual Web document.
9. The method as recited in claim 8, wherein the inter-form synchronization tag <fsync> has attribute information defined as the following table:
10. An apparatus for synchronizing visual data with speech data to provide a broadcasting Web sites (BWS) service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service, comprising: a) a content data generator for generating a visual Web document and a speech Web document including synchronization tags related to the visual Web document; b) a multimedia object transfer (MOT) server for transforming the generated visual Web document and the speech Web document as well into an MOT protocol; and c) a transmitting system for identifying the speech Web document and the visual Web document of the MOT protocol based on a sub-channel or a directory and transmitting the speech Web document and the visual Web document independently.
11. The apparatus as recited in claim 10, wherein the synchronization tags include an external document synchronization tag <esync> that provides synchronization information with the visual Web document which should be simultaneously loaded and rendered along with the speech Web document when an application is authored and driven.
12. The apparatus as recited in claim 10, wherein the synchronization tags further include a metadata tag <metasync> that provides metadata on the external document synchronization tags <esync>.
13. The apparatus as recited in claim 10, wherein the synchronization tags include an internal document synchronization tag <isync> that provides synchronization information with the visual Web document related to a corresponding form among the forms of the speech Web document.
14. The apparatus as recited in claim 10, wherein the synchronization tags include an inter-form synchronization tag <fsync> that provides synchronization information between a form of the speech Web document and a form of the visual Web document to secure mutual input synchronized between the form of the speech Web document and the form of the visual Web document.
15. A method for synchronizing visual data with speech data to provide a broadcasting Web sites (BWS) service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service, comprising the steps of: a) receiving and loading a visual Web document and a speech Web document including synchronization tags related to the visual Web document, the visual Web document and the speech Web document being identified based on a sub-channel or a directory and transmitted independently; and b) analyzing the synchronization tags when a synchronization event occurs and performing a corresponding synchronization operation.
16. The method as recited in claim 15, wherein a synchronization tag <metasync> that provides metadata on an external document synchronization tag <esync>, which provides the synchronization information with the visual Web document, are analyzed and the related visual Web document is loaded and rendered in the step b).
17. The method as recited in claim 16, wherein when an internal document synchronization tag <isync> that provides synchronization information with the visual Web document related to a corresponding form among the forms of the speech Web document during execution of the speech input/output is designated, the internal document synchronization tag <isync> is analyzed and the related visual Web document is loaded and rendered.
18. The method as recited in claim 15, wherein when user data are inputted while the application on the speech input/output is driven, synchronization is performed by using an inter-form synchronization tag <fsync> that provides synchronization information between a form of the speech Web document and a form of the visual Web document.
19. An apparatus for synchronizing visual data with speech data to provide a broadcasting Web sites (BWS) service capable of simultaneously inputting/outputting speech data in a multimedia broadcasting service, comprising: a) a baseband receiver for receiving broadcasting signals through a multimedia broadcasting network and performing channel decoding; b) a multimedia object transfer (MOT) decoder for decoding channel-decoded packets and restoring a visual Web document and a speech Web document including synchronization tags related to the visual Web document; and c) an integrated Web browser for analyzing the synchronization tag when a synchronization event occurs and executing a corresponding synchronization operation.
20. The apparatus as recited in claim 19, wherein the integrated Web browser analyzes a metadata tag
<metasync> that provides metadata on an external document synchronization tag <esync> providing synchronization information with the visual Web document, and loads and renders the related visual Web document.
21. The apparatus as recited in claim 20, wherein when an internal document synchronization tag <isync> that provides synchronization information with the visual Web document related to a corresponding form among the forms of the speech Web document during execution of the speech Web document is designated, the integrated Web browser analyzes the internal document synchronization tag <isync>, and loads and renders the related visual Web document.
22. The apparatus as recited in claim 19, wherein when user data are inputted while the application on the speech input/output is driven, the integrated Web browser performs synchronization by using an inter-form synchronization tag <fsync> that provides synchronization information between a form of the speech Web document and a form of the visual Web document.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP06823659A EP1952629A4 (en) | 2005-11-21 | 2006-11-21 | Method and apparatus for synchronizing visual and voice data in dab/dmb service system |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2005-0111238 | 2005-11-21 | ||
KR20050111238 | 2005-11-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2007058517A1 true WO2007058517A1 (en) | 2007-05-24 |
Family
ID=38048864
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2006/004901 WO2007058517A1 (en) | 2005-11-21 | 2006-11-21 | Method and apparatus for synchronizing visual and voice data in dab/dmb service system |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1952629A4 (en) |
KR (1) | KR100862611B1 (en) |
WO (1) | WO2007058517A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100862611B1 (en) * | 2005-11-21 | 2008-10-09 | 한국전자통신연구원 | Method and Apparatus for synchronizing visual and voice data in DAB/DMB service system |
CN111125065A (en) * | 2019-12-24 | 2020-05-08 | 阳光人寿保险股份有限公司 | Visual data synchronization method, system, terminal and computer readable storage medium |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100902732B1 (en) * | 2007-11-30 | 2009-06-15 | 주식회사 케이티 | Proxy, Terminal, Method for processing the Document Object Model Events for modalities |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001008173A (en) * | 1999-06-21 | 2001-01-12 | Mitsubishi Electric Corp | Data transmission equipment |
JP2001267945A (en) * | 2000-03-21 | 2001-09-28 | Clarion Co Ltd | Multiplex broadcast receiver |
KR20050103105A (en) * | 2004-04-24 | 2005-10-27 | 한국전자통신연구원 | Apparatus and method for processing multimodal web-based data broadcasting, and system and method for receiving multimadal web-based data broadcasting |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11215490A (en) * | 1998-01-27 | 1999-08-06 | Sony Corp | Satellite broadcast receiver and its method |
US6357042B2 (en) * | 1998-09-16 | 2002-03-12 | Anand Srinivasan | Method and apparatus for multiplexing separately-authored metadata for insertion into a video data stream |
US7028306B2 (en) * | 2000-12-04 | 2006-04-11 | International Business Machines Corporation | Systems and methods for implementing modular DOM (Document Object Model)-based multi-modal browsers |
EP1483654B1 (en) * | 2002-02-07 | 2008-03-26 | Sap Ag | Multi-modal synchronization |
WO2003071422A1 (en) * | 2002-02-18 | 2003-08-28 | Kirusa, Inc. | A technique for synchronizing visual and voice browsers to enable multi-modal browsing |
US20030187944A1 (en) * | 2002-02-27 | 2003-10-02 | Greg Johnson | System and method for concurrent multimodal communication using concurrent multimodal tags |
US20040128342A1 (en) * | 2002-12-31 | 2004-07-01 | International Business Machines Corporation | System and method for providing multi-modal interactive streaming media applications |
KR20040063373A (en) * | 2003-01-07 | 2004-07-14 | 예상후 | Method of Implementing Web Page Using VoiceXML and Its Voice Web Browser |
KR100561228B1 (en) * | 2003-12-23 | 2006-03-15 | 한국전자통신연구원 | Method for VoiceXML to XHTML+Voice Conversion and Multimodal Service System using the same |
KR100862611B1 (en) * | 2005-11-21 | 2008-10-09 | 한국전자통신연구원 | Method and Apparatus for synchronizing visual and voice data in DAB/DMB service system |
-
2006
- 2006-11-20 KR KR1020060114402A patent/KR100862611B1/en not_active IP Right Cessation
- 2006-11-21 EP EP06823659A patent/EP1952629A4/en not_active Ceased
- 2006-11-21 WO PCT/KR2006/004901 patent/WO2007058517A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001008173A (en) * | 1999-06-21 | 2001-01-12 | Mitsubishi Electric Corp | Data transmission equipment |
JP2001267945A (en) * | 2000-03-21 | 2001-09-28 | Clarion Co Ltd | Multiplex broadcast receiver |
KR20050103105A (en) * | 2004-04-24 | 2005-10-27 | 한국전자통신연구원 | Apparatus and method for processing multimodal web-based data broadcasting, and system and method for receiving multimadal web-based data broadcasting |
Non-Patent Citations (1)
Title |
---|
See also references of EP1952629A4 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100862611B1 (en) * | 2005-11-21 | 2008-10-09 | 한국전자통신연구원 | Method and Apparatus for synchronizing visual and voice data in DAB/DMB service system |
CN111125065A (en) * | 2019-12-24 | 2020-05-08 | 阳光人寿保险股份有限公司 | Visual data synchronization method, system, terminal and computer readable storage medium |
CN111125065B (en) * | 2019-12-24 | 2023-09-12 | 阳光人寿保险股份有限公司 | Visual data synchronization method, system, terminal and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
KR100862611B1 (en) | 2008-10-09 |
KR20070053627A (en) | 2007-05-25 |
EP1952629A1 (en) | 2008-08-06 |
EP1952629A4 (en) | 2011-11-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1143679B1 (en) | A conversational portal for providing conversational browsing and multimedia broadcast on demand | |
US6715126B1 (en) | Efficient streaming of synchronized web content from multiple sources | |
US9300505B2 (en) | System and method of transmitting data over a computer network including for presentations over multiple channels in parallel | |
JP3880517B2 (en) | Document processing method | |
KR101027548B1 (en) | Voice browser dialog enabler for a communication system | |
US8555151B2 (en) | Method and apparatus for coupling a visual browser to a voice browser | |
US8645134B1 (en) | Generation of timed text using speech-to-text technology and applications thereof | |
KR100833500B1 (en) | System and Method to provide Multi-Modal EPG Service on DMB/DAB broadcasting system using Extended EPG XML with voicetag | |
JP2000347972A (en) | Multicast data service and interactive broadcast system using broadcast signal mark-up stream | |
CN101617536B (en) | Method of transmitting at least one content representative of a service, from a server to a terminal, and corresponding device | |
US20200053412A1 (en) | Transmission device, transmission method, reception device, and reception method | |
JP2003044093A5 (en) | ||
WO2007058517A1 (en) | Method and apparatus for synchronizing visual and voice data in dab/dmb service system | |
KR100513045B1 (en) | Apparatus and Method for Providing EPG based XML | |
EP2447940B1 (en) | Method of and apparatus for providing audio data corresponding to a text | |
KR100576546B1 (en) | Data service apparatus for digital broadcasting receiver | |
Lee et al. | Mobile multimedia broadcasting applications: Speech enabled data services | |
EP1696342A1 (en) | Combining multimedia data | |
Kim et al. | An Extended T-DMB BWS for User-friendly Mobile Data Service | |
EP1696341A1 (en) | Splitting multimedia data | |
Matsumura et al. | Restoring semantics to BML content for data broadcasting accessibility | |
Guo et al. | A method of mobile video transmission based on J2ee | |
JP2002182684A (en) | Data delivery system for speech recognition and method and data delivery server for speech recognition | |
WO2007067022A1 (en) | Method for authoring location-based web contents, and appapatus and method for receiving location-based web data service in digital mobile terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2006823659 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |