KR101008753B1

KR101008753B1 - Multimedia data streaming system

Info

Publication number: KR101008753B1
Application number: KR1020100075125A
Authority: KR
Inventors: 임대용
Original assignee: 주식회사 씨앤드디큐브
Priority date: 2010-01-14
Filing date: 2010-08-04
Publication date: 2011-01-14

Abstract

PURPOSE: A multimedia data streaming system is provided to immediately watch a broadcast of HD(High Definition) quality without buffering time, thereby contributing to a domestic IPTV broadcast service. CONSTITUTION: An encoder configures an RTP head of an RTP(Real-time Transport Protocol) packet by a standard fixed head and a standard extension head. An instant on head is placed in front of the standard extension head. The instant on head includes an instant identifier, an instant size, and meta data. The instant size is the size of the meta data. The meta data is compression data of main digital data.

Description

Multimedia Data Streaming System {Multimedia Data Streaming System}

본 발명은 대용량의 멀티미디어 콘텐츠를 IP(Internet Protocol)로 전송하기 위한 파일 압축 및 전송과 재생 방법에 관한 것이다. 현재 해당 기술분야에서는 MPEG-4(H.264) 압축 기술을 통해 영상과 음성, 문자를 디지털 압축(Encoding)하여 HD품질의 멀티미디어를 IP로 전송할 수 있으며, 데이터를 IP로 전송함에 있어서 TCP와 UDP보다 성능이 향상된 RTP(Real- time Transport Protocol) 가 있으며, 세션의 접속과 관리를 위한 SDP (Session Description Protocol)와 RTP의 데이터 신뢰성을 TCP 수준으로 유지하는 실시간 스트리밍 및 VCR 기능의 구현을 위한 RTSP(Real Time Streaming Protocol), 데이터 체증의 제어와 모니터링 등의 제어를 위한 RTCP(Real Time Control Protocol)가 개발되어 IETF(Internet Engineering Task Force)에 의해 표준화 되고, 2009년 IPTV 국제기술표준으로 채택되었다.이는 HD급 품질의 멀티미디어를 IP를 통하여 실시간 스트리밍 전송하여 QoE(Quality of Experience)를 달성하고, VCR 기능, 모니터링 기능, 실시간 제어기능 등을 통해 QoS(Quality of service)의 목적을 달성하는데 있어서 RTP, SDP, RTSP, RTCP 기술 규격이 현존하는 최신의 기술임을 의미한다. The present invention relates to a file compression, transmission and playback method for transmitting a large amount of multimedia content to the Internet Protocol (IP). In the current technical field, MPEG-4 (H.264) compression technology can digitally encode video, voice, and text to transmit HD-quality multimedia to IP, and transmit data to IP through TCP and UDP. There is an improved Real-time Transport Protocol (RTP), and a Session Description Protocol (SDP) for accessing and managing sessions, and an RTSP for real-time streaming and VCR functions that maintain the data reliability of RTP at the TCP level. Real Time Streaming Protocol (RTC), Real Time Control Protocol (RTCP) for control and monitoring of data jams, was developed and standardized by the Internet Engineering Task Force (IETF). QoE (Quality of Experience) is achieved by streaming HD quality multimedia through IP in real time, and Q through VCR function, monitoring function, and real-time control function. In order to achieve the goal of quality of service (OS), it means that RTP, SDP, RTSP, and RTCP technical specifications are the latest technologies.

현재 IPTV 국제표준규격 중 본 발명과 관련이 있는 항목은 멀티미디어 압축 코덱으로 채택된 H.264와 MPEG-4 Series이며, 이에 대한 상세한 규격은 ISO/IEC 14496-1 MPEG-4 Part1-Systems/14496-2 MPEG-4 Part-2-Visual/14496-3 MPEG-4 Part3-Audio/14496-10 MPEG-4 Part10-AVC(H.264)/14496-12 MPEG-4 Part12-ISO Media File Format/14496-14 MPEG-4 Part14-MPEG-4 File Format이다.Among the current IPTV international standards, the relevant items of the present invention are H.264 and MPEG-4 Series, which are adopted as multimedia compression codecs, and the detailed specifications thereof are ISO / IEC 14496-1 MPEG-4 Part1-Systems / 14496-. 2 MPEG-4 Part-2-Visual / 14496-3 MPEG-4 Part3-Audio / 14496-10 MPEG-4 Part10-AVC (H.264) / 14496-12 MPEG-4 Part12-ISO Media File Format / 14496- 14 MPEG-4 Part14-MPEG-4 File Format.

Internet Protocol로 채택된 RTP, SDP, RTSP, RTCP가 있으며, 이에 대한 상세 규격에서 RTP는 IETF RFC 문서번호 3550 / 3551 /5506 / 2862 / 3016 / 3095 / 3243 / 3409 / 3759 / 4362 / 4815 / 5225 / 3558 / 4788 / 5188 / 3640 / 3711 /5506 / 3984 / 4867 / 4103 / 4351 / 4170 / 4348 / 4424 / 4352 / 4383 / 5219이며, There are RTP, SDP, RTSP, and RTCP adopted as Internet Protocol, and in the detailed specification, RTP is IETF RFC Document No. 3550/3551/5506/2862/3016/3095/3243/3409/3759/4362/4815/5225 / 3558/4788/5188/3640/3711/5506/3984/4867/4103/4351/4170/4348/4424/4352/4383/5219

SDP는 IETF RFC 문서번호 3388 / 3407 / 3556 / 3605 / 4566 / 4568 / 4570 / 4572 / 4574 / 4583 / 4796 / 5159 / 5432 / 5547이며,SDP is IETF RFC Document No. 3388/3407/3556/3605/4566/4568/4570/4572/4574/4583/4796/5159/5432/5547

RTSP는 IETF RFC 문서번호 2326 / 4567이며,RTSP is IETF RFC Document No. 2326/4567.

RTCP는 IETF RFC 문서번호 3550 / 3611 / 3711 / 4571 / 4585 / 4586 / 4961 / 5093 / 5124 / 5506이다. RTCP is IETF RFC Document No. 3550/3611/3711/4571/4585/4586/4961/5093/5124/5506.

그러나, 이러한 기술로써도 Public 망(ADSL, VDSL)에서 IPTV 방송을 서비스함에 있어서, 사용자가 디지털 멀티미디어 콘텐츠 혹은 채널을 선택하고 시청하기까지 수초에서 수십 초의 버퍼링 대기시간과 재생 대기시간이 필요하므로 기존 TV 방송을 시청할 때와는 다르게 실시간 채널의 이동 및 선택과 동시에 즉시 시청이 불가능함으로 사용자가 IPTV 방송을 시청할 시 불편함을 겪고 있는 실정이다.However, even with this technology, in providing IPTV broadcasting in public networks (ADSL, VDSL), the existing TV broadcasting is required because it requires a buffering latency of several seconds to several tens of seconds before the user selects and watches digital multimedia contents or channels. Unlike watching TV, users cannot watch IPTV broadcasting at the same time as moving and selecting a real-time channel. Therefore, users are inconvenient when watching IPTV broadcasting.

전 세계적으로 IPTV 시대가 도래함으로 인하여 국가간의 호환성을 담보하고 보다 나은 서비스를 위하여 국제표준화 기구인 ITU-T는 IPTV 국제기술표준화기구를 산하 단체에 두고 각 국가나 단체 및 개인들로부터 IPTV 방송 서비스를 위한 다양한 분야에 해당되는 기술 제안을 접수받아 효용성이 있는 우수한 기술을 IPTV 국제기술표준규격으로 채택하고 있으며, 2009년 디지털 멀티미디어를 압축하는 것에 대한 비디오(H.264)/오디오(MPEG-4 AAC) 코덱과 IP 전송을 위한 프로토콜(RTP/SDP/RTSP/RTCP)에 대한 IPTV 국제기술표준규격을 발표하였다. 이는 곧 IPTV 산업이 아주 오랫동안 지속적으로 성장할 산업임을 전 세계가 증명하는 것으로 시장 규모도 연간 65조 원으로 예측하고 있다.With the advent of the IPTV era globally, ITU-T, an international standardization organization, has the IPTV International Technology Standardization Organization under its umbrella to secure interoperability among countries and provide better services. We received technical proposals for various fields and adopted high-efficiency technologies as IPTV International Technical Standards. In 2009, video (H.264) / Audio (MPEG-4 AAC) on the compression of digital multimedia IPTV International Technical Standards for Codec and IP Transport Protocol (RTP / SDP / RTSP / RTCP) are released. This proves that the IPTV industry will continue to grow for a very long time, and the market size is estimated at 65 trillion won per year.

서유럽 및 북미의 경우 IPTV 국제기술표준규격을 만족하는 제품을 도입하여 우리가 흔히 사용하는 퍼블릭 인터넷 망(공중망이라고도 불림)인 ADSL/VDSL을 기반으로 IPTV 방송 서비스를 제공하고 있으며, 프랑스의 경우 2009년 11월을 기준으로 약 1,500만 명의 가입자가 서비스를 받고 있다. In Western Europe and North America, IPTV broadcasting services are provided based on ADSL / VDSL, a public Internet network (also called public network) that we commonly use by introducing products that meet IPTV international technical standards. In France, 2009 As of November, about 15 million subscribers are receiving services.

국내의 경우 2006년 Pre IPTV 서비스인 하나TV(현재 : SKT's Broad & TV)를 시작으로 KT의 쿡TV와 LGT의 마이엘지TV 서비스가 생겨나게 되었으며, 2008년 통방법의 통과로 인하여 실시간 방송을 할 수 있는 본격적인 IPTV 시대가 열렸으나, 전술한IPTV 3사들의 IPTV 방송 플랫폼은 IPTV 국제기술표준규격에 미치지 못하는 제품으로 구성되어 실시간 방송을 송출하지 못하는 상황에 직면하게 되어 이를 해결하기 위한 방편으로 망 자체에서 관리기능을 가지는 코아가 추가된 프리미엄 망을 새로 포설하여 본격적인 IPTV 서비스를 제공하려 하고 있으며, 2009년 12월을 기준으로 서울지역의 일부에 프리미엄 인터넷 망이 구축되어 실시간 방송 서비스를 포함한 IPTV 방송 서비스가 제공되고 있으나, 지방은 아직도 이러한 서비스를 제공받지 못하는 Pre IPTV 방송 서비스를 제공받고 있다. In Korea, KT's Cook TV and LGT's MyLG TV were launched in 2006, starting with Hana TV (now: SKT's Broad & TV), a pre-IPTV service. Although the era of full-fledged IPTV has been opened, the IPTV broadcasting platforms of the above-mentioned three IPTV companies are composed of products that do not meet the IPTV international technical standards, and thus are faced with a situation in which real-time broadcasting is not transmitted. It is trying to provide a full-fledged IPTV service by newly installing a premium network with cores with functions.As of December 2009, a premium internet network was established in parts of Seoul to provide IPTV broadcasting services including real-time broadcasting services. However, the province is still receiving Pre IPTV broadcasting service, which does not provide such services. All.

따라서, 향후 10년 이상 지속적이고 폭발적으로 성장할 IPTV 산업에서 IT 선진국이라고 불리는 대한민국이 해외에 뒤처지는 상황에 놓여 심히 우려할 만한 중대한 사안으로 대두 되자 각 학계와 업계 및 국책연구소에 의하여 '퍼블릭 인터넷 망에서 오픈 IPTV 방송 서비스'의 필요성이 제기되고 있고, 지상파 방송 3사는 이미 이러한 서비스의 제공을 검토 중에 있으며, 방송통신위원회 산하의 한국전파진흥원은 방통플랫폼 산업과 방통콘텐츠 산업의 육성을 위하여 '퍼블릭 인터넷 망에서 오픈 IPTV 방송송출 센터'를 구축하여 산업계를 지원하는 방안을 심도 있게 검토하고 있는 상황이다. 이는 IPTV 방송 서비스로 인하여 존폐의 위기에 처한 국내 100여 개 케이블 방송사들을 방통융합시대에 동반 성장시킬 수 있는 대안으로써 국가의 균형적인 발전에도 기여할 아주 중요한 사안으로 대두 되고 있다. Therefore, in the IPTV industry, which will continue to grow and explode for the next 10 years, Korea, which is an advanced IT country, is falling behind and emerged as a serious concern. The necessity of 'Open IPTV Broadcasting Service' is being raised, and the three terrestrial broadcasting companies are already considering providing such services. Is developing an open IPTV broadcasting transmission center to support the industry. This is a very important issue that will contribute to the balanced development of the country as an alternative to grow together 100 domestic cable broadcasters, which are in danger of being destroyed due to IPTV broadcasting service, in the age of convergence.

이러한 국내·외의 시장환경에서 전 세계적으로 당사를 포함하여 3개사 만이 디지털 멀티미디어의 압축과 IP 전송분야에 있어서 IPTV 국제표준기술규격에 적합한 제품을 보유하고 있는 상황이며, IPTV 국제기술표준규격에 적합한 제품이라고 할지라도 콘텐츠의 선택과 채널의 전환 시 5초 이상의 버퍼링을 위한 대기시간과 재생을 위한 대기시간이 필요함으로 기존의 아날로그TV와 같이 채널의 전환과 동시에 방송을 시청하지 못하는 불편함이 있는 상황이다. In this domestic and foreign market environment, only three companies worldwide including the company have products that meet the IPTV international standard in the field of digital multimedia compression and IP transmission, and products that meet the IPTV international technical standard. Even if the contents are selected and the channel is switched, the waiting time for the buffering and the playback time is needed for 5 seconds or longer, and thus there is an inconvenience in not being able to watch the broadcast at the same time as the channel switching as in the conventional analog TV. .

현재의 IPTV 방송 서비스는 전술한 문제로 인하여 시청자들에게 서비스의 만족을 선사하지 못한다. 대표적으로, 일반 TV에서는 채널의 선택과 동시에 즉시 시청할 수 있으나 현재 IPTV 방송 서비스로는 사용자가 채널이나 콘텐츠를 선택함과 동시에 즉시 시청할 수 없다. The current IPTV broadcasting service does not provide service satisfaction to viewers due to the above-mentioned problems. Representatively, in general TV, the channel can be viewed immediately at the same time as the channel is selected, but the current IPTV broadcasting service can not be immediately viewed at the same time as the user selects the channel or content.

이러한 현상은 현존하는 기술이 대용량 데이터를 IP로 전송할 시, 어플리케이션 차원에서 스트리밍 엔진이 응답하려면 사용자의 요청에 의하여 네트워크 커널의 응답→네트워크 라이버러리 응답→네트워크 어플리케이션 응답→스트리밍 엔진 가동을 위한 라이버러리 커널 응답→스트리밍 라이버러리 응답→스트리밍 어플리케이션 응답 순서로 동작하여 1초(1,000mm)이상의 응답 지연이 발생한다. This phenomenon is caused by the network kernel response → network library response → network application response → library kernel response to run the streaming engine. In response to streaming library response → streaming application response, response delay of more than 1 second (1,000mm) occurs.

다음 단계로 기존 ISO/IEC 파일 포맷 방식은 송출 시스템이 13MB~30MB의 데이터를 페이로딩 한 후 IP 패킷화 전송하는 방식이므로 3초~5초의 송출 지연 시간이 발생한다.In the next step, the existing ISO / IEC file format is a method of sending IP packets after payloading 13MB to 30MB of data, which causes transmission delay time of 3 to 5 seconds.

다음 단계로 대용량 데이터를 전송할 시 기존 HTTP 방식은 초당 14,900byte(15KB)를 전송하고 RTP 방식은 초당 43,690byte(43KB)이므로, 1초 이내에 192KB(SD급)~1,152KB(1080P HD급) 데이터를 전송하지 못하므로 완전한 영상의 전송을 위하여 버퍼링을 위한 대기시간이 HTTP 방식은 13초~77초, RTP 방식은 5초~27초가 각각 소요된다.The next step is to transfer 14,900bytes (15KB) per second when transferring large amounts of data, and 43,690bytes per second (43KB) per second for the RTP method, so 192KB (SD) to 1,152KB (1080P HD) data within 1 second. Since it cannot transmit, the waiting time for buffering is 13 seconds to 77 seconds for the HTTP method and 5 seconds to 27 seconds for the RTP method.

멀티미디어 파일의 재생에 있어서도 기존 방식에서는 13MB~30MB의 멀티미디어 데이터를 수신하여 해독하고 필요한 자원을 로드하여 재생하므로 데이터 수신을 위한 지연 시간과 해독을 위한 지연 시간이 추가로 발생하여 멀티미디어 재생을 위해서는 통상 수십 초의 버퍼링 대기 시간과 재생 대기 시간이 필요하였다. Even in the playback of multimedia files, the conventional method receives and decrypts 13MB to 30MB of multimedia data, loads and plays necessary resources, and thus additional delay time for receiving data and delay time for decryption occur. A buffering wait time and a refresh wait time were required.

따라서, IPTV 방송 서비스에서 버퍼링을 위한 대기 시간이 없이 사용자가 즉시 시청하기 위하여 해당 송출시스템이 즉시 응답하여야 하고, 상기 송출시스템이 멀티미디어 데이터를 페이로딩 하는 시간을 단축해야 하며, 재생기{플레이어, 셋톱박스}에서 멀티미디어 데이터를 해독하기 위한 시간을 단축하고, 대용량 데이터를 전송할 수 있는 프로토콜 및 그 방법을 제공하는 것이 본 발명의 목적이다.Therefore, in order to immediately watch a user without a waiting time for buffering in an IPTV broadcasting service, a corresponding transmission system must respond immediately, and a time for paying the multimedia data by the transmission system must be shortened. It is an object of the present invention to shorten the time for decrypting multimedia data and to provide a protocol and method for transmitting a large amount of data.

한편, IPTV 주문형 방송 서비스에 있어, 사용자가 원하는 장면을 찾기까지 배속재생을 통한 순차적인 탐색을 해야함으로 이미 시중에 출시된 가정용 DVD Player처럼 실시간 탐색(Jog-shuttle)을 할 수 없어 원하는 장면을 찾는 데에 소요되는 시간만큼을 대기하여야 하는 불편함과 네트워크의 대역폭도 상대적으로 배속재생의 비율만큼 더 필요하게 되므로 개인이 사용하는 인터넷은 속도의 한계에서 벗어날 수 없는 불편함이 내재 되어 있다. 배속재생이 아닌 또 다른 방법으로는 찾고자 하는 장면의 타임 라인을 예측하여 포지션 바를 움직여 검색하는 방법이 있으나 이를 정확하게 예측하기란 어려우며, 정확하게 예측하였다고 하더라도 클라이언트에서 서버로 영상을 요청하여 서버가 응답하는 시간과 버퍼링을 위한 대기시간 그리고 재생을 위한 대기시간 등 최소 5초의 대기시간이 필요한 불편함을 겪고 있는 실정이다.On the other hand, in IPTV on-demand broadcasting service, users must search sequentially through double speed playback until they find the scene they want, so they can't do real-time search (Jog-shuttle) like the home DVD player on the market. Inconvenience to wait as much time as it takes, and network bandwidth also needs to be relatively higher than the ratio of double speed playback, so the Internet used by individuals is inconvenient that cannot escape the speed limit. Another method other than double speed playback is to predict the timeline of the scene to find and move the position bar to search. However, it is difficult to accurately predict the timeline. Even if the prediction is accurate, the time that the server responds by requesting the video from the client to the server At least 5 seconds of waiting time, such as the wait time for the buffering and the waiting time for the playback is suffering from the inconvenience.

현재 IPTV 주문형 방송 서비스는 전술한 문제로 인하여 퍼블릭 망 기반의 IPTV는 물론 프리미엄 망 기반의 IPTV에서도 전·후 영상의 실시간 탐색(Jog-shuttle) 서비스를 제공하지 못한다. 대표적으로 이미 보급된 가정용 DVD Player에서는 죠그 셔틀(Jog-shuttle) 기능을 통하여 전·후 영상을 실시간 탐색할 수 있으나 현재 퍼블릭 망 및 프리미엄 망 기반의 IPTV 방송 서비스에서는 사용자가 원하는 장면을 찾기 위하여 전·후 영상을 실시간 탐색(Jog-shuttle) 할 수 없다.Currently, the IPTV on-demand broadcasting service cannot provide a real-time search (Jog-shuttle) service of the front and rear video even in the public network-based IPTV as well as the public network-based IPTV due to the above-mentioned problems. Representatively, the home DVD player that has already been distributed can search the front and back video in real time through the jog-shuttle function, but in the current public and premium network based IPTV broadcasting service, the user can search for the desired scene. After the video can not be searched (Jog-shuttle) in real time.

이러한 현상은 현존하는 배속재생 기술이 스트리밍 송출 시스템의 하드디스크에 있는 해당 영상의 데이터 모두를 네트워크를 통하여 전송하거나 클라이언트가 요청하는 데이터를 하드디스크의 섹터(Sector)에서 찾아내고 이를 다시 페이로딩(pay-loading) 하여 전송하는 데에 소요되는 시간과 해당 데이터를 클라이언트가 수신할 때까지 발생하는 버퍼링 시간과 재생을 위한 대기시간이 발생하기 때문이다. 이러한 과제를 해결하여 IPTV 주문형 방송 서비스에서도 전·후 영상을 실시간 탐색(Jog-shuttle) 가능하도록 하기 위한 것이 본 발명의 또 다른 목적이다.This phenomenon means that the existing double speed playback technology transmits all the data of the video on the hard disk of the streaming transmission system through the network, or finds the data requested by the client in the sector of the hard disk and pays it again. This is because there is a time required for transmission by loading, and a buffering time until the client receives the corresponding data and a waiting time for playback. Another object of the present invention is to solve the above problems and to enable real-time search (Jog-shuttle) of the front and rear video even in the IPTV on-demand broadcasting service.

본 발명은 Public 인터넷 망에서도 45mm Sec. 이내에 응답 가능하도록 하기 위하여 모노리딕 커널 기반의 리눅스 운영체제에서, 어플리케이션 계층의 송출 프로그램을 커널 계층으로 최적화함으로써, 이로 인해 송출시스템이 즉시 응답할 수 있도록 했으며, 송출시스템이 멀티미디어 파일을 페이로딩 함에 있어 그 시간을 단축하기 위하여 멀티미디어 파일을 압축할 때에 Instant-on 트랙 형태의 파일로 포맷하여 50mm Sec. 이내에 3,072KB(3,145,728byte)만의 데이터를 페이로딩 함으로써, 송출시스템의 응답지연시간과 송출지연시간을 단축하고, 1,152kb/805mm sec.의 대용량을 전송하고, 멀티미디어 재생기(IPTV 방송을 위한 장치)가 13MB~30MB의 멀티미디어 데이터를 수신받아 해독을 준비하는 기존 방식이 아닌 송출시스템에서 데이터를 페이로딩 시 생성된 1KB 크기인 SDP규격의 정보 파일을 수신받아 해독하여 50mm sec. 이내에 재생할 수 있도록 하여 채널의 선택과 동시에 1,000mm sec.이내에 즉시 시청할 수 있도록 구현하였다.The present invention is 45mm Sec. In the Linux operating system based on the monolithic kernel, to make it possible to respond within a short time, the application program's transmission program is optimized to the kernel layer so that the transmission system can respond immediately, and the time when the transmission system pays multimedia files. When compressing a multimedia file to reduce the speed of the file, format it as a file in the form of an Instant-on track. By payloading only 3,072 KB (3,145,728 bytes) of data within a short time, the response delay time and transmission delay time of the transmission system can be shortened, and a large capacity of 1,152 kb / 805 mm sec. Is transmitted, and a multimedia player (device for IPTV broadcasting) Receives and decrypts SDP standard information file, which is 1KB size, generated when payloading data in the transmission system, which is not the conventional method of receiving 13MB ~ 30MB multimedia data and preparing for decryption. It can be played back within 1,000mm sec.

이를 위하여, 본 발명은 멀티미디어 데이터를 메인 디지털 데이터로 변환시키고 상기 메인 디지털 데이터를 이용하여 RTP패킷을 생성하는 인코더, 상기 인코더로부터 상기 RTP패킷을 전송받으며, 상기 RTP패킷의 헤드정보를 해석하는 힌티드모듈을 가지는 스트리밍 서버, 상기 스트리밍 서버로부터 RTP패킷을 전송받아 복원하여 재생하는 클라이언트를 포함하는 멀티미디어 데이터 스트리밍 시스템에 있어서, 상기 인코더는 상기 RTP패킷의 RTP헤드를 표준고정헤드 및 표준확장헤드로 구성하고, 상기 표준확장헤드 앞에 식별자(ID), 길이(Length) 및 메타 데이터로 구성된 인스턴트온 헤드가 구비되며, 상기 길이(Length)는 상기 메타 데이터의 크기이고, 상기 메타 데이터는 상기 메인 디지털 데이터를 압축한 데이터인 것을 특징으로 하는 멀티미디어 데이터 스트리밍 시스템을 제공한다.To this end, the present invention is an encoder for converting multimedia data into main digital data and generating an RTP packet using the main digital data, receiving the RTP packet from the encoder, and interpreting the head information of the RTP packet. In the multimedia data streaming system comprising a streaming server having a module, a client for receiving and restoring and playing the RTP packet from the streaming server, wherein the encoder comprises the RTP head of the RTP packet as a standard fixed head and a standard expansion head And an instant-on head including an identifier (ID), a length, and metadata in front of the standard expansion head, wherein the length is a size of the metadata, and the metadata compresses the main digital data. Multimedia data characterized in that the data Provide a reaming system.

상기 메타 데이터는 표준확장헤드에서 이름(Name) 및 길이(Length)의 바로 다음 필드의 메인 디지털 데이터를 압축한 것이 바람직하다.Preferably, the meta data is the main digital data of the field immediately following the name and length in the standard extension head.

상기 인스턴트온 헤드의 식별자(ID), 길이(Length) 및 메타 데이터는 하나의 필드로 구성되는 것이 바람직하다.Preferably, the identifier ID, length, and metadata of the instant-on head are composed of one field.

상기 엔코딩 시스템은 상기 표준확장헤드의 이름(Name)에 대응하는 식별자(ID)를 8비트 크기로 생성할 수 있다.The encoding system may generate an identifier (ID) corresponding to the name of the standard extension head with an 8-bit size.

상기 식별자(ID) 중 처음 4비트는 비디오, 오디오 또는 텍스트 데이터 종류에 따라 동일하게 정해지는 트랙아이디인 것이 바람직하다.The first 4 bits of the identifier (ID) are preferably track IDs which are determined according to video, audio or text data types.

상기 스트리밍 서버는 상기 트랙아이디가 지정된 SDP파일을 생성하여 상기 트랙아이디에 따라 별도의 포트로 전송하는 것이 바람직하다.The streaming server preferably generates an SDP file to which the track ID is assigned and transmits it to a separate port according to the track ID.

상기 스트리밍 시스템은 클라이언트의 요청에 의한 스트리밍 시스템의 즉시 응답을 위하여 스트리밍 엔진을 리눅스 운영체제의 네트워크 커널에 임베디드(Embedded) 하는 것이 바람직하다.The streaming system preferably embeds the streaming engine in the network kernel of the Linux operating system in order to immediately respond to the streaming system at the request of a client.

상기 클라이언트는 상기 SDP 파일을 수신하여 스트리밍 시스템에서 전송되는 스트리밍 미디어의 정보를 획득하고 이에 대한 재생의 준비를 완료하여 즉시 재생하는 것이 바람직하다.Preferably, the client receives the SDP file, acquires information of the streaming media transmitted from the streaming system, and prepares for the playback of the streaming media.

상기 스트리밍 서버는 하드디스크에 저장된 상기 멀티미디어 데이터의 I-frame의 주소를 SDP파일에 저장하여 클라이언트로 전송할 수 있다.The streaming server may store an I-frame address of the multimedia data stored in a hard disk in an SDP file and transmit the same to a client.

상기 인스턴트온헤드는 상기 표준확장헤드의 사이에 반복하여 구비되고,The instant on head is repeatedly provided between the standard expansion head,

상기 클라이언트에서 포지션 바의 이동이 있는 경우, 상기 스트리밍 서버는 상기 SDP파일에 저장된 I-frame의 주소를 이용하여 하드디스크에서 상기 포지션 바의 위치에 해당하는 상기 인스턴트온 헤드의 I-frame을 페이로딩하여 클라이언트로 전송하는 것이 바람직하다.When there is a movement of the position bar on the client, the streaming server pays the I-frame of the instant-on head corresponding to the position of the position bar on the hard disk using the address of the I-frame stored in the SDP file. It is desirable to send to the client.

본 발명에 의하면 현재의 Public 인터넷 망에서 IPTV 방송을 시청하는 데 있어서 HD 품질의 방송도 버퍼링을 위한 대기 시간이 없이 사용자가 콘텐츠 및 채널을 선택하는 즉시 시청할 수 있게 함으로써, 국내의 IPTV 방송 서비스에 기여함은 물론, Premium 망을 구축할 수 없어도 쾌적한 IPTV 시청환경을 제공할 수 있다. According to the present invention, in watching an IPTV broadcast in the current public Internet network, HD quality broadcasts can be viewed immediately after the user selects content and channels without waiting for buffering, thereby contributing to domestic IPTV broadcasting service. Of course, it is possible to provide a comfortable IPTV viewing environment even if Premium network cannot be established.

한편, IPTV 시청자들은 아날로그 TV와 DVD Player를 경험하여 IPTV 방송에서도 기존의 DVD Player처럼 전·후 영상을 실시간 탐색하는 Jog-shuttle 기능을 요구하고 있으나, 현재 국내 및 국외의 모든 IPTV 주문형 방송 서비스에서 이를 지원하지 못하여 사용자들의 불편함을 초래하고 있으며, IPTV 서비스만이 가지고 있는 특징을 기술적인 한계로 모두 구현하지 못하고 있어 다양한 서비스 제공에 제약을 받고 있다. 이처럼 사용자들의 요구를 충족시키지 못하는 상황에서 본 발명은 전술한 기술적 한계를 극복하여 국내 IPTV 방송의 활성화에 기여함은 물론 해외의 IPTV 방송 플랫폼 분야에 독점적인 수출이 가능하도록 하는 효과가 있다.On the other hand, IPTV viewers have experienced analog TV and DVD player and demand Jog-shuttle function to search the front and rear video in real time like the existing DVD player in IPTV broadcasting. It is not supported, causing inconvenience to users, and it is limited in providing various services because it cannot implement all the features of IPTV service only due to technical limitations. In this situation, the present invention can not only meet the needs of users, but also contribute to the activation of domestic IPTV broadcasting by overcoming the above technical limitations, and has the effect of enabling exclusive export to the IPTV broadcasting platform field abroad.

도 1은 비디오 인코더를 나타내는 블록 다이어그램;
도 2는 H.264 인코더를 나타내는 구성도;
도 3은 H.264 디코더를 나타내는 구성도;
도 4는 H.264 Baseline, Main 및 Extended 프로파일을 나타내는 설명도;
도 5는 4:2:0 필드구성을 나타내는 개념도;
도 6은 NAL units의 순서도;
도 7은 RTSP 동작과정을 나타내는 설명도;
도 8은 TCP/UDP/RTP 헤더 포맷을 나타내는 구성도;
도 9는 SDP 세션 기술방법을 나타내는 설명도;
도 10은 디지털 멀티미디어 실시간 압축, 스트리밍, 재생 시스템을 나타내는 구성도;
도 11은 실시간 스트리밍 시스템 구현방법을 나타내는 설명도;
도 12는 실시간 스트리밍 서버를 나타내는 구조도;
도 13은 RTSP서버의 Start-Up/ShutDown을 나타내는 순서도;
도 14는 RTSP서버의 처리절차를 나타내는 순서도;
도 15는 RTSP프로세서 처리절차를 나타내는 순서도;
도 16은 RTSP 필터 롤을 나타내는 예시도;
도 17은 RTSP Route role을 나타내는 예시도;
도 18은 RTSP Preprocessor role을 나타내는 예시도;
도 19는 RTSP Request role을 나타내는 예시도;
도 20은 RTSP Postprocessor role을 나타내는 예시도;
도 21은 RTSP Send Packet role을 나타내는 예시도;
도 22는 RTSP Processor role을 나타내는 예시도;
도 23은 SDP Preocessor role을 나타내는 예시도;
도 24는 RTP 확장헤더를 나타내는 설명도;
도 25는 RTP 표준형식 데이터를 나타내는 설명도;
도 26은 이름 필드값을 나타내는 설명도;
도 27은 인스턴트온 확장헤더를 나타내는 설명도;
도 28은 인스턴트온 확장헤더와 표준 확장헤더를 나타내는 설명도;
도 29는 인스턴트온 Hinted role을 나타내는 예시도;
도 30은 본 발명에 따른 멀티미디어 데이터 스트리밍 시스템의 구성을 나타내는 구성도;
도 31은 본 발명에 따른 인스턴트온 확장헤더를 사용한 실시예를 나타내는 설명도;
도 32는 본 발명에 따른 SDP파일을 나타내는 예시도.1 is a block diagram illustrating a video encoder;
2 is a block diagram illustrating an H.264 encoder;
3 is a block diagram illustrating an H.264 decoder;
4 is an explanatory diagram showing an H.264 Baseline, Main, and Extended profile;
5 is a conceptual diagram illustrating a 4: 2: 0 field configuration;
6 is a flowchart of NAL units;
7 is an explanatory diagram showing an RTSP operation process;
8 is a block diagram showing a TCP / UDP / RTP header format;
9 is an explanatory diagram showing an SDP session description method;
10 is a block diagram showing a digital multimedia real-time compression, streaming, and playback system;
11 is an explanatory diagram showing a method of implementing a real-time streaming system;
12 is a structural diagram showing a real time streaming server;
13 is a flowchart showing Start-Up / ShutDown of the RTSP server;
14 is a flowchart showing a processing procedure of an RTSP server;
15 is a flowchart showing an RTSP processor processing procedure;
16 illustrates an RTSP filter roll.
17 shows an example of an RTSP Route role;
18 is an exemplary diagram illustrating an RTSP Preprocessor role;
19 shows an example of an RTSP Request role;
20 is an exemplary diagram illustrating an RTSP Postprocessor role;
21 is an exemplary diagram illustrating an RTSP Send Packet role;
22 is an exemplary diagram illustrating an RTSP Processor role;
23 is an exemplary diagram illustrating an SDP Preocessor role;
24 is an explanatory diagram showing an RTP extension header;
25 is an explanatory diagram showing RTP standard format data;
Fig. 26 is an explanatory diagram showing a name field value;
27 is an explanatory diagram showing an instant-on extension header;
28 is an explanatory diagram showing an instant-on extension header and a standard extension header;
29 illustrates an instant on Hinted role;
30 is a block diagram showing a configuration of a multimedia data streaming system according to the present invention;
31 is an explanatory diagram showing an embodiment using an instant-on extension header according to the present invention;
32 shows an SDP file according to the present invention.

본 발명에서 사용되는 용어는 본 발명에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어를 선택하였으나, 이는 해당분야에 종사하는 기술자의 의도 또는 관례 또는 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 발명의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 발명에서 사용되는 용어는 단순한 용어의 명칭이 아닌 그 용어가 가지는 의미와 본 발명의 전반에 걸친 내용을 토대로 정리되어야 함을 밝혀 두고자 한다.The terms used in the present invention have been selected as widely used general terms as possible in consideration of functions in the present invention, but may vary according to the intention or custom of the person skilled in the art or the emergence of new technology. In addition, in certain cases, there is a term arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the corresponding description of the invention. Therefore, it is intended that the terms used in the present invention should be arranged based on the meanings of the terms and the general contents of the present invention, rather than the names of the simple terms.

본 발명에서 사용되는 용어 중 IP(Internet Protocol)는 데이터를 네트워크로 전송 할 때에 사용하는 프로토콜을 지칭하는 것으로, 통상 파일을 다운받을 때에는 FTP(File Transfer Protocol)을 사용하고, 웹 사이트를 서핑할 때에는 HTTP(Hyper Text Transfer Protocol)을 사용한다.Among the terms used in the present invention, IP (Internet Protocol) refers to a protocol used when transferring data to a network. In general, when a file is downloaded, FTP (File Transfer Protocol) is used, and when a web site is surfed Use Hyper Text Transfer Protocol (HTTP).

본 발명에서 사용하는 용어 중 Digital Multimedia Contents는 통상 필름 등에 저장되어, 주파수의 차이로써 케이블이나 공중파를 통해 전송되는 아날로그 멀티미디어와 달리 통상 하드디스크(CD, DVD, USB Memory)등에 저장되어 데이터로서 전송되며 음성, 영상, 글자, 그래픽 등이 합성된 영화, 애니메이션, 그래픽, 이미지 등을 포함한다.Among the terms used in the present invention, Digital Multimedia Contents is usually stored in a film or the like, unlike analog multimedia transmitted over a cable or over the air due to a difference in frequency, it is usually stored in a hard disk (CD, DVD, USB Memory) and transmitted as data. It includes films, animations, graphics, images, etc., in which audio, video, text, and graphics are synthesized.

본 발명에서 사용되는 용어 중 Encoder는 아날로그 신호를 입력받아 디지털 멀티미디어로 압축하여 저장하는 장비이며, Encoding은 아날로그 멀티미디어를 디지털 멀티미디어로 압축하는 작업이다. Decoding은 디지털 멀티미디어를 재생하기 위한 작업을 말하며 플레이어, 셋톱박스 등이 사용된다. Trans-Coding은 디지털 멀티미디어를 별도의 다른 코덱으로 압축하거나 별도의 다른 포맷으로 변환하는 작업을 말한다.Among the terms used in the present invention, Encoder is a device that receives an analog signal and compresses and stores it in digital multimedia, and Encoding is an operation of compressing analog multimedia into digital multimedia. Decoding refers to the task of playing digital multimedia. Players, set-top boxes, etc. are used. Trans-Coding refers to the act of compressing or converting digital multimedia into a separate codec or into a separate format.

본 발명에서 사용되는 용어 중 Streaming은 데이터를 다운받을 때에 영상의 첫 화면만 받아 즉시 실시간으로 시청할 수 있도록 물이 흐르듯이 지속적으로 전송하는 방식이며, Broadcaster는 아날로그 신호를 입력받아 디지털 멀티미디어로 실시간 압축하여 IP 패킷으로 전송하는 장비이다. Streaming server는 Broadcaster로부터 수신된 스트리밍 데이터를 접속자에게 실시간 중계 전송하는 기능과 Encoder로부터 수신된 디지털 멀티미디어 파일을 사용자의 요청에 따라 실시간으로 스트리밍 하는 장비이다.Among the terms used in the present invention, Streaming is a method of continuously receiving water as it flows so that only the first screen of an image is downloaded and immediately viewed in real time, and a broadcaster receives an analog signal and compresses it in digital multimedia in real time. This equipment transmits IP packets. The streaming server is a device that transmits the streaming data received from the broadcaster to the accessor in real time and streams the digital multimedia file received from the encoder in real time according to the user's request.

본 발명에서 사용하는 용어 중 IPTV는 디지털 멀티미디어를 IP 네트워크를 통하여 전송하며, 셋톱박스와 TV 수상기를 사용하여 디지털 멀티미디어를 시청할 수 있도록 하는 방송형태이다.
Among the terms used in the present invention, IPTV is a broadcasting format that transmits digital multimedia through an IP network and enables the user to watch digital multimedia using a set-top box and a TV receiver.

본 발명에서 즉시 시청을 위한 데이터 전송량을 계산함에 있어서, SD(Standard Definition)급은 아날로그 TV의 화질과 동일한 화질로서 픽셀을 단위로 했을 때, 740×480 혹은 640×480의 크기를 가지며 24fps(초당 플레임 수) 혹은 30fps으로 구성된다. 디지털 멀티미디어 영상을 H.264 Video CODEC과 MPEG-4 AAC Audio CODEC을 사용하여 압축하였을 때 SD급 영상의 화질과 품질을 보장하기 위해 1.5Mbps, 192KB/sec, 30fps 일 때 1fps는 6,554byte/33.33mm sec의 data rate를 가진다.In calculating the data transmission amount for instant viewing in the present invention, SD (Standard Definition) class has the same image quality as that of analog TV and has a size of 740 × 480 or 640 × 480 when the unit of pixels is 24fps (per second). Number of frames) or 30 fps. 1fps is 6,554bytes / 33.33mm at 1.5Mbps, 192KB / sec, and 30fps to ensure the quality and quality of SD video when digital multimedia video is compressed using H.264 Video Codec and MPEG-4 AAC Audio Codec. has a data rate of sec.

D1급은 DVD의 화질로서 720×480의 크기를 가지며 24fps 혹은 30fps으로 구성된다. 디지털 멀티미디어 영상을 H.264 Video CODEC과 MPEG-4 AAC Audio CODEC을 사용하여 압축하였을 때 D1급 영상의 화질과 품질을 보장하기 위해 2Mbps, 256KB/sec, 30fps일 때 1fps는 8,738byte/33.33mm sec의 data rate를 가진다.D1 level is the picture quality of DVD and has the size of 720 × 480 and is composed of 24fps or 30fps. 1fps is 8,738bytes / 33.33mm sec at 2Mbps, 256KB / sec, and 30fps to ensure the quality and quality of D1 level video when digital multimedia video is compressed using H.264 Video Codec and MPEG-4 AAC Audio Codec. Has a data rate of.

480P는 ED-TV의 화질로서 848×480의 크기를 가지며 24fps 혹은 30fps으로 구성된다. 디지털 멀티미디어 영상을 H.264 Video CODEC과 MPEG-4 AAC Audio CODEC을 사용하여 압축하였을 때 480P급 영상의 화질과 품질을 보장하기 위해서는 3Mbps, 384KB/sec, 30fps일 때 1fps는 13,107byte/33.33mm sec의 data rate를 가진다.480P is the quality of ED-TV, which is 848 × 480 and consists of 24fps or 30fps. When digital multimedia video is compressed using H.264 Video Codec and MPEG-4 AAC Audio Codec, 1fps is 13,107byte / 33.33mm sec at 3Mbps, 384KB / sec, and 30fps to ensure the quality and quality of 480P video. Has a data rate of.

720P는 HD(High Definition)급의 화질로서 1280×720의 크기를 가지며 24fps, 30fps, 60fps으로 구성된다. 디지털 멀티미디어 영상을 H.264 Video CODEC과 MPEG-4 AAC Audio CODEC을 사용하여 압축하였을 때 720P급 영상의 화질과 품질을 보장하기 위해서는 6Mbps, 1,152KB/sec, 30fps일 때 1fps는 26,214byte/33.33mm sec의 data rate를 가진다.720P is HD (High Definition) quality and has a size of 1280 × 720 and is composed of 24fps, 30fps, and 60fps. To ensure the quality and quality of 720P video when digital multimedia video is compressed using H.264 Video Codec and MPEG-4 AAC Audio Codec, 1fps is 26,214byte / 33.33mm at 6Mbps, 1,152KB / sec, and 30fps. has a data rate of sec.

1080P는 HD급의 화질로서 1920×1080의 크기를 가지며 24fps 혹은 30fps으로 구성된다. 디지털 멀티미디어 영상을 H.264 Video CODEC과 MPEG-4 AAC Audio CODEC을 사용하여 압축하였을 때 1080P급 영상의 화질과 품질을 보장하기 위해서는 9Mbps, 1,152KB/sec, 30fps일 때 1fps는 39,322byte/33.33mm sec의 data rate를 가진다.1080P is HD quality and has 1920 × 1080 size and consists of 24fps or 30fps. When digital multimedia video is compressed using H.264 Video Codec and MPEG-4 AAC Audio Codec, to ensure the quality and quality of 1080P video, 1fps is 39,322byte / 33.33mm at 9Mbps, 1,152KB / sec, and 30fps. has a data rate of sec.

본 발명의 수송능력에 있어서, RTP(Real-time Transport Protocol) 데이터는 최대 1,466byte로 확장할 수 있으며, RTP의 수송능력은 1,466byte/1mm sec, 48,867byte/33.33mm sec이므로 30프레임 및 60프레임으로 이루어진 1080P 멀티미디어의 한 장면을 실시간으로 전송할 수 있다.
In the transport capacity of the present invention, RTP (Real-time Transport Protocol) data can be expanded to a maximum of 1,466 bytes, the transport capacity of the RTP is 1,466byte / 1mm sec, 48,867 bytes / 33.33mm sec, so 30 frames and 60 frames One scene of 1080P multimedia can be transmitted in real time.

본 발명에서 실시하는 멀티미디어 압축의 코덱(CODEC)에 대한 일반적인 사항중, 압축(compression)이란 데이터를 보다 적은 수의 비트로 만들거나 압축하는 과정을 의미한다. '처리되지 않은(raw)' 또는 압축되지 않은 디지털 비디오는 일반적으로 큰 비트율(1초 분량의 압축되지 않은 TV 화질의 비디오를 위해 약 216Mbits가 필요)을 필요로 하므로 디지털 비디오의 저장 및 전송을 위해서는 압축이 필요하다.Among general matters for a codec of multimedia compression implemented in the present invention, compression refers to a process of making or compressing data into fewer bits. 'Raw' or uncompressed digital video typically requires a large bit rate (about 216 Mbits required for one second of uncompressed TV-quality video). Compression is required.

압축은 서로 상대되는 시스템인 압축기(인코더)와 복원기(디코더)를 필요로 한다. 인코더는 저장 및 전송을 하기 전에 원본 데이터를 압축된 형태(감소된 수의 비트를 사용)로 변환하고, 디코더는 압축된 형태의 데이터를 원래의 비디오 데이터로 변환한다. 이러한 한 쌍의 인코더/디코더를 코덱이라 한다.Compression requires a compressor (encoder) and a decompressor (decoder), which are mutually opposing systems. The encoder converts the original data into compressed form (using a reduced number of bits) before storing and transmitting, and the decoder converts the compressed form data into the original video data. This pair of encoders / decoders is called a codec.

네트워크의 비트율(bitrate)은 계속 증가하고 있으며, 가정에서의 고속 네트워크 연결은 흔한 일이 되었고, 하드디스크, 플래시 메모리, 광학 미디어의 저장용량은 매우 크다. 전송되거나 저장되는 비트(bit) 당 가격이 계속하여 하락하고 있으며, 비디오 압축은 압축되지 않은 '원래 그대로의(raw)' 비디오를 지원하지 않은 전송환경 및 저장환경에서 디지털 비디오를 사용하는 것이 가능하도록 해준다. 예를 들어 현재의 인터넷 처리 속도는 압축되지 않은 비디오를 실시간으로 다루기에 충분하지 않다. Network bitrates continue to increase, and high-speed network connections at home have become commonplace, and the storage capacity of hard disks, flash memory, and optical media is very large. Prices continue to drop per bit transmitted or stored, and video compression allows the use of digital video in transmission and storage environments that do not support uncompressed 'raw' video. Do it. For example, current Internet throughput is not sufficient to deal with uncompressed video in real time.

DVD(Digital Versatile Disk)는 TV 화질의 해상도와 프레임 레이트에서 단지 몇 초 분량의 원본 비디오만을 저장할 수 있으므로 비디오와 오디오의 압축 없이는 DVD, 비디오 저장은 비실용적이다. 압축은 전송매체와 저장매체를 효율적으로 사용할 수 있게 해주고 저장용량과 전송용량이 향상되어도 멀티미디어의 핵심요소로서 중요한 역할을 한다. Digital Versatile Disk (DVD) can only store a few seconds of original video at TV quality resolution and frame rate, making DVD and video storage impractical without compressing video and audio. Compression makes efficient use of transmission media and storage media and plays an important role as a key element of multimedia even if storage capacity and transmission capacity are improved.

압축 알고리즘의 목적은 압축과정에 의해 발생하는 왜곡을 최소화하여 효율적인 압축을 얻고 시간적 영역, 공간적 영역, 주파수 영역 내의 전달 신호의 신호로부터 중복성을 제거함으로써 압축된다. 예를 들어 동일한 프레임 상에서 배경 영역에 저역통과필터(low-pass filter)를 적용하여 고주파 성분의 내용 정보(information)를 제거하여도 인간의 눈과 뇌(Human Visual System)는 저주파에 더 민감하므로 이미지를 알아보는 데에는 문제가 없다. The purpose of the compression algorithm is to compress by minimizing distortion caused by the compression process to obtain efficient compression and to remove redundancy from the signals of the transmitted signals in the temporal, spatial and frequency domains. For example, even if a low-pass filter is applied to the background area on the same frame to remove information of high frequency components, the human eye and the brain (Human Visual System) are more sensitive to low frequencies. There is no problem to find out.

비디오 코덱은 원본 이미지나 비디오 영상을 압축된 형태로 코딩하고 이것을 다시 복원하여 원본 영상의 복사본 또는 원본에 가까운 영상을 만들어 낸다. 복원된 비디오 영상이 원본과 동일하다면 무손실 코딩방법이 사용된 것이고, 복원된 영상이 원본과 다르다면 손실코딩방법이 사용된 것이다. The video codec codes the original image or video image in compressed form and restores it again to produce a copy of the original image or an image close to the original. If the reconstructed video image is identical to the original, a lossless coding method is used. If the reconstructed video is different from the original, a lossy coding method is used.

코덱은 모델을 사용하여 원본 비디오 영상을 표현하는데, 여기서 모델이란 원본 비디오를 효과적으로 부호화하여 표현함으로써 비디오 데이터를 복원하는데 사용될 수 있도록 하는 것을 의미한다. 이상적으로는 모델은 가능한 적은 비트로 가능한 원본에 충실하게 영상을 표현해야 한다. 압축된 비트율이 낮을수록 디코더에서 복원되는 이미지의 화질이 낮으므로 이러한 두 가지 목표(압축 효율과 화질)는 일반적으로 상충된다. 도 1과 같이 비디오 인코더는 시간적 모델(temporal model), 공간적 모델(spatial model) 그리고 엔트로피 인코더(entropy encoder)의 세 개의 주요 부분으로 구성되어 있다. The codec uses a model to represent the original video image, which means that the model can be used to reconstruct the video data by effectively encoding and representing the original video. Ideally, the model should represent the image as faithfully as possible to the original with as few bits as possible. The lower the compressed bit rate, the lower the image quality of the image reconstructed by the decoder, so these two goals (compression efficiency and image quality) generally conflict. As shown in FIG. 1, a video encoder is composed of three main parts: a temporal model, a spatial model, and an entropy encoder.

시간적 모델의 입력은 압축되지 않은 비디오 영상이다. 시간적 모델은 인접하는 비디오 프레임 사이의 유사성을 이용하여 시간적 중복요소를 제거하는데 일반적으로 현재 비디오 프레임을 예측하는 방법을 사용한다.The input of the temporal model is an uncompressed video image. The temporal model generally uses a method of predicting the current video frame to remove temporal overlap using similarity between adjacent video frames.

MPEG-4 Visual과 H.264에서는 하나 또는 그 이상의 이전 프레임 혹은 미래의 프레임으로부터 예측을 수행하며 프레임 사이의 차이를 보상함으로써 예측을 향상시킨다. 시간적 모델의 출력은 오차 프레임, 그리고 움직임이 어떻게 보상되었는지를 알려 주는 움직임 벡터와 같은 일련의 모델 파라미터 등이다.MPEG-4 Visual and H.264 improve prediction by performing predictions from one or more previous or future frames and compensating for differences between frames. The output of the temporal model is an error frame and a set of model parameters such as a motion vector that tells how the motion was compensated.

오차 프레임은 공간적 모델의 입력이 가능한데, 공간적 모델은 오차 프레임 내의 인접 샘플 사이의 유사성을 이용하여 공간적 중복 요소를 제거한다. MPEG-4 Visual H.264에서는 오차 샘플에 대해 변환을 수행하고 그 결과를 양자화함으로써 공간적 중복 요소를 제거한다. The error frame allows input of a spatial model, which uses the similarity between adjacent samples in the error frame to eliminate spatial redundancy. MPEG-4 Visual H.264 eliminates spatial redundancy by performing transforms on error samples and quantizing the results.

변환 과정에서 샘플은 또 다른 영역으로 변환되어 변환 계수(transform cofficient)로 표현된다. 계수는 양자화되어 중요하지 않은 값들이 제거되고 적은 수의 중요한 계수 값들만 남게 된다. 이렇게 함으로써 오차 프레임을 더 간략하게 표현할 수 있다. 공간적 모델의 출력은 일련의 양자화된 변환 계수 값들이다. During the transformation, the sample is transformed into another domain and represented as transform cofficient. Coefficients are quantized so that non-significant values are removed and only a few significant coefficient values remain. In this way, the error frame can be represented more simply. The output of the spatial model is a series of quantized transform coefficient values.

시간적 모델의 파라미터(일반적으로 움직임 벡터)와 공간적 모델의 파라미터(계수)는 엔트로피 인코더에 의해 압축된다. 엔트로피 인코더는 데이터에 존재하는 통계적 또는 압축된 파일을 생성한다. 압축된 데이터는 코딩된 움직임 벡터 파라미터, 코딩된 오차 계수 그리고 헤더(header) 정보로 구성되어 있다. 비디오 디코더는 압축된 비트 스트림으로부터 비디오 프레임을 복원한다. The parameters of the temporal model (generally motion vectors) and the parameters of the spatial model (coefficients) are compressed by an entropy encoder. The entropy encoder produces a statistical or compressed file present in the data. Compressed data consists of coded motion vector parameters, coded error coefficients, and header information. The video decoder recovers video frames from the compressed bit stream.

계수와 움직임 벡터가 엔트로피 디코더에 의해 복원되고, 오차 프레임을 복원하기 위해 공간적 모델이 디코딩된다. 디코더는 현재 프레임의 예측 프레임을 생성하기 위해 움직임 벡터 파라미터와 하나 또는 그 이상의 복원된 이전 프레임을 사용한다. 현재 프레임은 이렇게 만들어진 예측 프레임에 오차 프레임을 더함으로써 복원된다.
The coefficients and motion vectors are reconstructed by the entropy decoder and the spatial model is decoded to reconstruct the error frame. The decoder uses the motion vector parameter and one or more reconstructed previous frames to produce the predictive frame of the current frame. The current frame is reconstructed by adding an error frame to the prediction frame thus made.

본 발명에서 사용된 H.264 CODEC 비트스트림의 신택스(syntax)와 이러한 비트 스트림을 디코딩 하는 방법을 정의하면, 일반적으로 인코더와 디코더는 도 2와 도 3에 나타난 기능 요소들을 포함한다. When defining the syntax of the H.264 CODEC bitstream used in the present invention and a method of decoding such a bitstream, the encoder and decoder generally include the functional elements shown in Figs.

블록현상 제거 필터(디블럭킹 필터 : Deblocking filter)를 제외하면, 대부분의 기능 요소들(예측, 변환, 양자화, 엔트로피 코딩)은 이전의 표준안들(MPEG-1, MPEG-2, MPEG-4, H.261, H.263)에도 존재하지만 H.264의 중요한 변화는 각 기능 블록의 세부적인 부분에서 일어난다. 도 2의 인코더는 '순방향'의 경로(왼쪽에서 오른쪽)와 '복원' 경로(오른쪽에서 왼쪽)의 두 개의 데이터 흐름 경로를 포함한다. 입력 프레임 또는 필드 Fn은 매크로 블록 단위로 처리된다. 각 매크로 블록은 인트라 모드 또는 인터 모드로 인코딩되며, 매크로 블록 내의 각 블록에 대한 예측 블록 PRED{도 2의 'P'}는 복원된 픽쳐 샘플에 의해 형성되고 인트라 모드에서 PRED는 현재 슬라이스에 존재하는 이전에 인코딩되고 디코딩되어 복원된 샘플로부터 생성된다.Except for the deblocking filter, most of the functional elements (prediction, transform, quantization, entropy coding) are covered by previous standards (MPEG-1, MPEG-2, MPEG-4, H). .261, H.263), but significant changes in H.264 occur in the details of each functional block. The encoder of FIG. 2 includes two data flow paths, a 'forward' path (left to right) and a 'restore' path (right to left). The input frame or field Fn is processed in macro block units. Each macro block is encoded in intra mode or inter mode, where the predictive block PRED ('P' in FIG. 2) for each block in the macro block is formed by the reconstructed picture sample and in intra mode the PRED is present in the current slice. Generated from previously encoded, decoded and reconstructed samples.

인터모드에서 PRED는 List 0 그리고/또는 List 1의 참조 프레임 중 선택된 하나 또는 두 개의 참조 프레임으로부터 움직임 보상을 예측하여 생성된다. 그림에서는 참조 픽쳐가 이전에 인코딩된 픽쳐 F´n-₁로 나타나 있지만 각 매크로 블록 파티션(인터 모드에서)에 대한 예측 참조 프레임은 이미 인코딩되고 복원되어 필터를 거친(디스플레이되는 순서) 과거 또는 미래의 픽쳐로부터 선택될 수 있다.In the intermode PRED is generated by predicting motion compensation from one or two reference frames selected from List 0 and / or List 1 reference frames. In the figure, the reference picture is shown as the previously encoded picture F´n- 예측, but the predictive reference frame for each macroblock partition (in inter mode) has already been encoded and reconstructed and filtered (displayed order) past or future. It can be selected from the picture.

예측 블록 PRED는 현재 블록으로부터 나와서 오차 블록 Dn을 생성하고, 오차(차이) 블록은 변환되고(블록변환을 사용하여 변환), 양자화되어 그림의 X로 표현된다. 엔트로피 인코딩된 계수들은 매크로 블록 내의 각 블록을 디코딩 하는 데 필요한 부가적인 정보들(예측 모드, 양자화, 파라미터, 움직임 벡터 정보 등)과 함께 압축된 비트스트림을 형성하여 Network Abstraction Layer(NAL)를 통해 전송되거나 저장된다.The predictive block PRED comes out of the current block to produce an error block Dn, the error (difference) block is transformed (transformed using a block transform), quantized and represented by X in the picture. Entropy-encoded coefficients form a compressed bitstream along with additional information (prediction mode, quantization, parameters, motion vector information, etc.) needed to decode each block in the macroblock and transmitted via the Network Abstraction Layer (NAL). Or stored.

도 3의 디코더의 데이터 흐름 경로는 인코더와 디코더 사이의 유사성을 보여준다. 인코더(복원경로)는 매크로블록 내의 각 블록을 인코딩하여 전송할 뿐만 아니라 이러한 정보를 다시 디코딩(복원)하여 이후의 예측을 위한 참조 데이터를 생성한다. 계수 X는 역양자화 (Q

¹)되고 역변환(T

¹)되어 오차 블록 D´n을 생성한다. The data flow path of the decoder of FIG. 3 shows the similarity between the encoder and the decoder. The encoder (restore path) not only encodes and transmits each block in the macroblock but also decodes (restores) this information again to generate reference data for later prediction. Coefficient X is inverse quantization (Q

¹) and inverse transform (T

¹) to generate the error block D'n.

예측블록 PRED는 D´n에 더해져서 복원된 블록 uF´n(원래의 블록이 디코딩된 버전 : u는 필터를 거치지 않았다는 것을 의미한다)을 생성한다. 블록 왜곡 현상을 감소시키기 위해 필터가 적용되고, 복원된 참조 프레임은 블록 F´n으로부터 생성된다.The predictive block PRED is added to D'n to generate a reconstructed block uF'n (the version from which the original block was decoded: u means no filter). A filter is applied to reduce block distortion, and a reconstructed reference frame is generated from block F'n.

도 3의 디코더는 NAL로부터 압축된 비트스트림을 받아 데이터 요소들에 대해 엔트로피 디코딩을 수행하여 양자화된 계수 X를 생성한다. 생성된 계수들은 역양자화되고 역변환화 되어 D´n(인코더에 표시되어 있는 D´n과 동일)이 생성된다. 디코더는 비트스트림으로부터 디코딩된 헤더 정보를 사용하여 인코더에서 생성된 원래의 예측 블록 PRED와 동일한 예측 블록 PRED를 생성한다. PRED는 D´n에 더해져서 uF´n를 생성하며, uF´n는 필터를 거쳐 각각의 디코딩된 블록 F´n을 생성한다.
The decoder of FIG. 3 receives the compressed bitstream from the NAL and performs entropy decoding on the data elements to produce quantized coefficients X. The generated coefficients are inversely quantized and inversely transformed to produce D'n (same as D'n indicated on the encoder). The decoder uses the header information decoded from the bitstream to generate the same prediction block PRED as the original prediction block PRED generated at the encoder. PRED is added to D'n to generate uF'n, and uF'n passes through a filter to produce each decoded block F'n.

본 발명에서 사용된 H.264의 구조는 특정한 기능을 지원하는 세 개의 프로파일이 정의되어 있고 프로파일과 호환되기 위해서 인코더와 디코더에 요구되는 사항들이 정의되어 있다. Baseline 프로파일은 Ⅰ매크로 블록만을 포함하는 Ⅰ(Intra)-슬라이와 P 매크로 블록 그리고/또는 Ⅰ매크로 블록을 포함하는 P(Predicted)-슬라이스를 사용하는 인트라 코딩 및 인터 코딩, 그리고 컨텍스트 적응형 가변 길이 코드(context-adaptive variable-length codes, CAVLC)를 사용하는 엔트로피 코딩을 지원한다.In the structure of H.264 used in the present invention, three profiles that support a specific function are defined, and the requirements of the encoder and decoder are defined to be compatible with the profile. Baseline profiles are intra coding and inter coding using I-slices containing only I macro blocks and P (predicted) slices containing P macro blocks and / or I macro blocks, and context-adaptive variable length codes. It supports entropy coding using (context-adaptive variable-length codes, CAVLC).

Main 프로파일은 비월주사 비디오, B매크로 블록 그리고/또는Ⅰ매크로 블록을 포함하는 B(Bi-predictive)-슬라이스를 이용한 인터 코딩, 가중치 예측(weighted predict)을 사용하는 인터 코딩 그리고 켄텍스트 기반 산술 코딩(context-based arithmetic coding, CABAC)을 사용하는 엔트로피 코딩을 지원한다.The Main profile consists of intercoding with Bi-predictive (B) slices, including interlaced video, B macroblocks and / or Imacroblocks, intercoding using weighted predictors, and kentext based arithmetic coding. Supports entropy coding using context-based arithmetic coding (CABAC).

Extended 프로파일은 비월주사 비디오 또는 CABAC을 지원하지 않지만, 코딩된 비트스트림 즉 SP-슬라이스(코딩된 비트스트림 사이의 스위칭을 용이하게 하는 P 그리고/또는 Ⅰ매크로 블록을 포함하는 Switching P)와 SI-슬라이스(코딩된 비트스트림 사이의 스위칭을 용이하게 하는 인트라 코딩된 매크로 블록의 특별한 형태인 Switching Ⅰ) 사이의 효율적인 전환(switching)과 향상된 에러 복구 기능(Data Partitioning)을 가능하게 하는 모드가 추가된다.The Extended Profile does not support interlaced video or CABAC, but coded bitstreams, or SP-slices (Switching P with P and / or I macro blocks to facilitate switching between coded bitstreams) and SI-slices. A mode is added to enable efficient switching and improved error recovery (Data Partitioning) between (Switching I, a special form of intra-coded macroblocks that facilitate switching between coded bitstreams).

각 응용분야로는 Baseline 프로 파일은 화상전화, 화상회의, 무선통신이 가능하고 Main 프로파일은 TV방송과 비디오 저장이 있고, Extended 프로파일은 스트리밍 미디어 응용분야에 유용하다.For each application, the Baseline profile is capable of video telephony, video conferencing, and wireless communication. The main profile is for TV broadcast and video storage. The extended profile is useful for streaming media applications.

도 4는 세 개의 프로파일과 코딩 도구들 사이의 관계를 나타낸 것인데 Baseline 프로파일은 Extended 프로파일에 속하지만 Main 프로파일에 속하지 않는다. 그리고 코덱의 성능 제한은 레벨에 의해 정의되는데, 각 레벨은 샘플 프로세싱 비율, 픽쳐 사이즈, 압축 비트율 그리고 메모리 요구량과 같은 피라미터 제한이 있다.4 shows the relationship between the three profiles and the coding tools. The Baseline profile belongs to the Extended profile but does not belong to the Main profile. The codec's performance limits are defined by levels, with each level having parameter limits such as sample processing rate, picture size, compression bit rate, and memory requirements.

H.264의 비디오 포맷은 4:2:0 순차주사 또는 비월주사 비디오의 인코딩과 디코딩을 지원하며 4:2:0 순차주사 프레임의 기본 샘플링 포맷은 그림 8과 같고 기본 샘플링 포맷에서 색차(Cb와 Cr) 샘플은 수평으로 2번째 휘도 샘플마다 위치하며, 수직으로 2개의 휘도 샘플 사이에 위치한다. 비월주사 프레임은 시간적으로 분리되어 있는 두 개의 필드(상위 필드와 하위 필드)로 구성되며 기본 샘플링 포맷은 도 5와 같다.The H.264 video format supports the encoding and decoding of 4: 2: 0 progressive or interlaced video. The default sampling format for 4: 2: 0 progressive scan frames is shown in Figure 8, and the color difference (Cb and Cr) samples are located horizontally every second luminance sample and vertically located between two luminance samples. The interlaced frame consists of two fields (upper field and lower field) separated in time and the basic sampling format is shown in FIG.

H.264의 코딩된 데이터 포맷은 Video Coding Layer(VCL)와 Network Abstraction Layer(NAL) 사이에 차이를 둔다. 인코딩 과정의 출력은 VCL 데이터(코딩된 비디오 데이터를 나타내는 연속적인 비트들)이고 전송하거나 저장하기 전에 NAL 단위로(unit) 맵핑된다.The coded data format of H.264 makes a difference between the Video Coding Layer (VCL) and the Network Abstraction Layer (NAL). The output of the encoding process is VCL data (contiguous bits representing coded video data) and is mapped to NAL units before transmission or storage.

도 6과 같이 각 NAL 단위는 압축된 비디오 데이터 또는 헤더 정보에 해당하는 데이터인 RAW Byte Sequence Payload(RBSP)를 포함한다. 압축된 비디오 영상은 연속되는 NAL 단위로 표현되는데 , NAL 단위는 패킷 기반 네트워크 (Packet-based network) 또는 비트스트림 전송 링크(bitstream transmission link)를 통해 전송되거나 파일로 저장될 수 있다. As illustrated in FIG. 6, each NAL unit includes RAW Byte Sequence Payload (RBSP), which is data corresponding to compressed video data or header information. The compressed video image is expressed in consecutive NAL units, which may be transmitted through a packet-based network or a bitstream transmission link or stored in a file.

VCL과 NAL을 분리하여 정의하는 목적은 VCL의 압축 특징과 NAL의 전송 특징 사이를 구별하기 위해서이다. H.264의 인코더는 이전에 인코딩된 여러 개의 픽쳐 중 하나 또는 두 개를 인터 코딩된 매크로 블럭 또는 매크로 블럭 파티션의 움직임 보상 예측을 위한 참조 프레임으로 사용할 수 있다. 이것은 인코더가 단지 바로 이전에 인코딩된 픽쳐 뿐만 아니라 보다 다양한 픽쳐들로부터 현재의 매크로 블럭 파티션에 가장 근접하는 부분을 탐색할 수 있도록 해준다. The purpose of defining the VCL and NAL separately is to distinguish between the compression characteristics of the VCL and the transmission characteristics of the NAL. The encoder of H.264 may use one or two of several previously encoded pictures as reference frames for motion compensation prediction of an inter coded macroblock or macroblock partition. This allows the encoder to search the closest portion to the current macro block partition, not just from the previous encoded picture, but from more pictures.

인코더와 디코더는 각각 이전에 인코딩 되고 디코딩 된 참조 픽쳐(디스플레이되는 순서로 현재 픽쳐의 이전 그리고/또는 이후에 나타나는 픽쳐)를 하나 또는 두 개 가지고 있다.The encoder and decoder each have one or two previously encoded and decoded reference pictures (pictures that appear before and / or after the current picture in the order in which they are displayed).

비디오 픽쳐는 하나 또는 그 이상의 슬라이스로 구성되며, 각 슬라이스는 1(슬라이스 당 1개의 매크로 블록)부터 픽쳐 내의 모든 메크로 블록의 개수(픽쳐 당 1개의 슬라이스) 사이의 값을 갖는 정수 개의 매크로 블록을 포함한다.A video picture consists of one or more slices, each slice containing an integer number of macroblocks with values between 1 (one macroblock per slice) to the number of all macroblocks in the picture (one slice per picture). do.

슬라이스 당 메크로 블록의 개수는 하나의 픽쳐 내에서 일정해야 할 필요는 없다. 코딩된 슬라이스 사이에는 최소한의 상호 의존선(inter-dependency)이 있으므로 에러의 전달을 제한시키는 데 도움이 된다.
The number of macroblocks per slice need not be constant within one picture. There is minimal inter-dependency between coded slices, which helps limit the propagation of errors.

앞에서 언급한 5가지 종류의 코딩된 슬라이스가 존재하며 코딩된 픽쳐는 서로 다른 종류의 슬라이스들로 만들어질 수 있다. 예를 들어 Baseline 프로파일로 코딩된 픽쳐에는 Ⅰ슬라이스와 P 슬라이스가 혼합되어 포함될 수 있고, Main 프로파일 또는 Extended 프로파일 픽쳐에는 Ⅰ슬라이스와 P 슬라이스 그리고 B 슬라이스가 혼합되어 포함될 수 있다.
There are five kinds of coded slices mentioned above, and a coded picture may be made of different kinds of slices. For example, a picture coded with a baseline profile may include a mixture of I slices and P slices, and a main profile or extended profile picture may contain a mixture of I slices, P slices, and B slices.

신택스 요소Syntax elements

설 명Explanation
mb_typemb_type

매크로 블록이 인트라 모드로 코딩되는지 또는 인터(P 또는 B) 모드로 코딩 되는지 결정. 매크로 블록 파티션의 사이즈를 결정.

Determines whether a macro block is coded in intra mode or in inter (P or B) mode. Determine the size of macroblock partitions.

mb_predmb_pred

인트라 예측 모드를 결정
Determine intra prediction mode
sub_mb_predsub_mb_pred

각 서브 매크로 블록에 대한 서브 매크로 블록 파티션 사이즈를 결정.

Determine the submacroblock partition size for each submacroblock.

coded_block_patterncoded_block_pattern

어떤 8×8블록(휘도와 색차)이 코딩된 변환 계수를 포함하는지를 나타냄.

Indicates which 8x8 block (luminance and chrominance) contains the coded transform coefficients.

mb_qp_deltamb_qp_delta

양자화 파라마터를 변경.
Change the quantization parameter.
residualresidual

예측 후의 오차 이미지 샘플에 해당하는 코딩된 변환 계수.
Coded transform coefficients corresponding to the error image samples after prediction.

매크로 블록은 비디오 프레임의 16×16 샘플영역 (16×16 휘도 샘플, 8×8 Cb 그리고 8×8 Cr 샘플)에 해당하는 코딩된 데이터와 표 1과 같이 신택스 요소를 포함한다. 매크로블록에는 프레임 내의 순차적인 스캔 순서로 번호가 부여된다.
The macro block includes coded data corresponding to 16 × 16 sample areas (16 × 16 luminance samples, 8 × 8 Cb, and 8 × 8 Cr samples) of a video frame and syntax elements as shown in Table 1 below. Macroblocks are numbered in sequential scan order within a frame.

본 발명에서 H.264의 실시간 스트리밍 전송을 실시하기 위하여 엔코딩 된 H.264 비디오 영상은 일련의 NAL 유닛으로 구성되는데, 각 NAL 유닛은 RBSP 표 2를 포함한다. 다른 모든 요소들은 단지 NAL 유닛인 반면, 코딩된 슬라이스(데이터 분할 슬라이스와 Instantaneous Decoder Refresh 슬라이스 포함)와 the End of Sequence RBSP는 VLC NAL 유닛으로 정의된다. 전형적인 RBSP 각각의 유닛은 독립적인 NAL 유닛에서 전송된다.In the present invention, an H.264 video image encoded for real time streaming transmission of H.264 is composed of a series of NAL units, and each NAL unit includes RBSP Table 2. All other elements are just NAL units, while coded slices (including data partition slices and Instantaneous Decoder Refresh slices) and the End of Sequence RBSP are defined as VLC NAL units. Each unit of a typical RBSP is transmitted in an independent NAL unit.

RBSP TYPERBSP TYPE

설 명Explanation
Parameter SetParameter Set

픽쳐의 해상도, 비디오 포맷, 매크로 블록 배치 맵 등 영상에 대한 '전역' 파라비터.
'Global' parameters for the picture, such as picture resolution, video format, and macroblock placement map.
SupplementalSupplemental
EnhancementEnhancement
informationinformation

비디오 영상을 올바로 디코딩하는 데 필요한 필수 요소가 아닌 부가 메세지.

Additional messages that are not essential to the proper decoding of video footage.
Picture DelimiterPicture Delimiter

비디오 픽쳐 사이의 경계. Picture Delimiter가 사용되지 않으면, 디코드는 각 슬라이스 헤더 내에 포함된 프레임 번호를 기반으로 경계를 구분.

Boundaries between video pictures. If Picture Delimiter is not used, decode delimits based on the frame number contained within each slice header.

Coded sliceCoded slice

슬라이스에 대한 헤더와 데이터.
(이 RBSP 유닛은 실제 코딩된 비디오 데이터 포함)

Header and data for the slice.
(This RBSP unit contains the actual coded video data.)

Data PartitionData Partition
A, B or CA, B or C

데이터 파디션된 슬라이스 레이어 데이터를 포함하는 세 개의 유닛(에러 복원시 유용)
파티션 A : 슬라이스의 모든 MB에 대한 헤더 데이터.
파티션 B : 인트라 코딩된 데이터.
파티션 C : 인터 코딩된 데이터.

Three units containing data partitioned slice layer data (useful for error recovery)
Partition A: Header data for all MBs in the slice.
Partition B: Intra coded data.
Partition C: Intercoded data.

End of sequenceEnd of sequence

(디코딩 순서로)다음 픽쳐가 IDR 픽쳐임을 나타냄.
(디코딩시 꼭 필요한 정보는 아님)

Indicates that the next picture (in decoding order) is an IDR picture.
(Not necessary information for decoding)

End of streamEnd of stream

비트스트림 내에 더 이상의 픽쳐가 없다는 것을 나타냄.

Indicates that there are no more pictures in the bitstream.

Filler dataFiller data

(영상 내의 바이트 수를 증가시키는데 사용될 수 있는)
'의미 없는(dummy)' 데이터.

(Which can be used to increase the number of bytes in the image)
'Dummy' data.

NAL 유닛의 헤더는 RBSP의 종류에 대한 정보를 포함하며, NAL 유닛의 나머지부분은 RBSP 데이터로 구성된다. NAL 유닛을 전송하는 방법은 패킷 기반 전송 방법(packet networks)과 연속적인 데이터 스트림의 전송(circuit-switched channel) 사이에 몇 가지 차이가 있다.The header of the NAL unit includes information on the type of RBSP, and the rest of the NAL unit is composed of RBSP data. There are several differences between the method of transmitting the NAL unit between packet-based packet networks and the transmission of a continuous data stream.

패킷 기반 네트워크에서 각 NAL 유닛은 독립적인 패킷에 실릴 수 있으며 디코딩 하기 전에 올바른 순서로 구성되어야 한다. 연속적인 데이터 스트림 전송 환경에서의 시작 코드 접두사(start code prefix, 유일하게 식별되는 구분 부호)는 각 NAL 유닛 이전에 위치하며 전송하기 전에 바이트 스트림을 만든다. 이것은 디코더가 NAL 유닛의 시작을 구분하는 시작 코드 접두사를 찾기 위해 스트림을 탐색할 수 있게 한다. In a packet-based network, each NAL unit can be carried in an independent packet and must be configured in the correct order before decoding. In a continuous data stream transmission environment, a start code prefix (uniquely identified delimiter) is placed before each NAL unit and creates a byte stream before transmission. This allows the decoder to search the stream to find the start code prefix that distinguishes the start of the NAL unit.

일반적으로 응용된 코딩용 비디오는 관련된 오디오 트랙 및 부가 정보와 함께 전송되거나 저장되어야 한다. 이를 위해 RTP(Real Time Protocol) and UDP(User Datagram Protocol)와 같은 다음에 소개되는 실시간 전송 프로토콜을 사용할 수 있다. H.264는 압축 효율에 최적화 되어 있고 실제적인 멀티미디어 통신 응용분야의 요구를 만족시키는 것을 목표로 하는 비디오 코딩 방법을 제공한다. 사용 가능한 압축 도구들의 범위는 MPEG-4 Visual 보다 목표 범위가 제한되어 있지만 압축 파라미터와 다양한 전략의 선택이 가능하다. 그러므로 H.264의 디지털 영상 압축 기술은 최근 멀티미디어 응용분야의 핵심기술로 핸드폰 비디오 압축 및 VOD 서비스, 위성 및 지상파 DMB 서비스, IPTV 서비스의 영상 압축 알고리즘으로 채택되어 차세대 비디오 압축 기술의 표준으로 자리 잡고 있으며, IPTV 국제기술표준규격의 멀티미디어 압축 규격으로 채택되어, 본 발명은 H.264를 실시간 고품질 영상전송에 적용하였다.
In general, the applied coding video should be transmitted or stored along with the associated audio track and side information. To do this, you can use the following real-time transport protocols, such as Real Time Protocol (RTP) and User Datagram Protocol (UDP). H.264 provides a video coding method that is optimized for compression efficiency and aims to meet the needs of practical multimedia communication applications. The range of compression tools available is limited to that of MPEG-4 Visual, but compression parameters and various strategies are available. Therefore, H.264's digital video compression technology is the core technology of multimedia applications. It has been adopted as video compression algorithm of mobile phone video compression and VOD service, satellite and terrestrial DMB service and IPTV service. Adopted as the multimedia compression standard of the IPTV international technical standard, the present invention applied H.264 to real-time high-quality video transmission.

본 발명에서 디지털 멀티미디어의 IP 네트워크를 사용한 전송을 실시하기 위하여 리얼 네트워크사가 개발한 스트리밍 멀티미디어라는 기술에서 유래하는데, 이 기술의 원리는 대용량의 멀티미디어 자료라도 이를 개별적으로 실행할 수 있는 1~2초 분량의 작은 조각으로 나눠 시냇물이 흐르듯이 데이터를 연속적으로 전송하여 수신하는 측은 전체자료가 모두 수신될 때까지 기다릴 필요 없이 즉석에서 각 조각의 파일들을 재생하는 기술을 의미한다. 즉, 응용계층의 연속적인 미디어 데이타를 짧은 토막으로 잘라 패킷화 전송하고 수신 측에서는 일정한 단위의 데이터가 수신될 때마다 실시간 특성을 어느 정도 유지하면서 연속적인 복호화를 통해 재생을 하며 전체 데이터를 수신한 다음 복호 재생하는 것은 아니다. In the present invention, it is derived from a technology called streaming multimedia developed by Real Network Co., Ltd. to perform transmission using an IP network of digital multimedia. The principle of the technology is that a large amount of multimedia data can be individually executed. As a stream flows into smaller pieces, the receiver continuously sends and receives data, without having to wait for all the data to be received. That is, the continuous media data of the application layer is cut into short pieces, and the packet is transmitted. On the receiving side, the data is reproduced by continuous decoding while maintaining a certain amount of real-time characteristics whenever a certain unit of data is received. Decoded playback is not.

현재 스트리밍이라는 용어는 인터넷상에서의 방송을 의미하며 스트리밍은 브로드캐스팅의 한 종류라고 할 수 있다. 한편, 스트리밍은 방식은 주문형(On Demand)과 라이브(Live)로 나뉘고, 기술적으로는 유니캐스트(Unicast)와 멀티캐스 트(Multicast)로 구분된다. 스트리밍 기술이 사용하는 표준 프로토콜은 RTSP(Real Time Streaming Protocol)가 있다.The term streaming now means broadcasting on the Internet, and streaming is a kind of broadcasting. On the other hand, streaming is divided into On Demand and Live, and technically, it is divided into Unicast and Multicast. The standard protocol used by streaming technology is RTSP (Real Time Streaming Protocol).

첫 번째 주문형 스트리밍, 즉 미리 준비되어 있는 미디어 파일을 스트리밍 해 보는 방식으로 미디어 영상을 보기 위해서는 해당 파일을 클릭한 후 파일의 구동신호를 받을 때까지 기다려야 한다.In order to view the media image by streaming the first on-demand streaming, a media file prepared in advance, it is necessary to click on the file and wait for the driving signal of the file.

두 번째 방식은 바로 라이브 스트리밍이다. 이 방식은 해당하는 곳을 클릭하면 곧바로 스트리밍을 지원하는 플레이어가 나타나면서 영상 또는 음향을 보내주는 방식이다. 이 방식은 현재까지 전송속도가 음질이나 화질에 큰 영향을 주었다. 그리고 데이터가 필요 이상으로 많이 들어오게 되면 일시적으로 버퍼에 저장되며 데이터가 필요한 것보다 적게 들어오게 되면 음질이나 화질이 떨어졌다. 그러나 본 발명에서 제시하는 전송 속도 외에 압축방식이나 전송 프로토콜에 대한 연구를 통하여 실시간 스트리밍을 실시하도록 한다.
The second way is live streaming. This way, when you click on the relevant place, a player supporting streaming will appear and send video or sound. This method has a big impact on sound quality and image quality up to now. When more data is needed than it is needed, it is temporarily stored in the buffer. When less data is needed than it is needed, the sound quality or image quality is reduced. However, in addition to the transmission speed suggested by the present invention, a real time streaming is performed through a study on a compression scheme or a transmission protocol.

본 발명에서 실시간 스트리밍을 위한 RTSP(Real Time Streaming Protocol)의 실시는 RTSP(Real Time Streaming Protocol)은 On Demand 형식으로 실시간 미디어 전송을 행하는 application 계층의 프로토콜을 말한다. RTSP는 인터넷상 스트리밍 서비스에 있어서 서비스에 대한 요구, 응답이나 서비스 연결 설정, 스트림 재생과 관련된 각종 제어를 담당하는 프로토콜로서 Real Network, Netscape, Corporation, Columbia 대학 등에서 공동으로 개발하여 1998년 4월에 RFC2326으로 표준화되었다. 스트리밍을 위한 프로토콜은 RTSP와 RTP를 사용하게 되고, 이를 통해 패킷화하여 스트리밍 서비스를 행한다.Real Time Streaming Protocol (RTSP) for real-time streaming in the present invention, RTSP (Real Time Streaming Protocol) refers to the protocol of the application layer for real-time media transmission in the On Demand format. RTSP is a protocol that handles various requests related to service request, response or service connection setting, and stream playback in streaming service on the Internet. It is jointly developed by Real Network, Netscape, Corporation, Columbia University, etc. Has been standardized. The protocol for streaming uses RTSP and RTP, and packetizes it to perform streaming service.

RTSP 프로토콜은 유니캐스트 또는 멀티캐스트를 모두 사용하는 멀티포인터에서 스트리밍 멀티미디어를 위해 강력한 프로토콜을 제공하기 위한 목적을 가진 응용계층의 프로토콜로서 오디오, 비디오와 같은 시간적으로 동기화된 스트리밍을 생성하고 제어한다. 즉, 연속매체 자체는 전송하지 않고 스트리밍 서버를 위한 네트워크 원격제어 역할을 수행한다.The RTSP protocol is an application layer protocol that aims to provide a powerful protocol for streaming multimedia in multi-pointers using both unicast and multicast. It creates and controls temporally synchronized streaming such as audio and video. That is, the continuous medium itself does not transmit and performs a network remote control role for the streaming server.

MMS/HTTP 프로토콜을 보면 TCP 기반의 미디어 데이터를 패킷화 하여 순서에 관계없이 신뢰성 있는 전송으로 수신 측에서는 미디어 데이터를 복원 재생하려면 미디어 패킷의 시간관계를 파괴하여 버퍼링이 발생한다. 그리고 프레임 손실 시 재전송을 요구하면 연속 미디어 데이터의 Jitter나 Skew가 발생한다.In the MMS / HTTP protocol, TCP-based media data is packetized, and reliable transmission is performed regardless of the order. On the receiving side, in order to restore and reproduce the media data, the buffering occurs by destroying the time relationship of media packets. When requesting retransmission when a frame is lost, jitter or skew of continuous media data occurs.

RTSP 프로토콜의 기본 동작원리를 보면 클라이언트는 서버에게 실시간 특성을 갖는 영상이나 음성 정보를 요청하고, 이 요청에 의해 서버가 정보를 전송하는 방식으로 동작한다. 여기서 스트리밍이란 서버 측에서 압축된 연속적인 메시지를 패킷으로 잘라 전송하면 수신 측에서는 메시지 전체를 수신한 다음 복호화/재생하는 것이 아니라 어떤 일정한 단위의 메시지가 수신될 때마다 복호화함으로써 실시간 특성을 어느 정도 유지하면서 연속적인 재생을 가능하게 해주는 기술이다.According to the basic operation principle of the RTSP protocol, the client requests video or audio information having real-time characteristics from the server, and the server operates by transmitting the information. Here, streaming refers to a server that cuts and transmits a compressed series of messages into packets. On the receiving side, instead of receiving and decoding / reproducing the entire message, the receiving side decodes each time a certain unit of message is received, while maintaining some real-time characteristics. It is a technology that enables continuous playback.

RTSP 프로토콜의 특징은 Unicast 또는 Multicast 환경에서 복수 개의 미디어 정보 스트림을 동시에 제어 가능하고 일반적으로 TCP와 UDP를 포함하는 다양한 전달 계층 프로토콜 위에서 동작할 수 있으며 RTP와 RTCP 프로토콜과 함께 사용한다. The features of the RTSP protocol enable simultaneous control of multiple media information streams in a Unicast or Multicast environment, and can operate over a variety of transport layer protocols, including TCP and UDP, in combination with the RTP and RTCP protocols.

RTSP는 제어 메세지 전송을 위해서 신뢰성 있는 TCP를 사용하여 RTP/RTCP 채널을 설정 한 다음, RTP/RTCP 패킷이 전달 되도록 한다. 즉 세션의 설정과 해제는 RTSP에 의해 제어되고, 실제의 A/V 데이터는 RTP를 통해 전송하고, HTTP와 비슷한 신택스와 오퍼레이션을 가지나 Server와 Client 모두가 Request를 보내고 Response를 받을 수 있으며 초기, 준비, 재생 상태를 가진다. 그리고 세션의 설정과 해제에 사용되는 명령어에 대한 프로토콜 규정은 SDP를 통하여 이루어진다.The RTSP establishes an RTP / RTCP channel using reliable TCP for transmission of control messages and then forwards the RTP / RTCP packets. That is, session setup and release are controlled by RTSP, actual A / V data is transmitted through RTP, and has similar syntax and operation as HTTP, but both server and client can send request and receive response. , Has a playback state. And protocol specification for commands used to set up and release sessions is done through SDP.

도 7과 같이 RTSP 동작과정 중, Client의 Player는 SDP(Session Description Protocol)를 사용한 RTSP를 통해서 접속 요청을 하고 그에 대한 스트리밍 서버의 응답 또한 SDP는 session과 session이 설정하는 정보들을 담고, SDP에 의해 담긴 정보들은 RTSP에 적절히 할당되기에 적합하므로 사용한다.As shown in FIG. 7, during the operation of the RTSP, the player of the client makes a connection request through the RTSP using the Session Description Protocol (SDP), and the streaming server's response to the SDP also includes session and information set by the session. Use the information contained in it as it is appropriate for RTSP allocation.

Player의 제어명령은 RTSP를 통해 전달되고 응답 또한 RTSP를 통해 이루어진다. 이 정보를 이용하여 스트리밍 서버에게 Setup 요구를 보낼 수 있다. Play 요청이 받아들여지면 실제 데이터는 RTP/RTCP 프로토콜을 사용하여 클라이언트에게 전송된다. 일시적인 재생중지를 위해서는 Pause 메소드를 사용하고 세션을 완전히 닫기 위해서는 Teardown 메시지를 보낸다.
The player's control commands are sent via RTSP, and the response is also via RTSP. You can use this information to send setup requests to the streaming server. If the Play request is accepted, the actual data is sent to the client using the RTP / RTCP protocol. Use the Pause method to temporarily stop playback and send a Teardown message to close the session completely.

본 발명에서 실시간 스트리밍을 위한 RTP(Real-time Transport Protocol) 압축과 전송의 실시 중, RTP(Real-time Transport Protocol)는 1995년 11월 IESG(Internet Engineering Streering Group)으로부터 인터넷 제안표준으로 승인되었으며 RFCI1889(Request For Comments 1889)와 RFC1890(RTCP:RTP Profile for Audio and Video Conferences with Minimal Control 1890)으로 발표되었다.In the present invention, during implementation of Real-time Transport Protocol (RTP) compression and transmission for real-time streaming, Real-time Transport Protocol (RTP) was approved as Internet proposal standard by Internet Engineering Streering Group (IESG) in November 1995 and RFCI1889 (Request For Comments 1889) and RFC1890 (RTCP: RTP Profile for Audio and Video Conferences with Minimal Control 1890).

RTP는 멀티캐스트 또는 유니캐스트 상에서 음성, 화상 또는 모의 데이터와 같은 실시간 데이터를 전송하는 응용에 적합한 단대 단 트랜스포트 기능을 제공한다. 그러나 RTP는 자원 예약에 대한 내용을 다루지는 않으며, 특히 적시 데이터 전송(Time Delivery), QoS 보장, 뒤바뀐 순서의 전송 방지와 같은 기능을 제공하지 않는다. 따라서 트랜스포트의 의미는 실시간 데이터의 특성에 중점을 두어 제정한 표준이라고 할 수 있다. RTP 패킷은 UDP를 이용하여 전달된다. RTP에서의 다중화는 목적지 전송 주소가 제공하며, 여러 미디어를 사용하는 회의에서는 각각의 미디어글이 서로 다른 목적지 주소를 가지는 RTP 세션들 내에서 전송된다. 그러므로 하나의 RTP 세션에 여러 개의 미디어가 함께 전송되고 나서 페이로드 형이나 SSRC 필드 값에 기반하여 역다중화 되는 것은 아니다.RTP provides end-to-end transport capabilities suitable for applications that transmit real-time data such as voice, video, or mock data over multicast or unicast. However, RTP does not cover resource reservations and does not provide features such as timely data delivery, QoS guarantees, and reversal of transmission. Therefore, the meaning of transport can be said to be a standard established by focusing on the characteristics of real-time data. RTP packets are delivered using UDP. Multiplexing in RTP is provided by the destination transport address, and in conferences using multiple media, each media article is transmitted within RTP sessions with different destination addresses. Therefore, multiple media are transmitted together in one RTP session and then not demultiplexed based on payload type or SSRC field value.

RTP 프로토콜의 특징은 4가지를 들 수 있는데, 첫째로 ALF(Application Layer Framing) 구조로 응용프로그램이 직접적으로 사용할 수 있는 간단한 프레임을 사용한다. 실시간성 응용프로그램은 TCP가 제공하는 서비스들이 필요 없기 때문에 TCP와 같은 복잡한 프로토콜을 사용할 필요가 없다. 데이터의 손실이 발생하였을 경우 재전송하는 TCP 방식은 순차적으로 안정적인 정보 전달을 하지만 시간지연을 발생시킨다. 실시간성 프로그램에서는 데이터의 손실보다 더 심각한 문제로 대두되는 것이 바로 시간지연이다. There are four characteristics of the RTP protocol. First, it uses a simple frame that can be used directly by an application with the application layer framing (ALF) structure. Real-time applications do not need to use complex protocols such as TCP because they do not need the services provided by TCP. When the loss of data occurs, the TCP method of retransmitting stably transmits information but generates time delay. In real-time programs, time lag is more serious than loss of data.

ALF는 대부분의 네트워크 소프트웨어들이 구성되는 방식이며, 장점은 다층구조 보다 더 좋은 성능을 나타낸다는 것이다, 만일 서로 다른 층을 사용한다면 여러 프로토콜들이 다른 층과 상호작용하기 위해서 프로세스 간 통신과 같은 비효율적인 방법을 사용해야 하지만 ALF는 직접적인 함수호출과 같은 효율적인 방식을 사용할 수 있다. RTP는 format과 연산 그리고 기본적인 역할에 대해 프레임으로 정의적인 응용프로그램은 RTP 프레임에 데이터와 함께 인코딩 헤더 등 데이터에 필요한 정보를 같이 실어서 보낸다. ALF is the way most network software is organized, and the advantage is that it performs better than a multi-layered structure. If you use different layers, inefficient methods, such as interprocess communication, in order to allow different protocols to interact with different layers. However, ALF can use efficient methods such as direct function calls. RTP defines the format, operation, and basic roles as frames. An application program sends data in RTP frames along with the information needed for the data, such as encoding headers.

두 번째 특징은 실시간 데이터의 시간기록이 가능하다는 것이다. TCP/IP 네트워크는 메시지 간에 일정한 시간관계를 유지해 주지 않아서 실시간성 데이터들이 불규칙한 간격으로 도착할 수 있다. 순서가 바뀌어서 도착할 수도 있으며, 심지어는 도착하지 않을 수도 있다.The second feature is that time recording of real time data is possible. TCP / IP networks do not maintain constant time relationships between messages, allowing real-time data to arrive at irregular intervals. You may arrive out of order, or even not.

수신자의 응용프로그램에서는 원래 데이터의 생성간격과 같은 간격으로 수신한 데이터를 재구성해 주어야 하며, 수신한 데이터의 타임 스탬프를 이용하여 재구성 한다. 재구성을 위한 타임스탬프는 모든 실시간 응용프로그램에 있어서 공통적인 부분이므로 RTP 프레임의 일부에 속한 것이다.The receiver's application program should reconstruct the received data at the same interval as the original data creation interval, and reconstruct it using the time stamp of the received data. The timestamp for reconstruction is part of the RTP frame because it is a common part of all real-time applications.

세 번째 특징은 멀티캐스트와 유니캐스트 지원이 가능하다. 화상회의와 같은 응용프로그램은 일반적으로 참가자들이 다수이므로 RTP는 멀티캐스트를 지원하도록 설계되었다. 회의에 참여하여 발언을 하면 멀티캐스팅에 의하여 모든 참여자들이 정보를 받을 수 있다.The third feature is multicast and unicast support. Applications such as video conferencing typically have a large number of participants, so RTP is designed to support multicast. Participate in meetings and speak and all participants receive information by multicasting.

RTP 패킷만 멀티캐스팅 하는 것이 아니라 피드백 또한 멀티캐스팅을 한다. 피드백을 멀티캐스팅 하는 것은 네트워크의 대역폭을 낭비하는 것처럼 보이지만 참여자의 수를 알 수 있게 하여 사용되는 대역폭을 알 수 있고, 송신 없이 세션에 참가만 해도 피드백 정보를 모두 받아 볼 수 있어 네트워크 관리자의 모니터링이 쉬울 뿐만 아니라 전송장애가 지역적인 것인지 전역적인 것인지 알 수 있는 장점이 있다.In addition to multicasting only RTP packets, feedback also multicasts. Multicasting feedback seems to be a waste of network bandwidth, but it allows you to know the number of participants, so you can see the bandwidth used. Not only is it easy, but it has the advantage of knowing whether the transmission failure is local or global.

네 번째 특징은 Translators and Mixers 기능이 있다는 것이다. 실시간 전송에서는 송신자와 수신자가 중요한 역할을 한다. RTP는 이들에게 2가지 중요한 역할을 더 첨가하여 RTP를 사용하는 시스템들은 Translators와 Mixers로서의 역할을 수행할 수 있게 한다. 물론, Translators와 Mixers가 모든 경우에 있어서 반드시 필요한 것은 아니다.Fourth feature is Translators and Mixers function. In real time transmission, the sender and receiver play an important role. RTP adds two important roles to them, allowing systems using RTP to act as translators and mixers. Of course, Translators and Mixers are not necessary in every case.

그러나 다음과 같은 방식에 의하여 때때로 이것들은 네트워크가 실시간을 지원할 수 있는 유일한 방법이 되기도 한다. Translators와 Mixers는 송신자와 수신자의 중간에 위치하며 Translator는 단순히 다른 포맷으로 전환시켜 주는 역할을 한다. 가령 서로 다른 대역폭의 사용자들이 같이 화상회의에 참여하면 서로 다른 대역폭 때문에 가장 낮은 대역폭을 가지는 쪽에 맞추어야 한다. 하지만 Translator의 역할에 의하여 각각의 대역폭에 맞는 형식의 포맷으로 바뀌게 되면 사용자들 마다 자신에게 알맞은 대역폭의 데이터를 받을 수 있다. However, sometimes they are the only way the network can support real time by: Translators and Mixers are in the middle of the sender and receiver, and Translators simply switch to another format. For example, if users of different bandwidths participate in a video conference together, they should be tailored to the side with the lowest bandwidth because of the different bandwidths. However, if the format is changed to the format suitable for each bandwidth by the role of Translator, each user can receive the data of the bandwidth suitable for himself.

Mixer는 포맷을 바꾸는 것이 아니라 원래의 포맷을 유지하면서 여러 개의 스트림을 하나의 스트림으로 합치는 역할을 한다. 이러한 방식은 합성이 가능한 오디오 데이터에 대해서는 매우 효율적인 방식이다.Mixer does not change the format, but combines multiple streams into a single stream while maintaining the original format. This method is very efficient for synthesizeable audio data.

도 8은 TCP/UDP/RTP 헤드 포맷의 비교그림이다. 이 중에 RTP 패킷 형태를 보면 헤더는 고정 크기를 가지며 멀티미디어 정보에 따라서 헤더 뒤에 특정 정보 및 데이터가 붙게 된다. V는 버전 필드이며 최근 버전은 2.0이다. P는 32비트 단위로 패킷을 구성하기 위해서 사용된다. P값이 세팅되면 Payload 부분이 아닌 패딩 옥테트들이 패킷의 끝에 포함됨을 의미한다. X비트가 1로 사용되면 정확하게 한 개의 확장 헤더가 고정 헤더의 다음에 온다는 것을 가리킨다. CC는 고정 헤더에서 CSRC identifier의 개수를 가리킨다. CRSC는 RTP mixer가 combined stream으로 만드는 데 기여한 RTP 패킷 스트림의 소스이다. 즉, RTP 패킷들은 망을 통해서 전달되면서 중간 시스템에서는 여러 소스로부터 온 RTP 패킷들을 받고 이들을 적절히 조합시켜서 새로운 형태의 RTP 패킷을 만들고 이를 다음 시스템으로 전달하는데, 이러한 기능을 수행하는 중간 시스템을 RTP mixer라 한다.8 is a comparison of the TCP / UDP / RTP head format. Among them, in the form of RTP packet, the header has a fixed size and specific information and data are attached to the header according to the multimedia information. V is the version field and the latest version is 2.0. P is used to construct a packet in units of 32 bits. If P is set, it means that padding octets are included at the end of the packet, not the payload part. If the X bit is used as 1, it indicates that exactly one extension header follows the fixed header. CC indicates the number of CSRC identifiers in the fixed header. CRSC is the source of the RTP packet stream that contributed to the RTP mixer's combined stream. In other words, while RTP packets are transmitted through the network, the intermediate system receives RTP packets from various sources, combines them properly, creates a new type of RTP packet, and delivers it to the next system. An intermediate system that performs this function is called an RTP mixer. do.

M은 멀티미디어 정보에 대한 프레임 영역, 주로 프레임 경계를 나타내는 데 사용된다. 즉, 패킷 안에서 음성과 화상 정보 등을 구별하는 데 사용한다. PT 필드는 RFC 1890에서 정의된 프로파일의 RTP Payload 양식을 지칭하고 응용에 의해서 해석된다. 프로파일은 payload type code를 payload format으로 지정되고 고정된 대응을 시킨 것이다. 즉, PT가 0이면 인코딩 방식의 오디오 정보이고 800Hz clock rate를 가지며 오디오 채널 1개를 갖는 것을 가리킨다. 현재 33개의 payload type이 정의되어 있다.M is used to indicate a frame area, mainly frame boundary, for multimedia information. That is, it is used to distinguish between voice and image information in the packet. The PT field refers to the RTP Payload form of the profile defined in RFC 1890 and is interpreted by the application. The profile is a payload type code that is specified in the payload format and has a fixed correspondence. That is, when PT is 0, it indicates that the audio information is encoding scheme and has 800Hz clock rate and one audio channel. Currently 33 payload types are defined.

Sequence Number는 RTP 패킷이 송신될 때마다 1씩 증가한다. 수신 측은 이 필드를 이용하여 패킷 분실을 감지하고 패킷 순서를 재저장 한다. TimeStamp 필드는 RTP 패킷의 첫 번째 옥테트가 샘플링된 시점을 나타낸다. 그 샘플링 시점은 일정하게 증가하는 클럭으로부터 생성된다. 이것은 실시간 데이터의 동기화와 지터 계산에 이용된다. 초기값은 무작위로 선택된다.The Sequence Number is incremented by 1 each time an RTP packet is sent. The receiver uses this field to detect packet loss and restore the packet order. The TimeStamp field represents the point in time at which the first octet of the RTP packet was sampled. The sampling point is generated from a constantly increasing clock. This is used to synchronize real-time data and calculate jitter. The initial value is chosen randomly.

SSRC 필드는 카메라 또는 마이크 등의 데이터 원천지의 식별자를 가리킨다. 즉, 동기화 소스를 식별할 수 있게 해준다. 이 식별자는 RTP 세션 내에 있는 소스들과 다른 값을 가지는 무작위 수로 선택되어야 하지만, 충돌이 있을 수 있으므로 RTP에서 이를 감지하고 해결하는 방법을 갖는다. CSRC 필드는 RTP 패킷이 중간 시스템에서 혼합되는 경우에 그 소스들을 구별할 수 있는 식별자들을 가리킨다. 각각 32비트를 가지는 0~15개의 CSRC로 구성되며 이는 패킷에 담긴 페이로드를 위한 기여 소스들을 식별하는 것으로 믹서가 그 소스들의 SSRC를 CSRC로 채워준다. 식별자들의 개수는 위의 CC에서 나타난다. The SSRC field indicates an identifier of a data source such as a camera or a microphone. In other words, it allows you to identify the synchronization source. This identifier should be chosen as a random number that has a different value from the sources in the RTP session, but there may be a conflict, so RTP has a way to detect and resolve it. The CSRC field indicates identifiers that can distinguish the sources when the RTP packets are mixed in the intermediate system. It consists of 0-15 CSRCs, each with 32 bits, which identifies the contributing sources for the payload in the packet, and the mixer populates the SSRCs of those sources with the CSRC. The number of identifiers is shown in the CC above.

오디오는 RTP 패킷 구성할 시에는 특별한 Parsing 과정이 필요 없고 단순히 일정한 길이로 자르기만 하면 된다. 오디오는 사용자에게 일방적으로 전송만 하면 되므로 접속 요청 및 제어 명령에 대한 응답과는 달리 바인딩 과정이 필요 없다. 하지만 비디오 스트림은 프레임당 나오는 비트의 수가 일정하지 않으므로 오디오와 같이 일정한 길이의 RTP 패킷을 만들 수는 없다. 그러므로 Parsing 과정을 실시한다.
Audio does not require special parsing when constructing an RTP packet, but simply cuts it to a constant length. Audio only needs to be transmitted unilaterally to the user, so unlike the response to a connection request and a control command, there is no need for a binding process. However, because video streams do not have a constant number of bits per frame, they cannot produce RTP packets of constant length, such as audio. Therefore, carry out the parsing process.

본 발명에서 실시간 스트리밍을 위한 SDP(Session Description Protocol)의 실시 중, SDP 프로토콜은 인터넷에서 멀티미디어 세션에 참여하기 위한 사용자가 필요로 하는 정보를 광고하고 실시간으로 멀티미디어 세션을 정의할 목적으로 IETF의 MMUSIC(Multiparty Multi-media Session Control) 워킹그룹에 의해 RFC2327로 표준화된 프로토콜이다.During the implementation of the Session Description Protocol (SDP) for real-time streaming in the present invention, the SDP protocol is a MMUSIC (IETF) of IETF for the purpose of advertising the information required by the user to participate in the multimedia session on the Internet and defining the multimedia session in real time. Multiparty Multi-media Session Control) A protocol standardized to RFC2327 by the Working Group.

SDP는 멀티미디어 세션을 정의하기 위해 세션의 생성, 세션의 초대, 세션 기술에 관한 정보를 담고 있으며 이러한 정보로는 미디어 제어 서버에 대한 주소와 포트, 미디어 유형 및 미디어 서버 주소 등을 포함하게 된다. 세션을 기술하기 위해서는 SDP 프로토콜을 이용한다. SDP 프로토콜은 멀티미디어 세션들을 기술하고 다양한 형식의 세션을 초기화하는데 사용되고 있다.The SDP contains information on session creation, session invitation, and session description to define a multimedia session. This information includes the address and port for the media control server, the media type, and the media server address. To describe a session, use the SDP protocol. The SDP protocol is used to describe multimedia sessions and to initiate various types of sessions.

IP 멀티캐스트 기능을 이용하여 인터넷 위에 구축된 MBone은 다자간 멀티미디어 응용 서비스에 널리 이용되고 있다. 하지만 사용자가 현재 MBone에 개설되어 있는 세션에 참가하려면 세션의 내용이나 각 세션이 사용하는 멀티캐스트 주소와 포트번호 등을 알아야 한다.Built on the Internet using IP multicast, MBone is widely used for multi-media multimedia applications. However, if you want to join a session that is currently open on MBone, you need to know the contents of the session or the multicast address and port number each session uses.

SDP 프로토콜은 멀티미디어 세션을 기술하는 데 사용하는 프로토콜로서 이미 정해진 멀티캐스트 주소와 포트로 패킷을 주기적으로 보내어 세션을 알릴 때 페이로드 부분을 기술하는 목적으로 주로 사용한다. SDP의 사용 목적은 멀티미디어 세션의 미디어 스트림에 대한 정보를 그 세션에 참가하고 싶은 수혜자에게 전달하는 데 있다. SDP에서는 이 멀티미디어 세션을 일정 기간 동안 존재하는 미디어 스트림의 집합으로 정의하고 있다. 세션의 존재를 광고하고 그 세션에 참가할 수 있게 충분한 정보를 전달하는 기능이 있다.The SDP protocol is a protocol used to describe multimedia sessions. It is mainly used to describe the payload part when announcing a session by periodically sending packets to a predetermined multicast address and port. The purpose of the use of SDP is to convey information about the media stream of a multimedia session to beneficiaries who wish to participate in the session. SDP defines this multimedia session as a set of media streams that exist for a period of time. It has the ability to advertise the existence of a session and convey enough information to participate in the session.

인터넷 환경에서 SDP는 중요한 두 가지 기능을 제공하고 있다. 하나는 세션의 존재를 알리는 것이고, 다른 하나는 그 세션에 참가할 수 있게 충분한 정보를 전달하는 기능이다. 이를 위해 SDP는 다음 정보를 포함하고 있다. 세션 이름과 목적, 세션이 개설되는 시간, 세션을 구성하는 미디어, 미디어 수신 정보(주소, 포트, 포맷), 미디어 유형(비디오, 오디오), 전송 프로토콜(RTP/UDP/IP), 미디어 포맷(MPEG-4, H.264), 미디어에 대한 멀티캐스트 주소, 미디어에 대한 전송 포트, 대역폭 정보 등을 제공한다. 또한 클라이언트가 세션에 참가하는데 필요한 충분한 대역폭 정보와 세션을 책임지고 있는 사람의 연락 정보를 제공하고 있다. SDP 세션의 기술 방법은 텍스트로 기술한다. 그리고 각 SDP 항목은 아래와 같은 텍스트 형태의 여러 줄로 구성되어 있다.In the Internet environment, SDP provides two important functions. One is to announce the existence of a session, and the other is to deliver enough information to join the session. To this end, the SDP contains the following information: Session name and purpose, the time the session is established, the media making up the session, media reception information (address, port, format), media type (video, audio), transport protocol (RTP / UDP / IP), media format (MPEG -4, H.264), multicast address for media, transport port for media, and bandwidth information. It also provides enough bandwidth information for the client to join the session and the contact information of the person responsible for the session. The method of describing an SDP session is described in text. Each SDP item is composed of several lines of text as shown below.

<type>은 한 문자로 쓴다.<value>는 구조화된 텍스트 스트링 형태로 <type>에 따라 다르다. '=' 기호 양쪽에 공백이 있으면 안 된다. 일반적으로 <value>는 여러 부분으로 구성되어 있는데 공백으로 구분한다. SDP 세션은 세션 레벨과 몇 개의 미디어 레벨로 구성되어 있다. 세션 레벨 부분은 'v='줄로 시작하고, 미디어 레벨 부분은 'm='로 시작한다. 몇 개의 항목은 반드시 들어가야 하고 몇 개의 항목은 선택사항이다. 하지만 줄 간의 순서는 반드시 지켜야 한다.
<type> is written as one character. <value> is a structured text string that depends on <type>. There should be no spaces on either side of the '=' sign. In general, <value> consists of several parts, separated by spaces. SDP sessions consist of a session level and several media levels. The session level part starts with the 'v =' line and the media level part starts with the 'm ='. Some items must be filled in and some are optional. However, the order between the lines must be followed.

도 9를 참조하면, 세션 기술 표현방법에 대하여 자세히 알 수 있다. 위에서 *로 표시한 것은 선택사항이며, parser는 이해하지 못하는 <type> 문자는 무시한다. 다음은 SDP로 세션을 기술한 예를 보여 준다.Referring to FIG. 9, the session description method may be described in detail. The * marked above is optional, and the parser will ignore <type> characters that it does not understand. The following is an example of describing a session in SDP.

v=0v = 0

o=Administrator 743235897 866589328 IN IP4 203.1.118.230o = Administrator 743235897 866589328 IN IP4 203.1.118.230

s=ICAST Demos = ICAST Demo

i=Demoi = Demo

u=http://www.icast.comu = http: //www.icast.com

e=sales@icast.come=sales@icast.com

p=+01 408 8740701p = + 01 408 8740701

t=3075578128 3076222528t = 3075578128 3076222528

r=565200 39600 0r = 565200 39600 0

a=recvonlya = recvonly

m=audio 4640 RTP/AVP PCMm = audio 4640 RTP / AVP PCM

c=IN IP4 218.3.3.10/127c = IN IP4 218.3.3.10/127

a=orient:portaita = orient: portait

a=framerate:30a = framerate: 30

a=datarate:407a = datarate: 407

a=quality:6a = quality: 6

a=grayed:0a = grayed: 0

레코드를 끝내기 위해 새로운 라인을 사용하고 있다. 이제 각 항목에 대한 구체적 정보를 알아보면 다음과 같다.We are using a new line to end the record. Now, the specific information on each item is as follows.

·Protocol Version(v=0):SDP의 버전을 나타낸다.Protocol Version (v = 0): Indicates the version of SDP.

·OriginOrigin

o=<사용자 이름><세션 ID><버전><네트워크 유형><주소> o = <user name> <session ID> <version> <network type> <address>

'o' 부분은 세션 생성자에게 세션 ID와 세션 버전 번호를 나타낸다. <사용자 이름>은 생성 호스트에서 사용자 login이다. <세션 ID>는 세션에 대한 세계적으로 유일한 식별자를 제공한다. <버전>은 이 광고에 대한 버전 번호이다. 이 버전 값은 변경시킬 때마다 증가한다. <네트워크 유형>은 네트워크 유형을 기술하는 텍스트 열이다. 앞에 'IN' 스트링이 나오면 '인터넷'을 의미한다. <주소>는 세션을 생성한 호스트의 유일한 주소를 나탄낸다.The 'o' part shows the session creator the session ID and session version number. <User name> is the user login on the producing host. Session ID provides a globally unique identifier for the session. <Version> is the version number for this advertisement. This version value increments with every change. <Network type> is a text string describing the network type. The string 'IN' in front means 'internet'. <Address> represents the unique address of the host that created the session.

·세션 이름Session Name

s=<세션 이름>과 같이 표시되며, 세션 이름을 표시하고 있다.It looks like s = <session name>, which shows the session name.

·세션과 미디어에 대한 정보Information about sessions and media

I=<세션 기술> 그 세션에 대한 사용 목적과 여러 가지 상세한 정보를 기입하는 데 사용한다. I = <session description> Used to enter the purpose and various detailed information about the session.

·URI와 email 주소와 전화번호URI and email address and telephone number

u=<URI> e=<email 주소> p=<전화번호> u = <URI> e = <email address> p = <phone number>

회의에 대한 추가적인 정보를 가리키는 것으로 이용되고 있다. It is used to indicate additional information about the meeting.

·연결 데이터Connection data

c=<네트워크 유형><주소 유형><연결 주소>/ttlc = <network type> <address type> <connection address> / ttl

'c' 부분은 연결 데이터 정보를 포함하고 있다. 첫 세부 부분은 네트워크 유형이다. 'IN'은 인터넷을 의미한다. '주소 유형'에서는 IPv4로 정의되 어 있다. 주로 연결 주소는 class-D IP 멀티캐스트 그룹 주소를 쓰고 있 다. TTL(Time To Live)은 멀티캐스트 패킷들이 보내질 범위를 정의하고 있다. MBone의 경우에는 다음과 같이 TTL에 대하여 몇 가지 표준을 쓰고 있다.The 'c' part contains connection data information. The first detail is the network type. 'IN' stands for the Internet. In 'Address Type', it is defined as IPv4. The connection address mainly uses a class-D IP multicast group address. Time To Live (TTL) defines the range to which multicast packets are sent. MBone uses several standards for TTL as follows.

·대역폭Bandwidth

b=<변경자>:<대역폭 값> b = <modifier>: <bandwidth value>

변경자는 'CT(Conference Total)와 'AS(Application Specific Maximum) 가 있다. CT는 모든 미디어에 대한 전체 대역폭을 가리키고 AS는 한 미 디어에 대한 대역폭을 가리킨다.Modifiers are 'CT (Conference Total) and' AS (Application Specific Maximum). CT refers to the total bandwidth for all media and AS refers to the bandwidth for one media.

·시간, 반복횟수Time, repeat count

t=<시작 시간><끝나는 시간>t = <start time> <end time>

't' 부분은 회의 세션에 대한 시작과 끝나는 시간을 기술한다. 시간은 NTP 값을 사용한다. 끝나는 시간이 0인 경우는 세션이 무한정 계속된 다. 시작 시간이 0이며, 세션은 영구하다는 것을 나타낸다. The 't' part describes the start and end times for the conference session. Time uses NTP values. If the end time is zero, the session continues indefinitely. The start time is zero, indicating that the session is persistent.

r=<반복 간격><살아있는 시간><시작 시간에서부터의 옵션>r = <repeat interval> <live time> <option from start time>

'r' 부분은 한 세션에 대한 반복 횟수를 기술한다. 예를 들어 한 세션이 월요일에 10시부터 화요일에 11시부터 세 달에 걸쳐 한 시간 동안 살아 있는 경우 't' 부분에 <시작 시간>은 첫 월요일의 10시를 NTP 형태로 표 시하고 <반복간격>은 1주, 일 마지막 세션의 시간을 NTP 형태로 표시한다. The 'r' part describes the number of iterations for a session. For example, if a session is live for one hour over three months, starting from 10:00 on Monday to 11:00 on Tuesday, <start time> in the 't' section displays the first Monday's 10 o'clock in NTP format and repeats < Interval> displays the time of the last session of the week, in NTP format.

위의 기술된 사항의 예는 다음과 같다.Examples of the above are as follows.

t=3034423619 3042462419 t = 3034423619 3042462419

r=604800 3600 0 90000 r = 604800 3600 0 90000

·미디어 공고· Media Announcement

m=<미디어><포트><전송 프로토콜><리스트>m = <media> <port> <transport protocol> <list>

<미디어> 부분은 '오디오', '비디오', '화이트 보드', '텍스트' 그리고 '데이터' 중에 하나를 기술한다. 두 번째 부분은 미디어 스트림이 전송될 전송 포트를 나타내고 있다. 포트 값은 전송 프로토콜에 따라 다르다. 그 프로토콜이 RTP인 경우는 데이터 전송을 위해 짝수 포트(3456), 제어 정보를 위해 RTP보다 하나 큰 포트(3457)를 사용하고 있다. 세 번째 부분은 전송 프로토콜을 나타내고 있다. 일예로 'm=video 3456/2 RTP/AVP 31'과 같이 기술된 미디어는 RTP 오디오/비디어 프로필 아래에서 돌아 가는 RTP미디어 스트림인 경우에 쓰인다. 예를 들어 프로토콜 부분이 'RTP/XYZ'인 경우는 프로필 이름이 'XYZ'에서 동작하는 RTP를 나타낸다. 다음 부분은 미디어 포맷에 관한 부분이다. 오디오와 비디오인 경우에는 RTP 오디오/비디오 프로필에 정의되어 있는 미디어 페이로드 유형이 된다. 정적 페이로드 유형의 예로 8KHz에서 샘플링 하는 u-law PCM은 페이로드 유형이 0으로 정의되어 있으며 다음과 같이 나타낼 수 있다.The Media section describes one of audio, video, whiteboard, text, and data. The second part shows the transport port to which the media stream will be sent. The port value depends on the transport protocol. When the protocol is RTP, an even port (3456) is used for data transmission and one port (3457) larger than RTP is used for control information. The third part shows the transport protocol. For example, a media described as 'm = video 3456/2 RTP / AVP 31' is used when the RTP media stream runs under the RTP audio / media profile. For example, if the protocol part is 'RTP / XYZ', it indicates RTP whose profile name is 'XYZ'. The next section is about media formats. For audio and video, this is the media payload type defined in the RTP audio / video profile. As an example of static payload type, u-law PCM sampling at 8KHz has payload type defined as 0 and can be expressed as follows.

m=video 3456 RTP/AVP 0m = video 3456 RTP / AVP 0

동적 페이로드 유형 예로는 16KHz에서 샘플링 하는 16bit 스테레오 오디오를 들 수 있는데, 페이로드 유형 98을 쓰고 있다. 만약 이 페이로드 유형을 사용하고자 할 경우에는 다음과 같은 추가정보가 필요하다.An example of a dynamic payload type is 16-bit stereo audio sampling at 16 KHz, using payload type 98. If you want to use this payload type, you need the following additional information:

m=video 3456 RTP/AVP 98m = video 3456 RTP / AVP 98

a=rtpmap:98 L16/16000/2a = rtpmap: 98 L16 / 16000/2

rtpmap의 일반적인 형태는 다음과 같다.The general form of rtpmap is:

a=rtpmap<페이로드><인코딩>/<클락속도>/[<인코딩 파라미터>]
a = rtpmap <payload><encoding> / <clock rate> / [<encoding parameter>]

본 발명에서 실시간 스트리밍을 위한 시스템 구성의 실시는 도 10과 같으며, 이는 실시간 디지털 멀티미디어 전송 시스템 구성으로 Caster는 디지털 멀티미디어로 압축하여 패킷화 전송 및 파일화 저장을 담당하며, 코덱 방식은 MPEG-4 AVC(H.264) Video/AAC Audio CODEC을 적용하며 전송방식은 Unicast, Multicast 둘 다 지원할 수 있도록 하였으며, 운영체제는 WINDOWS XP, Linux를 지원하고 성능은 CPU 펜티엄 4 2.8GHz Hyper-Threading 기능이 되고 Memory는 256MB, Soundcard와 캡처보드는 Ospray 또는 DRC 보드를 장착하고 카메라 실내용은 1CCD로 하고 실외용은 3CCD로 Autofocus 기능이 되는 시스템을 사용하였다. The implementation of the system configuration for real time streaming in the present invention is shown in Figure 10, which is a real-time digital multimedia transmission system configuration Caster is responsible for packetized transmission and file storage by compressing the digital multimedia, codec method MPEG-4 AVC (H.264) Video / AAC Audio CODEC is applied and the transmission method supports both Unicast and Multicast, the operating system supports WINDOWS XP and Linux, and the performance is CPU Pentium 4 2.8GHz Hyper-Threading function and Memory It uses 256MB, Soundcard and Captureboard with Ospray or DRC board, 1CCD for indoor camera and 3CCD for outdoor camera.

Streaming Server(실시간 스트리밍 송출시스템)는 Caster로부터 수신된 패킷 및 파일을 재생기에 실시간 스트리밍 송출하는 기능으로 RTSP 인증, RTP수신, RTSP 인증, SDP 접속, RTP 전송의 절차 기능과 Unicast, Multicast 전송기능을 수행하고 운영체제는 Linux를 사용하며 CPU는 펜티엄 3 1GHz, 메모리는 512MB 사양으로 RTP/UDP Unicaster와 Multicast Module을 탑재하였다.Streaming Server (Real Time Streaming Transmission System) is a function to send the streaming packet and file received from the Caster to the player in real time.It performs RTSP authentication, RTP reception, RTSP authentication, SDP connection, RTP transmission and Unicast, Multicast transmission function. The operating system uses Linux, the CPU is Pentium 3 1GHz, and the memory is 512MB. It is equipped with RTP / UDP Unicaster and Multicast Module.

마지막으로 실시간 재생기의 H/W 성능은 CPU 펜티엄 3 1GHz로 메모리 128MB, 8MB 이상의 Video Memory와 soundcard를 갖추고 RTSP/RTP, SDP/HTTP Tunneling과 RTSP 인증 기능을 수행한다. Finally, the real-time player's H / W performance is CPU Pentium 3 1GHz with 128MB of memory, 8MB of video memory and soundcard, RTSP / RTP, SDP / HTTP Tunneling and RTSP authentication.

도 11과 같이 RTP 프로토콜을 사용하는 실시간 고품질 영상전송 시스템을 구성하는 방법을 RTSP Server 의 미디어 데이터 축적 유무 및 전송 방식에 따라 구분하여 3가지 방식을 구현하였다. 스트리밍 서버(실시간 스트리밍 송출 시스템)는 미디어 데이터의 파일화와 패킷화 형식에 따라 주문형 방송 방식과 생중계 방송으로 구성할 수 있고 전송 방식에 따라 Multicast와 Unicast로 구분한다. 스트리밍 서버로부터 실시간 스트리밍 멀티미디어 데이터의 수신을 위해서는 Client가 Server로 미디어 데이터를 요청할 시 서버는 즉시 미디어 데이터를 페이로드 하여 RTP/UDP를 통해 실시간으로 Client로 전송하는 주문형 방송 방식은 RTP 프로토콜을 사용하는 Unicast 전송 방식을 적용한다. As shown in FIG. 11, three methods were implemented by dividing a method of configuring a real-time high-quality video transmission system using an RTP protocol according to whether media data is accumulated and a transmission method of the RTSP server. The streaming server (real-time streaming transmission system) can be composed of on-demand broadcasting and live broadcasting according to the file format and packetization format of media data, and divided into multicast and unicast according to the transmission scheme. In order to receive real-time streaming multimedia data from a streaming server, when a client requests media data from the server, the server pays the media data immediately and sends it to the client in real time through RTP / UDP. Apply the transmission method.

미디어 데이터의 저장을 하지 않고 실시간으로 다중소스(오디오 및 비디오) 영상을 별도의 Streaming Server를 구성하지 않고 캡쳐 카드를 장착한 Caster에서 바로 H.264로 엔코딩 하여 Client로 전송하는 생중계 방송은 적은 대역폭을 사용하는 멀티캐스트 전송 방식에 적합한 구성방식이다. Unicast 전송 방식에서는 제한된 대역폭을 효율적으로 사용하기 위하여 Streaming Server를 구성하여 엔코딩 멀티미디어 패킷을 Streaming Server에 전송하여 다시 Client로 중계·전송하는 방식이다.
Real-time broadcast without transmitting media data and transmitting multi-source (audio and video) video to H.264 directly from Caster equipped with capture card to client without configuring streaming server. This configuration is suitable for the multicast transmission method used. In the unicast transmission method, the streaming server is configured to efficiently use the limited bandwidth, and the encoded multimedia packet is transmitted to the streaming server to be relayed and transmitted back to the client.

본 발명에서 실시간 스트리밍 서버(실시간 스트리밍 송출 시스템)의 실시를 위한 구성은 도 12로 나타낼 수 있으며, 이는 실시간 스트리밍 서버 구조로서 Server 제어, 관리, 접속 도구 프로그램을 관리하는 GUI 프로그램 관리부와 Server와 Client 간의 세션 연결, 해제, 실시간 미디어 데이터의 전송을 담당하는 스트리밍 제어부와 멀티미디어 파일을 페이로딩 하여 인식하는 플레이어부(해석)로 구성된다.In the present invention, the configuration for the implementation of the real-time streaming server (real-time streaming transmission system) can be shown in Figure 12, which is a real-time streaming server structure between the GUI program management unit for managing the server control, management, access tool program between the server and the client It consists of a streaming control unit responsible for session connection, disconnection, and transmission of real-time media data, and a player unit (interpretation) that pays for and recognizes multimedia files.

실시간 영상전송 프로토콜의 전송 포트 번호는, RTSP가 554번과 7070번 포트가 사용되고 RTP 포트는 TCP가 80번과 8080번 포트 2개인 반면 6,950번~6,999번 까지 50개의 포트를 사용함으로 미디어 데이터의 전송시 병렬 포트를 구성할 수 있어 TCP/IP 기반의 패킷전송보다는 RTP/UDP/IP 기반의 패킷전송이 연속적인 미디어 데이터 전송에 효율적인 전송을 꽤 할 수 있다. 위의 내용에서 디지털 멀티미디어로 압축되어 RTP 포맷으로 패킷화 되거나 파일화 된 디지털 멀티미디어는 클라이언트의 RTSP 요청에 의하여 서버가 응답하며, 도 12의 플레이어부(해석)에서 클라이언트가 요청한 멀티미디어 파일을 페이로딩 하여 도 12의 스트리밍 제어부의 SDP에 의하여 SDP 파일이 생성됨과 동시에 세션이 생성되고 RTSP에 의하여 SDP 파일의 클라이언트 플레이어에 전송된 후 RTP에 의하여 디지털 멀티미디어의 메인 데이터가 클라이언트에 실시간 스트리밍 되는 다음 절차에 의한다.The transmission port number of the real-time video transmission protocol is that the RTSP is used for port 554 and 7070, and the RTP port is used for TCP port 80 and 8080, whereas 50 ports are used for 6,950 to 6,999 for media data transmission. Since a parallel port can be configured, RTP / UDP / IP based packet transmission rather than TCP / IP based packet transmission can be quite efficient for continuous media data transmission. In the above contents, the digital multimedia compressed into digital multimedia and packetized or filed in the RTP format is responded by the server according to the RTSP request of the client, and the player unit (interpretation) of FIG. 12 pays the multimedia file requested by the client. According to the following procedure, the SDP file is generated by the SDP of the streaming control unit of FIG. 12 and the session is created and transmitted to the client player of the SDP file by RTSP, and then the main data of the digital multimedia is streamed to the client by RTP in real time. .

실시간 스트리밍 서버(RTSP Server)의 처리 절차로 Start-Up, Shut-Down, RTSP 프로토콜의 인증 순서, RTSP Preprocessor 처리 절차 흐름도는 도 13 내지 도 15와 같다.13 to 15 are flowcharts of a start-up, a shut-down, an authentication sequence of an RTSP protocol, and an RTSP preprocessor as a processing procedure of a real-time streaming server.

본 발명에서 실시간 스트리밍을 위한 실시간 스트링 서버의 실시를 위해 RTSP, RTP, SDP 적용한 각각의 Protocol Role 중에서, RTSP Filter role은 도 16과 같이 Client에서 RTSP 요구에 대하여 Server는 RTSP Filter Module을 호출하고 Filter는 가공한 동영상 데이터에 패킷별로 반응하나, RTSP의 변화에 따라 능동적으로 반응하기 때문에 동영상의 어떤 지점을 호출해도 바로 응답한다.In the present invention for the implementation of a real-time string server for real-time streaming Of the Protocol Roles applied to RTSP, RTP, and SDP, the RTSP Filter role is the server calling the RTSP Filter Module for the RTSP request from the client as shown in FIG. 16, and the Filter responds to the processed video data packet by packet, but changes to RTSP. It responds proactively, so it responds immediately to any point in the video.

RTSP Route role은 도 17과 같고 Server가 RTSP Filter Module에서 모든 패킷을 부른 후에 Server는 RTSP Route role을 호출하며, 이는 각 RTSP 요구를 위해 루트 디렉토리를 호출하기 위한 역할을 한다.The RTSP Route role is shown in FIG. 17. After the Server calls all packets in the RTSP Filter Module, the Server calls the RTSP Route role, which serves to call the root directory for each RTSP request.

RTSP Preprpcessor role은 도 18과 같고, Server가 RTSP 경로 역할을 위해 등록한 모든 Module을 호출한 후에 Server는 RTSP 전처리기 역할을 호출하며 Client에게 적당한 RTSP 응답을 보낸다.
The RTSP Preprpcessor role is shown in Figure 18. After the server calls all modules registered for the RTSP path role, the server calls the RTSP preprocessor role and sends the appropriate RTSP response to the client.

RTSP Request role은 도 19와 같고 RTSP 전처리기가 반응하지 않으면 Server는 RTSP를 다시 호출한다. 단 1개의 RTSP를 호출하고 Server가 시작될 때 RTSP가 첫 번째로 등록되는 단위이다.The RTSP Request role is shown in FIG. 19. If the RTSP preprocessor does not respond, the server calls RTSP again. Only one RTSP is called and RTSP is registered first when Server starts.

RTSP Postprocessor role은 도 20과 같고 RTSP Request가 등록되면 Server는 그 요구에 반응하여 RTSP Postprocessor를 호출하며 이는 통계적인 정보를 기록하기 위하여 이용된다.The RTSP Postprocessor role is shown in FIG. 20. When the RTSP Request is registered, the Server calls the RTSP Postprocessor in response to the request, which is used to record statistical information.

RTP Send packet role은 도 21과 같고 패킷이 Client의 Player를 호출할 때 Server는 패킷을 RTP를 통하여 보내며, Client에게 데이터를 보내기 위하여 RTP에 그 역할이 전가되며 RTP는 지속적으로 Server의 데이터를 호출하여 Client에게 송출한다.RTP Send packet role is shown in Figure 21. When a packet calls the player of the client, the server sends the packet through the RTP, its role is transferred to the RTP to send data to the client, and RTP continuously calls the data of the server. Send to client.

RTSP Processor role은 도 22와 같고 Server는 Client에게 RTCP 신호를 수신하며 언제든지 요청하는 단위 패킷(동영상의 어떠한 지점을 선택하여 요청할 시) 보낼 수 있도록 대기한다. 이는 Client의 어떠한 질의에 대하여서도 응답할 수 있도록 하며, 데이터의 손실률을 포함하여 오디오의 초당 프레임을 기준으로 삼는다.The RTSP Processor role is shown in FIG. 22. The Server receives the RTCP signal from the client and waits for a request to send a unit packet (when requesting any point of the video). This allows you to respond to any query from the client, based on frames per second of audio, including data loss.

SDP Processor role은 전송 서버 주소는 도 23과 같이 218.233.155.XX이고, 방송제어 프로토콜은 RTSP Protocol을 사용하고 미디어의 속성은 비디오, 오디오 포맷에 사용하는 RTP Protocol을 사용하며, 압축기술 CODEC은 H.264 방식을 적용했다.As for the SDP Processor role, the transmission server address is 218.233.155.XX as shown in FIG. 23, the broadcast control protocol uses the RTSP protocol, the property of the media uses the RTP protocol used for video and audio formats, and the compression technology CODEC is H. The .264 method is applied.

본 발명에서 즉시 시청을 위한 Instant-on의 실시는 RTSP/RTP를 사용하여 한번에 한 프레임의 데이터를 모두 보낼 수 있는 실시간 스트리밍의 기반을 상기와 같이 실시하고 스트리밍 송출 시스템(Server)이 즉각적으로 응답하기 위하여 멀티미디어 메인 데이터 파일을 페이로딩 하지 않고 헤더의 작은 데이터(원본 미디어 데이터의 3%)만을 페이로딩 하여 전송할 수 있도록 하기 위한 기법으로써, Client(재생기)가 Server에 디지털 멀티미디어를 요청할 때, Streaming Server는 멀티미디어 파일을 모두 페이로딩 하지 않고 헤드 데이터의 정보를 Client에 전송하는 방식으로 멀티미디어 데이터의 정보(미디어 속성)를 멀티미디어 파일의 헤드 부분에 추가하여 Streaming Server를 구성하는 각 요소 중 도 12의 스트리밍 제어부에 압축된 헤드의 정보를 해석할 수 있는 Hinted 모듈을 추가하여 헤드 데이터만을 페이로딩 하여 SDP Parsing roll에 의하여 멀티미디어의 속성 정보를 문자로된 SDP 파일을 만들어 그 파일을 Client에 바로 전송하는 기법으로 멀티미디어의 압축단계부터 전송 및 수신단계까지 적용한다.
Instant-on for instant viewing in the present invention is based on the real-time streaming that can send all the data of one frame at a time using RTSP / RTP as described above, and the streaming transmission system (Server) to respond immediately This is a technique for payloading and transmitting only small data (3% of original media data) of a header without paying the multimedia main data file. When a client (digital player) requests digital multimedia from the server, the streaming server The streaming control unit of FIG. 12 is added to each component of the Streaming Server by adding the information (media property) of the multimedia data to the head of the multimedia file by transmitting the head data information to the client without paying the multimedia files. Hinted module to interpret the information of the compressed head It was added to only the head page loading data by creating a SDP file multimedia attribute information of a character by the SDP Parsing roll is applied to the technique for directly transmitting the file to the Client to transmit and receive stage from the compression stage of the multimedia.

본 발명에서 즉시 시청의 Instant-on을 위한 Instant-on RTP Fomat의 실시는 RTP Payload Standard 구성의(도 24와 도 25) 헤드를 압축하여 Instant-on RTP payload meta-information(도 27) 헤드를 만들며, Instant-on Mixed RTP payload meta-information(도 28) 데이터를 만든다. 이와 같이 동영상 데이터의 헤더에 순차 번호와 타임 스탬프 당 데이터량에 대한 정보를 기입하여 RTP 메타정보를 제공함으로 RTP가 동영상 데이터 모두를 페이로딩 하지 않고 헤드만을 페이로딩 하여 동영상 데이터를 스트리밍 할 수 있도록 한다. RTP Data는 RTP 클라이언트에게 그 다음의 정보를 제공하기 위하여 RTP에 탑재된 다음의 메타정보가 필요하다. Implementation of Instant-on RTP Fomat for Instant-on of Instant View in the present invention compresses the head of the RTP Payload Standard configuration (FIGS. 24 and 25) to create the Instant-on RTP payload meta-information (FIG. 27) head. Instant-on Mixed RTP payload meta-information (FIG. 28) data is created. As such, RTP meta information is provided by writing information about the sequential number and the amount of data per time stamp in the header of the video data, so that RTP can payload only the head without streaming all the video data so that the video data can be streamed. . RTP Data needs the following meta information embedded in RTP to provide RTP clients with the following information.

Transmission Time은 서버에서 밀리 세컨드 안에 RTP 패킷의 전송 시간을 표시하는 4 옥텍트의 정수로 전송 시간을 보낸다. 전송시간은 매체의 시작에서 항상 상쇄된다. 예를 들면 URL의 요청에 의하여 RTSP or SDP 응답이 0-729.45의 범위를 포함하고 클라이언트가 100-729.45의 범위에서 플레이 될 것을 요구할 경우 서버가 바로 응답하여 가장 가까운 프레임의 시간을 발견하도록 하기 위함이다.Transmission Time sends the transmission time as an integer of 4 octets representing the transmission time of the RTP packet in milliseconds. The transmission time is always canceled at the beginning of the medium. For example, if the RTSP or SDP response includes a range of 0-729.45 by the request of the URL and the client requests to be played in the range of 100-729.45, the server responds immediately to find the time of the nearest frame. .

Frame Type 구성 중, 동영상은 Key Frame, b-Frame, p-Frame으로 이루어 지는데 모든 영상 소스는(배경화면을 포함한) Key Frame이 가지고 있으며, 이는 16bit 정수로 정의되어 클라이언트에게 보내어 진다. During frame type configuration, video is composed of key frame, b-frame, p-frame. All video sources have key frame (including background screen), which is defined as 16bit integer and sent to client.

Packet Number는 스트리밍 서버에서 단순한 64bit 정수로써 패킷의 수를 보낸다. 예를 들면, URL의 응답이 0-729.45의 범위를 포함하고 클라이언트가 동일한 범위에서 재생을 원할 때 첫 번째 패킷은 0이고 각 연속적인 패킷을 위하여 1씩은 증가하게 되는데 첫 번째 패킷은 60초 안에 1000의 패킷 사이 60-729.45의 재생 요구가 있으면, 첫 번째 패킷의 수는 1001이고, 각 연속적인 패킷을 위하여 1씩 증가하도록 한다.Packet Number is a simple 64-bit integer sent by a streaming server. For example, if the response from the URL contains a range of 0-729.45, and the client wants to play in the same range, the first packet is zero and increments by one for each successive packet, the first packet being 1000 in 60 seconds. If there is a play request of 60-729.45 between packets of 0, the number of first packets is 1001, and is increased by 1 for each successive packet.

Packet Position 구성은 스트리밍 서버에서 단순한 64bit 정수로 패킷의 위치를 보내며, 전체의 영상이 0-729.45의 범위에 있을 때 클라이언트가 100-729.45의 범위를 재생하면 첫 번째 영상의 RTP 패킷은 0과 100 사이 영상의 RTP 패킷을 합한 바이트(bytws)이다. 각 패킷의 위치를 계산하기 위하여 RTP 전송 패킷의 총합(bytes)에서 요청한 위치의 패킷의 합계를 뺀 패킷으로 위치를 계산하여, 원하는 영상을 즉시 시청하도록 한다. The Packet Position configuration sends the position of the packet as a simple 64-bit integer from the streaming server. If the client plays the range 100-729.45 when the entire video is in the range 0-729.45, the RTP packet of the first video is between 0 and 100. Bytes are the sum of the RTP packets of the video. In order to calculate the location of each packet, the location is calculated by subtracting the sum of packets of the requested location from the total number of bytes of the RTP transmission packet, so that the desired image is immediately viewed.

Media data는 멀티미디어의 메인 데이터로서 서버에서 RTP 프로토콜에 의하여 멀티미디어를 전달한다. Sequence Number는 스트리밍 서버(실시간 스트리밍 송출 시스템)에서 2 옥텍트의 RTP 순차 번호를 송출하며 송출 시 순차 번호는 탑재된 동영상 데이터의 량에 RTP 메타 정보를 나타내기 위하여 사용된다.Media data is the main data of the multimedia, and the server delivers the multimedia by the RTP protocol. The sequence number transmits 2 octets of the RTP sequence number from the streaming server (real time streaming transmission system), and the sequence number is used to express the RTP meta information on the amount of video data loaded.

RTP Standard Format 구성 도 24와 도 25는 실시간 스트리밍 송출 시스템에서 데이터 필드는 RTP 메타 정보의 로딩에 의하여 이루어지며, 헤드 및 데이터의 각 패킷의 정보로 이루어져 있다. 표준 형식 안에 이러한 데이터가 서버로 보내질 때, 헤드 부분의 첫 번째 비트는 0이며(즉, 압축하지 않은), 첫 번째 비트는 15비트 이름 필드에 의해 결정된다. 이 15비트 이름 필드에는 RTP 데이터에 포함된 패킷의 목록이 2개의 아스키 문자로 이루어져 있어야 한다. 그러므로, RTP Standard Format의 헤드는 전체 멀티미디어 데이터의 12%를 차지하게 되어 스트리밍 송출 시스템이 SD~HD 멀티미디어 파일을 페이로딩 하고 파싱하여 SDP 파일을 만드는데 1초~3초 간의 지연시간이 발생하므로 form 헤드를 Instant-on roll(도 29)을 적용하여 압축함으로써 도 27과 같이 Instant-on RTP payload meta-information 형태로 Hinted 함으로써 Instant-on RTP Format Head의 크기가 전체 멀티미디어 데이터의 3%(3,072KB) 미만이 되도록 하여 클라이언트의 요청에 대하여 45mm Sec. 이내에 실시간 스트리밍 송출 시스템이 응답하고 도 12의 스트리밍 서버 제어부에 Hinted 모듈을 추가하여 Instant-on RTP Format을 해석하고 SDP 파싱을 50mm Sec. 이내에 실행하여 SDP 파일을 생성하며, 이를 클라이언트에 전송하고 클라이언트의 플레이어는 SDP 파일을 수신하여 50mm Sec. 이내에 해독함으로써 최대 1,152KB/805mm Sec.로 전송되는 멀티미디어 메인 데이터를 1,000mm Sec. 이내에 재생할 수 있도록 실시한다.RTP Standard Format Configuration FIG. 24 and FIG. 25 show that a data field is loaded by RTP meta information in a real-time streaming transmission system, and is composed of information of each packet of a head and data. When this data is sent to the server in a standard format, the first bit of the head part is zero (ie uncompressed), and the first bit is determined by the 15-bit name field. This 15-bit name field must contain two ASCII characters for the list of packets contained in the RTP data. Therefore, the head of the RTP Standard Format occupies 12% of the total multimedia data, and the streaming head system takes one to three seconds of delay time to payload and parse SD to HD multimedia files to create SDP files. By compressing by applying Instant-on roll (Fig. 29), Hinted in the form of Instant-on RTP payload meta-information as shown in Fig. 27, the size of Instant-on RTP Format Head is less than 3% (3,072KB) of the total multimedia data. 45mm Sec. The real-time streaming transmission system responds within a short time, and adds a Hinted module to the streaming server controller of FIG. The SDP file is generated and sent to the client, and the player of the client receives the SDP file and receives 50mm Sec. By decoding the multimedia data within 1,000mm Sec. Up to 1152KB / 805mm Sec. Should be made to play within.

본 발명을 구성하는 RTP 포트는 6950번부터 6999번까지 50개의 포트를 사용할 수 있고, 비디오·오디오 및 텍스트 데이터를 분할된 포트로 동시에 전송함으로써 실시간 스트리밍 전송을 가능하게 하여 재생을 위한 대기 시간이 없는 즉시 시청을 구현한다. The RTP port constituting the present invention can use 50 ports from 6950 to 6999, and simultaneously transmits video, audio, and text data to the divided ports to enable real-time streaming transmission, thereby eliminating waiting time for playback. Implement viewing immediately.

한편, 본 발명에서는 클라이언트에서 실행되는 플레이어 포지션 바의 이동이 554번 포트를 사용하여 RTCP(Real Time Control Protocol)에 의하여 스트리밍 송출 시스템에 송·수신되고 스트리밍 송출 시스템은 해당 장면의 데이터를 페이로딩 하기 위하여 하드디스크를 탐색할 필요 없이 SDP 파일에 저장된 주소를 통하여 바로 접근하여 I-frame만을 페이로딩 한 후, 6,950번부터 6,999번까지의 50개 포트 중 사용하지 않는 포트로 I-frame의 데이터만을 전송함으로써 네트워크 대역폭의 증가 없이 실시간 탐색(Jog-Shuttle)을 구현한다. Meanwhile, in the present invention, the movement of the player position bar executed in the client is transmitted and received to the streaming transmission system by using the Real Time Control Protocol (RTCP) using the port 554, and the streaming transmission system pays the data of the scene. In order to access the I-frame only by directly accessing the address stored in the SDP file without searching the hard disk, only the I-frame data is transmitted to the unused port among the 50 ports from 6,950 to 6,999. This enables real-time search (Jog-Shuttle) without increasing network bandwidth.

<실시예><Examples>

도 30을 참조하여 본 발명에 따른 멀티미디어 데이터 스트리밍 시스템의 기본적인 구성에 대하여 설명하면 다음과 같다.A basic configuration of a multimedia data streaming system according to the present invention will be described with reference to FIG. 30 as follows.

본 실시예에 따른 멀티미디어 데이터 스트리밍 시스템은 아날로그 멀티미디어 데이터 입력수단(200), 인코더(300), 스트리밍 서버(100), 클라이언트(400)를 포함하여 구성된다.The multimedia data streaming system according to the present embodiment includes an analog multimedia data input means 200, an encoder 300, a streaming server 100, and a client 400.

상기 아날로그 멀티미디어 데이터 입력수단(200)은 카메라, 테이프 등 영상 및 오디오 입력수단으로서 물리적인 현상에 의한 영상 및 오디오 데이터가 입력되는 장치이다. The analog multimedia data input means 200 is a video and audio input means such as a camera, a tape, and the like, and is a device into which video and audio data due to physical phenomena are input.

상기 아날로그 멀티미디어 데이터 입력수단(200)에 의하여 입력된 영상 및 오디오 데이터는 인코더(300)로 전송된다. 상기 인코더(300)는 아날로그 데이터를 메인 디지털 데이터로 변환하고 상기 메인 디지털 데이터 앞에 RTP헤더를 붙이는 RTP인코딩 과정을 실행하여 RTP패킷을 생성하는 역할을 한다. The video and audio data input by the analog multimedia data input means 200 is transmitted to the encoder 300. The encoder 300 converts analog data into main digital data and executes an RTP encoding process in which an RTP header is attached to the main digital data to generate an RTP packet.

상기 RTP패킷의 헤더는 표준고정헤드(310), 표준확장헤드(330) 및 인스턴트온 헤드(320)를 포함한다.The header of the RTP packet includes a standard fixed head 310, a standard expansion head 330 and the instant-on head 320.

상기 표준고정헤드(310) 및 표준확장헤드(330)는 앞서 설명하였으므로 설명을 생략한다.Since the standard fixing head 310 and the standard expansion head 330 have been described above, description thereof will be omitted.

도 31을 참조하면, 상기 인스턴트온 헤드(320)는 상기 표준확장헤드(330) 앞에 구비되며, 인스턴트 식별자(322), 인스턴트 크기(324) 및 메타 데이터(326)가 하나의 필드에 구성된다.Referring to FIG. 31, the instant on head 320 is provided in front of the standard expansion head 330, and an instant identifier 322, an instant size 324, and metadata 326 are configured in one field.

상기 인스턴트 식별자(322)는 표준확장헤드(330)의 이름(Name)필드를 압축한 것이다. 즉, 표준확장헤드(330)의 이름(Name)필드는 16비트로 구성됨에 반하여, 인스턴트 식별자(322)는 8비트로 구성된다. The instant identifier 322 compresses the Name field of the standard extension head 330. That is, the name field of the standard extension head 330 is composed of 16 bits, whereas the instant identifier 322 is composed of 8 bits.

인코더(300)에서는 상기 인스턴트 식별자(322)를 임의의 숫자 및 문자로 생성하는데 써치과정을 통하여 인스턴트 식별자(322)가 다른 식별자와 겹치지 않도록 생성한다. The encoder 300 generates the instant identifier 322 in random numbers and letters. The instant identifier 322 is generated so that the instant identifier 322 does not overlap with another identifier through a search process.

이때, 상기 인스턴트 식별자(322)의 처음 4비트는 데이터 종류에 따라 다르게 정해지며, 동일한 데이터 종류에서는 동일하게 정해진다. 예를 들어, 비디오 데이터인 경우 처음 4비트가 4324, 오디오 데이터인 경우 4326, 텍스트 데이터인 경우 4328로 정해질 수 있으며, 이러한 처음 4비트는 동일한 데이터 종류에서는 동일하게 정해진다.In this case, the first 4 bits of the instant identifier 322 are determined differently according to the data type, and are identically assigned to the same data type. For example, the first 4 bits for video data, 4326 for audio data, and 4328 for text data may be determined, and the first 4 bits may be determined identically for the same data type.

상기 인스턴트 크기(324)는 상기 메타 데이터(326)의 크기를 나타내며, 상기 메타 데이터(326)는 표준확장헤드(330)에서 이름(Name) 및 길이(Length) 필드의 바로 다음 필드의 데이터를 압축한 것이다. 상기 압축은 빈 메모리공간을 제거하여 압축하는 무손실압축인 것이 바람직하다.The instant size 324 represents the size of the metadata 326, and the metadata 326 compresses data of the field immediately following the Name and Length fields in the standard extension head 330. It is. Preferably, the compression is lossless compression that removes and compresses empty memory space.

이러한 인스턴트온 헤더(320)에 의하여 아이피 티브이에서 채널을 바꾸거나 새로운 메뉴를 선택한 경우 버퍼링 시간없이 곧바로 시청할 수 있다. 구체적으로, 클라이언트에 의하여 전송요청이 수신된 경우, 스트리밍 서버는 인스턴트 식별자(322) 및 인스턴트 크기(324)의 내용을 기초로 하여 SDP파일을 생성하여 클라이언트에게 전송한다. 이때, 주로 첫 장면에 대한 데이터인 메타 데이터(326)가 상기 인스턴트 식별자 및 인스턴트 크기와 동일한 필드에 있으므로 SDP파일이 전송되면서 동시에 상기 메타 데이터(326)가 전송된다. 이후에는 메타 데이터(326)로 압축되지 않은 원래의 메인 데이터(328)가 전송된다.When the channel is changed or a new menu is selected in the IP by the instant-on header 320, the instant-on header 320 can immediately watch without buffering time. Specifically, when a transmission request is received by the client, the streaming server generates an SDP file based on the contents of the instant identifier 322 and the instant size 324 and transmits the SDP file to the client. At this time, since the metadata 326, which is mainly data about the first scene, is in the same field as the instant identifier and the instant size, the SDP file is transmitted and the metadata 326 is simultaneously transmitted. Thereafter, the original main data 328 which is not compressed into the metadata 326 is transmitted.

즉, 종래에는 30MB에 달하는 헤드정보를 모두 전송한 다음 재생을 하므로 버퍼링 시간이 오래 걸리지만, 본 발명의 인스턴트온 헤더(320)에 의하여 버퍼링 시간 없이 즉시시청이 가능한 것이다. 상기 인스턴트온 헤더(320)의 크기는 전체 멀티미디어 데이터의 3% 미만이 되며, SDP 파싱을 50mm Sec. 이내에 실행하여 SDP 파일을 생성하여 이를 클라이언트에 전송한다. 또한, 클라이언트는 상기 SDP파일을 수신하여 50mm Sec. 이내에 해독한다.That is, in the related art, the buffering takes a long time since all head information up to 30MB is transmitted and then reproduced. However, the instant-on header 320 enables instant viewing without the buffering time. The size of the instant-on header 320 is less than 3% of the total multimedia data, SDP parsing 50mm Sec. Run it within to generate the SDP file and send it to the client. In addition, the client receives the SDP file and 50mm Sec. Decrypt within.

본 실시예에서 RTP포트는 6950번부터 6999번까지 50개의 포트를 사용할 수 있으며, 특히, 비디오, 오디오 및 텍스트 데이터를 종류별로 구분하여 분할된 포트로 동시에 전송하여 재생을 위한 대기시간이 없는 즉시시청을 구현한다.In this embodiment, the RTP port can use 50 ports from 6950 to 6999. In particular, the video, audio, and text data can be classified by type and transmitted to the divided ports at the same time so that there is no waiting time for playback. Implement

구체적으로, 종래에는 포트간 혼잡도가 증가하여 분할된 포트로 전송하는 것은 매우 불안정하므로 실제로 상용화되기 어려웠으나, 본 발명에서는 도 32와 같이 SDP파일을 생성할 때 출발IP(520), 트랙아이디1(530) 및 트랙아이디2(540)가 지정된다. 상기 트랙아이디는 상기 인스턴트온 헤더에서 인스턴트 식별자(322)의 처음 4비트로 구성된다. 따라서, 상기 트랙아이디에 의하여 비디오, 오디오 및 텍스트 데이터가 자동으로 구분되어 지는 것이다. 상기 예에서 트랙아이디1(530)은 비디오, 트랙아이디2(540)는 오디오를 의미한다.Specifically, it was difficult to commercialize the transmission port to a port divided by the increase in congestion between ports in the past, but in the present invention, when the SDP file is generated as shown in FIG. 32, the starting IP 520 and the track ID 1 ( 530 and track ID 2 540 are designated. The track ID consists of the first four bits of the instant identifier 322 in the instant-on header. Therefore, video, audio, and text data are automatically classified by the track ID. In this example, track ID 1 530 means video and track ID 2 540 means audio.

상기 트랙아이디가 SDP파일에 의하여 클라이언트에 전송되어 인식되므로, 비디오 데이터, 오디오 데이터 및 텍스트 데이터를 분할된 포트로 전송하여도 혼잡이 발생하지 않는다. 또한, 상기 데이터는 인접한 포트번호를 사용하지 않고 각각 떨어진 포트번호를 사용하는 것이 바람직하다.Since the track ID is transmitted to and recognized by the client through the SDP file, congestion does not occur even when video data, audio data and text data are transmitted to the divided ports. In addition, it is preferable that the data use port numbers separated from each other without using adjacent port numbers.

본 발명에서 사용하는 전송 프로토콜은 RTP((Real-time Transport Protocol)로써 1,466byte/1mm sec.를 가지므로 48,867byte/33.33mm sec가 됨으로써 1080P의 HD급 IPTV 방송에서도 실시간 탐색을 위한 I-frame의 전송이 가능하다.Since the transmission protocol used in the present invention is 1,466byte / 1mm sec. As RTP (Real-time Transport Protocol), it becomes 48,867byte / 33.33mm sec. Transmission is possible.

스트리밍 서버가 멀티미디어 파일을 페이로딩 할 때 스트리밍 서버의 En-De Paser API는 IETE 표준에 근거한 다음과 같은 SDP roll이 만들어진다. When the streaming server pays off the multimedia file, the streaming server's En-De Paser API makes the following SDP roll based on the IETE standard.

v=0v = 0

o=mhandley 2890844526 2890842807 IN IP4 126.16.64.4o = mhandley 2890844526 2890842807 IN IP4 126.16.64.4

s=S에SeminarSeminar to s = S

i=A Seminer on the session description protocoli = A Seminer on the session description protocol

c=IN IP4 224.2.17.12/127c = IN IP4 224.2.17.12/127

t=2873397496 2873404696t = 2873397496 2873404696

m=audio 49170 RTP/AVP 0m = audio 49170 RTP / AVP 0

b=AS:64b = AS: 64

b=RS:800b = RS: 800

b=RR:2400b = RR: 2400

m=video 51372 RTP/AVP 31m = video 51372 RTP / AVP 31

b=AS:256b = AS: 256

b=RS:800b = RS: 800

b=RR:2400b = RR: 2400

IETE의 SDP 규격에는 옵션 'a'를 사용하여 사용자가 별도의 개발된 기술을 적용할 수 있도록 허용하는데, 이를 이용하여 상기의 표준 SDP 파일 규격의 시간 정보를 가지는 옵션 't'와 연계하여 멀티미디어 파일의 I-frame과 이와 연계된 오디오가 저장된 하드디스크의 섹터를 암호화하여 도 32의 (A)와 같이 추가한다. 이러한 SDP roll의 추가는 서버의 En-De Paser에서 이루어지도록 프로그래밍 된다.IETE's SDP specification uses option 'a' to allow a user to apply a separate developed technology. By using this, the multimedia file is linked with option 't' having time information of the standard SDP file standard. I-frame and the sectors of the hard disk in which the audio associated with it are stored are encrypted and added as shown in FIG. This addition of the SDP roll is programmed to be done in the server's En-De Paser.

상기 SDP 파일에는 하드디스크에 저장된 상기 멀티미디어 데이터의 I-frame 주소가 저장되며, 이러한 SDP파일은 처음 멀티미디어 데이터의 송출시 클라이언트로 전송된다. 따라서, 이후 클라이언트에서 실행되는 플레이어 포지션 바의 이동이 있는 경우에 해당 장면의 데이터를 페이로딩하기 위하여 하드디스크를 탐색하지 않아도 SDP파일에 저장된 주소를 통하여 바로 접근하여 I-frame만을 페이로딩할 수 있다. 이때 상기 주소는 인스턴트온 헤드의 인스턴트 식별자와 연계되어 I-frame이 페이로딩된다.The SDP file stores the I-frame address of the multimedia data stored in the hard disk. The SDP file is transmitted to the client when the multimedia data is first transmitted. Therefore, if there is a movement of the player position bar executed in the client later, only I-frame can be faded by directly accessing the address stored in the SDP file without searching the hard disk to payload the data of the scene. . In this case, the I-frame is payloaded in association with the instant identifier of the instant-on head.

도 31을 참조하면, 본 실시예에 의한 인스턴트온 헤드는 표준확장헤드 사이에 반복하여 구비된다. 이러한 구성은 표준확장헤드 2개 또는 3개 사이에 인스턴트온 헤드가 구비되도록 변형 가능하다. 상기 SDP파일에 저장된 주소는 상기 인스턴트온 헤드의 인스턴트 식별자와 매칭이 되어 있어 사용자가 포지션 바를 이동함에 따라 그에 대응하는 인스턴트온 헤드에 바로 접근 가능하다.Referring to FIG. 31, the instant on head according to the present embodiment is repeatedly provided between the standard expansion heads. This configuration can be modified to include an instant-on head between two or three standard expansion heads. The address stored in the SDP file is matched with the instant identifier of the instant-on head, so that the user can directly access the corresponding instant-on head as the user moves the position bar.

클라이언트로부터 포지션 바의 이동이 있는 경우, 554번 포트를 사용하여 RTCP(Real Time Control Protocol)에 의하여 신호가 전송되고, 상기 스트리밍 서버는 상기 SDP파일에 저장된 주소를 이용하여 하드디스크를 탐색하지 않고 바로 접근이 가능하며, 또한 상기 주소를 이용하여 반복되어 있는 인스턴트온 헤드 중 포지션 바의 위치에 해당하는 인스턴트온 헤드에 바로 접근하여 상기 인스턴트온 헤드의 메타데이터 내에 있는 I-frame만을 페이로딩하여 전송한다. When there is a movement of the position bar from the client, a signal is transmitted by Real Time Control Protocol (RTCP) using port 554, and the streaming server does not search the hard disk using the address stored in the SDP file. It is accessible and also directly accesses the instant-on head corresponding to the position of the position bar among the repeated instant-on heads by using the address, and pays only the I-frames in the metadata of the instant-on head. .

이러한 방법으로 멀티미디어 데이터를 건너뛰어 인스턴트온 헤드에 의하여 I-frame만을 전송하므로 대역폭을 증가시키지 않고 실시간 탐색(Jog-Shuttle)이 가능해 진다.In this way, since the I-frame is transmitted by the instant-on head by skipping the multimedia data, real-time search (Jog-Shuttle) is possible without increasing the bandwidth.

본 발명은 압축과 전송분야의 IPTV 국제기술표준규격을 구현한 기술력 위에 인스턴트온 기술을 압축과 전송분야의 제품에 추가함으로써 콘텐츠의 선택 및 채널의 전환 시 퍼블릭 망에서도 기존과는 다르게 5초 이상 기다리지 않아도 되는 즉시 시청을 지원함으로써 시청자의 불편함을 해소한 발명으로서 산업상의 이용가능성이 매우 크다. The present invention adds instant-on technology to products in the field of compression and transmission on the basis of technology implementing the IPTV international technical standard in the field of compression and transmission. As an invention that solves the inconvenience of viewers by supporting viewing as soon as there is no need, the industrial applicability is very large.

100 : 스트리밍 서버 200 : 아날로그 멀티미디어 데이터 입력수단
300 : 인코더 310 : 표준고정헤드
320 : 인스턴트온헤드 330 : 표준확장헤드
400 : 클라이언트100: streaming server 200: analog multimedia data input means
300: encoder 310: standard fixing head
320: Instant-on head 330: Standard expansion head
400: client

Claims

A streaming server which converts multimedia data into main digital data and generates an RTP packet using the main digital data, receives the RTP packet from the encoder, and has a hinted module for interpreting head information of the RTP packet; In the multimedia data streaming system including a client that receives the RTP packet from the streaming server to restore and play,
The encoder consists of a standard fixed head and a standard expansion head of the RTP head of the RTP packet,
In front of the standard expansion head is provided an instant-on head consisting of an instant identifier, instant size and metadata,
The instant size is the size of the metadata, the metadata is multimedia data streaming system, characterized in that the compressed data of the main digital data.

The method of claim 1,
The metadata is a multimedia data streaming system, characterized in that the main digital data of the field immediately following the name (Length) in the standard extension head.

The method of claim 1,
And the instant identifier, instant size, and metadata of the instant-on head are composed of one field.

The method of claim 1,
The encoder is a multimedia data streaming system, characterized in that for generating an 8-bit size instant identifier corresponding to the name (Name) of the standard extension head.

The method of claim 1,
And the first 4 bits of the instant identifier are track IDs determined according to video, audio or text data types.

The method of claim 5,
The streaming server generates the SDP file designated by the track ID and transmits to a separate port according to the track ID.

The method of claim 6,
The streaming system is a multimedia data streaming system embedded in the network kernel of the Linux operating system (Embedded) for the immediate response of the streaming system by the client's request.

The method of claim 6,
The client receives the SDP file, the multimedia data streaming system to obtain the information of the streaming media transmitted from the streaming system, and ready to play for it and play immediately.

The method of claim 1,
The streaming server is a multimedia data streaming system, characterized in that for storing the address of the I-frame of the multimedia data stored in the hard disk in the SDP file to the client.

10. The method of claim 9,
The instant on head is repeatedly provided between the standard expansion head,
When the movement signal of the position bar is transmitted from the client, the streaming server uses the address of the I-frame stored in the SDP file to determine the I-frame of the instant-on head corresponding to the position of the position bar on the hard disk. Multimedia data streaming system, characterized in that the payload to transmit to the client.