JP4596693B2

JP4596693B2 - Streaming method and system for executing the same

Info

Publication number: JP4596693B2
Application number: JP2001202147A
Authority: JP
Inventors: 英明春元; 優希堀内; 隆久藤田
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2000-07-06
Filing date: 2001-07-03
Publication date: 2010-12-08
Anticipated expiration: 2021-07-03
Also published as: JP2002084339A

Description

【０００１】
【発明の属する技術分野】
本発明は、ストリーミング方法に関し、より特定的には、サーバが端末へインターネットを通じてマルチメディアデータを送信し、かつ端末がそのデータを受信しつつ再生するためのストリーミング方法に関する。
【０００２】
【従来の技術】
（マルチメディアデータの符号化圧縮方式およびバッファモデルの説明）
インターネットでの伝送に使用されるマルチメディアデータには、動画、静止画、音声、テキスト、およびそれらが多重化されたデータ等、さまざまな種類がある。動画では、Ｈ．２６３やＭＰＥＧ１、２、４といった符号化圧縮方式が著名であるし、静止画としてはＪＰＥＧ、音声では、ＭＰＥＧオーディオ、Ｇ．７２９など枚挙にいとまがない。
【０００３】
本発明では、ストリーミング再生に的を絞っているので、動画および音声が伝伝送の対象となる。ここでは、動画圧縮方式の代表であるＭＰＥＧビデオ、中でも比較的仕組みが単純なＭＰＥＧ１（ＩＳＯ／ＩＥＣ１１１７２）ビデオや、ＭＰＥＧ２（ＩＳＯ／ＩＥＣ１３８１８）ビデオについて説明する。
【０００４】
ＭＰＥＧビデオは、高効率なデータ圧縮を実現するために、主に次の２つの特徴を有している。一つ目は、動画像データの圧縮において、従来から行われていた空間周波数特性を用いた圧縮方式の他に、フレーム間での時間相関特性を用いた圧縮方式を取り入れたことである。ＭＰＥＧでは、ストリームを構成している各フレーム（ピクチャとも呼ぶ）を、Ｉフレーム（フレーム内符号化ピクチャ）、Ｐフレーム（フレーム内符号化と過去からの参照関係を使用したピクチャ）、Ｂフレーム（フレーム内符号化と過去および未来からの参照関係を使用したピクチャ）の３種類に分類してデータ圧縮を行う。これらの３種では、Ｉフレームが最も大きく（つまり情報量が最も多く）、次いでＰ、Ｂの順である。圧縮アルゴリズムにも大きく依存するが、情報量の比は、おおよそＩ：Ｐ：Ｂ＝４：２：１程度となる。また一般的に、ＭＰＥＧビデオストリームは、１５フレーム（＝１ＧＯＰ）を単位として、１ＧＯＰについてＩフレームが１枚、Ｐフレームが４枚、Ｂフレームが１０枚の割合で含まれている。
【０００５】
ＭＰＥＧビデオの二つ目の特徴は、画像の複雑さに応じた動的な符号量割り当てをピクチャ単位で行える点である。ＭＰＥＧのデコーダは、デコーダバッファを備え、このデコーダバッファにデータを蓄積してからデコードを行うことで、圧縮の難しい複雑な画像に対して大量の符号量を割り当てることが可能になっている。ＭＰＥＧに限らず動画圧縮では、標準的なデコーダバッファの容量を規格で定義する場合が殆どである。ＭＰＥＧ１やＭＰＥＧ２の場合、標準デコーダバッファは、規格で容量が２２４ＫＢｙｔｅと定義されており、ＭＰＥＧエンコーダは、この容量の範囲内でデコーダバッファ占有量が遷移するようにピクチャデータを生成しなければならない。
【０００６】
図１９（Ａ）〜（Ｃ）は、従来のストリーミング方法を説明するための図である。図１９（Ａ）は、ビデオフレームを示す図、図１９（Ｂ）は、バッファ占有量の遷移を模式的に示した図、図１９（Ｃ）は、従来端末の構成例を示す図である。図１９（Ｃ）において、端末は、ビデオバッファと、ビデオデコーダと、Ｉ，Ｐ並べ替えバッファと、スイッチとを備えている。ビデオバッファが、前述のデコーダバッファに相当し、転送されてくるデータは、ビデオバッファに蓄積された後、ビデオデコーダによってデコードされる。デコードされたデータは、Ｉ，Ｐ並べ替えバッファおよびスイッチを通じて再生時刻順に並べ替えられる。
【０００７】
図１９（Ｂ）において、縦軸はバッファ占有量（ビデオバッファのデータ蓄積量）を、横軸は時間を示し、図中の太線がバッファ占有量の時間的遷移を示している。また、太線の傾きは、ビデオのビットレートに相当し、一定のレートでデータがバッファに入力されていることを示している。また、一定間隔（３３．３６６７ｍｓｅｃ）でバッファ占有量の削減が起こっているが、これは、一定周期で各フレームのデータがデコードされていくことによる。また、斜め点線と時間軸との交点は、各ビデオフレーム内のデータがビデオバッファへ向けて転送開始される時刻を示している。従って、図１９（Ａ）に示されたフレームＸの転送開始時刻はｔ１、フレームＹの転送開始時刻はｔ２となる。
【０００８】
図１９（Ａ），（Ｂ）において、ビデオの先頭フレームＸがビデオバッファに入力開始される時刻ｔ１から、最初にデコードが実行される時刻（図中、太線の第１の立ち下がり位置）までの時間を一般に、ｖｂｖ＿ｄｅｌａｙ時間と呼ぶ。最初のデコードは、ビデオバッファが満杯になった瞬間に実行されるので、ｖｂｖ＿ｄｅｌａｙ時間は、通常、データ入力開始から容量２２４ＫＢｙｔｅのビデオバッファが満杯になるまでの時間であり、従って、ビデオの入力が開始されてから、デコーダを通じてビデオ再生が開始されるまでの初期遅延時間（頭出し時の待ち時間）ということになる。
【０００９】
図１９（Ａ）のフレームＹが複雑な画像である場合、図１９（Ｂ）に示されているように、その符号量が大量なので、フレームＹのデコード時刻（図中のｔ３）よりも早い時刻（図中のｔ２）から、ビデオバッファへのデータ転送を開始しなければならない。ただし、どんなに複雑な画像でも、バッファを占有するピクチャ量は、２２４ＫＢｙｔｅの許容範囲内である。
【００１０】
図１９（Ｂ）に示したバッファ遷移がきちんと保たれるようにビデオバッファにデータが転送されるならば、ビデオバッファのアンダーフローやオーバーフローによるストリーミング破綻が起こらないことは、ＭＰＥＧの規格で保証されている。
【００１１】
（ネットワーク転送ジッタ吸収用の受信バッファの説明）
ところが、図２０に示すように、サーバ２０１と端末２０２とをネットワーク２０３で接続し、ストレージ２１０中のＭＰＥＧデータを配信する場合、生成モジュール２１１でパケットを生成する時間や、ネットワーク機器２０４，２０５における転送手続き時間、ネットワーク２０３の混雑などに伴なう伝送遅延時間などのために、データの転送レートに揺れが生じる。従って、実際には、図１９（Ｂ）に示したバッファ遷移が保たれないのが実情である。このような転送レートの揺れ（ジッタ）を緩和吸収する方法としては、まず、ネットワークの帯域に比べ十分小さい符号化レートのコンテンツを流すことが考えられる。しかし、ネットワーク資源をできる限り有効に使って高品位な映像や音声を提供する必要があるので、この方法は適切ではない。そこで、一般には、ネットワーク機器２０４，２０５に、それぞれ適当な容量の送受信バッファ２０６，２０７を設け、普段からデータを多少先送り気味に転送しておいて、データ転送に遅延が発生した時の不足を補う方法が採用される。
【００１２】
ここで、端末２０２側に受信バッファ２０７を設けるということは、結局、図１９（Ｂ）のバッファ遷移において、バッファ占有量の上限を、デコーダバッファ２０８の規格である２２４ＫＢｙｔｅから受信バッファ２０７による蓄積量の分だけかさ上げするのとおおむね等価である。図２１（Ａ），（Ｂ）に、受信バッファ２９７を追加する前後のバッファ占有量を並べて示す。なお、図２１（Ａ）に示されているのは、図１９（Ｂ）と同一のバッファ遷移である。
【００１３】
受信バッファ２０７の追加によって、バッファ遷移の許容範囲が拡がり、その結果、図１９（Ｂ）のバッファ遷移、すなわち図２１（Ａ）のバッファ遷移は、図２１（Ｂ）のようになって、ネットワークの転送レートが低下しても、アンダーフローを回避することが可能となる。反面、ｖｂｖ＿ｄｅｌａｙ時間が、受信バッファ２０７による蓄積量に相当する時間だけ長くなり、デコーダ２０９でのデコード開始および再生装置２１２での再生開始が遅れる。つまり、頭出し時間が、受信バッファ２０７へのデータ蓄積にかかる時間の分だけ長くなる。
【００１４】
【発明が解決しようとする課題】
以上から明らかなように、小規模ＬＡＮなどの信頼性や伝送速度の保証されたネットワーク環境において、ＭＰＥＧ等のマルチメディアデータをストリーミング再生する場合には、基本的に、コーデックの規格で定められた再生初期遅延時間（ｖｂｖ＿ｄｅｌａｙ）やデコーダバッファ遷移をきちんと遵守するようなシステム設計になってさえいれば、デコーダバッファのアンダーフローやオーバーフローが起こってストリーミング再生が破綻をきたすことはない。
【００１５】
しかしながら、インターネットなどの広域ネットワーク環境においては、通信経路の伝送特性変動に伴なう転送ジッタが無視できないほど大きいため、従来の端末２０２は、コーデックの規格で定められたデコーダバッファ（ｖｂｖバッファ）に加えて、図２０の受信バッファ２０７のような、転送ジッタ吸収のためのバッファを持つ場合が多い。このとき、次のような課題が存在する。
【００１６】
端末に搭載されるジッタ吸収用のバッファの容量は、一般に、機種によって様々である。そのため、同じデータを同じ条件下で配信しても、バッファ容量の多い機種ではストリーミング再生を破綻なく行えるが、少ない機種では、ジッタを吸収しきれずに破綻する場合があった。
【００１７】
この課題を解決するには、例えば、端末の搭載メモリ量を増やして、ジッタ吸収用のバッファ容量を十分確保すればよい。しかしながら、搭載メモリ量は、端末の価格を決める主な要因の一つであり、可能な限り少なく抑えたい要求がある。加えて、ジッタ吸収用のバッファ容量が多すぎると、再生開始までの頭出し時間が長くなって、ユーザにいらだちを感じさせてしまうという新たな問題が発生する。
【００１８】
それゆえに、本発明の目的は、端末のバッファ容量が機種によって異なっていても、ネットワークの伝送能力が変動しても、バッファのアンダーフローやオーバーフローによるストリーミング再生の破綻を回避することが可能であり、しかも、ストリーミング再生の破綻回避と、頭出し時の待ち時間短縮とを互いに両立させることができるようなストリーミング方法を提供することである。
【００１９】
【課題を解決するための手段および発明の効果】
第１の発明は、サーバが端末へネットワークを通じてストリームデータを送信し、かつ端末が当該ストリームデータを受信しつつ再生するストリーミング方法であって、
端末が、自身のバッファ容量とネットワークの伝送能力とに関連して、自身のバッファに蓄積すべきストリームデータの目標量を決定する目標量決定ステップ、
当該バッファ容量を当該伝送能力で除して得られる値を超えない範囲内で任意に、端末が、自身のバッファにストリームの先頭データを書き込んでから当該データを読み出して再生開始するまでの遅延時間を決定する遅延時間決定ステップ、
決定した目標時間および遅延時間を、端末がサーバに通知するステップ、
サーバが端末へネットワークを通じてストリームデータを送信する際に、通知された目標量および遅延時間に基づいて送信速度を制御する制御ステップを備える。
【００２０】
上記第１の発明では、端末が、自身のバッファ容量とネットワークの伝送能力とに応じた目標量を決定し、さらに、バッファ容量を伝送能力で除して得られる値を超えない範囲内で、遅延時間を決定する。サーバは、こうして端末が決定した目標量および遅延時間に基づいて送信速度を制御するので、端末のバッファ容量が機種によって異なっていても、ネットワークの伝送能力が変動しても、バッファ量および伝送能力に応じた送信速度制御が行え、その結果、バッファのアンダーフローやオーバーフローによるストリーミング再生の破綻を回避することが可能となる。しかも、目標量とは独立に遅延時間が決定されるので、ストリーミング再生の破綻回避と、頭出し時の待ち時間短縮とを互いに両立させることができる。
【００２１】
ここで、遅延時間が、バッファ容量を伝送能力で除して得られる値以下に制限されるのは、遅延時間がこの値を超えると、ストリーミング再生の破綻が起こる恐れがあるためである。この値を超えない範囲であれば、遅延時間をどのような値に決めてもよい。ただし、値を決める際には、伝送能力の変動に対する耐性と、頭出し時の待ち時間との間のバランスが考慮される。
【００２２】
第２の発明は、第１の発明において、
制御ステップにおいて、サーバは、
端末のバッファに蓄積されているストリームデータの量が、当該目標量の近傍において当該目標量を超えることなく遷移するように、当該送信速度を制御することを特徴とする。
【００２３】
上記第２の発明では、蓄積量が目標量の近傍において目標量を超えることなく遷移するので、バッファのアンダーフローやオーバーフローが起こりにくい。
【００２４】
第３の発明は、第２の発明において、
制御ステップにおいて、サーバは、当該送信速度と、当該遅延時間と、端末がストリームデータをデコードする速度とに基づいて、端末のバッファに蓄積されるストリームデータの量を予測算出することを特徴とする。
【００２５】
上記第３の発明では、サーバが蓄積量を予測算出して、その量に基づいて送信速度制御を行うので、蓄積量を目標量の近傍で目標量を超えないように遷移させることができる。
【００２６】
ここで、端末が現在の蓄積量をサーバに通知し、サーバは、通知に基づいて送信速度制御を行ってもよい。しかし、この場合、端末からサーバへの情報伝達に時間がかかるので、サーバは、過去の蓄積量に基づいて送信速度制御を行うことになる。そのため、蓄積量を目標量の近傍で目標量を超えないように遷移させることができるとは限らない。
【００２７】
第４の発明は、第１の発明において、
端末が、ネットワークの伝送能力が所定の閾値を跨いで変化したことを検出する検出ステップ、
検出ステップでの検出結果に応じて、端末が当該目標量を変更する目標量変更ステップ、および
変更後の目標量を、端末がサーバに通知するステップをさらに備え、
制御ステップにおいて、サーバは、変更後の目標量の通知を受けると、端末のバッファに蓄積されるストリームデータの量が、当該変更後の目標量の近傍において当該変更後の目標量を超えることなく遷移するように、当該送信速度を制御することを特徴とする。
【００２８】
上記第４の発明では、伝送能力が閾値を跨いで変化すると、端末によって目標量が変更される。サーバは、変更後の目標量の近傍において変更後の目標量を超えることなく遷移するように送信速度を制御して、目標量の変更に追従する。
【００２９】
第５の発明は、第４の発明において、
検出ステップでネットワークの伝送能力が第１の閾値を跨いで低下したことを検出すると、端末は、目標量変更ステップにおいて、当該目標量を増加させる向きに変更し、
制御ステップにおいて、サーバは、当該目標量が増加されたのに応じて、当該送信速度を上昇させる向きに制御することを特徴とする。
【００３０】
上記第５の発明では、伝送能力が第１の閾値を跨いで変化すると、端末によって目標量が増加される。サーバは、送信速度を上昇させることにより、目標量の増加に追従する。
【００３１】
第６の発明は、第５の発明において、
当該第１の閾値は、実現可能な最大の伝送能力と、ストリームデータの転送ロスが発生し始めるような伝送能力との略中間の値であることを特徴とする。
【００３２】
上記第６の発明では、伝送能力が低下しつつあるとき、ストリームデータの転送ロスが発生し始める前に、送信速度を上昇させて蓄積量を増やしておく。これにより、伝送能力の低下が進行したときに、ストリーミング再生が破綻するのを防ぐことができる。
【００３３】
第７の発明は、第４の発明において、
検出ステップでネットワークの伝送能力が当該第１の閾値より小さい第２の閾値を跨いで低下したことを検出すると、端末は、目標量変更ステップにおいて、当該目標量を減少させる向きに変更し、
制御ステップにおいて、サーバは、当該目標量が減少方向に変更されたのに応じて、当該送信速度を低下させる向きに制御することを特徴とする。
【００３４】
上記第７の発明では、伝送能力が第２の閾値を跨いで変化すると、端末によって目標量が減少される。サーバは、送信速度を低下させることにより、目標量の減少に追従する。
【００３５】
第８の発明は、第７の発明において、
当該第２の閾値は、ストリームデータの転送ロスが発生し始めるような伝送能力と対応する値であることを特徴とする。
【００３６】
上記第８の発明では、伝送能力の低下が進行して、ストリームデータの転送ロスが発生し始めると、一転、送信速度を低下させる。失われたデータの再送処理を妨害しないためである。
【００３７】
ここで、送信速度を低下させる場合、サーバは、低下幅に応じた頻度でフレームの送信をスキップしなければならない。フレームがスキップされると、端末が再生して得られる映像や音声の品位劣化が起こる。この品位劣化を抑えるために、下記第９の発明では、スキップされるフレームとして、再生時刻に間に合わないフレームが選択される。下記第１０の発明では、スキップされるフレームとして、重要度の低いフレームと、重要度は高いが再生時刻に間に合わないようなフレームとが選択される。
【００３８】
第９の発明は、第８の発明において、
目標量変更ステップにおいて、端末が当該目標量を減少させる向きに変更すると、制御ステップにおいて、サーバは、送信しようとするストリームを構成する各フレームの再生時刻を現在時刻と逐次比較して、再生時刻が現在時刻よりも古いフレームの送信をスキップし、それよって当該送信速度を低下方向に制御することを特徴とする。
【００３９】
上記第１１の発明では、再生時刻に間に合わないフレームが選択的にスキップされるので、無作為的にスキップするのと比べて、送信速度の低下による品位劣化を少なく抑えることができる。
【００４０】
第１０の発明は、第８の発明において、
目標量変更ステップにおいて、端末が当該目標量を減少させる向きに変更すると、制御ステップにおいて、サーバは、送信しようとするストリームを構成する各フレームの重要度を基準値と逐次比較して、
重要度が基準値未満であるフレームについては、全て送信をスキップし、
重要度が基準値以上であるフレームについては、それぞれの再生時刻を現在時刻と逐次比較して、再生時刻が現在時刻よりも古いものだけ送信をスキップし、それによって当該送信速度を低下方向に制御することを特徴とする。
【００４１】
上記第１０の発明では、重要度の低いフレームと、重要度は高いが再生時刻に間に合わないようなフレームとが選択的にスキップされるので、無作為的にスキップするのと比べて、送信速度の低下による品位劣化を少なく抑えることができる。
【００４２】
ここで、第１０の発明のような、スキップすべきフレームを選択する際に再生時刻に間に合うか否かに加えて重要度をも考慮する方法は、典型的には、ＭＰＥＧによる映像フレームに対して用いられる。この場合、送信速度を低下させるとき、ＰやＢのフレームが重要度の低いフレームとしてスキップされる一方、Ｉフレームは、重要度の高いフレームとして、再生時刻に間に合わない場合を除いてスキップされることがないので、送信速度低下による再生画像の品位劣化が最小限に抑えられる。なお、ＭＰＥＧによる音声フレームの場合、フレーム間に重要度の違いがないので、再生時刻に間に合うか否かだけを考慮すればよい。
【００４３】
第１１の発明は、ネットワークを通じてストリームデータを送信するサーバと、当該ストリームデータを受信しつつ再生する端末とからなるシステムであって、
端末は、
自身のバッファ容量とネットワークの伝送能力とに関連して、自身のバッファに蓄積すべきストリームデータの目標量を決定する目標量決定手段、
当該バッファ容量を当該伝送能力で除して得られる値を超えない範囲内で任意に、自身のバッファにストリームの先頭データを書き込んでから当該データを読み出して再生開始するまでの遅延時間を決定する遅延時間決定手段、および
決定した目標時間および遅延時間をサーバに通知する手段を備え、
サーバは、端末へネットワークを通じてストリームデータを送信する際に、通知された目標量および遅延時間に基づいて送信速度を制御する制御手段を備える。
【００４４】
第１２の発明は、ネットワークを通じてストリームデータを送信するサーバと共に用いられ、当該ストリームデータを受信しつつ再生する端末であって、
サーバには、端末へネットワークを通じてストリームデータを送信する際に、通知された目標量および遅延時間に基づいて送信速度を制御する制御手段が備わり、
自身のバッファ容量とネットワークの伝送能力とに関連して、自身のバッファに蓄積すべきストリームデータの目標量を決定する目標量決定手段、
当該バッファ容量を当該伝送能力で除して得られる値を超えない範囲内で任意に、自身のバッファにストリームの先頭データを書き込んでから当該データを読み出して再生開始するまでの遅延時間を決定する遅延時間決定手段、および
決定した目標時間および遅延時間をサーバに通知する手段を備える。
【００４５】
第１３の発明は、ストリームデータを受信しつつ再生する端末と共に用いられ、ネットワークを通じて当該ストリームデータを送信するサーバであって、
端末には、
自身のバッファ容量とネットワークの伝送能力とに関連して、自身のバッファに蓄積すべきストリームデータの目標量を決定する目標量決定手段、
当該バッファ容量を当該伝送能力で除して得られる値を超えない範囲内で任意に、自身のバッファにストリームの先頭データを書き込んでから当該データを読み出して再生開始するまでの遅延時間を決定する遅延時間決定手段、および
決定した目標時間および遅延時間をサーバに通知する手段が備わり、
端末へネットワークを通じてストリームデータを送信する際に、通知された目標量および遅延時間に基づいて送信速度を制御する制御手段を備え、
制御手段は、端末のバッファに蓄積されているストリームデータの量が、当該目標量の近傍において当該目標量を超えることなく遷移するように、当該送信速度を制御することを特徴とする。
【００４６】
第１４の発明は、上記第１の発明のようなストリーミング方法を記述したプログラムである。
【００４７】
第１５の発明は、上記第１４の発明のようなプログラムを記録した記録媒体である。
【００４８】
【発明の実施の形態】
以下、本発明の実施形態について、図面を参照しながら説明する。図１は、本発明の一実施形態に係るストリーミング方法を実行するサーバ・クライアント・システムの構成例を示すブロック図である。図１において、本システムは、サーバ１０１と、そのクライアントとして動作する端末１０２とを備えている。サーバ１０１側には、映像や音声のデータが蓄積されている。このデータは、ＭＰＥＧによって符号化圧縮されたデータである。サーバ１０１は、端末１０２からの要求に応じ、蓄積しているデータをパケット化してストリームを生成する。そして、ネットワーク１０３を通じ、生成したストリームを端末１０２に送信する。端末１０２は、サーバ１０１から送信されるストリームを受信してデコードし、得られた映像や音声を表示出力する。
【００４９】
図２は、図１のサーバ１０１の構成を示すブロック図である。図２において、サーバ１０１は、蓄積デバイス４１１と、送受信モジュール４０２と、生成モジュール４０５と、ＲＡＭ４０４と、ＣＰＵ４１２と、ＲＯＭ４１３とを備えている。蓄積デバイス４１１には、映像や音声のデータが蓄積されている。この蓄積デバイス４１１内のデータが生成モジュール４０５に与えられる。生成モジュール４０５は、読み出しバッファ４０７と、パケット生成回路４０６と、パケット生成バッファ４０８とを含み、与えられるデータをパケット化してストリームを生成する。
【００５０】
送受信モジュール４０２は、ネットワークコントローラ４１０と、送信バッファ４０９とを含み、生成モジュール４０５によって生成されたストリームを端末１０２へ、ネットワーク１０３経由で送信する。また、端末１０２からネットワーク１０３経由で送信されてくる情報を受信する。
【００５１】
送受信モジュール４０２によって受信された端末１０２からの情報は、ＲＡＭ４０４に書き込まれる。ＲＯＭ４１３には、サーバ制御プログラムが格納されており、ＣＰＵ４１２は、ＲＡＭ４０４に記憶されている端末からの情報を参照しつつＲＯＭ４１３内のプログラムを実行し、それによって、送受信モジュール４０２および生成モジュール４０５の制御を行う。なお、ここではプログラムがＲＯＭ４１３に格納されているとしたが、ＲＯＭ以外の記憶媒体、例えばハードディスクやＣＤ−ＲＯＭ等に格納されていてもよい。
【００５２】
図３は、図１の端末１０２の構成を示すブロック図である。図３において、端末１０２は、送受信モジュール５０７と、再生モジュール５１０と、表示デバイス５１１と、ＲＯＭ５０２と、ＣＰＵ５０３とを備えている。送受信モジュール５０７は、ネットワークコントローラ５０６と、受信バッファ５０５とを含み、サーバ１０１からネットワーク１０３経由で送信されてくるストリームを受信する。また、ＣＰＵ５０３からの情報をサーバ１０１へ、ネットワーク１０３経由で送信する。
【００５３】
送受信モジュール５０７によって受信されたストリームが、再生モジュール５１０に入力される。再生モジュール５１０は、デコーダバッファ５０８と、デコーダ５０９とを含み、入力されるストリームをデコードして再生する。再生モジュール５１０により再生されたデータが表示デバイス５１１に与えられ、表示デバイス５１１は、そのデータを映像に変換して表示する。
【００５４】
ＲＯＭ５０２には、端末制御プログラムが格納されており、ＣＰＵ５０３は、ＲＯＭ５０２内のプログラムを実行し、それによって、送受信モジュール５０７、再生モジュール５１０および表示デバイス５１１の制御を行う。
【００５５】
以上のように構成されたシステムの動作を、以下に説明する。図４は、図１のシステムの全体動作を説明するためのシーケンス図である。図４には、サーバ１０１側の送受信層および制御層と、端末１０２側の送受信層および制御層とが示されており、これら各層の間でやりとりされるコマンドやストリームが時系列的に並べられている。
【００５６】
最初、本システムの全体的な動作について、図４を用いて説明する。図４において、最初、端末１０２からサーバ１０１へ、コマンド”ＳＥＴＵＰ”が送信される。サーバ１０１では、”ＳＥＴＵＰ”に応じて初期設定が行われ、設定が完了すると、サーバ１０１から端末１０２へ、”ＯＫ”が応答される。
【００５７】
サーバ１０１から”ＯＫ”が返ってくると、端末１０２からサーバ１０１へ、コマンド”ＰＬＡＹ”が送信される。サーバ１０１では、送信準備が行われ、準備が完了すると、サーバ１０１から端末１０２へ、”ＯＫ”が応答される。
【００５８】
サーバ１０１から”ＯＫ”が返ってくると、端末１０２は、ストリームを待ち受ける態勢へと移行する。この”ＯＫ”の応答に引き続いて、サーバ１０１は、ストリームの送信を開始する。
【００５９】
その後、端末１０２からサーバ１０１へ、コマンド”ＴＥＡＲＤＯＷＮ”が送信され、サーバ１０１は、”ＴＥＡＲＤＯＷＮ”に応じてストリーム送信を終了する。送信が終了されると、サーバ１０１から端末１０２へ、”ＯＫ”が応答される。
サーバ１０１から”ＯＫ”が返ってくると、端末１０２は、ストリーム待ち受け態勢から脱する。
【００６０】
以上が本システムの全体的な動作の概要であり、上で説明した限りでは、本システムの動作は、従来と同様である。本システムの動作が従来と異なるのは、次の（１）および（２）の２点である。
（１）端末１０２からサーバ１０１へのコマンド”ＳＥＴＵＰ”にパラメータ”Ｓ＿ｔａｒｇｅｔ”および”Ｔ＿ｄｅｌａｙ”が添付されており、サーバ１０１は、ストリームを送信する際、これらのパラメータに基づいて送信速度を制御する。
【００６１】
上記（１）において、”Ｓ＿ｔａｒｇｅｔ”は、端末１０２がバッファに蓄積するデータ量の目標値であり、その端末１０２に備わるバッファ（図３の例では、受信バッファ５０５およびデコーダバッファ５０８）の総容量（”Ｓ＿ｍａｘ”）と、ネットワーク１０３の伝送能力とに基づいて決定される。従って、”Ｓ＿ｔａｒｇｅｔ”は、一般に、端末１０２の機種によって値が異なる。
【００６２】
また、”Ｔ＿ｄｅｌａｙ”は、端末１０２が先頭データをバッファに書き込んでから、そのデータを読み出してデコードを開始するまでの時間（つまり頭出し遅延時間）であり、”Ｓ＿ｔａｒｇｅｔ”を送信速度（後述）で除して得られる値を最大値として、その最大値を超えない範囲内で任意の値に決定される。すなわち、「”Ｓ＿ｔａｒｇｅｔ”を送信速度で除して得られる値を超えない範囲内で」という条件が付くものの、端末１０２は、”Ｓ＿ｔａｒｇｅｔ”とは独立に”Ｔ＿ｄｅｌａｙ”を決めることができる。
【００６３】
また、「送信速度」は、単位時間に送信する情報の量をいい、例えば、単位時間に送信するパケットの個数が決められている場合、各パケットに詰め込むデータの量を増減させることによって、送信速度を制御することができる。また、各パケットに詰め込むデータの量が決められている場合、パケットと次のパケットとの時間的な間隔を伸縮させることによって、送信速度を制御することができる。あるいは、両方を同時に行う（すなわち、各パケットに詰め込むデータの量を増減させ、かつパケットと次のパケットとの時間的な間隔を伸縮させる）ことによっても、送信速度を制御することができる。本実施形態では、各パケットに詰め込むデータの量を増減させることにより速度制御を行うものとする。
【００６４】
（２）端末１０２は、ストリームの配信中、必要に応じて”Ｓ＿ｔａｒｇｅｔ”を変更することができる。この場合、変更後の”Ｓ＿ｔａｒｇｅｔ”が端末１０２からサーバ１０１に通知され、以降、サーバ１０１は、変更後の”Ｓ＿ｔａｒｇｅｔ”に基づいて送信速度を制御する。
【００６５】
上記（２）において、”Ｓ＿ｔａｒｇｅｔ”の変更は、ネットワーク１０３の伝送能力の変動に応じて行われる。具体的には、端末１０２が携帯電話の場合、電界強度（例えば「強・中・弱・圏外」の４段階の強度）を検知することができるので、この電界強度の変化を「ネットワーク１０３の伝送能力の変動」と見なして、”Ｓ＿ｔａｒｇｅｔ”を変化させる。例えば、電界強度が「強」から「中」に変化すると、端末１０２は、”Ｓ＿ｔａｒｇｅｔ”の値をより大きな値に変更し、「中」から「弱」に変化すると、”Ｓ＿ｔａｒｇｅｔ”の値をより小さな値に変更する。
本システムの動作が従来と異なるのは、主として上の２点である。
【００６６】
次に、本システムの全体動作の具体例を詳細に説明する。図４において、端末１０２は、ストリーム再生を開始するのに先立ち、端末制御プログラムに従ってＣＰＵ５０３がＲＯＭ５０２より端末に固有のパラメータ群を抽出する。このパラメータ群中には、受信バッファ５０５とデコーダバッファ５０８とを合わせた総容量（端末１０２が実際に蓄積できる最大データ量）Ｓ＿ｍａｘが含まれる。一方、ＣＰＵ５０３は、ストリーム再生補助データなどの事前入手の手続きによって、受信したいストリームデータの符号化圧縮レートＶｒや、ビデオないしオーディオのフレーム発生周期Ｔｆｒｍを知っているものとする。また、ＣＰＵ５０３は、ネットワークインターフェースを通じ、ネットワーク１０３の伝送能力、たとえば携帯電話における受信電波強度や、通信速度（ＰＨＳの場合では６４Ｋｂｐｓ接続ないし３２Ｋｂｐｓ接続などの情報）も検出しているものとする。
【００６７】
ＣＰＵ５０３は、これらＳ＿ｍａｘ、Ｖｒ、Ｔｆｒｍ、ネットワーク１０３の伝送能力（例えば有効転送速度＝ｎｅｔｗｏｒｋＲａｔｅ）などをもとに、端末１０２内のバッファにどれだけのデータを蓄積するかを示す目標量Ｓ＿ｔａｒｇｅｔ、およびストリーム再生を始めるまでのプレバッファリング時間（すなわち頭出し遅延時間）Ｔ＿ｄｅｌａｙを決定する。
【００６８】
ここで、Ｓ＿ｔａｒｇｅｔの本質的な意味は、今から開始するストリーミング再生において、Ｓ＿ｔａｒｇｅｔ近傍かつそれを越えないように端末の蓄積バッファ量が遷移すれば、途切れなく正常にストリーミング再生が持続できるような基準値のことである。前述のように、Ｔ＿ｄｅｌａｙが大きいと頭出し遅延時間が長くなるが、ネットワーク１０３の転送ジッタに対しては強くなる。しかし、遅延時間があまり長いとサービス仕様として不適切なので、Ｔ＿ｄｅｌａｙを決める際には、転送ジッタに対する耐性と、頭出し時の待ち時間との間のバランスをとる配慮が必要である。
【００６９】
なお、Ｔ＿ｄｅｌａｙの代わり、もしくはＴ＿ｄｅｌａｙと併せて、端末１０２内のバッファにデータを何バイトまで充填したらデコードを開始するかという充填量Ｓ＿ｄｅｌａｙを決定してもよい。端末１０２がＳ＿ｄｅｌａｙのみを決定してサーバ１０１に通知する場合は、サーバ１０１側でＴ＿ｄｅｌａｙ＝Ｓ＿ｄｅｌａｙ／ｎｅｔｗｏｒｋＲａｔｅなる式を用いて、Ｓ＿ｄｅｌａｙをＴ＿ｄｅｌａｙに換算することが可能である。また、Ｓ＿ｄｅｌａｙの値は、バッファ総量Ｓ＿ｍａｘに対する充填率ｒＳ（％）であってもよい（この場合、換算式は、Ｓ＿ｄｅｌａｙ＝Ｓ＿ｍａｘ＊ｒＳ／１００となる）。
【００７０】
端末１０２は、Ｓ＿ｔａｒｇｅｔと、Ｔ＿ｄｅｌａｙ（および／またはＳ＿ｄｅｌａｙ）とを準備すると、図４に示されているように、サーバ１０１に対し、ストリーム配信の準備を促すＳＥＴＵＰコマンドを発行する。ＳＥＴＵＰコマンド中には、引数としてＳ＿ｔａｒｇｅｔと、Ｔ＿ｄｅｌａｙ（および／またはＳ＿ｄｅｌａｙ）とが含まれている。サーバ１０１は、ＳＥＴＵＰを受信すると、引数をＲＡＭ４０４に記憶して、ストリーム配信のための初期設定を行う。具体的には、サーバ１０１のＣＰＵ４１２が、最初ＲＡＭ４０４から引数を取り出し、次いで、たとえばストリームのソースファイルを蓄積デバイス４１１から読み出してバッファ４０７に書き込む操作と、読み出したデータをパケット化するパケット生成回路４０６のパラメータ設定とを行う。なお、パケット生成回路４０６は、必ずしも専用のハードウエアである必要はなく、サーバ１０１（例えばワークステーションなどで実現される）のＣＰＵ４１２に同様のパケット化処理を実行させるプログラム（ソフトウエアアルゴリズム）であってもよい。
【００７１】
前述のＳ＿ｔａｒｇｅｔと、Ｔ＿ｄｅｌａｙ（および／またはＳ＿ｄｅｌａｙ）との２つの値も、パケット生成回路４０６に引き渡される。パケット生成回路４０６では、これらの値を用いて最適なレート制御パラメータの算出が行われ、その結果、端末１０２へのストリーム配信に適したレートでパケットが生成、送出されるようになる。ネットワーク１０３中にパケットを送出する準備が正常に完了すると、図４のように、送受信層から制御層にＯＫが返り、次いで、端末１０２へ向けてＳＥＴＵＰコマンドに対するＯＫが返る。こうして、本システムにおいて、ストリーム配信準備が完了する。
【００７２】
次いで、端末１０２がサーバ１０１に対し、ストリーム配信の開始を促すＰＬＡＹコマンドを発行する。サーバ１０１は、ＰＬＡＹを受信すると、ストリームデータの配信を開始する。端末１０２は、サーバ１０１からのストリームデータを受信して蓄積する。そして、蓄積開始から前述のプレバッファリング時間（Ｔ＿ｄｅｌａｙ）が経過したのちに、ストリームデータのデコード、再生を開始する。このときストリーム配信は、ＳＥＴＵＰ時に設定された適切なレート制御パラメータに基づいてなされていることはいうまでもない。
【００７３】
ストリーム再生の終了時には、端末１０２よりサーバ１０１に対し、ＴＥＡＲＤＯＷＮコマンドが発行される。サーバ１０１は、ＴＥＡＲＤＯＷＮを受信すると、ストリーム配信の終了処理を行い、全手続きを完了させる。以上が、本システムの具体的な動作例である。
【００７４】
以下には、端末１０２の動作について詳細に説明する。端末１０２は、インターネットに接続可能な携帯電話であり、電界強度（受信電波強度）を検知する機能を持っているとする。図５は、図１の端末１０２の動作を示すフローチャートである。図５において、最初、端末１０２は、２つのパラメータＳ＿ｔａｒｇｅｔおよびＴ＿ｄｅｌａｙの値を決定する（ステップＳ１０１）。
【００７５】
ここで、上記ステップＳ１０１の処理内容を具体的に説明する。図６は、図３のＲＯＭ５０２の記憶内容を示す図である。図６に示すように、ＲＯＭ５０２内には、端末制御プログラムと、電界強度およびＳ＿ｔａｒｇｅｔが互いに対応付けて記載されたテーブル６０１と、パラメータＴ＿ｄｅｌａｙの値とが記憶される。パラメータＳ＿ｔａｒｇｅｔの値としては、電界強度「強」と対応するＳ＿ｔａｒｇｅｔ１、「中」と対応するＳ＿ｔａｒｇｅｔ２、「弱／圏外」と対応するＳ＿ｔａｒｇｅｔ３の３つの値が記憶されている。一方、パラメータＴ＿ｄｅｌａｙの値は、１つだけが記憶されている。
【００７６】
上記３つの値Ｓ＿ｔａｒｇｅｔ１〜３は、次の関係を満たすように決められる。
Ｓ＿ｔａｒｇｅｔ３＜Ｓ＿ｔａｒｇｅｔ１＜Ｓ＿ｔａｒｇｅｔ２≦Ｓ＿ｍａｘ一方、値Ｔ＿ｄｅｌａｙは、値Ｓ＿ｍａｘをネットワーク１０３の実効的な伝送能力で除して得られる値を超えないように決められる。
【００７７】
一例として、Ｓ＿ｍａｘが５１２（ＫＢ）であれば、Ｓ＿ｔａｒｇｅｔ１＝２５６（ＫＢ），Ｓ＿ｔａｒｇｅｔ２＝３８４（ＫＢ），Ｓ＿ｔａｒｇｅｔ３＝１２８（ＫＢ）などのように決められる。また、ネットワーク１０３の実効的な伝送能力を３８４（Ｋｂｐｓ）すなわち４８（ＫＢ／ｓｅｃ）とすると、Ｔ＿ｄｅｌａｙは、５１２÷４８≒１０．７（秒）を超えない範囲で、任意の値（例えば４秒や３秒など）に決定される。
【００７８】
上記ステップＳ１０１では、ＲＯＭ５０２から、初期値としてのＳ＿ｔａｒｇｅｔ１と、値Ｔ＿ｄｅｌａｙとが読み出される。
【００７９】
なお、ここでは、Ｓ＿ｔａｒｇｅｔ１〜３と、Ｔ＿ｄｅｌａｙの値とが予め計算されてＲＯＭ５０２に記憶されており、ＣＰＵ５０３は、必要な値をＲＯＭ５０２から読み出すようにしているが、代わりに、バッファの総容量と、ネットワーク１０３の実効的な伝送能力と、Ｓ＿ｔａｒｇｅｔおよびＴ＿ｄｅｌａｙの値を計算するためのプログラムとをＲＯＭ５０２に記憶しておいてもよい。この場合、ＣＰＵ５０３は、必要があれば、その都度、ＲＯＭ５０２から容量、速度およびプログラムを読み出して、Ｓ＿ｔａｒｇｅｔおよびＴ＿ｄｅｌａｙの値を計算する。また、ここでは、Ｔ＿ｄｅｌａｙの値が１つだけであるが、複数の値を準備しておいて、その中から選択してもよい。以上がステップＳ１０１の処理内容である。
【００８０】
再び図５において、端末１０２は、ステップＳ１０１で決定されたＳ＿ｔａｒｇｅｔおよびＴ＿ｄｅｌａｙをＳＥＴＵＰコマンドに添付して、サーバ１０１へ向けて送信する（ステップＳ１０２）。応じて、サーバ１０１からストリームが送られてくる。ストリーム送信の際、サーバ１０１は、端末から通知されたＳ＿ｔａｒｇｅｔおよびＴ＿ｄｅｌａｙに基づく送信速度制御を実行する（サーバ側の動作ついては後述）。
【００８１】
次に、端末１０２は、サーバ１０１から送られてくるストリームを受信して、バッファに書き込む動作を開始する（ステップＳ１０３）。具体的には、図３に示されているように、ネットワーク１０３を通じて送られてくるストリームは、まずネットワークコントローラ５０６を経由して受信バッファ５０５に書き込まれる。時間が経過して受信バッファ５０５が満杯になると、受信バッファ５０５内のストリームが先頭データから順番に読み出されて、デコーダバッファ５０８へと書き込まれていく。
【００８２】
次に、端末１０２は、バッファリング開始から時間がＴ＿ｄｅｌａｙだけ経過したか否かを判定し（ステップＳ１０４）、その判定結果が否定であれば、肯定となるまで待機する。ステップＳ１０４の判定結果が肯定となると、端末１０２は、バッファからストリームを読み出してデコード・再生する動作を開始する（ステップＳ１０５）。具体的には、図３において、ＣＰＵ５０３がバッファリング開始からの経過時間を計測しており、その計測結果がＲＯＭ５０２内のＴ＿ｄｅｌａｙと一致した瞬間、再生モジュール５１０に命じて、デコーダバッファ５０８内のストリームを先頭データから順番に読み出してデコーダ５０９に入力する処理を開始させる。
【００８３】
次に、端末１０２は、ネットワーク１０３の伝送能力が閾値を跨いで変化したか否かを判定する（ステップＳ１０６）。この判定は、具体的には、次のようにして行われる。例えば、ネットワーク１０３を管理するホストコンピュータ（図示せず）が、ネットワーク１０３の伝送能力に関する情報をネットワーク１０３経由で端末１０２に随時配信するようにし、端末１０２は、ホストコンピュータからの情報をもとに変化の有無を判定する。
【００８４】
この場合、具体的には、図３に示すように、伝送能力に関する情報が送受信モジュール５０７を通じてＣＰＵ５０３へと送られる。ＲＯＭ５０２には、予め閾値が格納されており、ＣＰＵ５０３は、送られてきた情報と、保持している前回の情報と、ＲＯＭ５０２内の閾値とを互いに比較することにより、伝送能力が閾値を跨いで変化したか否かを判定することができる。
【００８５】
または、ネットワーク１０３を管理するホストコンピュータがその伝送能力に関する情報を端末１０２に配信する機能を持たない場合、端末１０２は、例えば、次のようにして自ら判定を行うことができる。すなわち、端末１０２が携帯電話の場合、図７（後述）に示すように、周囲の電界強度を検知して、検知結果を「強・中・弱・圏外」のように表示する機能を持っている。この電界強度の変化をネットワーク１０３の伝送能力の変化と見なせば、端末１０２は、検出を簡単に行えることになる。
【００８６】
ステップＳ１０６の判定結果が肯定の場合、端末１０２は、新たなＳ＿ｔａｒｇｅｔを決定し（ステップＳ１０７）、それをサーバ１０１へ向けて送信する（ステップＳ１０８）。一方、ステップＳ１０６の判定結果が否定の場合、ステップＳ１０７，Ｓ１０８をスキップして、ステップＳ１０９（後述）を実行する。
【００８７】
ここで、上記ステップＳ１０６およびステップＳ１０７の処理内容について、詳しく説明する。以下では、端末１０２が携帯電話であり、電界強度の変化に応じてＳ＿ｔａｒｇｅｔを変更する場合を説明する。図７は、あるエリアにおける電界強度の分布と、端末の移動に伴う伝送能力の変化とを示す模式図である。図７（Ａ）には、３つの中継局Ｂ１〜Ｂ３を含むエリアにおける電界強度分布が示されている。図７（Ａ）において、各中継局Ｂ１〜Ｂ３を中心とする同心円が、互いに等しい電界強度の点を繋いでできる等電界曲線である。
【００８８】
例えば、中継局に最も近い同心円７０３内では、電界強度が「強」であり、この同心円７０３から次の同心円７０４までの間の領域では「中」となる。さらに、同心円７０４から同心円７０５までの間では「弱」、同心円７０５の外側の領域では「圏外」となる。ただし、各中継局を中心とする同心円は、一部が互いに交差しており、電界強度が「圏外」となる領域は、わずかしかない。
【００８９】
いま、端末１０２は、矢印７０２で示される経路に沿って、中継局Ｂ１の近傍から中継Ｂ２の近傍へ移動しようとしている。図７（Ｂ）には、図７（Ａ）の矢印７０２に沿った電界強度（これをネットワーク１０３の伝送能力と見なすことができる）が示されている。図７（Ｂ）に示されているように、電界強度は、端末１０２が中継局の近傍にあるとき「強」であり、中継局Ｂ１から離れるにつれて「中」、「弱」、「圏外」のようにだんだん弱くなっていく。そして、中継局Ｂ１の「圏外」となった直後、端末１０２は、中継局Ｂ２の「圏内」に入り、電波強度が「弱」、「中」、「強」のようにだんだん強くなっていく。
【００９０】
上記のように移動する端末１０２は、電界強度が「強」から「中」へと変化した瞬間、ネットワーク１０３の伝送能力が閾値Ａを跨いで変化したと判定して、新たなＳ＿ｔａｒｇｅｔを決定し、「中」から「弱」へと変化した瞬間、伝送能力が閾値Ｂを跨いで変化したと判定して、新たなＳ＿ｔａｒｇｅｔを決定する。逆に、「弱」から「中」へと変化した瞬間、伝送能力が閾値Ｂを跨いで変化したと判定して、新たなＳ＿ｔａｒｇｅｔを決定し、「中」から「強」へと変化した瞬間、伝送能力が閾値Ａを跨いで変化したと判定して、新たなＳ＿ｔａｒｇｅｔを決定する。
【００９１】
なお、一般的には、閾値Ａは、ネットワーク１０３において実現可能な最大の伝送能力と、ストリームの転送ロスが発生し始めるような伝送能力との略中間の値である。閾値Ｂは、ストリームの転送ロスが発生し始めるような伝送能力と対応する値である。
【００９２】
新たなＳ＿ｔａｒｇｅｔは、ＲＯＭ５０２内のテーブル６０１（図６）を参照することにより、次のように決定される。図８は、図５のステップＳ１０７の詳細を示すフローチャートである。図８において、端末１０２は、最初、変化後の電界強度が「強」か否かを判定し（ステップＳ２０１）、判定結果が肯定であれば、新たなＳ＿ｔａｒｇｅｔをＳ＿ｔａｒｇｅｔ１に決定する（ステップＳ２０２）。ステップＳ２０１の判定結果が否定あれば、ステップＳ２０２をジャンプして、ステップＳ２０３に進む。
【００９３】
次に、端末１０２は、変化後の電界強度が「中」か否かを判定し（ステップＳ２０３）、判定結果が肯定であれば、新たなＳ＿ｔａｒｇｅｔをＳ＿ｔａｒｇｅｔ２に決定する（ステップＳ２０４）。ステップＳ２０３の判定結果が否定であれば、ステップＳ２０４をジャンプして、ステップＳ２０５に進む。
【００９４】
次に、端末１０２は、変化後の電界強度が「弱／圏外」か否かを判定し（ステップＳ２０５）、判定結果が肯定であれば、新たなＳ＿ｔａｒｇｅｔをＳ＿ｔａｒｇｅｔ３に決定し（ステップＳ２０６）、その後、図５のフローに戻る。ステップＳ２０５の判定結果が否定であれば、ステップＳ２０６をジャンプして、図５のフローに戻る。
【００９５】
従って、図７（Ａ）の矢印７０２に沿って端末１０２が移動して行く場合、電界強度の変化に伴って、端末１０２は、パラメータＳ＿ｔａｒｇｅｔの値を、Ｓ＿ｔａｒｇｅｔ１→Ｓ＿ｔａｒｇｅｔ２→Ｓ＿ｔａｒｇｅｔ３→Ｓ＿ｔａｒｇｅｔ２→Ｓ＿ｔａｒｇｅｔ１のように変化させる。具体例を挙げれば、２５６（ＫＢ）→３８４（ＫＢ）→１２８（ＫＢ）→３８４（ＫＢ）→１２８（ＫＢ）のように変化させる。以上が、ステップＳ１０６およびステップＳ１０７の具体的な処理例である。
【００９６】
再び図５において、ステップＳ１０８で端末１０２が新たなＳ＿ｔａｒｇｅｔをサーバ１０１へ向けて送信すると、それに応じて、サーバ１０１は、パラメータＳ＿ｔａｒｇｅｔの値を、端末１０２から新たに通知された値に変更して、送信速度制御を続行する。
【００９７】
次に、端末１０２は、ストリーミング再生を終了するか否かを判断し（ステップＳ１０９）、終了する場合は、サーバ１０１へコマンドＴＥＡＲＤＯＷＮを送信すると共に、ストリームの受信およびバッファリングを停止し（ステップＳ１１０）、次いで、再生処理を停止する（ステップＳ１１１）。一方、ストリーミング再生を継続する場合には、端末１０２は、ステップＳ１０６に戻って、上記と同様の処理を繰り返す。以上が、端末１０２の動作である。
【００９８】
次に、サーバ１０１の動作について詳細に説明する。なお、ここでは説明を簡単にするために、サーバ１０１は、ＭＰＥＧ１ビデオ（ＩＳＯ／ＩＥＣ１１１７２−２）、ＭＰＥＧ２ビデオ（ＩＳＯ／ＩＥＣ１３８１８−２）、あるいはＭＰＥＧ２−ＡＡＣオーディオ（ＩＳＯ／ＩＥＣ１３８１８−７）のような、固定周期Ｔｆｒｍでフレームを発生させる符号化圧縮アルゴリズムを用いて符号化を行い、かつ、固定周期Ｔｓで符号化データのパケット化を行うものとする。
このパケット化は、フレーム単位で行われるものとする。
【００９９】
最初、サーバ１０１が行うストリーム送信速度制御の概要について、図９〜図１１を用いて説明する。図９〜図１１は、サーバ１０１が行うストリーム送信速度制御によって、端末１０２のバッファに蓄積されているデータ量（バッファ占有量）がどのように遷移するかを示す図である。サーバ１０１は、送信先の端末１０２において、バッファ占有量が図９〜図１１に示されているごとく遷移するように、ストリームの送信速度を制御する。
【０１００】
図９には、バッファ占有量がＳ＿ｔａｒｇｅｔに近づいていく様子が示されている。図１０には、バッファ占有量がＳ＿ｔａｒｇｅｔの近傍で遷移している状態で、Ｓ＿ｔａｒｇｅｔの値がより大きな値（Ｓ＿ｔａｒｇｅｔ２）に変更された場合に、バッファ占有量がＳ＿ｔａｒｇｅｔ２に近づいていく様子が示されている。図１１には、バッファ占有量がＳ＿ｔａｒｇｅｔの近傍で遷移している状態で、Ｓ＿ｔａｒｇｅｔの値がより小さな値（Ｓ＿ｔａｒｇｅｔ３）に変更された場合に、バッファ占有量がＳ＿ｔａｒｇｅｔ３に近づいていく様子が示されている。
【０１０１】
図９〜図１１に共通して、”Ｓ＿ｍａｘ”は、端末１０２のバッファの総容量であり、”Ｓｕｍ”が、バッファ占有量である。”ｄｅｌｔａ（０，１，２，…）”は、サーバ１０１が単位時間Ｔｓあたりに送信するデータの量（すなわち、１つのパケットに詰め込まれているデータの量）を示す。ここで、単位時間Ｔｓは、サーバ１０１がパケットを送信する周期であり、固定値である。”Ｌ（０，１，２，…）”は、１つのフレームあたりのデータ量である。
【０１０２】
サーバ１０１は、端末１０２からＴ＿ｄｅｌａｙの値の通知を受けると、その値に基づいてストリームの送信速度を制御する。この速度制御は、１つのパケットに詰め込むデータの量を変化させることにより行われる。
【０１０３】
図９において、サーバ１０１が最初に送信したパケット（ｉ＝０）には、量ｄｅｌｔａ０のデータが詰め込まれており、時刻ｔ＝０では、バッファ占有量Ｓｕｍは、ｄｅｌｔａ０となる。単位時間Ｔｓが経過すると、次のパケット（ｉ＝１）が送られてくるが、そこには、量ｄｅｌｔａ１のデータが詰め込まれている。従って、時刻ｔ＝Ｔｓでは、Ｓｕｍは、｛ｄｅｌｔａ０＋ｄｅｌｔａ１｝となる。以降、単位時間Ｔｓが経過する毎に、次々とパケット（ｉ＝２，３，…）が送られてきて、Ｓｕｍにｄｅｌｔａ２，ｄｅｌｔａ３，…が加算されていく。
【０１０４】
一方、３つ目のパケット（ｉ＝２）が送られてくる以前である時刻ｔ＝Ｔ＿ｄｅｌａｙに、バッファからデータを読み出してデコードする処理が開始される。デコードはフレーム単位で行われるので、ｔ＝Ｔ＿ｄｅｌａｙ以降、固定周期Ｔｆｒｍ毎に、ＳｕｍからＬ０，Ｌ１，Ｌ２…が減算されていく。
【０１０５】
すなわち、バッファ占有量Ｓｕｍは、時刻ｔ＝０以降、周期Ｔｓ毎に、ｄｅｌｔａ０，ｄｅｌｔａ１，…が加算されて、だんだん増加していく。その一方で、時刻ｔ＝Ｔ＿ｄｅｌａｙ以降、周期Ｔｆｒｍ毎にＬ０，Ｌ１，Ｌ２…が減算されていく。従って、Ｓｕｍが目標値Ｓ＿ｔａｒｇｅｔに達する直前までの期間は、１つのパケットに詰め込むデータ量を標準よりも多くして（より一般的には送信速度を速くして）、バッファへの書き込み速度がバッファからの読み出し速度を上回るようにし、それ以降は、１つのパケットに詰め込むデータ量を標準に戻して、書き込み速度と読み出し速度とを均衡させれば、Ｓｕｍを目標値Ｓ＿ｔａｒｇｅｔの近傍で遷移させることが可能となる。
【０１０６】
このような送信速度制御を行えば、図１０，図１１のように、目標値Ｓ＿ｔａｒｇｅｔが途中で新たな目標値（Ｓ＿ｔａｒｇｅｔ２，３）に変更された場合にも、Ｓｕｍを新たな目標値（Ｓ＿ｔａｒｇｅｔ２，３）の近傍で遷移させることが可能となる。
【０１０７】
すなわち、図１０において、Ｓｕｍが目標値Ｓ＿ｔａｒｇｅｔの近傍で遷移している状態で、Ｓ＿ｔａｒｇｅｔがより大きな値（Ｓ＿ｔａｒｇｅｔ２）に変更されると、サーバ１０１は、以降のパケット（ｉ＝３，４）に詰めるデータの量を増やすことによって、バッファへの書き込み速度がバッファからの読み出し速度を上回るようにする。Ｓｕｍが新たな目標値Ｓ＿ｔａｒｇｅｔ２に達して以降は、１つのパケットに詰め込むデータ量を標準に戻して、書き込み速度と読み出し速度とを均衡させればよい。
【０１０８】
また、図１１において、Ｓｕｍが目標値Ｓ＿ｔａｒｇｅｔの近傍で遷移している状態で、Ｓ＿ｔａｒｇｅｔがより小さな値（Ｓ＿ｔａｒｇｅｔ３）に変更されると、サーバ１０１は、以降のパケット（ｉ＝３，４）に詰めるデータの量を減らすことによって、バッファへの書き込み速度がバッファからの読み出し速度を下回るようにする。Ｓｕｍが新たな目標値Ｓ＿ｔａｒｇｅｔ３に達して以降は、１つのパケットに詰め込むデータ量を標準に戻して、書き込み速度と読み出し速度とを均衡させればよい。
【０１０９】
次に、上で説明したようなサーバ１０１による送信速度制御について詳細に説明する。図１２は、サーバ１０１が行う送信速度制御アルゴリズムの一例を示すフローチャートである。図１２において、最初、端末１０２が自身のバッファ占有量（Ｓｕｍ）を検出し、サーバ１０１は、端末１０２からバッファ占有量Ｓｕｍの通知を受ける（ステップＳ３０１）。次に、サーバ１０１は、ステップＳ３０１で通知されたバッファ占有量Ｓｕｍが、端末１０２から指定された目標値Ｓ＿ｔａｒｇｅｔの近傍で遷移しているか否かを判定する（ステップＳ３０２）。その判定結果が肯定であれば、現在の送信速度が維持される。
【０１１０】
ステップＳ３０２の判定結果が否定の場合、サーバ１０１は、ステップＳ３０１で通知されたバッファ占有量Ｓｕｍが目標値Ｓ＿ｔａｒｇｅｔよりも大きいか否かを判定する（ステップＳ３０３）。そして、判定結果が否定であれば、送信速度を現在よりも速い速度に変更し（ステップＳ３０４）、その後、ステップＳ３０６に進む。一方、ステップＳ３０３の判定結果が肯定であれば、送信速度を現在よりも遅い速度に変更し（ステップＳ３０５）、その後、ステップＳ３０６に進む。
【０１１１】
ステップＳ３０６では、速度制御動作を継続するか否かが判断され、判断結果が肯定の場合、ステップＳ３０１に戻って、上記と同様の動作が繰り返される。一方、判断結果が否定の場合、動作が終了される。以上が、サーバ１０１が行う送信速度制御の一例である。
【０１１２】
ところで、図１２の例では、端末１０２が自身のバッファ占有量を検出して、サーバ１０１に通知している。しかしその場合、端末１０２が検出するのは、現在時刻におけるバッファ占有量である。その上、端末１０２からサーバ１０１への情報伝達に時間がかかるので、サーバ１０１は、伝達遅延時間の分だけ過去のバッファ占有量に基づいて送信速度制御を行うことになり、バッファ占有量をＳ＿ｔａｒｇｅｔの近傍で遷移させるのは、現実には困難である。
【０１１３】
これに対して、以下に説明する別の例（図１３，１４参照）では、未来のある時点でのバッファ占有量に基づいて送信速度制御を行うことによって、バッファ占有量をＳ＿ｔａｒｇｅｔの近傍で遷移させることが可能となる。この場合、サーバ１０１は、端末１０２からＳｕｍの通知を受けるのでなく、未来のある時刻における端末１０２側のバッファ占有量Ｓｕｍを予測算出する。この予測算出は、次のようにして行われる。
【０１１４】
すなわち、図２において、ＲＯＭ４１３には、パケット送信周期Ｔｓ（固定値）と、デコード周期Ｔｆｒｍ（固定値）とが予め記憶されている。ＣＰＵ４１２は、パケットが生成される際、そのパケットに詰め込まれたデータの量（ｄｅｌｔａ０，ｄｅｌｔａ１，ｄｅｌｔａ２，…）をＲＡＭ４０４に記憶させておく。さらに、ストリームが送信される際、各フレームのデータ量（Ｌ０，Ｌ１，Ｌ２，…）をＲＡＭ４０４に記憶させておく。
【０１１５】
ＲＡＭ４０４にはまた、先に端末１０２から通知されたＴ＿ｄｅｌａｙが記憶されており、ＣＰＵ４１２は、ＲＯＭ４１３内のＴｓおよびＴｆｒｍと、ＲＡＭ４０４内のｄｅｌｔａ（０，１，２，…）、Ｌ（０，１，２，…）およびＴ＿ｄｅｌａｙとを参照して所定の演算を行うことにより、未来のある時刻におけるバッファ占有量Ｓｕｍを算出することができる。このような算出処理を行うことによって、サーバ１０１は、端末１０２側においてバッファ占有量Ｓｕｍがどのように遷移してくか（図９〜図１１参照）を予測することができる。
【０１１６】
以下、サーバ１０１が端末１０２のバッファ占有量Ｓｕｍを予測算出して送信速度制御を行う具体的な動作例について、図９のバッファ遷移図、図１３および図１４のフローチャート、図１５のパケット構成図を用いて説明する。
【０１１７】
図９において、Ｓ＿ｍａｘは、端末１０２内バッファの有効蓄積量の最大値（これを簡単に「バッファの総容量」と呼んでいる）であり、Ｓ＿ｔａｒｇｅｔは、今回のストリーミングにおいて端末１０２内バッファに蓄積しようとする目標量であり、Ｔ＿ｄｅｌａｙは、頭出し遅延時間の設定値である。これら各パラメータの意味については、既に説明したとおりである。以下では、端末１０２よりＳ＿ｔａｒｇｅｔとＴ＿ｄｅｌａｙが通知されたものとして説明を行う。
【０１１８】
本実施の形態では、理解を簡単にするために、固定時間周期Ｔｓ毎にパケットの生成・配信を行う例（ｉ＝ｎに相当する時刻でパケット配信：ｎは整数）を示している。また、ｉ＝ｎに相当する時刻（ｔ＝ｉ＊Ｔｓ）でパケットの配信がなされた際に、端末１０２の受信バッファ５０５およびデコーダバッファ５０８内の蓄積量Ｓｕｍは、数フレームに相当するデータ量が瞬時に増加しているが、これは、図１５（Ａ）に示されているように、１パケットに複数フレームを挿入するパターンでパケットを生成して端末１０２に配信しているためである。実際には、パケット配信には転送時間がかかり、図のように瞬時にバッファ占有量が増加する訳ではない（傾き＝ｎｅｔｗｏｒｋＲａｔｅの斜線になる）が、あくまでモデルとして単純化して取り扱うものとする。また、時刻ｔ＝Ｔ＿ｄｅｌａｙ以降、階段状にバッファ占有量が減じられていくのは、その時刻に端末１０２でストリーム再生が始まったことを示している。すなわち、フレームの再生周期Ｔｆｒｍ毎に、各々のフレーム長Ｌ＝Ｌ［ｋ］（ｋは整数）ずつデコーダ５０９でデータが処理される。
【０１１９】
図１３および図１４は、図９のバッファ遷移を実現するためにサーバ１０１によって行われる送信制御アルゴリズムの別の例を示したフローチャートである。図１３がアルゴリズムの全体像であり、図１４は、図１３のステップＳ４０４中の関数ｍｋＰａｃｋｅｔの一例を示すフローチャートである。このようなアルゴリズムを記述したプログラムがＲＯＭ４１３（図２参照）に格納されており、ＣＰＵ４１２は、このプログラムに従って各種の演算や制御を行い、その結果、図９のバッファ遷移が実現される。なお、説明を簡単にするために、ストリーム途中での配信停止などは、今回考えないものとする。以下、各ステップについて順次説明を行う。
【０１２０】
図９において、サーバ１０１は、端末１０２から送られてきたＳ＿ｔａｒｇｅｔおよびＴ＿ｄｅｌａｙを受信して記憶する（ステップＳ４０１）。具体的には、図２において、端末１０２からネットワーク１０３経由で送られてきたＳ＿ｔａｒｇｅｔおよびＴ＿ｄｅｌａｙの値が、ネットワークコントローラ４１０を通じてＲＡＭ４０４に書き込まれる。
【０１２１】
なお、ここでは、端末１０２がパラメータＳ＿ｔａｒｇｅｔおよびＴ＿ｄｅｌａｙの値を決定して、結果をサーバ１０１に通知しているが、代わりに、サーバ１０１がそれらの値を予め記憶しておいてもよく、あるいは、端末１０２の機種情報（バッファの総容量等）を記憶しておいて、この機種情報をもとにサーバ１０１がパラメータの値を計算してもよい。
【０１２２】
次に、各変数の初期化が行われる（ステップＳ４０２，Ｓ４０３）。各変数の意味は、図１４の説明にて後述する。初期化が完了すると、ステップＳ４０４以降の処理、すなわち関数ｍｋＰａｃｋｅｔにてパケットを生成してネットワーク１０３中に送出する処理が開始される。生成されたパケットは、この例では固定周期Ｔｓで端末１０２に配信されるので、サーバ１０１は、ステップＳ４０５にてタイミング調整を行ったのち、ステップＳ４０６にて送出を行う。この一連の処理が完了すると、ＣＰＵ４１２は、関数ｍｋＰａｃｋｅｔの実行カウンタｉを更新し、ステップＳ４０４に戻ってループする。ストリームデータの読み出しおよびパケット化が全て完了すると、ＣＰＵ４１２は、関数ｍｋＰａｃｋｅｔを抜けて、ステップＳ４０４に判定結果ＦＡＬＳＥでｒｅｔｕｒｎする。ＣＰＵ４１２は、このとき配信が完了したと見なし、アルゴリズムを完了する。以上が、本送信制御アルゴリズムの概要である。
【０１２３】
次に、ステップＳ４０４に示された関数ｍｋＰａｃｋｅｔの詳細なアルゴリズムであるが、まず各変数について説明を行う。Ｓｕｍは、端末１０２内の受信バッファ５０５およびデコーダバッファ５０８に蓄積されているデータ量の総和であり、Ｌは、フレームのデータ量であり、ｄｅｌｔａは、関数ｍｋＰａｃｋｅｔが今回呼ばれてからパケット化したデータ量の総和（すなわち１つのパケットに詰め込んだデータの量）であり、ｉｎは、蓄積デバイス４１１から読み出したストリームソースのフレーム数を示すカウンタであり、ｏｕｔは、端末１０２内のデコーダ５０９でデコードされたフレーム数を示すカウンタであり、ｄｔｓは、デコーダ５０９にてフレームがデコードされる時刻であり、ｇｒｉｄは、前回の関数ｍｋＰａｃｋｅｔの１ループを処理する際に進んだｄｔｓの上限値である。
【０１２４】
図１４において、関数ｍｋＰａｃｋｅｔは、大きくパケット生成アルゴリズムＡ１と、デコード量算出アルゴリズムＡ２との２つのアルゴリズムに分けられる。前者において、最初のステップ（Ｓ５０１）では、ＣＰＵ４１２は、ｄｅｌｔａをクリアする。続くステップＳ５０２では、ＣＰＵ４１２は、Ｌ＝Ｌ［ｉｎ］のフレーム（既に読み出し済み）を今回のパケット生成に用いて良いかどうかの判定を行う。判定の基準は、（ａ）ＳｕｍにＬを加えてもＳ＿ｔａｒｇｅｔを越えないこと、および（ｂ）今回の関数呼び出しでパケット化したデータの量（今回１つのパケットに詰め込んだデータの量）ｄｅｌｔａにＬを加えても、１つのパケットに詰め込み可能なデータ量の上限値ｄｅｌｔａＭａｘを超えないこと、の２つであるである。
【０１２５】
ここでｄｅｌｔａＭａｘは、図１５（Ａ）に示される不等式
（ｄｅｌｔａＭａｘ＋ｈｄｒ）／Ｔｓ＜ＮｅｔｗｏｒｋＲａｔｅ
を満たす値であって、周期Ｔｓ以内に端末に配信可能なデータ量の最大値であり、ネットワーク１０３の実効転送レート（伝送能力）から算出が可能である。ステップＳ５０２にて真と判定されると、ＣＰＵ４１２は、ステップＳ５０３に進み、Ｌ＝Ｌ［ｉｎ］のフレームをパケット化する。続くステップＳ５０４では、ＣＰＵ４１２は、パケット化の実行に伴い、Ｓｕｍおよびｄｅｌｔａを更新する。続くステップＳ５０５では、ＣＰＵ４１２は、次のフレームのデータを読み出しバッファ４０７から、フレーム長ＬをＲＡＭ４０４から、それぞれ読み出す。
そして、Ｌが０よりも大きいか否かを判定する。
【０１２６】
ステップＳ５０５の判定結果が否定、すなわちＬ＝０であれば、ＣＰＵ４１２は、全データの読み出しが完了した（ＥｎｄｏｆＦｉｌｅ検出）とみなし、関数を抜ける。そして、メインフロー（図１３）のステップＳ４０４に判定結果ＦＡＬＳＥでＲＥＴＵＲＮする。一方、判定結果が肯定、すなわちＬ＞０であれば、ＣＰＵ４１２は、次のステップ（Ｓ５０６）に進み、Ｌ［ｉｎ］を配列ｌｅｎｇ内に加える（ＲＡＭ４０４に記憶させる）。これは後ほど説明するが、デコード量算出アルゴリズムＡ２で用いるためである。次に、ＣＰＵ４１２は、ステップＳ５０７に進み、読み出したフレーム数カウンタｉｎを更新して、ステップＳ５０２にループする。
【０１２７】
上記のループによるパケット生成を繰り返すうち、Ｓｕｍおよびｄｅｌｔａの値がだんだん大きくなっていく。そして、ステップＳ５０２にてＳｕｍまたはｄｅｌｔａが十分大きくなったと判定されると、ＣＰＵ４１２は、このループを抜けて、デコード量算出アルゴリズムＡ２に入る。
【０１２８】
デコード量算出アルゴリズムＡ２において、最初のステップ（Ｓ５０８）では、ｉ＊Ｔｓがｇｒｉｄ以上であるか否かが判定される。このステップＳ５０８は、端末１０２においてデコードが開始される時刻になったかどうかを判定することが目的である。具体的には、ｇｒｉｄが最初Ｔ＿ｄｅｌａｙに設定されているため、関数呼び出しカウンタ数ｉが小さくてｔ＝ｉ＊Ｔｓがｇｒｉｄ未満の間は、端末１０２でのデコードがまだ始まっていないものと判定される。図９では、ｉ＝０およびｉ＝１と対応する時刻がこれに相当する。
【０１２９】
ステップＳ５０８の判定結果が否定の場合、ＣＰＵ４１２は、デコードによるフレームデータの減算を行わずに関数を抜ける。一方、ｉが十分大きくなってパケット生成時刻ｔ＝ｉ＊Ｔｓがｇｒｉｄ以上になると、ＣＰＵ４１２は、端末１０２でのデコードが既に始まっているとみなし、フレームデータの減算処理を行う。図９では、ｉが２以上の時刻がこれに相当する。続くステップＳ５０９からステップＳ５１２までのループにおいて、現在のｇｒｉｄ時刻から次のｇｒｉｄ時刻（＝ｇｒｉｄ＋Ｔｓ）に挟まれた時間内にデコード処理されるフレームデータの量ｌｅｎｇ［ｏｕｔ］をＳｕｍから減算し、かつデコードしたフレーム数ｏｕｔをカウントアップする。
【０１３０】
上記ループ内のステップＳ５１１において、ｄｔｓには、フレームを１つデコードするたびにＴｆｒｍずつ加算されるが、これは、本実施形態が固定時間間隔Ｔｆｒｍのフレーム発生を行う符号化方式を用いていることに由来する。ステップＳ５１２では、ＣＰＵ４１２は、今回の時間間隔Ｔｓでデコードされるべきフレームの有無を判定している。ステップＳ５１２の判定結果が否定、すなわち、もはや今回の時間間隔Ｔｓでデコードされるフレームは無いと判定されると、ＣＰＵ４１２は、上記のループ（ステップＳ５０９〜Ｓ５１２）から抜け、ステップＳ５１３に進む。ステップＳ５１３では、ＣＰＵ４１２は、変数ｇｒｉｄを次のｇｒｉｄ時刻に更新する。そして、関数を抜け、メインフロー（図１３）のステップＳ４０４に判定結果ＴＲＵＥでＲＥＴＵＲＮする。
【０１３１】
以上のアルゴリズムにより、図９で示したように、端末１０２内において、バッファ占有量Ｓｕｍを常にＳ＿ｔａｒｇｅｔの近傍でかつＳ＿ｔａｒｇｅｔを超えないように遷移させることが可能となる。従って、複数機種の端末１０２があって、バッファの総容量Ｓｍａｘが機種によって異なっていても、それぞれの端末１０２のＳｍａｘに応じてＳ＿ｔａｒｇｅｔを適切な値に設定すれば、バッファのオーバーフローもアンダーフローも生じないようにすることができる。
【０１３２】
なお、今回の例では、図１５（Ａ）のように１パケットに複数フレームを挿入するパターンでパケットを生成したが、代わりに、図１５（Ｂ）のように、１パケットに１フレームのフレームを挿入するパターンでパケットを生成することも可能である。この場合は、図１４のステップＳ５０２において、後半の不等式を
ｄｅｌｔａ＋（Ｌ＋ｈｄｒ）＜＝ｄｅｌｔａＭａｘ
とし、ステップＳ５０４の後半の等式を
ｄｅｌｔａ＋＝（Ｌ＋ｈｄｒ）
とするだけでよい。
【０１３３】
また、本実施形態では、説明を簡単にするために、固定時間間隔Ｔｆｒｍでフレーム発生を行う符号化方式を用いたが、使用する符号化方式―たとえばＭＰＥＧ４ビデオ（ＩＳＯ／ＩＥＣ１４４９６−２）―に合わせてデコード量算出アルゴリズムＡ２を設計すれば、必ずしも固定時間間隔のフレーム発生を行わなくても構わないことはいうまでもない。また、必ずしもフレーム単位でデータを扱うアルゴリズムでなくてもよく、例えばスライス単位、あるいはＭＰＥＧ１やＭＰＥＧ２システムストリームのパック単位でデータを扱うアルゴリズムであってもよい。
【０１３４】
一方、図１４のステップＳ５０２において、Ｓ＿ｔａｒｇｅｔの値が途中で変更されると、本アルゴリズムは瞬時に、変更された新しいＳ＿ｔａｒｇｅｔをターゲットとしてパケットの生成を行うようになる。Ｓ＿ｔａｒｇｅｔの値が途中で変更された場合のバッファ遷移の様子が、図１０および図１１に示されている。図１０において、ｉ＝３の時刻にＳ＿ｔａｒｇｅｔがＳ＿ｔａｒｇｅｔ２に変更（Ｓ＿ｔａｒｇｅｔ＜Ｓ＿ｔａｒｇｅｔ２≦Ｓ＿ｍａｘ）されると、変更後しばらくは多量のフレームデータがパケット化され（図中、ｄｅｌｔａ３やｄｅｌｔａ４）、その結果、Ｓｕｍが新しい目標値Ｓ＿ｔａｒｇｅｔ２の近傍に到達するようになる。
【０１３５】
また、図１１のように、ｉ＝２の時刻にＳ＿ｔａｒｇｅｔがＳ＿ｔａｒｇｅｔ３に変更（Ｓ＿ｔａｒｇｅｔ３＜Ｓ＿ｔａｒｇｅｔ）されると、少量（ｄｅｌｔａ４）または０（ｄｅｌｔａ３）のフレームデータがパケット化される。その一方、デコードによりＳｕｍが消費されるので、やはりＳｕｍが新しい目標値Ｓ＿ｔａｒｇｅｔ３の近傍に到達するようになる。このような仕組みを利用すると、ネットワーク１０３の伝送能力（あるいは端末１０２の電波受信状態）に応じて、動的に端末１０２内のバッファ占有量Ｓｕｍを増減させることが可能となり、以下に説明するような応用が可能となる。
【０１３６】
図７（Ａ）において、携帯電話７０１（図１の端末１０２と対応）を持ったユーザが、図中の矢印７０２ように、中継局Ｂ１の圏内から中継局Ｂ２の圏内へと移動する場合を考える。移動に伴い、携帯電話７０１からの呼を受け付ける業務が、中継局Ｂ１から中継局Ｂ２へと引き渡される。このとき、携帯電話７０１の受信電波強度は、おおむね図７（Ｂ）に示したグラフのように変化する。本モデルでは、説明を簡単にするために、電波強度が強から中（または中から強）に変わるところをネットワーク１０３の伝送能力に関する閾値Ａ、中から弱（または弱から中）に変わるところを閾値Ｂ、弱から圏外（または圏外から弱）に変わるところを閾値Ｃとした。
【０１３７】
図７（Ｂ）において、今、携帯電話７０１を持ったユーザが距離ｄ１だけ移動し、伝送能力が閾値Ａを下回ったとする。このとき携帯電話７０１は、図１１に示されるように、Ｓ＿ｄｅｌａｙをより大きい値（Ｓ＿ｔａｒｇｅｔ２）に変更して、その値をサーバ１０１に通知する。これは、その後も進むと予想される伝送能力低下に備えて、サーバ１０１による新たなパケット生成および送出を促進させ、それにより、できるだけ長時間（これをΔｔとする）ぶんのデータを携帯電話７０１内のバッファに蓄積しておくためである。伝送能力が閾値Ａを下回っても、閾値Ｂ以上である間は、まだパケットの転送ロスが発生することがないので、このような伝送速度の高速化が可能である。
【０１３８】
ユーザが移動して距離ｄ２に達すると、伝送能力が閾値Ｂを下回って、パケットの転送ロスが発生し始める。このとき、携帯電話７０１は、図１１に示されるように、Ｓ＿ｔａｒｇｅｔを小さい値（Ｓ＿ｔａｒｇｅｔ３）に変更して、その値をサーバ１０１に通知する。これは、その後もさらに進むと予想される伝送能力低下に備えて、できるだけサーバ１０１による新たなパケット生成および送出を抑制させるためである。パケット生成および送出を抑制するのは、次の理由による。
【０１３９】
たとえば、携帯電話７０１が通信方式としてＰＨＳのＰＩＡＦＳ方式を採用している場合、パケットの伝送ロスが発生すると、リンクレイヤであるＰＩＡＦＳ層のプロトコルに基づくデータ再送処理が行われる。再送処理中に、新たなパケット生成および送出が行われると、それが再送処理の邪魔をする結果となり、かえって好ましくないからである。
【０１４０】
ユーザが移動して距離ｄ３に達すると、伝送能力が閾値Ｃを下回って、一瞬、パケット転送が困難となる。しかし、ユーザがさらに移動して距離ｄ４に達すると、伝送能力が閾値Ｂを上回り、かつ呼を受け付ける業務の引き渡し（ハンドオーバー）も完了しているので、携帯電話７０１は、今度はＳ＿ｔａｒｇｅｔ３を元の値Ｓ＿ｔａｒｇｅｔに戻して、その値をサーバ１０１に通知し、それによって、データの蓄積量（すなわちバッファ占有量Ｓｕｍ）を増加させる。なお、ＰＨＳ等のハンドオーバー時間は、普通に人が歩く速さでもおおよそ２〜３秒程度で完了するため、上記のΔｔをおおよそ３〜４秒程度確保しておけば、ハンドオーバーが起こっても携帯電話７０１でのストリーミング再生を滞りなく継続することができる。
【０１４１】
ところで、図１１のように、ストリーム配信の途中でＳ＿ｔａｒｇｅｔの設定値がより小さな値に変更されると、図１４のアルゴリズムにおいてステップＳ５０２の判定文がなかなか真にならず、次のフレームのデータを送出できないケースが起こりうる。このようなケースがたびたび発生すると、折角パケットを端末１０２に届けても、もはやそのパケット内のフレームデータを再生するべき時刻（ＰｒｅｓｅｎｔａｔｉｏｎＴｉｍｅ）を経過してしまっており、データが無駄になってしまうことがある。このような場合は、再生時刻が経過してしまったフレームデータをスキップした方が、無駄なデータをネットワーク１０３に流さないで済む分だけ効率的である。
【０１４２】
図１６は、図１３のステップＳ４０４中の関数ｍｋＰａｃｋｅｔの別の例を示すフローチャートである。図１６の関数ｍｋＰａｃｋｅｔには、サーバ１０１が送信速度を遅くする際に、再生時刻を過ぎたデータの送出をスキップするためのステップ（Ｓ６０１およびＳ６０２）が含まれている。すなわち、図１６のアルゴリズムは、図１４と比較して、ステップＳ６０１およびステップＳ６０２が追加されただけである。他のステップは、図１４と全く同じであり、同一の参照番号が付されている。ステップＳ６０１では、ＣＰＵ４１２は、今から送出しようとしているｉｎ番目のフレームデータが、０番目のフレームのデータでなく、かつ、端末１０２にて既にデコードされたとみなされるｏｕｔ番目のフレームデータより再生時刻が後かどうかを判定している。
【０１４３】
この判定結果が真ならば、ＣＰＵ４１２は、ｉｎ番目のフレームのデータが端末での再生時刻に間に合うと見なして、ステップＳ５０３にてそのデータをパケット化し、端末１０２に送出する。偽の場合は、ｉｎ番目のフレームデータを無かったものとみなし、ステップＳ６０２にてＬ＝０とする。これにより、ステップＳ５０２では必ず真と判定され、かつステップＳ５０３のパケット化において、不要なフレームデータのコピーを行わずに、送出フレームを次に進めることができる。なお、このようなフレームスキップがあった場合は、デコーダ５０９での再生が時間Ｔｆｒｍだけ飛ぶので、その旨を端末１０２に通知する情報が、図１５（Ａ），（Ｂ）のパケット中に記述されるものとする。例えば、ヘッダ内にそのような再生時刻情報を記述する領域を設ければよい。
【０１４４】
図１６に示したアルゴリズムは、ＭＰＥＧオーディオのように各フレームどうしの優先順位（重要度）に差が無い場合には、十分有効な手法である。しかし、ＭＰＥＧビデオにおいては、従来例の紹介において既に説明したように、Ｉフレームであればそれ単独で意味のある画像を再構成することができるが、ＰやＢのフレームでは、時間的に前後の参照フレームがなければ、意味のある画像を再構成することができない。この場合には、図１６のアルゴリズムにおいてフレームの間引きを行う際に、再生時刻に間に合うＩフレームを優先的に送出する一方、ＰやＢのフレームを全てスキップすることで、ネットワーク１０３の転送速度が遅い状況においても、端末１０２に対してより高品位の映像を提供することが可能となる。
【０１４５】
図１７は、図１３のステップＳ４０４中の関数ｍｋＰａｃｋｅｔの、さらに別の例を示すフローチャートである。図１７の関数ｍｋＰａｃｋｅｔには、サーバ１０１が送信速度を遅くする際に、優先度の低いデータと、優先度は高いが再生時刻を過ぎたデータとの送出をスキップするためのステップ（Ｓ５０５’，Ｓ６０１，Ｓ６０２，Ｓ７０１およびＳ７０２）が含まれている。すなわち、図１７のアルゴリズムは、図１４と比較して、ステップＳ６０１，Ｓ６０２，Ｓ７０１およびＳ７０２が追加され、かつステップＳ５０５がステップＳ５０５’に置き換えられている。ステップＳ５０５’は、ステップＳ５０５に対し、関数ＮｅｘＴｆｒｍに優先順位ｐｒｉの検出機能が加えられたものである。他のステップは、図１４，図１６と全く同じであり、同一の参照番号が付されている。
【０１４６】
従って、図１７のアルゴリズムは、図１６と比較すると、ステップＳ７０１，Ｓ７０２が追加され、かつステップＳ５０５がＳ５０５’に置き換わっている。
【０１４７】
図１７のアルゴリズムを実行するには、端末１０２が検出した受信状態を示す情報（受信状態情報）を、端末１０２からサーバ１０１に通知する機能が必要となる。このような機能を持ったサーバ・クライアント・システムの構成例を、図１８に示す。図１８において、端末１０２は、受信状態を検出する検出部８０１を備えている。端末１０２とサーバ１０１との間には、検出された受信状態情報を端末１０２からサーバ１０１に通知する通知部８０２が設けられる。サーバ１０１は、保持部８０３を備えており、通知された受信状態情報を保持する。
【０１４８】
再び図１７において、ｍｋＰａｃｋｅｔ関数が呼び出されると、ステップＳ５０１に先立って、ステップＳ７０１が実行される。ステップＳ７０１において、サーバ１０１（のＣＰＵ４１２）は、保持部８０３内の情報（端末１０２側の受信状態）を参照して、ネットワーク１０３の伝送能力が閾値Ｂを下回るか否かを判定する。この判定の結果、閾値Ｂを下回っていればｓｌｏｗｆｌａｇを真とし、そうでなければ偽とする。
【０１４９】
ステップＳ５０５’では、次のフレームの優先度が検出され、続くステップＳ７０２では、そのフレームのデータの優先度が高く、かつｓｌｏｗｆｌａｇが真であるか否かが判定される。この判定結果が肯定、すなわちネットワーク１０３の転送速度が遅いことを示すｓｌｏｗｆｌａｇが真であり、かつ優先度の高いフレームである場合、ステップＳ６０１に進んで、再生時刻が経過してしまったフレームか否かが判定される。一方、判定結果が否定である場合、ステップＳ６０２に進んで、Ｌ＝０とされる（つまり、たとえ再生時刻に間に合うようであっても、そのフレームはスキップされる）。後の処理は、図１４や図１６の処理と全く同様である。
【０１５０】
以上のように、本実施形態によれば、端末１０２が、自身のバッファ容量とネットワーク１０３の伝送能力とに応じた目標量を決定し、さらに、目標量を伝送能力で除して得られる値を超えない範囲内で、遅延時間を決定する。サーバ１０１は、こうして端末１０２が決定した目標量および遅延時間に基づいて送信速度を制御するので、端末１０２のバッファ容量が機種によって異なっていても、ネットワーク１０３の伝送能力が変動しても、バッファ量および伝送能力に応じた送信速度制御が行え、その結果、バッファのアンダーフローやオーバーフローによるストリーミング再生の破綻を回避することが可能となる。しかも、目標量とは独立に遅延時間が決定されるので、ストリーミング再生の破綻回避と、頭出し時の待ち時間短縮とを互いに両立させることができる。
【図面の簡単な説明】
【図１】本発明の一実施形態に係るストリーミング方法を実行するサーバ・クライアント・システムの構成例を示すブロック図である。
【図２】図１のサーバ１０１の構成を示すブロック図である。
【図３】図１の端末１０２の構成を示すブロック図である。
【図４】図１のシステムの全体動作を説明するためのシーケンス図である。
【図５】図１の端末１０２の動作を示すフローチャートである。
【図６】図３のＲＯＭ５０２の記憶内容を示す図である。
【図７】あるエリアにおける電界強度の分布（Ａ）と、端末の移動に伴う伝送能力の変化（Ｂ）とを示す模式図である。
【図８】図５のステップＳ１０７の詳細を示すフローチャートである。
【図９】図１のサーバ１０１が行う送信速度制御によって、端末１０２のバッファ占有量がどのように遷移するか（Ｓ＿ｔａｒｇｅｔに近づいていく様子）を示す図である。
【図１０】バッファ占有量がＳ＿ｔａｒｇｅｔの近傍で遷移している状態で、Ｓ＿ｔａｒｇｅｔの値がより大きな値（Ｓ＿ｔａｒｇｅｔ２）に変更された場合に、図１のサーバ１０１が行う送信速度制御によって、端末１０２のバッファ占有量がどのように遷移するかを示す図である。
【図１１】バッファ占有量がＳ＿ｔａｒｇｅｔの近傍で遷移している状態で、Ｓ＿ｔａｒｇｅｔの値がより小さな値（Ｓ＿ｔａｒｇｅｔ３）に変更された場合に、図１のサーバ１０１が行う送信速度制御によって、端末１０２のバッファ占有量がどのように遷移するかを示す図である。
【図１２】図１のサーバ１０１が行う送信速度制御アルゴリズムの一例を示すフローチャートである。
【図１３】図９〜図１１のバッファ遷移を実現するためにサーバ１０１によって行われる送信速度制御アルゴリズムの別の例を示したフローチャートである。
【図１４】図１３のステップＳ４０４中の関数ｍｋＰａｃｋｅｔの一例を示すフローチャートである。
【図１５】図１のサーバ１０１が生成するパケットの構成例（Ａは１パケットに複数フレームを挿入する場合、Ｂは１パケットに１フレームを挿入する場合）を示す図である。
【図１６】図１３のステップＳ４０４中の関数ｍｋＰａｃｋｅｔの別の例を示すフローチャートである。
【図１７】図１３のステップＳ４０４中の関数ｍｋＰａｃｋｅｔの、さらに別の例を示すフローチャートである。
【図１８】本発明の一実施形態に係るストリーミング方法を実行するサーバ・クライアント・システムの別の構成例を示すブロック図である。
【図１９】従来のストリーミング方法を説明するための図である（Ａはビデオフレーム、Ｂはバッファ占有量の遷移、Ｃは従来端末の構成例）。
【図２０】従来のストリーミング方法を実行するサーバ・クライアント・システムの構成例を示すブロック図である。
【図２１】受信バッファを追加することによってバッファ占有量の遷移がどのように変化するかを説明するための図である（Ａが追加前、Ｂが追加後）。
【符号の説明】
１０１サーバ
１０２端末
１０３ネットワーク
４０２，５０７送受信モジュール
４０４ＲＡＭ
４０５生成モジュール
４０６パケット生成回路
４０７読み出しバッファ
４０８パケット生成バッファ
４０９送信バッファ
４１０，５０６ネットワークコントローラ
４１１蓄積デバイス
４１２，５０３ＣＰＵ
４１３，５０２ＲＯＭ
５０５受信バッファ
５０８デコーダバッファ
５０９デコーダ
５１０再生モジュール
５１１表示デバイス[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a streaming method, and more particularly to a streaming method in which a server transmits multimedia data to a terminal through the Internet, and the terminal reproduces the data while receiving the data.
[0002]
[Prior art]
(Description of multimedia data encoding and compression method and buffer model)
There are various types of multimedia data used for transmission on the Internet, such as moving images, still images, audio, text, and data in which they are multiplexed. In the video, H. Coding compression methods such as H.263, MPEG1, 2, 4 are well known, JPEG is used for still images, MPEG audio is used for audio, and G.264 is used for audio. 729 and so on.
[0003]
In the present invention, since the focus is on streaming reproduction, moving images and audio are targets for transmission. Here, an MPEG video that is a representative of the moving image compression method, particularly an MPEG1 (ISO / IEC 11172) video and an MPEG2 (ISO / IEC 13818) video that have a relatively simple mechanism will be described.
[0004]
MPEG video mainly has the following two features in order to realize highly efficient data compression. First, in compression of moving image data, a compression method using temporal correlation characteristics between frames is adopted in addition to a compression method using spatial frequency characteristics which has been conventionally performed. In MPEG, each frame (also referred to as a picture) constituting a stream is divided into an I frame (an intra-frame encoded picture), a P frame (a picture using intra-frame encoding and a reference relationship from the past), a B frame ( Data compression is performed by classifying into three types (pictures using intra-frame coding and past and future reference relationships). In these three types, the I frame is the largest (that is, the amount of information is the largest), followed by P and B. Although greatly dependent on the compression algorithm, the ratio of the information amount is approximately I: P: B = 4: 2: 1. In general, an MPEG video stream includes 15 frames (= 1 GOP) as a unit, and 1 GOP includes 1 I frame, 4 P frames, and 10 B frames.
[0005]
The second feature of MPEG video is that dynamic code amount allocation according to image complexity can be performed in units of pictures. An MPEG decoder includes a decoder buffer. By decoding data after storing the data in the decoder buffer, a large amount of code can be assigned to a complex image that is difficult to compress. In moving image compression, not limited to MPEG, the standard decoder buffer capacity is mostly defined by the standard. In the case of MPEG1 or MPEG2, the standard decoder buffer has a capacity defined as 224 Kbytes in the standard, and the MPEG encoder must generate picture data so that the decoder buffer occupancy transitions within this capacity.
[0006]
19A to 19C are diagrams for explaining a conventional streaming method. FIG. 19A is a diagram showing a video frame, FIG. 19B is a diagram schematically showing a transition of buffer occupancy, and FIG. 19C is a diagram showing a configuration example of a conventional terminal. . In FIG. 19C, the terminal includes a video buffer, a video decoder, an I / P rearrangement buffer, and a switch. The video buffer corresponds to the decoder buffer described above, and the transferred data is stored in the video buffer and then decoded by the video decoder. The decoded data is rearranged in order of reproduction time through the I, P rearrangement buffer and the switch.
[0007]
In FIG. 19B, the vertical axis represents the buffer occupation amount (data accumulation amount of the video buffer), the horizontal axis represents time, and the thick line in the figure represents the temporal transition of the buffer occupation amount. The slope of the thick line corresponds to the video bit rate and indicates that data is input to the buffer at a constant rate. Further, the buffer occupancy is reduced at regular intervals (33.3667 msec). This is because the data of each frame is decoded at regular intervals. In addition, the intersection of the oblique dotted line and the time axis indicates the time when the data in each video frame starts to be transferred to the video buffer. Accordingly, the transfer start time of frame X shown in FIG. 19A is t1, and the transfer start time of frame Y is t2.
[0008]
19A and 19B, from the time t1 when the start frame X of the video starts to be input to the video buffer to the time when decoding is first executed (the first falling position of the bold line in the figure). Is generally called vbv_delay time. Since the first decoding is performed at the moment when the video buffer is full, the vbv_delay time is usually the time from the start of data input until the video buffer with a capacity of 224 Kbytes is full. This is the initial delay time (waiting time for cueing) from the start until video playback is started through the decoder.
[0009]
When the frame Y in FIG. 19A is a complex image, as shown in FIG. 19B, the amount of code is large, so that it is earlier than the decoding time of frame Y (t3 in the figure). Data transfer to the video buffer must be started from time (t2 in the figure). However, the amount of pictures that occupy the buffer is within the allowable range of 224 Kbytes, no matter how complex the image is.
[0010]
If data is transferred to the video buffer so that the buffer transition shown in FIG. 19B is properly maintained, it is guaranteed by the MPEG standard that no streaming failure occurs due to an underflow or overflow of the video buffer. ing.
[0011]
(Description of network transfer jitter absorption receive buffer)
However, as shown in FIG. 20, when the server 201 and the terminal 202 are connected via the network 203 and the MPEG data in the storage 210 is distributed, the time for generating the packet by the generation module 211 and the network devices 204 and 205 The data transfer rate fluctuates due to the transfer procedure time, the transmission delay time accompanying the congestion of the network 203, and the like. Therefore, in reality, the buffer transition shown in FIG. 19B is not maintained. As a method for mitigating and absorbing such fluctuations (jitter) in the transfer rate, it is conceivable that content having a coding rate sufficiently smaller than the network bandwidth is first flowed. However, this method is not appropriate because it is necessary to provide high-quality video and audio using network resources as effectively as possible. Therefore, in general, the network devices 204 and 205 are provided with transmission / reception buffers 206 and 207 having appropriate capacities, respectively, so that data is normally transferred somewhat forward and there is no shortage when data transfer is delayed. A supplementary method is adopted.
[0012]
Here, providing the reception buffer 207 on the terminal 202 side means that, in the buffer transition of FIG. 19B, the upper limit of the buffer occupancy amount is increased from the 224 Kbytes that is the standard of the decoder buffer 208 to the accumulation amount by the reception buffer 207. It is roughly equivalent to raising the height by that amount. FIGS. 21A and 21B show the buffer occupancy before and after the addition of the reception buffer 297 side by side. Note that FIG. 21A shows the same buffer transition as in FIG. 19B.
[0013]
By adding the reception buffer 207, the allowable range of buffer transition is expanded. As a result, the buffer transition of FIG. 19B, that is, the buffer transition of FIG. 21A becomes as shown in FIG. Even if the transfer rate decreases, underflow can be avoided. On the other hand, the vbv_delay time is increased by a time corresponding to the accumulation amount by the reception buffer 207, and the start of decoding by the decoder 209 and the start of playback by the playback device 212 are delayed. That is, the cue time becomes longer by the time required for data storage in the reception buffer 207.
[0014]
[Problems to be solved by the invention]
As is apparent from the above, when streaming multimedia data such as MPEG in a network environment where reliability and transmission speed are guaranteed such as a small LAN, it is basically determined by the codec standard. As long as the system design properly observes the playback initial delay time (vbv_delay) and the decoder buffer transition, the decoder playback will not underflow or overflow, and streaming playback will not fail.
[0015]
However, in a wide area network environment such as the Internet, since the transfer jitter due to the transmission characteristic variation of the communication path is so large that it cannot be ignored, the conventional terminal 202 has a decoder buffer (vbv buffer) defined by the codec standard. In addition, there are many cases where a buffer for absorbing transfer jitter, such as the reception buffer 207 of FIG. At this time, the following problems exist.
[0016]
In general, the capacity of a buffer for absorbing jitter mounted on a terminal varies depending on the model. For this reason, even if the same data is distributed under the same conditions, streaming playback can be performed without failure on a model with a large buffer capacity. However, on a small model, there is a case in which it fails without absorbing jitter.
[0017]
In order to solve this problem, for example, the amount of memory installed in the terminal may be increased to ensure a sufficient buffer capacity for absorbing jitter. However, the amount of installed memory is one of the main factors that determine the price of a terminal, and there is a demand to suppress it as much as possible. In addition, if the buffer capacity for absorbing jitter is too large, a new problem arises that the cue time until the start of reproduction becomes long and the user is frustrated.
[0018]
Therefore, the object of the present invention is to avoid the failure of streaming playback due to buffer underflow or overflow even if the buffer capacity of the terminal varies depending on the model or the transmission capacity of the network fluctuates. Furthermore, it is to provide a streaming method capable of achieving both the avoidance of failure of streaming reproduction and the reduction of waiting time at the time of cueing.
[0019]
[Means for Solving the Problems and Effects of the Invention]
A first invention is a streaming method in which a server transmits stream data to a terminal through a network, and the terminal reproduces the stream data while receiving the stream data,
A target amount determination step in which the terminal determines a target amount of stream data to be stored in its buffer in relation to its buffer capacity and network transmission capability;
The delay time from when the terminal writes the head data of the stream to its own buffer until the data is read and playback is started, as long as it does not exceed the value obtained by dividing the buffer capacity by the transmission capacity Determine the delay time determination step,
The terminal notifies the server of the determined target time and delay time;
When the server transmits the stream data to the terminal through the network, the server includes a control step of controlling the transmission speed based on the notified target amount and delay time.
[0020]
In the first invention, the terminal determines a target amount according to its own buffer capacity and the transmission capacity of the network, and further within a range not exceeding a value obtained by dividing the buffer capacity by the transmission capacity, Determine the delay time. Since the server controls the transmission speed based on the target amount and delay time determined by the terminal in this way, even if the terminal buffer capacity varies depending on the model or the transmission capacity of the network fluctuates, the buffer amount and transmission capacity As a result, it is possible to avoid failure of streaming playback due to buffer underflow or overflow. In addition, since the delay time is determined independently of the target amount, it is possible to achieve both the avoidance of failure of streaming playback and the reduction of the waiting time at the time of cueing.
[0021]
Here, the reason why the delay time is limited to a value obtained by dividing the buffer capacity by the transmission capability is that there is a risk that streaming playback may fail if the delay time exceeds this value. As long as the value does not exceed this value, the delay time may be determined to any value. However, when determining the value, the balance between the tolerance to fluctuations in transmission capability and the waiting time at the time of cueing is considered.
[0022]
According to a second invention, in the first invention,
In the control step, the server
The transmission speed is controlled so that the amount of stream data stored in the buffer of the terminal changes without exceeding the target amount in the vicinity of the target amount.
[0023]
In the second aspect of the invention, since the accumulated amount transitions in the vicinity of the target amount without exceeding the target amount, the buffer underflow and overflow are unlikely to occur.
[0024]
According to a third invention, in the second invention,
In the control step, the server predicts and calculates the amount of stream data stored in the terminal buffer based on the transmission speed, the delay time, and the speed at which the terminal decodes the stream data. .
[0025]
In the third aspect, since the server predicts and calculates the accumulation amount and performs transmission speed control based on the amount, the accumulation amount can be shifted in the vicinity of the target amount so as not to exceed the target amount.
[0026]
Here, the terminal may notify the server of the current accumulation amount, and the server may perform transmission speed control based on the notification. However, in this case, since it takes time to transmit information from the terminal to the server, the server performs transmission speed control based on the past accumulation amount. Therefore, it is not always possible to make a transition so that the accumulated amount does not exceed the target amount in the vicinity of the target amount.
[0027]
According to a fourth invention, in the first invention,
A detection step in which the terminal detects that the transmission capability of the network has changed across a predetermined threshold;
A target amount changing step in which the terminal changes the target amount according to the detection result in the detecting step; and
The terminal further includes a step of notifying the server of the target amount after the change,
In the control step, when the server receives the notification of the changed target amount, the amount of stream data accumulated in the terminal buffer does not exceed the changed target amount in the vicinity of the changed target amount. The transmission speed is controlled so as to transit.
[0028]
In the fourth aspect, when the transmission capacity changes across the threshold, the target amount is changed by the terminal. The server follows the change in the target amount by controlling the transmission speed so that the change does not exceed the target amount after the change in the vicinity of the target amount after the change.
[0029]
A fifth invention is the fourth invention,
When detecting that the transmission capability of the network has decreased across the first threshold in the detection step, the terminal changes the direction to increase the target amount in the target amount change step,
In the control step, the server controls the transmission speed to increase in accordance with the increase of the target amount.
[0030]
In the fifth aspect, when the transmission capacity changes across the first threshold, the target amount is increased by the terminal. The server follows the increase in the target amount by increasing the transmission speed.
[0031]
According to a sixth invention, in the fifth invention,
The first threshold is a value approximately in the middle between the maximum realizable transmission capability and the transmission capability at which stream data transfer loss begins to occur.
[0032]
In the sixth aspect of the invention, when the transmission capability is decreasing, the transmission rate is increased to increase the accumulation amount before the transfer loss of stream data starts to occur. As a result, it is possible to prevent the streaming reproduction from failing when the transmission capability is reduced.
[0033]
According to a seventh invention, in the fourth invention,
When detecting that the transmission capability of the network has dropped across the second threshold value smaller than the first threshold value in the detection step, the terminal changes the direction to decrease the target amount in the target amount change step,
In the control step, the server controls the transmission speed to decrease in accordance with the target amount being changed in the decreasing direction.
[0034]
In the seventh aspect, when the transmission capacity changes across the second threshold, the target amount is decreased by the terminal. The server follows the decrease in the target amount by reducing the transmission rate.
[0035]
In an eighth aspect based on the seventh aspect,
The second threshold value is a value corresponding to a transmission capability at which a transfer loss of stream data starts to occur.
[0036]
In the eighth aspect of the invention, when the transmission capacity declines and stream data transfer loss begins to occur, the transmission speed is reduced. This is to prevent the lost data from being retransmitted.
[0037]
Here, when reducing the transmission rate, the server must skip frame transmission at a frequency corresponding to the decrease rate. When a frame is skipped, the quality of video and audio obtained by playing back the terminal is degraded. In order to suppress this deterioration in quality, in the ninth invention, a frame that is not in time for the reproduction time is selected as a skipped frame. In the following tenth invention, as a frame to be skipped, a frame with low importance and a frame with high importance but not in time for the reproduction time are selected.
[0038]
In a ninth aspect based on the eighth aspect,
In the target amount changing step, when the terminal changes in a direction to decrease the target amount, in the control step, the server sequentially compares the reproduction time of each frame constituting the stream to be transmitted with the current time, thereby reproducing the reproduction time. Is characterized by skipping transmission of frames older than the current time, thereby controlling the transmission rate in a decreasing direction.
[0039]
In the eleventh aspect of the invention, frames that do not meet the playback time are selectively skipped, so that deterioration in quality due to a decrease in transmission speed can be suppressed as compared with random skipping.
[0040]
In a tenth aspect based on the eighth aspect,
In the target amount changing step, when the terminal changes in a direction to decrease the target amount, in the control step, the server sequentially compares the importance of each frame constituting the stream to be transmitted with the reference value,
For frames whose importance is less than the reference value, all transmissions are skipped,
For frames whose importance is greater than or equal to the reference value, each playback time is sequentially compared with the current time, and transmission is skipped only when the playback time is older than the current time, thereby controlling the transmission speed in a decreasing direction. It is characterized by doing.
[0041]
In the tenth aspect of the invention, a frame with low importance and a frame with high importance but not in time for the reproduction time are selectively skipped, so the transmission rate is higher than when skipping randomly. It is possible to suppress the deterioration of the quality due to the decrease in the quality.
[0042]
Here, as in the tenth aspect of the invention, a method that considers the importance in addition to whether or not it is in time for the reproduction time when selecting a frame to be skipped is typically used for an MPEG video frame. Used. In this case, when the transmission speed is reduced, the P and B frames are skipped as low importance frames, while the I frame is skipped as a high importance frame unless it is not in time for the playback time. Therefore, quality degradation of the reproduced image due to a decrease in transmission speed can be minimized. In the case of an MPEG audio frame, there is no difference in importance between frames, so it is only necessary to consider whether or not it is in time for the playback time.
[0043]
An eleventh invention is a system comprising a server that transmits stream data through a network and a terminal that reproduces the stream data while receiving the data.
The terminal
A target amount determining means for determining a target amount of stream data to be accumulated in its own buffer in relation to its own buffer capacity and network transmission capability;
Arbitrarily determine the delay time from writing the head data of the stream to its own buffer until reading the data and starting playback within a range not exceeding the value obtained by dividing the buffer capacity by the transmission capacity Delay time determining means, and
Means for notifying the server of the determined target time and delay time;
The server includes control means for controlling the transmission speed based on the notified target amount and delay time when transmitting stream data to the terminal through the network.
[0044]
A twelfth aspect of the invention is a terminal that is used together with a server that transmits stream data over a network and that plays back the stream data while receiving the stream data.
The server is equipped with a control means for controlling the transmission speed based on the notified target amount and delay time when transmitting stream data to the terminal through the network,
A target amount determining means for determining a target amount of stream data to be accumulated in its own buffer in relation to its own buffer capacity and network transmission capability;
Arbitrarily determine the delay time from writing the head data of the stream to its own buffer until reading the data and starting playback within a range not exceeding the value obtained by dividing the buffer capacity by the transmission capacity Delay time determining means, and
Means for notifying the server of the determined target time and delay time.
[0045]
A thirteenth aspect of the invention is a server that is used together with a terminal that reproduces while receiving stream data, and that transmits the stream data through a network.
On the device,
A target amount determining means for determining a target amount of stream data to be accumulated in its own buffer in relation to its own buffer capacity and network transmission capability;
Arbitrarily determine the delay time from writing the head data of the stream to its own buffer until reading the data and starting playback within a range not exceeding the value obtained by dividing the buffer capacity by the transmission capacity Delay time determining means, and
A means to notify the server of the determined target time and delay time is provided.
When transmitting stream data to the terminal through the network, the control means for controlling the transmission speed based on the notified target amount and delay time,
The control means controls the transmission speed so that the amount of stream data accumulated in the buffer of the terminal changes without exceeding the target amount in the vicinity of the target amount.
[0046]
A fourteenth invention is a program describing a streaming method as in the first invention.
[0047]
A fifteenth aspect of the invention is a recording medium on which a program like the fourteenth aspect of the invention is recorded.
[0048]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing a configuration example of a server client system that executes a streaming method according to an embodiment of the present invention. In FIG. 1, the system includes a server 101 and a terminal 102 that operates as a client thereof. Video and audio data are accumulated on the server 101 side. This data is data encoded and compressed by MPEG. In response to a request from the terminal 102, the server 101 packetizes the accumulated data and generates a stream. Then, the generated stream is transmitted to the terminal 102 through the network 103. The terminal 102 receives and decodes the stream transmitted from the server 101, and displays and outputs the obtained video and audio.
[0049]
FIG. 2 is a block diagram showing the configuration of the server 101 in FIG. In FIG. 2, the server 101 includes a storage device 411, a transmission / reception module 402, a generation module 405, a RAM 404, a CPU 412, and a ROM 413. The storage device 411 stores video and audio data. Data in the storage device 411 is given to the generation module 405. The generation module 405 includes a read buffer 407, a packet generation circuit 406, and a packet generation buffer 408. The generation module 405 packetizes given data to generate a stream.
[0050]
The transmission / reception module 402 includes a network controller 410 and a transmission buffer 409, and transmits the stream generated by the generation module 405 to the terminal 102 via the network 103. Also, information transmitted from the terminal 102 via the network 103 is received.
[0051]
Information from the terminal 102 received by the transmission / reception module 402 is written in the RAM 404. A server control program is stored in the ROM 413, and the CPU 412 executes the program in the ROM 413 while referring to information from the terminal stored in the RAM 404, thereby controlling the transmission / reception module 402 and the generation module 405. I do. Although the program is stored in the ROM 413 here, the program may be stored in a storage medium other than the ROM, such as a hard disk or a CD-ROM.
[0052]
FIG. 3 is a block diagram showing a configuration of the terminal 102 of FIG. In FIG. 3, the terminal 102 includes a transmission / reception module 507, a playback module 510, a display device 511, a ROM 502, and a CPU 503. The transmission / reception module 507 includes a network controller 506 and a reception buffer 505, and receives a stream transmitted from the server 101 via the network 103. Information from the CPU 503 is transmitted to the server 101 via the network 103.
[0053]
A stream received by the transmission / reception module 507 is input to the reproduction module 510. The reproduction module 510 includes a decoder buffer 508 and a decoder 509, and decodes and reproduces an input stream. The data reproduced by the reproduction module 510 is given to the display device 511, and the display device 511 converts the data into video and displays it.
[0054]
A terminal control program is stored in the ROM 502, and the CPU 503 executes the program in the ROM 502, thereby controlling the transmission / reception module 507, the reproduction module 510, and the display device 511.
[0055]
The operation of the system configured as described above will be described below. FIG. 4 is a sequence diagram for explaining the overall operation of the system of FIG. FIG. 4 shows a transmission / reception layer and a control layer on the server 101 side, and a transmission / reception layer and a control layer on the terminal 102 side, and commands and streams exchanged between these layers are arranged in time series. ing.
[0056]
First, the overall operation of this system will be described with reference to FIG. In FIG. 4, the command “SETUP” is first transmitted from the terminal 102 to the server 101. The server 101 performs initial setting according to “SETUP”. When the setting is completed, “OK” is returned from the server 101 to the terminal 102.
[0057]
When “OK” is returned from the server 101, the command “PLAY” is transmitted from the terminal 102 to the server 101. The server 101 prepares for transmission, and when the preparation is completed, “OK” is returned from the server 101 to the terminal 102.
[0058]
When “OK” is returned from the server 101, the terminal 102 shifts to a state of waiting for a stream. Following the response of “OK”, the server 101 starts transmitting a stream.
[0059]
Thereafter, the command “TEARDOWN” is transmitted from the terminal 102 to the server 101, and the server 101 ends the stream transmission in response to “TEARDOWN”. When the transmission is completed, “OK” is returned from the server 101 to the terminal 102.
When “OK” is returned from the server 101, the terminal 102 leaves the stream standby state.
[0060]
The above is an overview of the overall operation of the system, and as long as it has been described above, the operation of the system is the same as the conventional one. The operation of this system differs from the conventional one in the following two points (1) and (2).
(1) Parameters “S_target” and “T_delay” are attached to the command “SETUP” from the terminal 102 to the server 101, and the server 101 controls the transmission speed based on these parameters when transmitting the stream. .
[0061]
In the above (1), “S_target” is a target value of the amount of data stored in the buffer by the terminal 102, and the total capacity of the buffers (reception buffer 505 and decoder buffer 508 in the example of FIG. 3) provided in the terminal 102. ("S_max") and the transmission capability of the network 103. Therefore, “S_target” generally varies depending on the model of the terminal 102.
[0062]
Further, “T_delay” is a time from when the terminal 102 writes the head data to the buffer until the data is read and decoding is started (that is, cue delay time), and “S_target” is a transmission speed (described later). The value obtained by dividing by is determined as an arbitrary value within a range not exceeding the maximum value. That is, although the condition “within a range not exceeding the value obtained by dividing“ S_target ”by the transmission rate” is attached, the terminal 102 can determine “T_delay” independently of “S_target”.
[0063]
The “transmission speed” refers to the amount of information transmitted per unit time. For example, when the number of packets transmitted per unit time is determined, the amount of data packed in each packet is increased or decreased to increase the amount of information transmitted. The speed can be controlled. In addition, when the amount of data to be packed in each packet is determined, the transmission speed can be controlled by expanding / contracting the time interval between the packet and the next packet. Alternatively, the transmission speed can also be controlled by performing both simultaneously (that is, increasing or decreasing the amount of data packed in each packet and expanding or contracting the time interval between the packets and the next packet). In the present embodiment, the speed control is performed by increasing or decreasing the amount of data packed in each packet.
[0064]
(2) The terminal 102 can change “S_target” as needed during the delivery of the stream. In this case, the changed “S_target” is notified from the terminal 102 to the server 101, and thereafter, the server 101 controls the transmission speed based on the changed “S_target”.
[0065]
In (2) above, “S_target” is changed in accordance with the change in the transmission capability of the network 103. Specifically, when the terminal 102 is a mobile phone, it is possible to detect electric field strength (for example, four levels of strength, medium, weak, and out of service area). “S_target” is changed on the assumption that the transmission capability varies. For example, when the electric field strength changes from “strong” to “medium”, the terminal 102 changes the value of “S_target” to a larger value, and when it changes from “medium” to “weak”, the terminal changes the value of “S_target”. Change to a smaller value.
The operation of this system is mainly different from the above two points.
[0066]
Next, a specific example of the overall operation of this system will be described in detail. In FIG. 4, before starting the stream reproduction, the terminal 102 extracts a parameter group specific to the terminal from the ROM 502 according to the terminal control program. This parameter group includes the total capacity (maximum data amount that can be actually stored in the terminal 102) S_max, which is the sum of the reception buffer 505 and the decoder buffer 508. On the other hand, it is assumed that the CPU 503 knows the encoding / compression rate Vr of the stream data to be received and the video or audio frame generation period Tfrm by the procedure for obtaining the stream reproduction auxiliary data in advance. Further, it is assumed that the CPU 503 detects the transmission capability of the network 103, for example, the received radio wave intensity and the communication speed (information such as 64 Kbps connection or 32 Kbps connection in the case of PHS) through the network interface.
[0067]
The CPU 503 uses the S_max, Vr, Tfrm, the transmission capacity of the network 103 (for example, effective transfer rate = networkRate), and the like, a target amount S_target indicating how much data is stored in the buffer in the terminal 102, and The pre-buffering time (that is, the cue delay time) T_delay until the stream playback is started is determined.
[0068]
Here, the essential meaning of S_target is that the streaming playback can be normally continued without interruption if the storage buffer amount of the terminal changes so that it does not exceed the vicinity of S_target in the streaming playback to be started from now. It is a value. As described above, when T_delay is large, the cue delay time becomes long, but it becomes strong against the transfer jitter of the network 103. However, if the delay time is too long, it is inappropriate as a service specification. Therefore, when determining T_delay, it is necessary to consider the balance between the tolerance against transfer jitter and the waiting time at the time of cueing.
[0069]
In addition, instead of T_delay or in combination with T_delay, a filling amount S_delay that determines how many bytes of data are filled in the buffer in terminal 102 to start decoding may be determined. When the terminal 102 determines only S_delay and notifies it to the server 101, it is possible to convert S_delay to T_delay on the server 101 side using an expression of T_delay = S_delay / networkRate. Further, the value of S_delay may be a filling rate rS (%) with respect to the total buffer amount S_max (in this case, the conversion formula is S_delay = S_max * rS / 100).
[0070]
When the terminal 102 prepares S_target and T_delay (and / or S_delay), the terminal 102 issues a SETUP command that prompts the server 101 to prepare for stream delivery, as shown in FIG. The SETUP command includes S_target and T_delay (and / or S_delay) as arguments. When the server 101 receives the SETUP, the server 101 stores the argument in the RAM 404 and performs initial setting for stream delivery. Specifically, the CPU 412 of the server 101 RAM An argument is extracted from 404, and then, for example, an operation of reading a source file of the stream from the storage device 411 and writing it in the buffer 407 and setting of parameters of the packet generation circuit 406 that packetizes the read data are performed. The packet generation circuit 406 is not necessarily dedicated hardware, and is a program (software algorithm) that causes the CPU 412 of the server 101 (for example, realized by a workstation) to execute similar packetization processing. May be.
[0071]
The two values of S_target and T_delay (and / or S_delay) are also passed to the packet generation circuit 406. The packet generation circuit 406 calculates an optimum rate control parameter using these values, and as a result, packets are generated and transmitted at a rate suitable for stream delivery to the terminal 102. When the preparation for transmitting the packet into the network 103 is completed normally, OK is returned from the transmission / reception layer to the control layer as shown in FIG. 4, and then OK is returned to the terminal 102 for the SETUP command. Thus, the stream distribution preparation is completed in this system.
[0072]
Next, the terminal 102 issues a PLAY command that prompts the server 101 to start stream distribution. When receiving the PLAY, the server 101 starts distributing stream data. The terminal 102 receives and accumulates stream data from the server 101. Then, after the pre-buffering time (T_delay) has elapsed from the start of accumulation, decoding and reproduction of stream data is started. At this time, it goes without saying that the stream distribution is performed based on an appropriate rate control parameter set at the time of SETUP.
[0073]
At the end of stream reproduction, a TEARDOWN command is issued from the terminal 102 to the server 101. When the server 101 receives TEARDOWN, it performs a stream distribution end process and completes all procedures. The above is a specific operation example of this system.
[0074]
Hereinafter, the operation of the terminal 102 will be described in detail. The terminal 102 is a mobile phone that can be connected to the Internet, and has a function of detecting electric field strength (received radio wave strength). FIG. 5 is a flowchart showing the operation of the terminal 102 of FIG. In FIG. 5, first, the terminal 102 determines values of two parameters S_target and T_delay (step S101).
[0075]
Here, the processing content of the said step S101 is demonstrated concretely. FIG. 6 is a diagram showing the contents stored in the ROM 502 of FIG. As illustrated in FIG. 6, the ROM 502 stores a terminal control program, a table 601 in which electric field strength and S_target are described in association with each other, and a parameter T_delay value. As the value of the parameter S_target, three values of S_target 1 corresponding to the electric field strength “strong”, S_target 2 corresponding to “medium”, and S_target 3 corresponding to “weak / out of range” are stored. On the other hand, only one parameter T_delay value is stored.
[0076]
The three values S_target 1 to 3 are determined so as to satisfy the following relationship.
S_target3 <S_target1 <S_target2 ≦ S_max On the other hand, the value T_delay is determined not to exceed the value obtained by dividing the value S_max by the effective transmission capability of the network 103.
[0077]
As an example, if S_max is 512 (KB), S_target1 = 256 (KB), S_target2 = 384 (KB), S_target3 = 128 (KB), and the like. Further, assuming that the effective transmission capacity of the network 103 is 384 (Kbps), that is, 48 (KB / sec), T_delay is an arbitrary value (for example, 4) within a range not exceeding 512 ÷ 48≈10.7 (seconds). Second or 3 seconds).
[0078]
In step S101, S_target1 as an initial value and a value T_delay are read from the ROM 502.
[0079]
Here, S_targets 1 to 3 and the value of T_delay are calculated in advance and stored in the ROM 502, and the CPU 503 reads out necessary values from the ROM 502, but instead, the total capacity of the buffer and The effective transmission capability of the network 103 and a program for calculating the values of S_target and T_delay may be stored in the ROM 502. In this case, the CPU 503 reads the capacity, speed, and program from the ROM 502 whenever necessary, and calculates the values of S_target and T_delay. Further, here, there is only one value of T_delay, but a plurality of values may be prepared and selected from them. The above is the processing content of step S101.
[0080]
In FIG. 5 again, the terminal 102 attaches the S_target and T_delay determined in step S101 to the SETUP command and transmits them to the server 101 (step S102). In response, a stream is sent from the server 101. During stream transmission, the server 101 performs transmission speed control based on S_target and T_delay notified from the terminal (the operation on the server side will be described later).
[0081]
Next, the terminal 102 receives a stream sent from the server 101 and starts an operation of writing to the buffer (step S103). Specifically, as shown in FIG. 3, the stream sent through the network 103 is first written into the reception buffer 505 via the network controller 506. When time elapses and the reception buffer 505 becomes full, the stream in the reception buffer 505 is sequentially read from the head data and written to the decoder buffer 508.
[0082]
Next, the terminal 102 determines whether or not the time T_delay has elapsed from the start of buffering (step S104). If the determination result is negative, the terminal 102 waits until it becomes affirmative. If the determination result in step S104 is affirmative, the terminal 102 starts an operation of reading a stream from the buffer and decoding / reproducing it (step S105). Specifically, in FIG. 3, the CPU 503 measures the elapsed time from the start of buffering, and at the moment when the measurement result coincides with T_delay in the ROM 502, the playback module 510 is instructed to stream the stream in the decoder buffer 508. Are sequentially read from the head data and input to the decoder 509 is started.
[0083]
Next, the terminal 102 determines whether or not the transmission capability of the network 103 has changed across the threshold (step S106). Specifically, this determination is performed as follows. For example, a host computer (not shown) that manages the network 103 distributes information regarding the transmission capability of the network 103 to the terminal 102 as needed via the network 103, and the terminal 102 is based on information from the host computer. Determine if there is a change.
[0084]
In this case, specifically, as shown in FIG. 3, information regarding transmission capability is sent to the CPU 503 through the transmission / reception module 507. The ROM 502 stores a threshold value in advance, and the CPU 503 compares the sent information, the previous information held, and the threshold value in the ROM 502 with each other, so that the transmission capability crosses the threshold value. It can be determined whether or not it has changed.
[0085]
Alternatively, when the host computer that manages the network 103 does not have a function of distributing information related to the transmission capability to the terminal 102, the terminal 102 can make the determination for example as follows. That is, when the terminal 102 is a mobile phone, as shown in FIG. 7 (described later), it has a function of detecting the surrounding electric field strength and displaying the detection result as “strong / medium / weak / out of service area”. Yes. If this change in electric field strength is regarded as a change in transmission capability of the network 103, the terminal 102 can easily detect.
[0086]
If the determination result of step S106 is affirmative, the terminal 102 determines a new S_target (step S107) and transmits it to the server 101 (step S108). On the other hand, if the determination result of step S106 is negative, steps S107 and S108 are skipped and step S109 (described later) is executed.
[0087]
Here, the processing contents of steps S106 and S107 will be described in detail. Hereinafter, a case where the terminal 102 is a mobile phone and S_target is changed according to a change in electric field strength will be described. FIG. 7 is a schematic diagram showing a distribution of electric field strength in a certain area and a change in transmission capability accompanying the movement of the terminal. FIG. 7A shows an electric field strength distribution in an area including three relay stations B1 to B3. In FIG. 7A, concentric circles centering on the relay stations B1 to B3 are equi-electric field curves that can connect points having the same electric field strength.
[0088]
For example, the electric field strength is “strong” in the concentric circle 703 closest to the relay station, and “medium” in the region between the concentric circle 703 and the next concentric circle 704. Furthermore, the region between the concentric circle 704 and the concentric circle 705 is “weak”, and the region outside the concentric circle 705 is “out of range”. However, concentric circles centering on each relay station partially intersect each other, and there are only a few regions where the electric field strength is “out of range”.
[0089]
Now, the terminal 102 is about to move from the vicinity of the relay station B1 to the vicinity of the relay B2 along the route indicated by the arrow 702. FIG. 7B shows the electric field strength along the arrow 702 in FIG. 7A (this can be regarded as the transmission capability of the network 103). As shown in FIG. 7B, the electric field strength is “strong” when the terminal 102 is in the vicinity of the relay station, and “medium”, “weak”, “out of service” as the terminal 102 moves away from the relay station B1. It will become weaker gradually. Immediately after the relay station B1 becomes “out of service area”, the terminal 102 enters the “service area” of the relay station B2, and the radio wave intensity gradually increases as “weak”, “medium”, and “strong”. .
[0090]
The terminal 102 that moves as described above determines that the transmission capability of the network 103 has changed across the threshold A at the moment when the electric field strength changes from “strong” to “medium”, and determines a new S_target. At the moment when the state changes from “medium” to “weak”, it is determined that the transmission capacity has changed across the threshold B, and a new S_target is determined. Conversely, the moment when the transmission capacity changes from “weak” to “medium”, it is determined that the transmission capacity has changed across the threshold value B, a new S_target is determined, and the moment when it changes from “medium” to “strong” Then, it is determined that the transmission capability has changed across the threshold A, and a new S_target is determined.
[0091]
In general, the threshold A is a value approximately between the maximum transmission capability that can be realized in the network 103 and the transmission capability at which a stream transfer loss starts to occur. The threshold value B is a value corresponding to a transmission capability at which a stream transfer loss starts to occur.
[0092]
The new S_target is determined as follows by referring to the table 601 (FIG. 6) in the ROM 502. FIG. 8 is a flowchart showing details of step S107 in FIG. In FIG. 8, the terminal 102 first determines whether or not the electric field strength after the change is “strong” (step S201). If the determination result is affirmative, the terminal 102 determines a new S_target as S_target1 (step S202). . If the determination result of step S201 is negative, the process jumps to step S202 and proceeds to step S203.
[0093]
Next, the terminal 102 determines whether or not the electric field strength after the change is “medium” (step S203). If the determination result is affirmative, the terminal 102 determines a new S_target as S_target2 (step S204). If the determination result of step S203 is negative, the process jumps to step S204 and proceeds to step S205.
[0094]
Next, the terminal 102 determines whether or not the changed electric field strength is “weak / out of range” (step S205). If the determination result is affirmative, the terminal 102 determines a new S_target as S_target3 (step S206). Thereafter, the flow returns to the flow of FIG. If the determination result of step S205 is negative, the process jumps to step S206 and returns to the flow of FIG.
[0095]
Therefore, when the terminal 102 moves along the arrow 702 in FIG. 7A, the terminal 102 changes the value of the parameter S_target from S_target1 → S_target2 → S_target3 → S_target2 → S_target1 as the electric field strength changes. To change. As a specific example, it is changed in the order of 256 (KB) → 384 (KB) → 128 (KB) → 384 (KB) → 128 (KB). The above is a specific processing example of step S106 and step S107.
[0096]
In FIG. 5 again, when the terminal 102 transmits a new S_target to the server 101 in step S108, the server 101 changes the value of the parameter S_target to a value newly notified from the terminal 102 accordingly. Continue transmission speed control.
[0097]
Next, the terminal 102 determines whether or not to end the streaming reproduction (step S109). When the terminal 102 ends, the terminal 102 transmits the command TEARDOWN to the server 101 and stops receiving and buffering the stream (step S110). Then, the reproduction process is stopped (step S111). On the other hand, when the streaming reproduction is continued, the terminal 102 returns to step S106 and repeats the same processing as described above. The above is the operation of the terminal 102.
[0098]
Next, the operation of the server 101 will be described in detail. In order to simplify the description, the server 101 uses MPEG1 video (ISO / IEC 11172-2), MPEG2 video (ISO / IEC 13818-2), or MPEG2-AAC audio (ISO / IEC 13818-7). It is assumed that encoding is performed using an encoding compression algorithm that generates a frame with a fixed period Tfrm, and the encoded data is packetized with a fixed period Ts.
This packetization is performed in units of frames.
[0099]
First, an overview of stream transmission speed control performed by the server 101 will be described with reference to FIGS. 9 to 11 are diagrams illustrating how the data amount (buffer occupancy) accumulated in the buffer of the terminal 102 changes according to the stream transmission speed control performed by the server 101. FIG. The server 101 controls the transmission speed of the stream so that the buffer occupancy changes as shown in FIGS. 9 to 11 in the terminal 102 of the transmission destination.
[0100]
FIG. 9 shows how the buffer occupancy approaches S_target. FIG. 10 shows how the buffer occupancy approaches S_target2 when the value of S_target is changed to a larger value (S_target2) while the buffer occupancy is in the vicinity of S_target. ing. FIG. 11 shows a state in which the buffer occupancy approaches S_target3 when the value of S_target is changed to a smaller value (S_target3) while the buffer occupancy is in the vicinity of S_target. ing.
[0101]
9 to 11, “S_max” is the total buffer capacity of the terminal 102, and “Sum” is the buffer occupation amount. “Delta (0, 1, 2,...)” Indicates the amount of data that the server 101 transmits per unit time Ts (that is, the amount of data packed in one packet). Here, the unit time Ts is a cycle in which the server 101 transmits a packet, and is a fixed value. “L (0, 1, 2,...)” Is a data amount per frame.
[0102]
When the server 101 receives a notification of the value of T_delay from the terminal 102, the server 101 controls the transmission speed of the stream based on the value. This speed control is performed by changing the amount of data packed in one packet.
[0103]
In FIG. 9, the packet (i = 0) transmitted first by the server 101 is packed with data of the amount delta0, and the buffer occupation amount Sum is delta0 at time t = 0. When the unit time Ts elapses, the next packet (i = 1) is sent, which is filled with data of the quantity delta1. Therefore, at time t = Ts, Sum is {delta0 + delta1}. Thereafter, every time the unit time Ts elapses, packets (i = 2, 3,...) Are sent one after another, and delta2, delta3,... Are added to Sum.
[0104]
On the other hand, at time t = T_delay before the third packet (i = 2) is sent, a process of reading and decoding data from the buffer is started. Since decoding is performed on a frame basis, L0, L1, L2,... Are subtracted from Sum every fixed period Tfrm after t = T_delay.
[0105]
That is, after the time t = 0, the buffer occupancy Sum is gradually increased by adding delta0, delta1,... Every period Ts. On the other hand, L0, L1, L2,... Are subtracted every cycle Tfrm after time t = T_delay. Accordingly, during the period until the Sum reaches the target value S_target, the amount of data packed in one packet is made larger than the standard (more generally, the transmission speed is increased), and the writing speed to the buffer is increased. From then on, if the amount of data packed in one packet is returned to the standard and the writing speed and the reading speed are balanced, Sum can be shifted in the vicinity of the target value S_target. It becomes possible.
[0106]
When such transmission speed control is performed, as shown in FIGS. 10 and 11, even when the target value S_target is changed to a new target value (S_target2, 3) in the middle, Sum is changed to the new target value (S_target2). , 3) in the vicinity.
[0107]
That is, in FIG. 10, when S_target is changed to a larger value (S_target2) in a state where Sum is transitioning in the vicinity of target value S_target, server 101 changes to a subsequent packet (i = 3, 4). By increasing the amount of data to be packed, the writing speed to the buffer is made faster than the reading speed from the buffer. After Sum reaches the new target value S_target2, the data amount packed in one packet is returned to the standard, and the writing speed and the reading speed may be balanced.
[0108]
In FIG. 11, when S_target is changed to a smaller value (S_target3) while Sum is transitioning in the vicinity of the target value S_target, the server 101 changes to subsequent packets (i = 3, 4). By reducing the amount of data to be packed, the writing speed to the buffer is made lower than the reading speed from the buffer. After Sum reaches the new target value S_target3, the data amount packed in one packet is returned to the standard, and the writing speed and the reading speed may be balanced.
[0109]
Next, transmission rate control by the server 101 as described above will be described in detail. FIG. 12 is a flowchart illustrating an example of a transmission rate control algorithm performed by the server 101. In FIG. 12, first, the terminal 102 detects its own buffer occupation amount (Sum), and the server 101 receives a notification of the buffer occupation amount Sum from the terminal 102 (step S301). Next, the server 101 determines whether or not the buffer occupancy amount Sum notified in step S301 has changed in the vicinity of the target value S_target specified from the terminal 102 (step S302). If the determination result is affirmative, the current transmission rate is maintained.
[0110]
When the determination result of step S302 is negative, the server 101 determines whether or not the buffer occupation amount Sum notified in step S301 is larger than the target value S_target (step S303). If the determination result is negative, the transmission speed is changed to a speed higher than the current speed (step S304), and then the process proceeds to step S306. On the other hand, if the determination result of step S303 is affirmative, the transmission speed is changed to a speed slower than the current speed (step S305), and then the process proceeds to step S306.
[0111]
In step S306, it is determined whether or not to continue the speed control operation. If the determination result is affirmative, the process returns to step S301 and the same operation as described above is repeated. On the other hand, if the determination result is negative, the operation is terminated. The above is an example of transmission speed control performed by the server 101.
[0112]
By the way, in the example of FIG. 12, the terminal 102 detects its own buffer occupation amount and notifies the server 101 of it. However, in that case, the terminal 102 detects the buffer occupancy at the current time. In addition, since it takes time to transmit information from the terminal 102 to the server 101, the server 101 performs transmission rate control based on the past buffer occupancy for the transmission delay time, and the buffer occupancy is set to S_target. It is actually difficult to make a transition in the vicinity of.
[0113]
On the other hand, in another example described below (see FIGS. 13 and 14), the buffer occupancy is changed in the vicinity of S_target by performing transmission rate control based on the buffer occupancy at a certain time in the future. It becomes possible to make it. In this case, the server 101 does not receive the Sum notification from the terminal 102 but predicts and calculates the buffer occupancy Sum on the terminal 102 side at a future time. This prediction calculation is performed as follows.
[0114]
That is, in FIG. 2, the ROM 413 stores a packet transmission cycle Ts (fixed value) and a decoding cycle Tfrm (fixed value) in advance. When generating a packet, the CPU 412 stores the amount of data (delta 0, delta 1, delta 2,...) Packed in the packet in the RAM 404. Further, when the stream is transmitted, the data amount (L0, L1, L2,...) Of each frame is stored in the RAM 404.
[0115]
The RAM 404 also stores T_delay previously notified from the terminal 102, and the CPU 412 stores Ts and Tfrm in the ROM 413, delta (0, 1, 2,...), L (0, 1) in the RAM 404. , 2,...) And T_delay to perform a predetermined calculation, the buffer occupancy Sum at a future time can be calculated. By performing such a calculation process, the server 101 can predict how the buffer occupation amount Sum changes on the terminal 102 side (see FIGS. 9 to 11).
[0116]
Hereinafter, with regard to a specific operation example in which the server 101 predicts and calculates the buffer occupation amount Sum of the terminal 102 and performs transmission rate control, the buffer transition diagram of FIG. 9, the flowcharts of FIGS. 13 and 14, and the packet configuration diagram of FIG. Will be described.
[0117]
In FIG. 9, S_max is the maximum value of the effective accumulation amount of the buffer in the terminal 102 (this is simply called “total capacity of the buffer”), and S_target is accumulated in the buffer in the terminal 102 in the current streaming. T_delay is a set value for the cue delay time. The meaning of each parameter has already been described. In the following description, it is assumed that S_target and T_delay are notified from the terminal 102.
[0118]
In the present embodiment, in order to facilitate understanding, an example in which packets are generated / distributed every fixed time period Ts (packet distribution at a time corresponding to i = n: n is an integer) is shown. In addition, when a packet is delivered at a time corresponding to i = n (t = i * Ts), the accumulation amount Sum in the reception buffer 505 and the decoder buffer 508 of the terminal 102 is an amount of data corresponding to several frames. However, this is because packets are generated and distributed to the terminal 102 in a pattern in which a plurality of frames are inserted into one packet as shown in FIG. . Actually, the packet delivery takes a transfer time, and the buffer occupancy does not increase instantaneously as shown in the figure (inclination = slanted line of networkRate), but it is assumed that it is handled as a simple model. In addition, the buffer occupancy is reduced stepwise after time t = T_delay, indicating that stream playback has started at the terminal 102 at that time. That is, data is processed by the decoder 509 for each frame length L = L [k] (k is an integer) for each frame reproduction period Tfrm.
[0119]
FIG. 13 and FIG. 14 are flowcharts showing another example of the transmission control algorithm performed by the server 101 to realize the buffer transition of FIG. FIG. 13 is an overview of the algorithm, and FIG. 14 is a flowchart showing an example of the function mkPacket in step S404 of FIG. A program describing such an algorithm is stored in the ROM 413 (see FIG. 2), and the CPU 412 performs various calculations and controls according to this program, and as a result, the buffer transition of FIG. 9 is realized. In order to simplify the explanation, it is assumed that the distribution stop in the middle of the stream is not considered this time. Hereinafter, each step will be described sequentially.
[0120]
In FIG. 9, the server 101 receives and stores S_target and T_delay sent from the terminal 102 (step S401). Specifically, in FIG. 2, the values of S_target and T_delay sent from the terminal 102 via the network 103 are written into the RAM 404 through the network controller 410.
[0121]
Here, the terminal 102 determines the values of the parameters S_target and T_delay and notifies the result to the server 101. Alternatively, the server 101 may store these values in advance, or The model information (total buffer capacity, etc.) of the terminal 102 may be stored, and the server 101 may calculate the parameter value based on the model information.
[0122]
Next, each variable is initialized (steps S402 and S403). The meaning of each variable will be described later with reference to FIG. When the initialization is completed, processing after step S404, that is, processing for generating a packet with the function mkPacket and sending it to the network 103 is started. In this example, the generated packet is delivered to the terminal 102 at a fixed period Ts. Therefore, the server 101 adjusts the timing in step S405, and then sends it out in step S406. When this series of processing is completed, the CPU 412 updates the execution counter i of the function mkPacket, returns to step S404, and loops. When all the reading and packetization of the stream data are completed, the CPU 412 exits the function mkPacket and returns to the step S404 with the determination result FALSE. At this time, the CPU 412 regards the distribution as being completed, and completes the algorithm. The above is the outline of this transmission control algorithm.
[0123]
Next, a detailed algorithm of the function mkPacket shown in step S404 will be described. First, each variable will be described. Sum is the total amount of data stored in the reception buffer 505 and the decoder buffer 508 in the terminal 102, L is the data amount of the frame, and delta is packetized after the function mkPacket is called this time. The sum of the data amount (that is, the amount of data packed in one packet), in is a counter indicating the number of frames of the stream source read from the storage device 411, and out is decoded by the decoder 509 in the terminal 102 The counter indicates the number of frames that have been received, dts is the time at which the decoder 509 decodes the frame, and grid is the upper limit of dts advanced when processing one loop of the previous function mkPacket.
[0124]
In FIG. 14, the function mkPacket is roughly divided into two algorithms: a packet generation algorithm A1 and a decoding amount calculation algorithm A2. In the former, in the first step (S501), the CPU 412 clears delta. In the subsequent step S502, the CPU 412 determines whether or not a frame of L = L [in] (already read) can be used for the current packet generation. The criteria for determination are (a) adding L to Sum does not exceed S_target, and (b) the amount of data packetized by the current function call (the amount of data packed in one packet this time) delta Even if L is added, the upper limit value deltaMax of the data amount that can be packed in one packet is not exceeded.
[0125]
Here, deltaMax is the inequality shown in FIG.
(DeltaMax + hdr) / Ts <NetworkRate
Is the maximum value of the amount of data that can be distributed to the terminal within the period Ts, and can be calculated from the effective transfer rate (transmission capability) of the network 103. If it is determined to be true in step S502, the CPU 412 proceeds to step S503 and packetizes the frame of L = L [in]. In subsequent step S504, CPU 412 updates Sum and delta as packetization is performed. In subsequent step S <b> 505, the CPU 412 reads the data of the next frame from the read buffer 407 and the frame length L from the RAM 404.
Then, it is determined whether or not L is greater than zero.
[0126]
If the determination result in step S505 is negative, that is, if L = 0, the CPU 412 regards that all data has been read (end of file detection) and exits the function. Then, RETURN is performed with the determination result FALSE in step S404 of the main flow (FIG. 13). On the other hand, if the determination result is affirmative, that is, if L> 0, the CPU 412 proceeds to the next step (S506), and adds L [in] into the array length (stores it in the RAM 404). As will be described later, this is for use in the decoding amount calculation algorithm A2. Next, the CPU 412 proceeds to step S507, updates the read frame number counter in, and loops to step S502.
[0127]
As the packet generation by the above loop is repeated, the values of Sum and delta increase gradually. If it is determined in step S502 that Sum or delta has become sufficiently large, the CPU 412 exits this loop and enters the decoding amount calculation algorithm A2.
[0128]
In the decoding amount calculation algorithm A2, in the first step (S508), it is determined whether i * Ts is equal to or greater than grid. The purpose of step S508 is to determine whether or not it is time to start decoding in the terminal 102. Specifically, since grid is initially set to T_delay, it is determined that decoding at terminal 102 has not yet started while function call counter number i is small and t = i * Ts is less than grid. The In FIG. 9, the time corresponding to i = 0 and i = 1 corresponds to this.
[0129]
If the determination result of step S508 is negative, the CPU 412 exits the function without performing subtraction of frame data by decoding. On the other hand, when i becomes sufficiently large and the packet generation time t = i * Ts becomes equal to or greater than grid, the CPU 412 regards that the terminal 102 has already started decoding, and performs the subtraction process on the frame data. In FIG. 9, the time when i is 2 or more corresponds to this. In the subsequent loop from step S509 to step S512, the amount of frame data length [out] to be decoded within the time between the current grid time and the next grid time (= grid + Ts) is subtracted from Sum, and The number of decoded frames out is counted up.
[0130]
In step S511 in the above loop, Tfrm is added to dts every time one frame is decoded, and this embodiment uses an encoding method that generates frames with a fixed time interval Tfrm. It comes from that. In step S512, the CPU 412 determines whether there is a frame to be decoded at the current time interval Ts. If the determination result in step S512 is negative, that is, if it is determined that there is no longer a frame to be decoded at the current time interval Ts, the CPU 412 exits from the loop (steps S509 to S512) and proceeds to step S513. In step S513, the CPU 412 updates the variable grid to the next grid time. Then, the function is exited, and RETURN is performed with the determination result TRUE in step S404 of the main flow (FIG. 13).
[0131]
With the above algorithm, as shown in FIG. 9, the buffer occupancy Sum can be changed within the terminal 102 so that it is always in the vicinity of S_target and does not exceed S_target. Therefore, even if there are a plurality of types of terminals 102 and the total capacity Smax of the buffer varies depending on the models, if S_target is set to an appropriate value according to the Smax of each terminal 102, the buffer overflows and underflows. It can be prevented from occurring.
[0132]
In this example, a packet is generated with a pattern in which a plurality of frames are inserted into one packet as shown in FIG. 15A. Instead, one frame is included in one packet as shown in FIG. 15B. It is also possible to generate a packet with a pattern for inserting. In this case, in step S502 in FIG.
delta + (L + hdr) <= deltaMax
And the latter half of step S504
delta + = (L + hdr)
Just do it.
[0133]
Further, in the present embodiment, for the sake of simplicity, an encoding method for generating frames at a fixed time interval Tfrm is used. However, an encoding method to be used—for example, MPEG4 video (ISO / IEC 14496-2) — If the decoding amount calculation algorithm A2 is designed according to the above, it goes without saying that it is not always necessary to generate frames at fixed time intervals. Further, the algorithm does not necessarily need to handle data in units of frames. For example, it may be an algorithm that handles data in units of slices or packs of MPEG1 or MPEG2 system streams.
[0134]
On the other hand, when the value of S_target is changed halfway in step S502 of FIG. 14, the present algorithm instantly generates a packet with the changed new S_target as a target. The state of buffer transition when the value of S_target is changed in the middle is shown in FIGS. In FIG. 10, when S_target is changed to S_target2 at the time of i = 3 (S_target <S_target2 ≦ S_max), a large amount of frame data is packetized for a while after the change (delta3 and delta4 in the figure). Sum reaches the vicinity of the new target value S_target2.
[0135]
As shown in FIG. 11, when S_target is changed to S_target3 at the time of i = 2 (S_target3 <S_target), a small amount (delta4) or 0 (delta3) of frame data is packetized. On the other hand, Sum is consumed by decoding, so that Sum reaches near the new target value S_target3. If such a mechanism is used, the buffer occupancy Sum in the terminal 102 can be dynamically increased or decreased according to the transmission capability of the network 103 (or the radio wave reception state of the terminal 102), as will be described below. Application becomes possible.
[0136]
In FIG. 7A, a case where a user having a mobile phone 701 (corresponding to the terminal 102 in FIG. 1) moves from the area of the relay station B1 to the area of the relay station B2 as indicated by an arrow 702 in the figure. Think. Along with the movement, the task of accepting a call from the mobile phone 701 is handed over from the relay station B1 to the relay station B2. At this time, the received radio wave intensity of the mobile phone 701 changes as shown in the graph of FIG. 7B. In this model, in order to simplify the explanation, the place where the radio field intensity changes from strong to medium (or medium to strong) is changed to the threshold A concerning the transmission capability of the network 103, and the place where medium to weak (or weak to medium) changes. A threshold value C is defined as a threshold value C, where the light intensity changes from weak to out of service (or from out of service to weak).
[0137]
In FIG. 7B, it is assumed that the user holding the mobile phone 701 has moved by the distance d1 and the transmission capability has fallen below the threshold A. At this time, the mobile phone 701 changes S_delay to a larger value (S_target2) and notifies the server 101 of the value, as shown in FIG. This facilitates the generation and transmission of new packets by the server 101 in preparation for a decrease in transmission capacity that is expected to continue thereafter, and thereby, as much data as possible (this is denoted by Δt) as long as possible. This is because it is stored in the internal buffer. Even if the transmission capacity falls below the threshold A, no packet transfer loss occurs while the transmission capacity is equal to or higher than the threshold B. Therefore, the transmission speed can be increased.
[0138]
When the user moves to reach the distance d2, the transmission capability falls below the threshold value B, and packet transfer loss starts to occur. At this time, the mobile phone 701 changes S_target to a small value (S_target3) and notifies the server 101 of the value, as shown in FIG. This is to suppress generation and transmission of new packets by the server 101 as much as possible in preparation for a decrease in transmission capability that is expected to continue further. The reason for suppressing packet generation and transmission is as follows.
[0139]
For example, when the cellular phone 701 employs a PHS PIAFS system as a communication system, when a packet transmission loss occurs, data retransmission processing based on a PIAFS layer protocol that is a link layer is performed. This is because if a new packet is generated and transmitted during the retransmission process, it will interfere with the retransmission process, which is not preferable.
[0140]
When the user moves to reach the distance d3, the transmission capability falls below the threshold C, and packet transfer becomes difficult for a moment. However, when the user further moves to reach the distance d4, the transmission capability exceeds the threshold value B, and the handover of the task for accepting the call is completed, so that the mobile phone 701 is now based on S_target3. Is returned to the value S_target, and the server 101 is notified of the value, thereby increasing the data storage amount (ie, the buffer occupation amount Sum). In addition, since the handover time of PHS or the like is normally completed in about 2 to 3 seconds even at a speed at which a person walks, if the above Δt is secured for about 3 to 4 seconds, handover occurs. In addition, streaming playback on the mobile phone 701 can be continued without delay.
[0141]
By the way, as shown in FIG. 11, when the set value of S_target is changed to a smaller value in the middle of stream distribution, the judgment sentence in step S502 in the algorithm of FIG. There may be cases where it cannot be sent. When such a case frequently occurs, even when the corner packet is delivered to the terminal 102, the time (Presentation Time) at which the frame data in the packet should be reproduced has passed, and the data is wasted. Sometimes. In such a case, skipping the frame data for which the playback time has passed is more efficient as long as unnecessary data does not flow through the network 103.
[0142]
FIG. 16 is a flowchart showing another example of the function mkPacket in step S404 of FIG. The function mkPacket in FIG. 16 includes steps (S601 and S602) for skipping the transmission of data past the reproduction time when the server 101 decreases the transmission speed. That is, in the algorithm of FIG. 16, only step S601 and step S602 are added as compared to FIG. The other steps are exactly the same as in FIG. 14, and are given the same reference numbers. In step S601, the CPU 412 determines that the in-th frame data to be transmitted from now on is not the 0th frame data, and the playback time is longer than the out-th frame data that is considered to have been decoded by the terminal 102. Judging whether it is after.
[0143]
If this determination result is true, the CPU 412 assumes that the data of the in-th frame is in time for the playback time at the terminal, packetizes the data in step S503, and sends it to the terminal 102. In the case of false, it is considered that there is no in-th frame data, and L = 0 is set in step S602. As a result, it is always determined to be true in step S502, and in the packetization in step S503, it is possible to advance the transmission frame without copying unnecessary frame data. If there is such a frame skip, the playback at the decoder 509 skips for the time Tfrm, so information notifying the terminal 102 of this fact is described in the packets of FIGS. 15A and 15B. Shall be. For example, an area for describing such reproduction time information may be provided in the header.
[0144]
The algorithm shown in FIG. 16 is a sufficiently effective technique when there is no difference in priority (importance) between frames as in MPEG audio. However, in MPEG video, as already explained in the introduction of the conventional example, a meaningful image can be reconstructed by itself if it is an I frame. Without a reference frame, a meaningful image cannot be reconstructed. In this case, when performing frame decimation in the algorithm of FIG. 16, while preferentially transmitting I frames in time for playback time, skipping all P and B frames allows the transfer rate of the network 103 to be reduced. Even in a slow situation, the terminal 102 can be provided with higher quality video.
[0145]
FIG. 17 is a flowchart showing yet another example of the function mkPacket in step S404 of FIG. In the function mkPacket of FIG. 17, when the server 101 decreases the transmission speed, a step for skipping transmission of low priority data and high priority data that has passed the reproduction time (S505 ′, S601, S602, S701, and S702) are included. That is, in the algorithm of FIG. 17, steps S601, S602, S701, and S702 are added, and step S505 is replaced with step S505 ′, compared to FIG. Step S505 ′ is obtained by adding a function of detecting priority order pri to the function NexTfrm to step S505. The other steps are exactly the same as in FIGS. 14 and 16, and are given the same reference numerals.
[0146]
Accordingly, in the algorithm of FIG. 17, steps S701 and S702 are added and step S505 is replaced with S505 ′ as compared with FIG.
[0147]
In order to execute the algorithm of FIG. 17, a function for notifying the server 101 of information (reception status information) indicating the reception status detected by the terminal 102 is required. FIG. 18 shows a configuration example of a server client system having such a function. In FIG. 18, the terminal 102 includes a detection unit 801 that detects a reception state. Between the terminal 102 and the server 101, a notification unit 802 that notifies the server 101 of the detected reception state information is provided. The server 101 includes a holding unit 803 and holds the notified reception state information.
[0148]
In FIG. 17 again, when the mkPacket function is called, step S701 is executed prior to step S501. In step S <b> 701, the server 101 (the CPU 412 thereof) refers to the information in the holding unit 803 (the reception state on the terminal 102 side) and determines whether or not the transmission capability of the network 103 is below the threshold B. As a result of this determination, if the threshold value B is below the threshold B, then the slow flag is true, and otherwise it is false.
[0149]
In step S505 ′, the priority of the next frame is detected. In subsequent step S702, it is determined whether the priority of the data of the frame is high and the slow flag is true. If this determination result is affirmative, that is, if the slow flag indicating that the transfer speed of the network 103 is slow is true and the frame has a high priority, the process proceeds to step S601, and whether or not the frame has passed the playback time. Is determined. On the other hand, if the determination result is negative, the process proceeds to step S602 and L = 0 is set (that is, the frame is skipped even if it is in time for the reproduction time). The subsequent processing is exactly the same as the processing in FIGS.
[0150]
As described above, according to the present embodiment, the terminal 102 determines a target amount according to its own buffer capacity and the transmission capability of the network 103, and further obtains a value obtained by dividing the target amount by the transmission capability. The delay time is determined within a range not exceeding. Since the server 101 controls the transmission speed based on the target amount and the delay time determined by the terminal 102 in this way, even if the buffer capacity of the terminal 102 differs depending on the model or the transmission capability of the network 103 varies, the buffer 101 Transmission speed control according to the amount and transmission capability can be performed, and as a result, failure of streaming playback due to buffer underflow or overflow can be avoided. In addition, since the delay time is determined independently of the target amount, it is possible to achieve both the avoidance of failure of streaming playback and the reduction of the waiting time at the time of cueing.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration example of a server client system that executes a streaming method according to an embodiment of the present invention.
FIG. 2 is a block diagram illustrating a configuration of the server 101 in FIG.
FIG. 3 is a block diagram illustrating a configuration of the terminal 102 of FIG.
4 is a sequence diagram for explaining the overall operation of the system of FIG. 1. FIG.
FIG. 5 is a flowchart showing an operation of the terminal 102 of FIG.
6 is a diagram showing the contents stored in the ROM 502 of FIG. 3;
FIG. 7 is a schematic diagram showing an electric field intensity distribution (A) in a certain area and a change in transmission capability (B) accompanying the movement of the terminal.
FIG. 8 is a flowchart showing details of step S107 in FIG. 5;
9 is a diagram illustrating how the buffer occupancy of the terminal 102 changes (approaching S_target) according to transmission speed control performed by the server 101 in FIG. 1; FIG.
FIG. 10 shows a state in which the terminal 102 performs transmission speed control performed by the server 101 in FIG. 1 when the value of S_target is changed to a larger value (S_target2) while the buffer occupancy is changing in the vicinity of S_target. It is a figure which shows how the buffer occupation amount changes.
11 shows a state in which the terminal 102 is subjected to transmission speed control performed by the server 101 in FIG. 1 when the value of S_target is changed to a smaller value (S_target3) while the buffer occupancy is changing in the vicinity of S_target. It is a figure which shows how the buffer occupation amount changes.
12 is a flowchart showing an example of a transmission rate control algorithm performed by the server 101 of FIG.
FIG. 13 is a flowchart showing another example of a transmission rate control algorithm performed by the server 101 to realize the buffer transition of FIGS. 9 to 11;
14 is a flowchart illustrating an example of a function mkPacket in step S404 in FIG.
15 is a diagram illustrating a configuration example of a packet generated by the server 101 in FIG. 1 (A is a case where a plurality of frames are inserted into one packet, and B is a case where one frame is inserted into one packet).
16 is a flowchart showing another example of the function mkPacket in step S404 in FIG.
FIG. 17 is a flowchart showing still another example of the function mkPacket in step S404 in FIG.
FIG. 18 is a block diagram showing another configuration example of the server client system that executes the streaming method according to the embodiment of the present invention.
FIG. 19 is a diagram for explaining a conventional streaming method (A is a video frame, B is a transition of buffer occupancy, and C is a configuration example of a conventional terminal).
FIG. 20 is a block diagram illustrating a configuration example of a server / client system that executes a conventional streaming method;
FIG. 21 is a diagram for explaining how the transition of buffer occupancy changes by adding a reception buffer (A is before addition and B is after addition);
[Explanation of symbols]
101 server
102 terminals
103 network
402,507 Transmission / reception module
404 RAM
405 generation module
406 Packet generation circuit
407 Read buffer
408 Packet generation buffer
409 Transmission buffer
410,506 Network controller
411 Storage device
412 and 503 CPU
413, 502 ROM
505 Receive buffer
508 Decoder buffer
509 decoder
510 Playback module
511 Display device

Claims

A streaming method in which a server transmits stream data to a terminal through a network, and the terminal reproduces while receiving the stream data,
A target amount determination step in which the terminal determines a target amount of stream data to be stored in its buffer in relation to its buffer capacity and network transmission capability;
The delay time from when the terminal writes the head data of the stream to its own buffer until the data is read and playback is started, as long as it does not exceed the value obtained by dividing the buffer capacity by the transmission capacity Determine the delay time determination step,
The terminal notifies the server of the determined target time and delay time;
A streaming method comprising a control step of controlling a transmission rate based on a notified target amount and delay time when a server transmits stream data to a terminal through a network.

In the control step, the server
The transmission rate is controlled so that the amount of stream data accumulated in a buffer of a terminal does not exceed the target amount in the vicinity of the target amount. Streaming method.

In the control step, the server predicts and calculates the amount of stream data stored in the buffer of the terminal based on the transmission speed, the delay time, and the speed at which the terminal decodes the stream data. The streaming method according to claim 2.

A detection step in which the terminal detects that the transmission capability of the network has changed across a predetermined threshold;
According to the detection result in the detection step, the terminal further includes a target amount changing step in which the terminal changes the target amount, and a step in which the terminal notifies the server of the changed target amount,
In the control step, when the server receives the notification of the changed target amount, the amount of stream data accumulated in the terminal buffer exceeds the changed target amount in the vicinity of the changed target amount. The streaming method according to claim 1, wherein the transmission speed is controlled so as to make a transition without any change.

When detecting that the transmission capability of the network has dropped across the first threshold in the detection step, the terminal changes the direction to increase the target amount in the target amount change step,
5. The streaming method according to claim 4, wherein, in the control step, the server controls the transmission speed to increase in accordance with an increase in the target amount.

6. The streaming according to claim 5, wherein the first threshold is a value approximately in the middle between a maximum transmission capability that can be realized and a transmission capability at which a transfer loss of stream data starts to occur. Method.

When the detection step detects that the transmission capability of the network has dropped across the second threshold value that is smaller than the first threshold value, the terminal changes the direction so as to decrease the target amount in the target amount change step. ,
5. The streaming method according to claim 4, wherein, in the control step, the server controls the transmission speed to decrease in accordance with the target amount being changed in a decreasing direction.

The streaming method according to claim 7, wherein the second threshold value is a value corresponding to a transmission capability at which stream data transfer loss starts to occur.

In the target amount changing step, when the terminal changes in a direction to decrease the target amount, in the control step, the server sequentially compares the reproduction time of each frame constituting the stream to be transmitted with the current time, 9. The streaming method according to claim 8, wherein transmission of a frame whose playback time is older than the current time is skipped, and thereby the transmission speed is controlled in a decreasing direction.

In the target amount changing step, when the terminal changes in a direction to decrease the target amount, in the control step, the server sequentially compares the importance of each frame constituting the stream to be transmitted with a reference value,
For frames whose importance is less than the reference value, all transmissions are skipped,
For frames whose importance is greater than or equal to the reference value, each playback time is sequentially compared with the current time, and transmission is skipped only when the playback time is older than the current time, thereby controlling the transmission speed in a decreasing direction. The streaming method according to claim 8, wherein:

A system comprising a server that transmits stream data over a network and a terminal that reproduces the stream data while receiving the data,
The terminal
A target amount determining means for determining a target amount of stream data to be accumulated in its own buffer in relation to its own buffer capacity and network transmission capability;
Arbitrarily determine the delay time from writing the head data of the stream to its own buffer until reading the data and starting playback within a range not exceeding the value obtained by dividing the buffer capacity by the transmission capacity A delay time determining means, and means for notifying a server of the determined target time and delay time;
The server comprises control means for controlling a transmission speed based on the notified target amount and delay time when transmitting stream data to a terminal through a network.

A terminal that is used together with a server that transmits stream data through a network and that plays back the stream data while receiving the stream data,
The server is equipped with a control means for controlling the transmission speed based on the notified target amount and delay time when transmitting stream data to the terminal through the network,
A target amount determining means for determining a target amount of stream data to be accumulated in its own buffer in relation to its own buffer capacity and network transmission capability;
Arbitrarily determine the delay time from writing the head data of the stream to its own buffer until reading the data and starting playback within a range not exceeding the value obtained by dividing the buffer capacity by the transmission capacity A terminal comprising delay time determining means and means for notifying a server of the determined target time and delay time.

A server that is used together with a terminal that reproduces while receiving stream data and that transmits the stream data through a network,
On the device,
A target amount determining means for determining a target amount of stream data to be accumulated in its own buffer in relation to its own buffer capacity and network transmission capability;
Arbitrarily determine the delay time from writing the head data of the stream to its own buffer until reading the data and starting playback within a range not exceeding the value obtained by dividing the buffer capacity by the transmission capacity A delay time determining means, and means for notifying the server of the determined target time and delay time;
When transmitting stream data to the terminal through the network, the control means for controlling the transmission speed based on the notified target amount and delay time,
The control means controls the transmission speed so that the amount of stream data stored in the buffer of the terminal changes without exceeding the target amount in the vicinity of the target amount. .

A program that describes a streaming method in which a server transmits stream data to a terminal through a network and the terminal plays back the stream data while receiving the stream data,
A target amount determination step in which the terminal determines a target amount of stream data to be stored in its buffer in relation to its buffer capacity and network transmission capability;
The delay time from when the terminal writes the head data of the stream to its own buffer until the data is read and playback is started, as long as it does not exceed the value obtained by dividing the buffer capacity by the transmission capacity Determine the delay time determination step,
The terminal notifies the server of the determined target time and delay time;
A program describing a streaming method including a control step of controlling a transmission rate based on a notified target amount and delay time when a server transmits stream data to a terminal through a network.

A recording medium that records a program in which a server transmits stream data to a terminal through a network and a terminal describes a streaming method for reproducing the stream data while receiving the stream data,
A target amount determination step in which the terminal determines a target amount of stream data to be stored in its buffer in relation to its buffer capacity and network transmission capability;
The delay time from when the terminal writes the head data of the stream to its own buffer until the data is read and playback is started, as long as it does not exceed the value obtained by dividing the buffer capacity by the transmission capacity Determine the delay time determination step,
The terminal notifies the server of the determined target time and delay time;
A recording medium on which is recorded a program describing a streaming method including a control step of controlling a transmission speed based on a notified target amount and delay time when a server transmits stream data to a terminal through a network.