JP3831180B2

JP3831180B2 - Video information printing apparatus, video information summarizing method, and computer-readable recording medium storing a program for causing a computer to execute the method

Info

Publication number: JP3831180B2
Application number: JP2000200271A
Authority: JP
Inventors: 孝之國枝; 由喜脇田
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2000-06-30
Filing date: 2000-06-30
Publication date: 2006-10-11
Anticipated expiration: 2020-06-30
Also published as: JP2002027365A

Description

【０００１】
【発明の属する技術分野】
本発明は、映像情報印刷装置、映像情報要約方法およびその方法をコンピュータに実行させるプログラムを記録したコンピュータ読み取り可能な記録媒体に関する。
【０００２】
【従来の技術】
例えばビデオテープやＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）といった情報を映像として記録できる記録媒体（映像記録媒体）が普及している。このような映像記録媒体には、記録されている映像の内容が一見して分かるインデックスが一般的に付されており、映像記録媒体の整理や管理に利用されている。
【０００３】
現在、インデックスを簡易に作成できる構成として、特開平６−２４５１７９号公報に記載された発明、特開平８−２９４０８０号公報に記載された発明、特開平９−９３５２７号公報に記載された発明がある。このうちの特開平６−２４５１７９号公報に記載された発明は、フリーズボタンの押下によって映像信号をメモリに蓄積し、蓄積された映像信号によって表される静止画像を印刷している。
【０００４】
また、特開平８−２９４０８０号公報に記載された発明は、映像記録媒体にインデックス領域を設け、インデックス領域に映像データの代表画面、撮影日時、撮影場所などを記録することによってインデックスデータを作成する。インデックスデータとして記録された代表画面は、表示画面に表示することも紙に出力することもできる。また、この際、代表画面と対応させて撮影日時や撮影場所を表示、あるいは紙出力することができる。
【０００５】
また、特開平９−９３５２７号公報に記載された発明は、再生された映像中のシーンチェンジに基づいて（基本的に）代表画像を抽出し、抽出された代表画像を格納してインデックスを作成するものである。このインデックスは代表画像の一覧であって、表示部に表示することも印刷することも可能である。以上の特開平６−２４５１７９号公報に記載された発明、特開平８−２９４０８０号公報に記載された発明、特開平９−９３５２７号公報に記載された発明によれば、いずれも映像に含まれる画像を紙に印刷して出力することができる。そして、出力された印刷物は、映像記録媒体のインデックスとして利用することができる。
【０００６】
【発明が解決しようとする課題】
しかしながら、上記した従来の発明は、いずれもインデックスとして映像に含まれる画像を主に使用している。画像を主にしたインデックスは、映像の内容をある程度理解することに有効ではあるものの、理解のために必須の情報が欠落する、あるいは不要な情報が混在する可能性がある。
【０００７】
また、従来の発明は、情報の欠落を補うために撮影日時や撮影場所といった簡単な情報を付加する、あるいはシーンチェンジに基づいて決定した代表画像を手動で置き換えているものの、映像の内容を充分に理解できるインデックスを作成するには未だ改善の余地があるものといえる。
【０００８】
また、インデックスを利用して映像記録媒体を整理する場合、例えば、インデックスを映像記録媒体のケースなどに直接貼付する、あるいはインデックスのみをファイリングすることが考えられる。しかしながら、上記した従来の発明は、作成したデータインデックスを印刷する際の印刷レイアウトをインデックスの利用方法に合せて変更することができない。このため、利用者は、印刷されたインデックスを利用方法に合せて加工することが必要であった。
【０００９】
ところで、現在、映像の編集、加工を容易にするために映像を構造化する映像構造化装置が提唱されている。映像の構造化とは、映像を構成する画像の変化点を検出するなどして映像を分割し、分割されたまとまりごとに映像を検索できるようにする処理をいう。
【００１０】
映像を構造化する映像構造化装置の公知例としては、例えば、特開平８−３３９３７９号公報、特開平１０−１３７７３号公報があり、さらに、特願平１１−２７７３６１号として映像を木構造として管理する映像構造化装置が出願されている。構造化された映像には分割されたまとまりごとに内容を示す情報が付加されているが、この情報を印刷して映像記録媒体のインデックスとして使用する技術は未だ完成していない。
【００１１】
本発明は上述の問題点を解決するために成されたものであり、その第１の目的は、映像記録媒体の紙出力されるインデックス中に映像の内容を理解するのに十分な情報を付加し、簡易かつ確実に映像の内容が理解できるインデックスを作成することができる映像情報印刷装置、映像情報要約方法およびその方法をコンピュータに実行させるプログラムを記録したコンピュータ読み取り可能な記録媒体を提供することを目的とする。
【００１２】
その第２の目的は、インデックスのレイアウトをユーザの要求に応じて決定し、映像記録媒体ケースへの貼付、あるいはファイリングといった使用方法に最適なインデックスを作成することができる映像情報印刷装置、映像情報要約方法およびその方法をコンピュータに実行させるプログラムを記録したコンピュータ読み取り可能な記録媒体を提供することを目的とする。
【００１３】
【課題を解決するための手段】
上記した課題を解決し、目的を達成するため、請求項１に記載の発明にかかる映像情報印刷装置は、映像に関する情報を要約すると共に要約された情報を印刷する映像情報印刷装置であって、情報要約時の条件である要約条件および要約情報印刷時の印刷書式を設定する条件設定手段と、前記映像の構造を木構造で表す映像構造インデックスと前記要約条件とに基づいて前記映像に関する情報を要約する映像情報要約手段と、前記映像情報要約手段によって要約された情報を前記印刷書式に基づいて整形する印刷情報整形手段と、前記映像情報要約手段によって要約され、かつ、前記印刷情報整形手段によって整形された情報を印刷物として出力する印刷出力手段と、を備えることを特徴とする。
【００１４】
この請求項１に記載の発明によれば、木構造の映像構造インデックスを利用し、映像に関する情報を要約条件に沿って要約することができる。また、要約した情報を印刷書式に沿って印刷することができる。
【００１５】
請求項２に記載の発明にかかる映像情報印刷装置は、前記要約条件が、前記映像構造インデックスで定義されている前記映像を構成する各セグメントのフレーム画像を代表する代表フレーム群から、前記各代表フレームを集約した参照フレームを抽出するための抽出条件を含むことを特徴とする。
【００１６】
この請求項２に記載の発明によれば、映像構造インデックスを利用して映像をあらわす画像（代表フレーム群）を映像情報から抽出することにより効率的に映像情報を検索し、抽出条件に従って代表フレーム群から参照フレームを抽出することができる。
【００１７】
請求項３に記載の発明にかかる映像情報印刷装置は、前記要約条件と前記印刷書式とを登録する登録手段を備えることを特徴とする。
【００１８】
この請求項３に記載の発明によれば、要約条件と印刷書式とを登録することができ、映像情報印刷のたびに要約条件や印刷書式を入力する必要をなくすことができる。また、オペレータが、映像情報印刷装置に新たな要約条件や印刷書式を記憶させることができる。
【００１９】
請求項４に記載の発明にかかる映像情報印刷装置は、前記映像情報要約手段が、映像から抽出された画像の特徴量を取得する特徴量取得手段と、画像間において、前記特徴量取得手段によって取得された特徴量の差を取得する特徴量差取得手段とを備え、前記特徴量差取得手段によって取得された特徴量の差が所定の値以下である複数の画像を１つの画像で代表することを特徴とする。
【００２０】
この請求項４に記載の発明によれば、抽出された画像を１つの画像で表すことができる。また、特徴量の差が小さい画像を１つの画像で代表することにより、抽出された画像全体に含まれる情報量の低下を抑えることができる。
【００２１】
請求項５に記載の発明にかかる映像情報印刷装置は、前記印刷情報整形手段が、印刷時のレイアウトを複数記憶するレイアウト記憶手段を有し、前記レイアウト記憶手段に記憶されているレイアウトを使用して前記映像情報要約手段によって要約された情報を整形することを特徴とする。
【００２２】
この請求項５に記載の発明によれば、印刷時のレイアウトを予め記憶しておくことができ、印刷のたびにレイアウトを入力する必要をなくすことができる。また、オペレータが、新たなレイアウトを映像情報印刷装置に記憶させることができる。
【００２３】
請求項６に記載の発明にかかる映像情報印刷装置は、前記レイアウト記憶手段が、映像メディアの識別ラベルを対象としたレイアウトを記憶することを特徴とするものである。
【００２４】
この請求項６に記載の発明によれば、要約された映像に関する情報を映像メディアの識別ラベルに最適なレイアウトで印刷することができる。
【００２５】
請求項７に記載の発明にかかる映像情報印刷装置は、さらに、前記印刷情報整形手段によって整形された情報に含まれる画像の映像における位置に関する映像位置情報を前記印刷情報に付す位置情報付加手段を備え、前記印刷出力手段は、前記印刷情報と共に前記映像位置情報を印刷物として出力することを特徴とする。
【００２６】
この請求項７に記載の発明によれば、印刷物中に映像における画像の位置に関する情報を付加することができる。
【００２７】
請求項８に記載の発明にかかる映像情報要約方法は、映像に関する情報を要約すると共に、印刷可能な形式に整形する映像情報要約方法であって、情報要約時の条件である要約条件および要約情報印刷時の印刷書式を設定する条件設定工程と、前記映像の構造を木構造で表す映像構造インデックスと前記要約条件とに基づいて前記映像に関する情報を要約する映像情報要約工程と、前記情報要約工程において要約された情報を前記印刷書式に基づいて整形する印刷情報整形工程と、を含むことを特徴とする。
【００２８】
この請求項８に記載の発明によれば、木構造の映像構造インデックスを利用し、映像に関する情報を要約条件に沿って要約することができる。
【００２９】
請求項９に記載の発明にかかる映像情報要約方法は、前記要約条件が、前記映像構造インデックスで定義されている前記映像を構成する各セグメントのフレーム画像を代表する代表フレーム群から、前記各代表フレームを集約した参照フレームを抽出するための抽出条件を含むことを特徴とする。
【００３０】
この請求項９に記載の発明によれば、映像構造インデックスを利用して映像をあらわす画像（代表フレーム群）を映像情報から抽出することにより効率的に映像情報を検索し、抽出条件に従って代表フレーム群から参照フレームを抽出することができる。
【００３１】
請求項１０に記載の発明にかかる映像情報要約方法は、前記情報要約工程が、映像から抽出された画像の特徴量を取得する特徴量取得工程と、画像間において、前記特徴量取得工程において取得された特徴量の差を取得する特徴量差取得工程とを含み、前記特徴量差取得工程において取得された特徴量の差が所定の値以下である複数の画像を１つの画像で代表することを特徴とする。
【００３２】
この請求項１０に記載の発明によれば、抽出された画像を１つの画像で表すことができる。また、特徴量の差が小さい画像を１つの画像で代表することにより、抽出された画像全体に含まれる情報量の低下を抑えることができる。
【００３３】
請求項１１に記載の発明にかかる記録媒体は、前記請求項８〜１０に記載の映像情報要約方法のいずれか一つをコンピュータに実行させるプログラムを記録することができる。
【００３４】
【発明の実施の形態】
以下に添付図面を参照して、この発明にかかる映像情報印刷装置の好適な実施の形態を詳細に説明する。なお、本明細書の実施の形態でいう映像の要約とは、構造化された映像に基づいて映像に含まれるフレーム画像を、フレーム画像に付加されている文書と共に抽出する処理をいうものとする。
【００３５】
図１は、本実施の形態の映像情報印刷装置を説明するためのブロック図である。本実施の形態の映像情報印刷装置は、映像（映像情報１０２）を要約して要約情報を作成し、作成された要約情報を印刷する映像情報印刷装置である。
【００３６】
図示した映像情報印刷装置は、映像構造解釈部１０３、入力部１０４、登録部１１１を備えた映像要約生成部１０５、印刷整形部１０６、印刷言語生成部１０７、映像参照フレーム抽出部１０９、バーコード付加部１１０、印刷部１０８を備えている。なお、本実施の形態の映像情報印刷装置で処理される映像は、映像情報（映像本体）１０２と、映像情報１０２に付され、かつ、映像の構造を木構造で表す映像構造インデックス１０１とからなっている。
【００３７】
図２は、本実施の形態の映像情報印刷装置で使用する映像構造インデックス１０１を説明するための図である。映像構造インデックス１０１は、階層的な木構造として構成されている。木構造の階層は、情報の詳細さの度合いによって分けられていて、下位の階層ほど映像に関する詳細な情報を保持している。図示した木構造は、記録メディア全体に対して１つのルート・ディレクトリ２０１を持つ映像管理データを表すものである。
【００３８】
映像構造インデックス１０１は、映像情報１０２を構造化して表すデータである。映像情報１０２の構造化とは、映像情報で表される映像を連続するフレーム画像（動画を分割して作成された画像）としてとらえ、フレーム画像を所定の単位にまとめる処理をいう。構造化によってまとめられた複数のフレーム画像のまとまりは、セグメントとよばれる。映像構造インデックス１０１によれば、各セグメントがどのような画像のまとまりであるかを示す情報を得ることができる。
【００３９】
本実施の形態では、撮影ごとにセグメントを作成し、撮影ごとにまとめられたセグメントをさらにカット・ショット変化点ごとにまとめてセグメントを作成した。カット・ショット変化点とは、映像の記録開始から起算される記録時間、あるいは画像記録数に基づいて配列されたフレーム画像において、隣り合って配列され、かつ特徴量の差が所定の値よりも大きいフレーム画像間に設けられるものである。なお、特徴量としては、例えば以下のものがある。
・カラーヒストグラム
・空間色分布
・空間エッジ分布
・空間テクスチャ分布
・オブジェクト形状
【００４０】
図２中、Ｘは環境情報を示し、Ｙはカメラ情報を表し、Ｚは代表フレームの静止画特徴量をそれぞれ表すものとする。環境情報Ｘとは、撮影時の外部環境に関する情報であって、日時、位置（緯度および経度）、高度、気温、湿度に関する情報を含んでいる。なお、映像構造インデックス１０１を作成する映像構造化装置は、このような情報を取得するための時計（ＧＭＴ時間を標準とする）、ＧＰＳ（Global Positioning System）、気圧センサ、温度センサ、湿度センサを備えている。
【００４１】
また、カメラ情報Ｙとは、あるセグメントに含まれるフレーム画像（動画像を分割して作成された画像）を撮影したときの焦点距離、絞り値、シャッタースピード、ピント合焦距離といった情報を含んでいる。このような情報のうち、焦点距離に関する情報からは、フレーム画像の撮影対象物が風景か、あるいは人物や物といったオブジェクトか、広角撮影かあるいは望遠撮影かといった点ついて推測することができる。また、絞り値に関する情報からは、撮影対象物が風景か、あるいはオブジェクトかといった判断に加えて、撮影場所の明暗についても推測することができる。
【００４２】
また、シャッタースピードに関する情報からは、撮影対象物が静止したものか、あるいは移動しているものかについて、さらに撮影場所の明暗について推測することができる。そして、ピント合焦距離に関する情報からは、映像記録装置のカメラ部分と撮影対象物との距離がおおよそ推測できる。上記した情報を複合的に解釈することにより、例えば、シャッタースピードに関わらず絞り値が低く、合焦距離が近くて焦点距離が遠いこと場合には、オブジェクトを撮影した可能性が高いというような推測ができるようになる。
【００４３】
さらに、静止画特徴量Ｚとは、フレーム画像を静止画像として見た場合の、特徴量を表す情報である。
【００４４】
ルート・ディレクトリ２０１の直下に位置するカット・ショット統括情報２０２Ａ、カット・ショット情報２０３Ａは、映像情報１０２すべてに共通の情報である。さらに、カット・ショット統括情報２０２Ｂは、カット・ショット情報２０３Ａによって管理されるセグメントのカット・ショット変化点に関する統括的な情報である。
【００４５】
図２に示した木構造中には、撮影が進むにつれて作成された撮影ごとのセグメントが２つあって、２つのセグメントにはそれぞれカット・ショット情報２０３Ｂ、２０３Ｃが付されている。また２つのセグメントのいずれにもカメラ情報Ｙが付加されている。
【００４６】
さらに、カット・ショット情報２０３Ｂ、２２０３Ｃが付加されたセグメントの下位には、各セグメントに含まれるカット・ショットセグメントが４つ配置されている。４つのセグメントは、カット・ショット情報２０３Ｂが付加されたセグメントをカット・ショット変化点で分割して作成されたものである。そして、４つのカット・ショットセグメントには、それぞれカット・ショット情報２０３Ｄ、２０３Ｅ、２０３Ｆ、２０３Ｇが配置されている。また、カット・ショット情報２０３Ｃが付加されたセグメントの下位に配置された４つのセグメントには、カット・ショット情報２０３Ｈ、２０３Ｌ、２０３Ｊ、２０３Ｋが付加されている。カット・ショット情報２０３Ｄ〜Ｇ、カット・ショット情報２０３Ｈ〜Ｋには、それぞれカメラ情報Ｙが付加されている。
【００４７】
さらに、カット・ショット情報２０３Ｄ〜Ｇ、カット・ショット情報２０３Ｈ〜Ｋには、各カット・ショット情報が付加されたセグメントに含まれるフレーム画像を代表する代表フレーム（キーフレームともいう）画像の情報（代表フレーム情報）２０４Ａ、２０４Ｂ、２０４Ｃ、２０４Ｄと、代表フレーム情報２０４Ｅ、２０４Ｆ、２０４Ｇ、２０４Ｈとがそれぞれ付加されている。さらに、代表フレーム情報２０４Ａ〜Ｄ、代表フレーム情報２０４Ｅ〜Ｈには、各代表フレームの静止画特徴量Ｚが付加されている。
【００４８】
代表フレーム情報２０４Ａ〜Ｄ、代表フレーム情報２０４Ｅ〜Ｈは、各代表フレームの静止画特徴量Ｚの他、それぞれの代表フレームを説明する文書を含んでいる。代表フレームを説明する文書は、映像構造インデックス１０１の作成時にオペレータによって作成されたもの、あるいは特徴量に応じて自動的に付加されたサンプル文書である。映像構造インデックス１０１においては、各代表フレームを説明する文書を含む代表フレーム情報２０４Ａ〜Ｄ、代表フレーム情報２０４Ｅ〜Ｈとが印刷可能な情報である。
【００４９】
本実施の形態の映像情報印刷装置の入力部１０４は、映像要約生成部１０５による要約に使用される条件である要約条件（図中条件Ａと記す）、印刷整形部１０６による情報の整形に使用される印刷書式（図中条件Ｂと記す）を入力する。入力された要約条件、印刷書式は、映像情報印刷装置に設定される。また、映像構造解釈部１０３は、映像構造インデックス１０１の構造を解釈する。映像要約生成部１０５、映像参照フレーム抽出部は、映像構造解釈部１０３によって解釈された映像構造インデックス１０１と要約条件とに基づいて映像情報１０２を要約する。印刷整形部１０６は、映像要約生成部１０５、映像参照フレーム抽出部１０９が映像情報１０２を要約して生成した情報（映像要約）を印刷書式に基づいて印刷可能な形式にする（整形）。
【００５０】
なお、本実施の形態は、要約条件として映像要約のうちに画像が占める割合および文字が占める割合、要約時の精度、映像参照フレーム抽出部１０９において行われる参照フレーム（印刷されるフレーム画像）の抽出条件、抽出される参照フレームの数を設定する。また、印刷書式として印刷時のレイアウトを設定するものとした。要約条件および印刷書式は、入力部１０４から入力する他、映像要約生成部１０５の登録部１１１に予め登録しておくこともできる。
【００５１】
上記した要約条件のうち、要約時の精度は、木構造の階層を指定することによって設定される。また、参照フレームの抽出条件は、オペレータが入力部１０４に入力したキーワードや画像、あるいは登録部１１１に登録されているキーワードや画像によって設定される。
【００５２】
設定されたキーワードや画像は、映像要約生成部１０５を介して映像参照フレーム抽出部１０９に入力する。映像参照フレーム抽出部１０９は、キーワードに基づいてフレーム画像（本実施の形態では代表フレームを対象にするものとする）を検索し、該当する代表フレームを抽出する。あるいは抽出条件に設定された画像と類似する画像を抽出する。抽出された画像が要約条件の参照フレームの数よりも多い場合、映像参照フレーム抽出部１０９は、図３に示す方法で代表フレームを絞り込む。
【００５３】
すなわち、図３によれば、抽出の初期の段階で代表フレームａ、ｂ、ｃ、ｄ、ｅ、ｆ、ｇの７枚が抽出された場合、代表フレームａ〜ｇ間で互いの類似度を比較する。この類似度は、例えば、特徴量によって判別しても良い。類似度の比較の結果、代表フレームａと代表フレームｅとが類似する（例えば特徴量の相違が予め設定されている基準よりも小さい）と判断された場合、映像参照フレーム抽出部１０９は、代表フレームａまたは代表フレームｅの一方だけを残す。残った代表フレームａまたはｅは、代表フレームｈとなる。すなわち、代表フレームｈは、代表フレームａとｅとを代表することになる。同様に、代表フレームｃと代表フレームｆとが類似すると判断された場合、代表フレームｃまたは代表フレームｆの一方だけを残すことによって代表フレームｃ、代表フレームｆを代表フレームｊで代表する。以上の処理により、代表フレームａ〜ｇの７枚は、代表フレームｈ、ｉ、ｊ、ｋ、ｌの５枚に絞り込まれる。
【００５４】
さらに、映像参照フレーム抽出部１０９は、代表フレームｈ〜ｌの５枚に対して代表フレームａ〜ｇに対して行った絞り込みの処理と同様の処理を施し、代表フレームｈ〜ｌの５枚を代表フレームｍ、ｎ、ｏの３枚に絞り込む。３という数値が要約条件として設定された抽出すべき参照フレームの数に合致する場合、映像参照フレーム抽出部１０９は、代表フレームｍ〜ｏを抽出したという情報を映像要約生成部１０５および印刷整形部１０６に出力する。
【００５５】
映像要約生成部１０５は、代表フレームｍ〜ｏと代表フレームｍ〜ｏに付されている文書との量を要約条件として設定された画像が占める割合および文字が占める割合と比較する。代表フレームｍ〜ｏと代表フレームｍ〜ｏに付されている文書との量は、例えば、印刷された画像において抽出された参照フレームが占める面積比と文字が占める面積比とをシミュレーションによって求めるものであっても良い。このとき、要約条件として設定された画像が占める割合および文字が占める割合とは、印刷された画像において画像が占める面積比と文字が占める面積比とによって設定される。
【００５６】
そして、映像要約生成部１０５は、参照フレームとして抽出された代表フレーム（以下、参照フレームと記す）が占める割合と文字が占める割合とが要約条件に一致しなかった場合、参照フレームが占める割合と文字が占める割合とが要約条件に一致するように調整する。この調整は、例えば、抽出された参照フレームを削除してその空間を文字にあてる、あるいは文字で表される情報を削除する方法によって行われる。なお、文字情報を削除する場合、映像要約生成部１０５は、例えば、文字情報を対象にして重要な語句を抽出する。そして、重要な語句が多く含まれている文書ほど重要度が高く、少ない文書ほど重要度が低いと判断し、重要度が低いと判断された文書を優先して削除するようにしても良い。
【００５７】
さらに、映像要約生成部１０５は、以上のようにして割合を調整した参照フレームと文書とに関する情報を印刷整形部１０６に出力する。印刷整形部１０６は、参照フレームと文書とを印刷書式として設定されたレイアウトにしたがって配置し、印刷言語生成部１０７に出力する。また、この際、印刷整形部１０６は、参照フレームに関する情報から参照フレームの映像における位置をバーコード付加部１１０に出力する。バーコード付加部１１０は、参照フレームの映像における位置をバーコードで表す情報に変換し、印刷言語生成部１０７に出力する。なお、参照フレームの映像における位置は、参照フレームが映像を構成するフレーム画像において何番目に位置するか、あるいは参照フレームが映像の再生開始から何時間（分、秒）後に再生されるかで示される時間的位置であっても良い。
【００５８】
図４（ａ）、（ｂ）、（ｃ）は、印刷書式として設定されるレイアウトを例示する図である。図４（ａ）は、画像を主体とする画像主体レイアウトである。また、（ｂ）は、画像と文書とが比較的近い割合で配分された画像・文字混合レイアウトである。また、（ｃ）は、文書を主体とした文字主体レイアウトである。
【００５９】
本実施の形態の映像情報印刷装置は、図４で例示したレイアウトを含む複数のレイアウトを例えば印刷整形部１０６に保存しておく。このとき、本実施の形態では、ビデオテープやＤＶＤといった映像メディアの識別ラベル（付箋紙）を対象としたレイアウトをも印刷整形部１０６に保存しておくものとする。そして、印刷書式の入力時、図示しない表示部にレイアウトの一覧を表示し、オペレータが任意のレイアウトを指定するものとする。また、レイアウトは、予め保存されるものに限るものでなく、オペレータが登録部１１１に登録することも可能である。
【００６０】
レイアウトに沿って印刷整形された参照フォームおよび文書は、印刷整形部１０６から印刷言語生成部１０７に出力される。また、バーコード付加部１１０は、バーコードに変換した参照フレームの映像における位置を印刷言語生成部１０７に出力する。印刷言語生成部１０７は、印刷整形された参照フォームおよび文書とバーコードとを印刷部１０８で使用されている印刷言語に変換して印刷部１０８に出力する。印刷部１０８は、参照フォームおよび文書とバーコードとを同一の用紙上に印刷、出力する。この際、レイアウトにバーコード専用の領域を設けておくようにしても良い。
【００６１】
次に、以上説明した映像情報印刷装置で行われる処理を図５に示すフローチャートを用いて説明する。なお、図５に示した処理は、本発明の一実施の形態の映像情報要約方法を含むものである。図５に示すように、本実施の形態の映像情報印刷装置は、図示しない表示部に要約条件、印刷書式入力を入力する画面を表示することによって要約条件と印刷書式とを入力する（ステップＳ５０１）。
【００６２】
このとき、映像情報印刷装置は、オペレータが登録部１１１に登録されている要約条件（キーワードなど）や印刷書式（レイアウトなど）を指定したか否か判断し（ステップＳ５０２）、登録データが指定された場合には（ステップＳ５０２：Ｙｅｓ）、登録部１１１から登録データを読み出す（ステップＳ５１０）。また、登録データが指定されていない場合には（ステップＳ５０２：Ｎｏ）、映像構造インデックス１０１に基づいて映像情報１０２で表される映像の構造を解釈する（ステップＳ５０３）。
【００６３】
次に、映像情報印刷装置は、映像要約作成サブルーチンを実行する（ステップＳ５０４）。映像要約作成サブルーチンは、映像要約生成部１０５、映像参照フレーム抽出部１０９においてなされる処理である。そして、要約条件にしたがって参照フレームを抽出する処理、抽出された参照フレームに付加されている文書と参照フレームとの割合を調整する処理、参照フレームの映像における位置を特定する処理を含むものである。
【００６４】
次に、映像情報印刷装置は、映像要約サブルーチンの処理によって得られる参照フレームおよび文書を印刷書式に基づいてレイアウトに配置し、参照フレームおよび文書を印刷時の様式に整形する（ステップＳ５０５）。また、参照フレームの位置に関する情報をバーコード付加部１１０でバーコードに変換すると共に参照フレームおよび文書に付加する（ステップＳ５０６）。そして、印刷時の様式に整形された参照フレームおよび文書とバーコードとを印刷部１０８で使用される言語に変換して印刷言語を生成し（ステップＳ５０７）、印刷部１０８で印刷する（ステップＳ５０８）。
【００６５】
さらに、本実施の形態の映像情報印刷装置は、今回使用した要約条件と印刷書式とをオペレータが登録するか否か判断し（ステップＳ５０９）、登録する場合には（ステップＳ５０９：Ｙｅｓ）、要約条件および印刷書式を登録部１１１に登録する（ステップＳ５１１）。また、登録しない場合には（ステップＳ５０９：Ｎｏ）、登録を行うことなく処理を終了する。
【００６６】
なお、本実施の形態で説明した映像情報要約方法は、予め用意されたプログラムをパーソナル・コンピューターやワークステーション等のコンピュータで実行することにより実現することができる。このプログラムは、ハードディスク、フレキシブルディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。またこのプログラムは、伝送媒体としてまたは上記記録媒体を介して、インターネット等のネットワークを介して配布することができる。
【００６７】
以上説明した本実施の形態の映像情報印刷装置は、映像構造インデックスを利用して映像をあらわす画像および文書を映像情報から抽出することにより効率的に映像情報を検索し、参照フレームを抽出することができる。また、参照フレームを説明するための文書を比較的簡易に取得することができる。
【００６８】
また、オペレータが設定した要約条件に沿って映像を要約し、オペレータが設定した印刷書式に沿って印刷することにより、参照フレームおよび文書の選択にユーザの目的や嗜好を反映させることができる。また、参照フレームおよび文書を印刷する際のレイアウトにもユーザの目的や嗜好を反映させることができる。さらに、要約条件や印刷書式をオペレータが登録することにより、任意の要約条件によって抽出された参照フレームおよび文書を任意のレイアウトで印刷することができる。印刷物を映像記録媒体のインデックスに利用すれば、各ユーザにオリジナルのインデックスを作成することができる。
【００６９】
また、本実施の形態の映像情報印刷装置は、映像情報を要約した情報を紙出力することにより、映像情報を紙によって閲覧することができる。また、映像の要約情報を紙によって保存、管理することができるため、例えば料理番組で料理のレシピを保管しておく場合にも便利である。
【００７０】
【発明の効果】
以上説明したように、請求項1に記載の発明は、木構造の映像構造インデックスを利用して映像に関する情報を要約するため、映像の内容を理解するのに充分な情報を得ることができる映像情報印刷装置を提供することができる。また、要約条件に沿って映像に関する情報を要約することができるため、目的に沿った内容の要約の情報を得ることができる映像情報印刷装置を提供することができる。さらに、要約の情報を印刷書式に沿って印刷することができるため、映像の内容を理解するのに充分な情報を紙に出力できる映像情報印刷装置を提供することができる。さらに、インデックスのレイアウトをユーザの要求に応じて決定し、映像記録媒体ケースへの貼付、あるいはファイリングといった使用方法に最適なインデックスを作成することができる映像情報印刷装置を提供することができる。また、実施の形態１に記載の発明は、映像の要約を含む情報を紙によって一覧することができ、映像情報の処理に紙の利便性（場所を問わず閲覧できる、再生装置など特別な装置がなくても閲覧できるなど）を生かすことができる映像情報印刷装置を提供することができる。また、映像を文書に変換できる映像情報印刷装置を提供することができる。
【００７１】
請求項２に記載の発明は、映像構造インデックスを利用して映像をあらわす画像（代表フレーム群）を映像情報から抽出することにより効率的に映像情報を検索し、抽出条件に従って代表フレーム群から参照フレームを抽出することができる映像情報印刷装置を提供することができる。
【００７２】
請求項３に記載の発明は、要約条件と印刷書式とを登録することができるため、映像情報印刷のたびに要約条件および印刷書式を入力する必要がなく操作性の高い映像情報印刷装置を提供することができる。また、オペレータによってユーザの嗜好を反映した要約条件および印刷書式を登録することにより、ユーザの嗜好を反映したインデックスを得ることができる映像情報印刷装置を提供することができる。
【００７３】
請求項４に記載の発明は、抽出された画像を１つの画像で表すことができ、必要な画像の数に合せて抽出された画像を絞り込むことができる。このため、映像を表す情報から任意の数の画像を抽出することができる映像情報印刷装置を提供することができる。
【００７４】
請求項５に記載の発明は、印刷のたびにレイアウトを入力する必要がない操作性の高い映像情報印刷装置を提供することができる。また、オペレータがユーザの嗜好を反映したレイアウトを登録することにより、ユーザの嗜好を反映したレイアウトで印刷されたインデックスを得ることができる映像情報印刷装置を提供することができる。
【００７５】
請求項６に記載の発明は、要約された映像に関する情報を映像メディアの識別ラベルに最適なレイアウトで印刷することができ、映像メディアの識別ラベルに最適な印刷物を得ることができる映像情報印刷装置を提供することができる。
【００７６】
請求項７に記載の発明は、印刷物中に映像における画像の位置に関する情報を付加することができ、画像が示す内容を映像から簡単に検索することができる。このため、映像の加工、編集に便利なインデックスを得ることができる映像情報印刷装置を提供することができる。
【００７７】
請求項８に記載の発明は、木構造の映像構造インデックスを利用して映像に関する情報を要約するため、映像の内容を理解するのに充分な情報を得ることができる映像情報要約方法を提供することができる。また、要約条件に沿って映像に関する情報を要約することができるため、目的に沿った内容の要約の情報を得ることができる映像情報要約方法を提供することができる。
【００７８】
請求項９に記載の発明は、映像構造インデックスを利用して映像をあらわす画像（代表フレーム群）を映像情報から抽出することにより効率的に映像情報を検索し、抽出条件に従って代表フレーム群から参照フレームを抽出することができる映像情報要約方法を提供することができる。
【００７９】
請求項１０に記載の発明は、抽出された画像を１つの画像で表すことができ、必要な画像の数に合せて抽出された画像を絞り込むことができる。このため、映像を表す情報から任意の数の画像を抽出することができる映像情報要約方法を提供することができる。
【００８０】
請求項１１に記載の発明にかかる記録媒体は、前記請求項８〜１０に記載の映像情報要約方法のいずれか一つをコンピュータに実行させるプログラムを記録することができるため、前記請求項８〜１０に記載の映像情報要約方法のいずれか一つをコンピュータに実行させることができる。
【図面の簡単な説明】
【図１】本発明の一実施の形態の映像情報印刷装置を説明するためのブロック図である。
【図２】本発明の一実施の形態の映像情報印刷装置で使用する映像構造インデックスを説明するための図である。
【図３】参照フレームの絞り込みを説明するための図である。
【図４】本発明の一実施の形態で印刷書式として設定されるレイアウトを例示する図である。
【図５】本発明の映像情報要約方法を説明するための図である。
【符号の説明】
１０１映像構造インデックス
１０２映像情報
１０３映像構造解釈部
１０４入力部
１０５映像要約生成部
１０６印刷整形部
１０７印刷言語生成部
１０８印刷部
１０９映像参照フレーム抽出部
１１０バーコード付加部
１１１登録部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a video information printing apparatus, a video information summarizing method, and a computer-readable recording medium recording a program for causing a computer to execute the method.
[0002]
[Prior art]
For example, a recording medium (video recording medium) capable of recording information such as a video tape and a DVD (Digital Versatile Disc) as a video is widely used. Such a video recording medium is generally provided with an index that allows the contents of the recorded video to be understood at a glance, and is used for organizing and managing the video recording medium.
[0003]
Currently, as a configuration for easily creating an index, the invention described in JP-A-6-245179, the invention described in JP-A-8-294080, the invention described in JP-A-9-93527 are disclosed. is there. Of these, the invention described in Japanese Patent Laid-Open No. 6-245179 stores a video signal in a memory by pressing a freeze button, and prints a still image represented by the stored video signal.
[0004]
In the invention described in Japanese Patent Laid-Open No. 8-294080, an index area is provided on a video recording medium, and index data is created by recording a representative screen of the video data, a shooting date, a shooting location, and the like in the index area. . The representative screen recorded as index data can be displayed on the display screen or output on paper. At this time, the shooting date / time and shooting location can be displayed or output on paper in correspondence with the representative screen.
[0005]
The invention described in Japanese Patent Laid-Open No. 9-93527 extracts (basically) a representative image based on a scene change in a reproduced video, and stores the extracted representative image to create an index. To do. This index is a list of representative images, and can be displayed on the display unit or printed. According to the invention described in JP-A-6-245179, the invention described in JP-A-8-294080, and the invention described in JP-A-9-93527, all are included in the video. Images can be printed on paper and output. The output printed matter can be used as an index of the video recording medium.
[0006]
[Problems to be solved by the invention]
However, all of the above-described conventional inventions mainly use an image included in a video as an index. An index mainly composed of images is effective for understanding the content of video to some extent, but there is a possibility that information essential for understanding may be lost or unnecessary information may be mixed.
[0007]
In addition, although the conventional invention adds simple information such as the shooting date and time and the shooting location to compensate for the lack of information or manually replaces the representative image determined based on the scene change, the content of the video is sufficient. There is still room for improvement in creating an easily understandable index.
[0008]
Further, when organizing video recording media using an index, for example, it is conceivable that the index is directly attached to the case of the video recording medium, or only the index is filed. However, the above-described conventional invention cannot change the print layout for printing the created data index in accordance with the index usage method. For this reason, the user needs to process the printed index according to the usage method.
[0009]
By the way, at present, a video structuring apparatus for structuring a video to facilitate editing and processing of the video has been proposed. Video structuring refers to a process of dividing a video by detecting a change point of an image constituting the video and making it possible to search the video for each divided group.
[0010]
Known examples of video structuring apparatuses for structuring video include, for example, Japanese Patent Application Laid-Open Nos. 8-339379 and 10-13773, and Japanese Patent Application No. 11-277361 as a tree structure. A video structuring apparatus to be managed has been filed. Information indicating the contents of each divided group is added to the structured video, but a technique for printing this information and using it as an index of a video recording medium has not yet been completed.
[0011]
The present invention has been made to solve the above-mentioned problems, and its first object is to add information sufficient to understand the content of the video to the paper output index of the video recording medium. To provide a video information printing apparatus capable of easily and reliably creating an index for understanding video content, a video information summarizing method, and a computer-readable recording medium storing a program for causing a computer to execute the method With the goal.
[0012]
The second purpose is to determine the layout of the index according to the user's request, and to create an optimal index for a usage method such as sticking to a video recording medium case or filing, and video information It is an object of the present invention to provide a computer-readable recording medium recording a summarizing method and a program for causing a computer to execute the method.
[0013]
[Means for Solving the Problems]
In order to solve the above-described problems and achieve the object, the video information printing apparatus according to the first aspect of the present invention is a video information printing apparatus that summarizes information about videos and prints the summarized information, Condition setting means for setting a summary condition that is a condition at the time of information summarization and a print format for printing summary information, a video structure index that represents the structure of the video in a tree structure, and information on the video based on the summary condition Summarizing video information summarizing means, print information shaping means for shaping the information summarized by the video information summarizing means based on the print format, summarizing by the video information summarizing means, and by the print information shaping means Printing output means for outputting the shaped information as a printed matter.
[0014]
According to the first aspect of the present invention, it is possible to summarize information related to video according to the summarizing condition by using the video structure index having a tree structure. In addition, the summarized information can be printed according to the print format.
[0015]
In the video information printing apparatus according to claim 2, the summary condition is Extraction conditions for extracting a reference frame in which each representative frame is aggregated from a representative frame group representing a frame image of each segment constituting the video defined by the video structure index It is characterized by including.
[0016]
According to the invention described in claim 2, Video information is efficiently searched by extracting images (representative frame groups) representing video using the video structure index from the video information, and reference frames are extracted from the representative frame groups according to the extraction conditions. can do.
[0017]
The video information printing apparatus according to claim 3, wherein the summary condition and the print format And It is characterized by comprising registration means for registering.
[0018]
According to the third aspect of the present invention, it is possible to register the summary condition and the print format, and it is possible to eliminate the need to input the summary condition and the print format each time video information is printed. Further, the operator can store new summary conditions and print formats in the video information printing apparatus.
[0019]
According to a fourth aspect of the present invention, in the video information printing apparatus, the video information summarizing unit includes a feature amount acquiring unit that acquires a feature amount of an image extracted from the video, and the feature amount acquiring unit between the images. A feature quantity difference obtaining unit that obtains a difference between the obtained feature quantities, and a plurality of images in which the feature quantity difference obtained by the feature quantity difference obtaining unit is equal to or less than a predetermined value is represented by one image. It is characterized by that.
[0020]
According to the fourth aspect of the present invention, the extracted image can be represented by one image. In addition, by representing an image with a small difference in feature amount as a single image, it is possible to suppress a decrease in the amount of information included in the entire extracted image.
[0021]
According to a fifth aspect of the present invention, in the video information printing apparatus, the print information shaping unit includes a layout storage unit that stores a plurality of layouts at the time of printing, and uses the layout stored in the layout storage unit. Then, the information summarized by the video information summarizing means is shaped.
[0022]
According to the fifth aspect of the present invention, the layout at the time of printing can be stored in advance, and it is not necessary to input the layout every time printing is performed. Also, the operator can store a new layout in the video information printing apparatus.
[0023]
A video information printing apparatus according to a sixth aspect of the invention is characterized in that the layout storage unit stores a layout for an identification label of a video medium.
[0024]
According to the sixth aspect of the present invention, it is possible to print information relating to the summarized video with an optimum layout on the identification label of the video media.
[0025]
According to a seventh aspect of the present invention, the video information printing apparatus further comprises position information adding means for attaching video position information relating to a position in an image of an image included in the information shaped by the print information shaping means to the print information. The print output means outputs the video position information together with the print information as a printed matter.
[0026]
According to the seventh aspect of the present invention, information relating to the position of the image in the video can be added to the printed matter.
[0027]
A video information summarizing method according to an eighth aspect of the present invention is a video information summarizing method for summarizing information related to video and shaping the information into a printable format. A condition setting step for setting a print format at the time of printing, a video information summarizing step for summarizing information about the video based on a video structure index representing the structure of the video in a tree structure and the summarizing condition, and the information summarizing step A printing information shaping step for shaping the information summarized in (1) based on the printing format.
[0028]
According to the eighth aspect of the present invention, it is possible to summarize the information about the video according to the summarizing condition by using the video structure index having a tree structure.
[0029]
The video information summarizing method according to the invention of claim 9 is characterized in that the summarizing condition is: Extraction conditions for extracting a reference frame in which each representative frame is aggregated from a representative frame group representing a frame image of each segment constituting the video defined by the video structure index It is characterized by including.
[0030]
According to the invention of claim 9, Video information is efficiently searched by extracting images (representative frame groups) representing video using the video structure index from the video information, and reference frames are extracted from the representative frame groups according to the extraction conditions. can do.
[0031]
In the video information summarizing method according to claim 10, the information summarizing step acquires the feature amount of the image extracted from the video and the feature amount acquisition step between the images. A feature amount difference obtaining step for obtaining a difference between the feature amounts, and representing a plurality of images in which the difference between the feature amounts obtained in the feature amount difference obtaining step is a predetermined value or less as one image It is characterized by.
[0032]
According to the tenth aspect of the present invention, the extracted image can be represented by one image. In addition, by representing an image with a small difference in feature amount as a single image, it is possible to suppress a decrease in the amount of information included in the entire extracted image.
[0033]
A recording medium according to an eleventh aspect of the invention can record a program that causes a computer to execute any one of the video information summarizing methods according to the eighth to tenth aspects.
[0034]
DETAILED DESCRIPTION OF THE INVENTION
Exemplary embodiments of a video information printing apparatus according to the present invention will be explained below in detail with reference to the accompanying drawings. The video summarization referred to in the embodiment of the present specification refers to a process of extracting a frame image included in a video together with a document added to the frame image based on the structured video. .
[0035]
FIG. 1 is a block diagram for explaining a video information printing apparatus according to the present embodiment. The video information printing apparatus according to the present embodiment is a video information printing apparatus that summarizes video (video information 102), creates summary information, and prints the created summary information.
[0036]
The illustrated video information printing apparatus includes a video structure interpretation unit 103, an input unit 104, a video summary generation unit 105 including a registration unit 111, a print shaping unit 106, a print language generation unit 107, a video reference frame extraction unit 109, a barcode. An adding unit 110 and a printing unit 108 are provided. The video processed by the video information printing apparatus of this embodiment is based on the video information (video main body) 102 and the video structure index 101 attached to the video information 102 and representing the video structure in a tree structure. It has become.
[0037]
FIG. 2 is a diagram for explaining the video structure index 101 used in the video information printing apparatus according to the present embodiment. The video structure index 101 is configured as a hierarchical tree structure. The hierarchy of the tree structure is divided according to the degree of detail of information, and the lower hierarchy holds detailed information about the video. The tree structure shown represents video management data having one root directory 201 for the entire recording medium.
[0038]
The video structure index 101 is data representing the video information 102 in a structured manner. The structuring of the video information 102 refers to a process of capturing the video represented by the video information as a continuous frame image (an image created by dividing a moving image) and grouping the frame images into a predetermined unit. A group of a plurality of frame images grouped by structuring is called a segment. According to the video structure index 101, it is possible to obtain information indicating what sort of image each segment is.
[0039]
In the present embodiment, a segment is created for each shooting, and the segments grouped for each shooting are further grouped for each cut / shot change point. Cut-shot change point is a recording time calculated from the start of video recording, or frame images arranged based on the number of recorded images. It is provided between large frame images. Note that the feature amounts include, for example, the following.
・ Color histogram
・ Spatial color distribution
・ Spatial edge distribution
・ Spatial texture distribution
・ Object shape
[0040]
In FIG. 2, X represents environment information, Y represents camera information, and Z represents a still image feature amount of a representative frame. The environment information X is information related to the external environment at the time of shooting, and includes information related to date and time, position (latitude and longitude), altitude, temperature, and humidity. Note that the video structuring apparatus that creates the video structure index 101 includes a clock (standard for GMT time), a GPS (Global Positioning System), a barometric sensor, a temperature sensor, and a humidity sensor for acquiring such information. I have.
[0041]
The camera information Y includes information such as a focal length, an aperture value, a shutter speed, and a focusing distance when a frame image (an image created by dividing a moving image) included in a certain segment is captured. Yes. Among such information, it is possible to infer from the information about the focal length whether the object to be captured of the frame image is a landscape, an object such as a person or an object, wide-angle shooting, or telephoto shooting. Further, from the information about the aperture value, in addition to determining whether the object to be photographed is a landscape or an object, it is also possible to infer the brightness of the photographing location.
[0042]
Further, from the information about the shutter speed, it is possible to infer whether the subject to be photographed is stationary or moving, and the brightness of the photographing location. From the information on the focus distance, the distance between the camera portion of the video recording apparatus and the object to be photographed can be roughly estimated. By interpreting the above information in a complex manner, for example, if the aperture value is low regardless of the shutter speed, the focus distance is close, and the focal distance is long, the possibility that the object has been shot is high. You can guess.
[0043]
Furthermore, the still image feature amount Z is information representing the feature amount when the frame image is viewed as a still image.
[0044]
The cut / shot control information 202A and the cut / shot information 203A located immediately below the root directory 201 are information common to all the video information 102. Further, the cut / shot control information 202B is general information on the cut / shot change point of the segment managed by the cut / shot information 203A.
[0045]
In the tree structure shown in FIG. 2, there are two segments for each photographing created as the photographing progresses, and the cut / shot information 203B and 203C are attached to the two segments, respectively. Camera information Y is added to both of the two segments.
[0046]
Further, four cut / shot segments included in each segment are arranged below the segment to which the cut / shot information 203B and 2203C are added. The four segments are created by dividing the segment to which the cut / shot information 203B is added at the cut / shot change point. Cut shot information 203D, 203E, 203F, and 203G is arranged in each of the four cut shot segments. Cut shot information 203H, 203L, 203J, and 203K are added to the four segments arranged below the segment to which the cut shot information 203C is added. Camera information Y is added to each of the cut / shot information 203D to G and the cut / shot information 203H to K.
[0047]
Further, the cut shot information 203D to G and the cut shot information 203H to K include information on a representative frame (also referred to as key frame) image representing a frame image included in the segment to which each cut shot information is added ( (Representative frame information) 204A, 204B, 204C, 204D and representative frame information 204E, 204F, 204G, 204H are respectively added. Further, the still image feature value Z of each representative frame is added to the representative frame information 204A to 204D and the representative frame information 204E to 204H.
[0048]
The representative frame information 204A to D and the representative frame information 204E to H include a document describing each representative frame in addition to the still image feature amount Z of each representative frame. The document describing the representative frame is a document created by the operator when creating the video structure index 101 or a sample document automatically added according to the feature amount. In the video structure index 101, representative frame information 204A to D and representative frame information 204E to H including a document describing each representative frame are printable information.
[0049]
The input unit 104 of the video information printing apparatus according to the present embodiment is used for summarizing conditions (referred to as condition A in the figure) that are conditions used for summarization by the video summarization generation unit 105 and information shaping by the print shaping unit 106. Enter the print format (referred to as condition B in the figure). The input summary conditions and print format are set in the video information printing apparatus. The video structure interpretation unit 103 interprets the structure of the video structure index 101. The video summary generation unit 105 and the video reference frame extraction unit summarize the video information 102 based on the video structure index 101 interpreted by the video structure interpretation unit 103 and the summary condition. The print shaping unit 106 converts the information (video summary) generated by the video summary generation unit 105 and the video reference frame extraction unit 109 by summarizing the video information 102 into a printable format based on the print format (shaping).
[0050]
In the present embodiment, as the summarization conditions, the ratio of the image and the ratio of characters in the video summary, the accuracy at the time of summarization, and the reference frame (frame image to be printed) performed in the video reference frame extraction unit 109 are described. Set the extraction conditions and the number of reference frames to be extracted. Also, the layout at the time of printing is set as the print format. The summary conditions and the print format can be input from the input unit 104 or registered in advance in the registration unit 111 of the video summary generation unit 105.
[0051]
Among the above-described summarization conditions, the accuracy at the time of summarization is set by designating a tree structure hierarchy. The reference frame extraction condition is set by a keyword or image input by the operator to the input unit 104 or a keyword or image registered in the registration unit 111.
[0052]
The set keyword and image are input to the video reference frame extraction unit 109 via the video summary generation unit 105. The video reference frame extraction unit 109 searches for a frame image (in this embodiment, the target is a representative frame) based on the keyword, and extracts the corresponding representative frame. Alternatively, an image similar to the image set as the extraction condition is extracted. If the number of extracted images is larger than the number of summary condition reference frames, the video reference frame extraction unit 109 narrows down the representative frames by the method shown in FIG.
[0053]
That is, according to FIG. 3, when seven representative frames a, b, c, d, e, f, and g are extracted at the initial stage of extraction, the similarity between the representative frames a to g is determined. Compare. This similarity may be determined by, for example, a feature amount. As a result of the comparison of similarities, when it is determined that the representative frame a and the representative frame e are similar (for example, the difference in feature amount is smaller than a preset standard), the video reference frame extraction unit 109 Only one of frame a or representative frame e is left. The remaining representative frame a or e becomes a representative frame h. That is, the representative frame h represents the representative frames a and e. Similarly, when it is determined that the representative frame c and the representative frame f are similar, the representative frame c and the representative frame f are represented by the representative frame j by leaving only the representative frame c or the representative frame f. With the above processing, the seven representative frames a to g are narrowed down to five representative frames h, i, j, k, and l.
[0054]
Further, the video reference frame extraction unit 109 performs the same processing as the narrowing-down process performed on the representative frames a to g on the five representative frames h to l to obtain the five representative frames h to l. Narrow down to three representative frames m, n, and o. When the numerical value of 3 matches the number of reference frames to be extracted set as the summary condition, the video reference frame extraction unit 109 displays information that the representative frames m to o have been extracted as the video summary generation unit 105 and the print shaping unit. It outputs to 106.
[0055]
The video summary generation unit 105 compares the amounts of the representative frames m to o and the documents attached to the representative frames m to o with the ratio of the image set as the summary condition and the ratio of the characters. The amounts of the representative frames m to o and the documents attached to the representative frames m to o are obtained, for example, by calculating the area ratio occupied by the reference frame extracted in the printed image and the area ratio occupied by characters. It may be. At this time, the ratio occupied by the image and the ratio occupied by the characters set as the summary conditions are set by the area ratio occupied by the image and the area ratio occupied by the characters in the printed image.
[0056]
Then, the video summary generation unit 105 determines the ratio of the reference frame when the ratio of the representative frame extracted as the reference frame (hereinafter referred to as the reference frame) and the ratio of the character do not match the summary conditions. Adjust the proportion of characters to match the summary conditions. This adjustment is performed, for example, by a method of deleting the extracted reference frame and assigning the space to characters, or deleting information represented by characters. In addition, when deleting character information, the video | video summary production | generation part 105 extracts an important phrase for character information, for example. Then, it may be determined that a document including a lot of important words / phrases has a higher importance and a document having a smaller importance has a lower importance, and a document determined to have a lower importance is preferentially deleted.
[0057]
Further, the video summary generation unit 105 outputs information on the reference frame and the document whose ratios are adjusted as described above to the print shaping unit 106. The print shaping unit 106 arranges the reference frame and the document according to the layout set as the print format, and outputs the frame to the print language generation unit 107. At this time, the print shaping unit 106 outputs the position of the reference frame in the video from the information about the reference frame to the barcode adding unit 110. The bar code adding unit 110 converts the position of the reference frame in the video into information represented by a bar code, and outputs the information to the print language generating unit 107. Note that the position of the reference frame in the video indicates how many times the reference frame is positioned in the frame image constituting the video, or how many hours (minutes, seconds) the reference frame is played back from the start of video playback. It may be a temporal position.
[0058]
4A, 4B, and 4C are diagrams illustrating layouts set as print formats. FIG. 4A shows an image main layout mainly composed of images. (B) is an image / character mixed layout in which images and documents are distributed at a relatively close ratio. (C) is a character-based layout mainly composed of documents.
[0059]
The video information printing apparatus according to the present embodiment stores a plurality of layouts including the layout illustrated in FIG. At this time, in the present embodiment, it is assumed that a layout for an identification label (sticky note) of a video medium such as a video tape or DVD is also stored in the print shaping unit 106. When inputting the print format, a list of layouts is displayed on a display unit (not shown), and the operator designates an arbitrary layout. Further, the layout is not limited to the one stored in advance, and the operator can also register in the registration unit 111.
[0060]
The reference form and the document printed and shaped according to the layout are output from the print shaping unit 106 to the print language generation unit 107. Further, the barcode adding unit 110 outputs the position in the video of the reference frame converted into the barcode to the print language generating unit 107. The print language generation unit 107 converts the printed and formatted reference form, document, and barcode into the print language used by the print unit 108 and outputs the converted print language to the print unit 108. The printing unit 108 prints and outputs the reference form, the document, and the barcode on the same sheet. At this time, an area dedicated to barcodes may be provided in the layout.
[0061]
Next, processing performed by the video information printing apparatus described above will be described with reference to the flowchart shown in FIG. The process shown in FIG. 5 includes the video information summarizing method according to the embodiment of the present invention. As shown in FIG. 5, the video information printing apparatus according to the present embodiment inputs a summary condition and a print format by displaying a screen for inputting a summary condition and a print format on a display unit (not shown) (step S501). ).
[0062]
At this time, the video information printing apparatus determines whether or not the operator has specified a summary condition (such as a keyword) or a print format (such as a layout) registered in the registration unit 111 (step S502), and registration data is specified. In the case of registration (step S502: Yes), the registration data is read from the registration unit 111 (step S510). If the registered data is not designated (step S502: No), the video structure represented by the video information 102 is interpreted based on the video structure index 101 (step S503).
[0063]
Next, the video information printing apparatus executes a video summary creation subroutine (step S504). The video summary creation subroutine is processing performed in the video summary generation unit 105 and the video reference frame extraction unit 109. The process includes a process of extracting a reference frame in accordance with the summary condition, a process of adjusting the ratio of the document added to the extracted reference frame and the reference frame, and a process of specifying the position of the reference frame in the video.
[0064]
Next, the video information printing apparatus arranges the reference frame and the document obtained by the processing of the video summarizing subroutine in the layout based on the print format, and shapes the reference frame and the document into a printing format (step S505). In addition, the bar code adding unit 110 converts the information related to the position of the reference frame into a bar code and adds it to the reference frame and the document (step S506). Then, the reference frame, the document, and the barcode that are formatted in the printing format are converted into a language used by the printing unit 108 to generate a printing language (step S507), and the printing unit 108 prints it (step S508). ).
[0065]
Further, the video information printing apparatus according to the present embodiment determines whether or not the operator registers the summarization conditions and the print format used this time (step S509), and in the case of registration (step S509: Yes), the summarization is performed. The conditions and the print format are registered in the registration unit 111 (step S511). If not registered (step S509: No), the process is terminated without performing registration.
[0066]
Note that the video information summarization method described in this embodiment can be realized by executing a program prepared in advance on a computer such as a personal computer or a workstation. This program is a hard disk, flexible The program is recorded on a computer-readable recording medium such as a disk, CD-ROM, MO, and DVD, and is executed by being read from the recording medium by the computer. The program can be distributed as a transmission medium or via the recording medium via a network such as the Internet.
[0067]
The video information printing apparatus according to the present embodiment described above efficiently searches video information and extracts a reference frame by extracting an image and a document representing the video from the video information using the video structure index. Can do. Also, a document for explaining the reference frame can be acquired relatively easily.
[0068]
Also, by summarizing the video according to the summarizing conditions set by the operator and printing it according to the print format set by the operator, the user's purpose and preference can be reflected in the selection of the reference frame and the document. In addition, the user's purpose and preference can be reflected in the layout when the reference frame and the document are printed. Furthermore, the operator registers the summary condition and the print format, so that the reference frame and the document extracted by the arbitrary summary condition can be printed with an arbitrary layout. If the printed material is used as an index of a video recording medium, an original index can be created for each user.
[0069]
Further, the video information printing apparatus according to the present embodiment can browse the video information on paper by outputting information summarizing the video information on paper. In addition, since the summary information of the video can be stored and managed on paper, it is convenient when, for example, cooking recipes are stored in a cooking program.
[0070]
【The invention's effect】
As described above, since the invention according to claim 1 summarizes the information about the video using the tree-structured video structure index, the video can be obtained with sufficient information to understand the content of the video. An information printing apparatus can be provided. In addition, since the information about the video can be summarized according to the summarization condition, it is possible to provide a video information printing apparatus that can obtain the summary information of the content according to the purpose. Furthermore, since the summary information can be printed according to the print format, it is possible to provide a video information printing apparatus capable of outputting information sufficient for understanding the content of the video on paper. Furthermore, it is possible to provide a video information printing apparatus capable of determining an index layout according to a user's request and creating an optimal index for a usage method such as sticking to a video recording medium case or filing. In addition, the invention described in the first embodiment is a special device such as a playback device that can list information including video summaries on paper, and can use the convenience of paper for processing video information (which can be viewed regardless of location). It is possible to provide a video information printing apparatus that can make use of (such as browsing without being). Also, it is possible to provide a video information printing apparatus capable of converting video into a document.
[0071]
The invention described in claim 2 Video information is efficiently searched by extracting images (representative frame groups) representing video using the video structure index from the video information, and reference frames are extracted from the representative frame groups according to the extraction conditions. It is possible to provide a video information printing apparatus capable of doing so.
[0072]
According to the third aspect of the present invention, since the summarization condition and the print format can be registered, there is no need to input the summarization condition and the print format every time the video information is printed, and a video information printing apparatus with high operability is provided. can do. In addition, it is possible to provide a video information printing apparatus capable of obtaining an index reflecting the user's preference by registering the summary condition and the print format reflecting the user's preference by the operator.
[0073]
According to the fourth aspect of the present invention, the extracted images can be represented by one image, and the extracted images can be narrowed down according to the number of necessary images. Therefore, it is possible to provide a video information printing apparatus that can extract an arbitrary number of images from information representing video.
[0074]
According to the fifth aspect of the present invention, it is possible to provide a video information printing apparatus with high operability that does not require a layout to be input every time printing is performed. In addition, it is possible to provide a video information printing apparatus in which an operator can obtain an index printed with a layout reflecting the user's preference by registering the layout reflecting the user's preference.
[0075]
According to the sixth aspect of the present invention, a video information printing apparatus capable of printing the information related to the summarized video in an optimal layout for the identification label of the video media, and obtaining an optimal printed matter for the identification label of the video media. Can be provided.
[0076]
According to the seventh aspect of the present invention, information regarding the position of the image in the video can be added to the printed matter, and the contents indicated by the image can be easily retrieved from the video. Therefore, it is possible to provide a video information printing apparatus capable of obtaining an index convenient for video processing and editing.
[0077]
The invention according to claim 8 provides a video information summarizing method capable of obtaining information sufficient for understanding the content of the video because the video related index is summarized using the video structure index of the tree structure. be able to. In addition, since the information about the video can be summarized according to the summarization condition, it is possible to provide a video information summarizing method capable of obtaining the summary information of the content according to the purpose.
[0078]
The invention according to claim 9 is: Video information is efficiently retrieved by extracting images (representative frame groups) representing the video from the video information using the video structure index, and reference frames are extracted from the representative frame groups according to the extraction conditions. It is possible to provide a video information summarization method.
[0079]
According to the tenth aspect of the present invention, the extracted images can be represented by one image, and the extracted images can be narrowed down according to the number of necessary images. Therefore, it is possible to provide a video information summarizing method that can extract an arbitrary number of images from information representing video.
[0080]
Since the recording medium concerning invention of Claim 11 can record the program which makes a computer perform any one of the video information summarizing methods of the said Claims 8-10, the said Claims 8 ~ 10. Any one of the video information summarizing methods described in 10 can be executed by a computer.
[Brief description of the drawings]
FIG. 1 is a block diagram for explaining a video information printing apparatus according to an embodiment of the present invention.
FIG. 2 is a diagram for explaining a video structure index used in the video information printing apparatus according to the embodiment of the present invention.
FIG. 3 is a diagram for explaining narrowing down of a reference frame.
FIG. 4 is a diagram illustrating a layout set as a print format in one embodiment of the present invention.
FIG. 5 is a diagram for explaining a video information summarizing method according to the present invention;
[Explanation of symbols]
101 Video structure index
102 Video information
103 Image structure interpretation part
104 Input section
105 Video summary generator
106 Print shaping unit
107 Print language generator
108 Printing department
109 Video reference frame extraction unit
110 Barcode addition part
111 Registration Department

Claims

A video information printing apparatus that summarizes information about video and prints the summarized information,
Condition setting means for setting a summary condition that is a condition for information summarization and a print format for printing summary information;
Video information summarizing means for summarizing information about the video based on a video structure index representing the structure of the video in a tree structure and the summary condition;
Print information shaping means for shaping the information summarized by the video information summarizing means based on the print format;
Print output means for outputting the information summarized by the video information summarizing means and shaped by the print information shaping means as a printed matter;
A video information printing apparatus comprising:

The summary condition includes an extraction condition for extracting a reference frame in which the representative frames are aggregated from a representative frame group representing a frame image of each segment constituting the video defined by the video structure index. The video information printing apparatus according to claim 1, wherein:

The video information printing apparatus according to claim 1, further comprising registration means for registering the summary condition and the print format.

The video information summarizing unit includes a feature amount acquisition unit that acquires a feature amount of an image extracted from a video, and a feature amount difference acquisition unit that acquires a difference between the feature amounts acquired by the feature amount acquisition unit between images. A plurality of images in which a difference in feature amounts acquired by the feature amount difference acquisition unit is equal to or less than a predetermined value is represented by one image. The video information printing device described in 1.

The print information shaping means has layout storage means for storing a plurality of layouts at the time of printing, and uses the layout stored in the layout storage means to shape the information summarized by the video information summarization means. The video information printing apparatus according to claim 1, wherein:

The video information printing apparatus according to claim 5, wherein the layout storage unit stores a layout for an identification label of the video media.

Further, the image processing apparatus further comprises position information adding means for attaching to the print information video position information relating to a position in an image of an image included in the information shaped by the print information shaping means, and the print output means includes the video position together with the print information. The video information printing apparatus according to claim 1, wherein the information is output as a printed matter.

A video information summarization method for summarizing information about video and shaping it into a printable format,
A condition setting step for setting a summary condition that is a condition for summarizing information and a print format for printing summary information;
A video information summarizing step for summarizing information about the video based on a video structure index representing the structure of the video in a tree structure and the summary condition;
A print information shaping step for shaping the information summarized in the information summarization step based on the print format;
A video information summarizing method comprising:

The summary condition includes an extraction condition for extracting a reference frame in which the representative frames are aggregated from a representative frame group representing a frame image of each segment constituting the video defined by the video structure index. The video information summarizing method according to claim 8.

The information summarization step includes a feature amount acquisition step of acquiring a feature amount of an image extracted from a video, and a feature amount difference acquisition step of acquiring a difference between the feature amounts acquired in the feature amount acquisition step between images. 10. The video information summary according to claim 8, wherein a plurality of images in which the difference between the feature amounts acquired in the feature amount difference acquisition step is equal to or less than a predetermined value are represented by one image. Method.

A computer-readable recording medium having recorded thereon a program for causing a computer to execute any one of the video information summarizing methods according to claim 8.