JP2006202118A

JP2006202118A - Attribute evaluation apparatus, method and program

Info

Publication number: JP2006202118A
Application number: JP2005014263A
Authority: JP
Inventors: Brodie Julian; ブロディジュリアン
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2005-01-21
Filing date: 2005-01-21
Publication date: 2006-08-03
Anticipated expiration: 2025-01-21
Also published as: JP4755834B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an attribute evaluation apparatus, a method and a program capable of flexibly, efficiently, accurately, and dynamically evaluating attributes related to contents. <P>SOLUTION: Mutually perpendicular vectors associated with candidates for the attributes related to the contents are used as bases, and information related to vectors representing a position within a vector space of general nouns extracted from the contents is created. Based on the created information, the attributes related to the contents are evaluated so as to flexibly, efficiently, accurately, and dynamically evaluate the attributes related to the contents. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

この発明は、コンテンツに係る情報を基にして当該コンテンツに係る属性を評価する属性評価装置、属性評価方法および属性評価プログラムに関し、特に、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができる属性評価装置、属性評価方法および属性評価プログラムに関する。 The present invention relates to an attribute evaluation apparatus, an attribute evaluation method, and an attribute evaluation program that evaluate an attribute related to content based on information related to the content, and in particular, flexibly, efficiently, accurately, and dynamically. The present invention relates to an attribute evaluation apparatus, an attribute evaluation method, and an attribute evaluation program that can evaluate attributes related to content.

従来、インターネットなどのインタラクティブなメディアにおいて、ユーザによりアクセスされるコンテンツの属性を評価することがおこなわれている。コンテンツの属性とは、コンテンツの内容により分類されたコンテンツのカテゴリーなどである。 Conventionally, in interactive media such as the Internet, an attribute of content accessed by a user is evaluated. The content attribute is a content category classified by the content content.

コンテンツの属性の評価は、インターネットで公開されているコンテンツにインターネット広告を配信する場合などに特に重要となる。すなわち、コンテンツの属性が「スポーツ」である場合には、「スポーツ」に関連したインターネット広告を配信すると効果的であるため、精度よく客観的に属性を判定することが必要とされている。 Evaluation of content attributes is particularly important when an Internet advertisement is distributed to content published on the Internet. That is, when the content attribute is “sports”, it is effective to distribute Internet advertisements related to “sports”, and therefore it is necessary to accurately and objectively determine the attributes.

コンテンツの属性の評価方法には、コンテンツの内容を人がひとつひとつ閲覧し、コンテンツの内容からコンテンツの属性を判定するものがある（たとえば、非特許文献１を参照）。そして、コンテンツが複数の属性を有すると考えられる場合には、複数の属性をコンテンツに割り当てる。 As a content attribute evaluation method, there is a method in which a person browses the content contents one by one and determines the content attributes from the content contents (see, for example, Non-Patent Document 1). If the content is considered to have a plurality of attributes, the plurality of attributes are assigned to the content.

たとえば、コンテンツの内容が「スポーツニュース」に関するものである場合には、そのコンテンツは「スポーツ」という属性と「ニュース」という属性との両方を有すると判定され、当該コンテンツに両方の属性を割り当てる。 For example, when the content is related to “sports news”, it is determined that the content has both an attribute “sport” and an attribute “news”, and both attributes are assigned to the content.

ただし、この方法では、人が主観的にコンテンツの属性を評価しているため、コンテンツの属性を正確、客観的かつ効率的に評価することが難しいという問題がある。 However, this method has a problem that it is difficult to evaluate content attributes accurately, objectively, and efficiently because people subjectively evaluate content attributes.

たとえば、コンテンツの内容が「スポーツニュース」である場合には、コンテンツの属性が「スポーツ」という属性と「ニュース」という属性との両方であると評価する例を挙げたが、人によっては「ニュース」という属性だけに限定したり、あるいは、「メディア」という属性をさらに付与したりすることもありうる。 For example, when the content of the content is “sports news”, an example is given in which the content attributes are both “sports” and “news” attributes. It is possible to limit only to the attribute “”, or to further add the attribute “media”.

このように、人が主観的にコンテンツの属性を評価する場合に、評価が一定しないため、評価された属性をインターネット広告の配信などに用いることが難しかった。 As described above, when a person subjectively evaluates an attribute of a content, the evaluation is not constant. Therefore, it is difficult to use the evaluated attribute for distribution of an Internet advertisement or the like.

そのため、コンテンツに含まれるテキストからテキストマイニングにより抽出されたキーワードや、語句の出現頻度、フォントサイズ、ウェブサイトのリンク構造などの要因を考慮してコンテンツの属性を客観的に評価し、属性に関連するインターネット広告を当該コンテンツを掲載しているウェブサイトに配信する方法も開発されている（たとえば、非特許文献２を参照。）。 Therefore, the content attributes are objectively evaluated in consideration of factors such as keywords extracted from text contained in the content by text mining, the appearance frequency of phrases, font size, website link structure, etc. A method for distributing an Internet advertisement to be distributed to a website on which the content is posted has also been developed (see, for example, Non-Patent Document 2).

ヤフー株式会社、”カテゴリ検索”、［online］、［平成１７年１月５日検索］、インターネット＜ＵＲＬ： http://howto.yahoo.co.jp/chapters/8/1.html＞Yahoo Japan Corporation, “Category Search”, [online], [Search January 5, 2005], Internet <URL: http://howto.yahoo.co.jp/chapters/8/1.html> グーグル株式会社、”AdSense”、［online］、[平成１７年１月１８日検索]、インターネット＜ＵＲＬ： https://www.google.com/adsense/premium-overview?hl=ja＞Google Inc., “AdSense”, [online], [Search January 18, 2005], Internet <URL: https://www.google.com/adsense/premium-overview?hl=en>

しかしながら、上述した特許文献２に代表される従来技術では、コンテンツの属性を柔軟に決定することが難しいという問題があった。すなわち、コンテンツの属性の分類は、言語や文化の変遷とともに変化していくものであるが、コンテンツから抽出したキーワードなどにより属性を決定する方法では、その変化に対応することが難しいという問題があった。 However, the conventional technique represented by Patent Document 2 described above has a problem that it is difficult to flexibly determine content attributes. In other words, the classification of content attributes changes with changes in language and culture, but the method of determining attributes based on keywords extracted from the content has a problem that it is difficult to cope with such changes. It was.

たとえば、インターネット広告の配信を依頼する広告主には、「スポーツ」や「ニュース」などのように直接的に表現された属性だけでなく、「ぬくもりのある」や「シャープな」、「ほのぼのとした」、「あたたかみのある」など、生活シーンに応じた感覚的に表現された属性にコンテンツを分類し、それに関連する広告を配信したいというニーズが近年生まれてきている。 For example, advertisers requesting distribution of Internet advertisements are not only directly represented attributes such as “sports” and “news”, but also “warmth”, “sharp”, In recent years, there has been a need to classify content into attributes that are expressed sensuously according to the life scene, such as “warming” and “warm”, and to distribute related advertisements.

このような場合に、コンテンツに含まれるテキストからキーワードを抽出したりするだけでは、広告主が望む上述したような属性にコンテンツを分類することが容易ではなく、コンテンツの属性評価を柔軟かつ動的におこなうことが難しいという問題があった。 In such a case, it is not easy to classify the content into the above-described attributes desired by the advertiser simply by extracting keywords from the text included in the content, and the attribute evaluation of the content is flexible and dynamic. There was a problem that it was difficult to do.

そのため、コンテンツの属性をいかに柔軟に、効率的に、正確に、また、動的に評価することができるかが、インターネット広告の広告効果を高めるために非常に重要な課題となってきている。 Therefore, how to flexibly, efficiently, accurately, and dynamically evaluate content attributes has become a very important issue in order to increase the advertising effectiveness of Internet advertising.

本発明は、上述した従来技術による問題点を解消するためになされたものであり、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができる属性評価装置、属性評価方法および属性評価プログラムを提供することを目的とする。 The present invention has been made to solve the above-described problems caused by the prior art, and is an attribute evaluation apparatus capable of flexibly, efficiently, accurately and dynamically evaluating attributes relating to content. It is an object to provide an attribute evaluation method and an attribute evaluation program.

上述した課題を解決し、目的を達成するため、本発明は、コンテンツに係る情報を基にして当該コンテンツに係る属性を評価する属性評価装置であって、前記コンテンツに係る属性の各候補に対応付けられた互いに直交するベクトルを基底とし、かつ、前記コンテンツに係る情報のベクトル空間内での位置を表すベクトルに係る情報を生成するベクトル情報生成手段と、前記ベクトル情報生成手段により生成された情報に基づいて前記コンテンツに係る属性を評価する属性評価手段と、を備えたことを特徴とする。 In order to solve the above-described problems and achieve the object, the present invention is an attribute evaluation apparatus that evaluates an attribute related to content based on information related to the content, and corresponds to each attribute candidate related to the content Vector information generating means for generating information related to a vector representing a position in a vector space of the information related to the content based on the orthogonal vectors attached to each other, and information generated by the vector information generating means And an attribute evaluation unit that evaluates an attribute related to the content based on the content.

また、本発明は、上記発明において、コンテンツに係る情報とコンテンツに係る属性の候補の情報とを対応付けて記憶したデータベースをさらに備え、前記ベクトル情報生成手段は、前記コンテンツに係る情報を検索キーとしてコンテンツに係る属性の候補の情報を前記データベースから検索し、検索の結果得られた前記コンテンツに係る属性の候補の情報を基にして、前記ベクトルに係る情報を生成することを特徴とする。 Further, the present invention, in the above invention, further comprises a database that stores information related to the content and information of candidate attributes related to the content in association with each other, and the vector information generation means searches the information related to the content with a search key. And searching for information on attribute candidates related to content from the database, and generating information related to the vector based on information on candidate attributes related to the content obtained as a result of the search.

また、本発明は、上記発明において、前記データベースは、コンテンツに係る情報とコンテンツに係る属性の候補との間の異なる組み合わせを複数記憶し、前記ベクトル情報生成手段は、複数の組み合わせのうち指定された組み合わせにおけるコンテンツに係る属性の候補の情報を検索し、検索の結果得られた前記コンテンツに係る属性の候補の情報を基にして、前記ベクトルに係る情報を生成することを特徴とする。 Also, in the present invention according to the above invention, the database stores a plurality of different combinations between the information related to the content and the attribute candidates related to the content, and the vector information generating means is designated from the plurality of combinations. Information on attribute candidates related to the content in the combination is searched, and information related to the vector is generated based on the information on the attribute candidates related to the content obtained as a result of the search.

また、本発明は、上記発明において、前記データベースは、コンテンツに係る属性の各候補の重みに係る情報をさらに記憶し、前記ベクトル情報生成手段は、前記コンテンツに係る属性の候補の情報および前記重みに係る情報を前記データベースから読み出し、読み出した情報に基づいて前記ベクトルに係る情報を生成することを特徴とする。 Also, in the present invention according to the above-mentioned invention, the database further stores information related to the weight of each candidate for the attribute related to the content, and the vector information generating means includes information about the candidate attribute for the content and the weight The information concerning is read from the database, and the information concerning the vector is generated based on the read information.

また、本発明は、上記発明において、前記属性評価手段は、所定のコンテンツに対してコンテンツに係る複数の属性と、各属性の優先度とを評価することを特徴とする。 Further, the present invention is characterized in that, in the above-mentioned invention, the attribute evaluation means evaluates a plurality of attributes related to the content and a priority of each attribute with respect to a predetermined content.

また、本発明は、上記発明において、前記属性評価手段は、前記コンテンツに係る情報の出現頻度に基づいて、前記ベクトルに係る情報の重みを設定し、設定した重みに基づいて前記コンテンツに係る属性を評価することを特徴とする。 Also, in the present invention according to the above invention, the attribute evaluation unit sets a weight of information related to the vector based on an appearance frequency of the information related to the content, and attributes related to the content based on the set weight. It is characterized by evaluating.

また、本発明は、上記発明において、前記ベクトル情報生成手段は、コンテンツに係る属性の候補の数が増加した場合に、当該候補の数の増加に応じて次元が増加したベクトル空間における前記ベクトルに係る情報を生成することを特徴とする。 Also, in the present invention according to the above invention, when the number of attribute candidates related to the content increases, the vector information generation unit adds the vector in the vector space whose dimension increases as the number of candidates increases. Such information is generated.

また、本発明は、上記発明において、前記コンテンツに係る情報は、コンテンツの内容に係る情報を含んだメタデータまたは当該コンテンツから抽出された情報であることを特徴とする。 Also, the present invention is characterized in that, in the above invention, the information relating to the content is metadata including information relating to the content of the content or information extracted from the content.

また、本発明は、上記発明において、前記ベクトル情報生成手段は、属性を評価する第１のコンテンツに対してハイパーリンクまたはトラックバックにより関連付けられた第２のコンテンツに係る属性の各候補に対応付けられている互いに直交するベクトルを基底とし、かつ、前記第２のコンテンツに係る情報のベクトル空間内での位置を表すベクトルに係る情報を生成し、前記属性評価手段は、前記ベクトル情報生成手段により生成された情報に基づいて前記第１のコンテンツに係る属性を評価することを特徴とする。 Also, in the present invention according to the above-described invention, the vector information generation unit is associated with each attribute candidate related to the second content associated with the first content to be evaluated by the hyperlink or the trackback. Generating information related to a vector representing a position in the vector space of the information related to the second content based on mutually orthogonal vectors, and the attribute evaluation means is generated by the vector information generation means The attribute relating to the first content is evaluated based on the obtained information.

また、本発明は、上記発明において、前記コンテンツに係る情報は、前記コンテンツの検索に用いられた検索語、または、ハイパーリンクが設定された語であることを特徴とする。 Also, the present invention is characterized in that, in the above invention, the information related to the content is a search word used for searching the content or a word set with a hyperlink.

また、本発明は、コンテンツに係る情報を基にして当該コンテンツに係る属性を評価する属性評価方法であって、前記コンテンツに係る属性の各候補に対応付けられた互いに直交するベクトルを基底とし、かつ、前記コンテンツに係る情報のベクトル空間内での位置を表すベクトルに係る情報を生成するベクトル情報生成工程と、前記ベクトル情報生成工程により生成された情報に基づいて前記コンテンツに係る属性を評価する属性評価工程と、を含んだことを特徴とする。 Further, the present invention is an attribute evaluation method for evaluating an attribute related to the content based on information related to the content, based on mutually orthogonal vectors associated with each candidate for the attribute related to the content, And a vector information generating step for generating information relating to a vector representing a position of the information relating to the content in a vector space, and evaluating an attribute relating to the content based on the information generated by the vector information generating step And an attribute evaluation process.

また、本発明は、上記発明において、前記ベクトル情報生成工程は、前記コンテンツに係る情報を検索キーとしてコンテンツに係る情報とコンテンツに係る属性の候補の情報とを対応付けて記憶したデータベースからコンテンツに係る属性の候補の情報を検索し、検索の結果得られた前記コンテンツに係る属性の候補の情報を基にして、前記ベクトルに係る情報を生成することを特徴とする。 Also, in the present invention according to the above invention, the vector information generating step may be configured such that the information related to the content is stored in association with the information related to the content and the attribute candidate information related to the content using the information related to the content as a search key. Information regarding the attribute candidate is searched, and information regarding the vector is generated based on the attribute candidate information regarding the content obtained as a result of the search.

また、本発明は、上記発明において、前記ベクトル情報生成工程は、コンテンツに係る情報とコンテンツに係る属性の候補との間の異なる組み合わせを複数記憶したデータベースから、指定された組み合わせにおけるコンテンツに係る属性の候補の情報を検索し、検索の結果得られた前記コンテンツに係る属性の候補の情報を基にして、前記ベクトルに係る情報を生成することを特徴とする。 Also, in the present invention according to the above invention, the vector information generation step may include the attribute related to the content in the specified combination from a database storing a plurality of different combinations between the information related to the content and the attribute candidates related to the content. The information on the vector is searched, and the information on the vector is generated based on the information on the candidate attribute for the content obtained as a result of the search.

また、本発明は、上記発明において、前記ベクトル情報生成工程は、前記コンテンツに係る属性の候補の情報およびコンテンツに係る属性の各候補の重みに係る情報を記憶したデータベースから前記属性の候補の情報および重みに係る情報を読み出し、読み出した情報に基づいて前記ベクトルに係る情報を生成することを特徴とする。 Also, in the present invention according to the above-described invention, the vector information generation step includes the attribute candidate information from the database storing the attribute candidate information related to the content and the information related to the weight of each attribute candidate related to the content. And information related to the weight is read out, and information related to the vector is generated based on the read-out information.

また、本発明は、上記発明において、前記属性評価工程は、所定のコンテンツに対してコンテンツに係る複数の属性と、各属性の優先度とを評価することを特徴とする。 Further, the present invention is characterized in that, in the above-mentioned invention, the attribute evaluation step evaluates a plurality of attributes related to the content and a priority of each attribute for a predetermined content.

また、本発明は、上記発明において、前記属性評価工程は、前記コンテンツに係る情報の出現頻度に基づいて、前記ベクトルに係る情報の重みを設定し、設定した重みに基づいて前記コンテンツに係る属性を評価することを特徴とする。 Also, in the present invention according to the above invention, the attribute evaluation step sets a weight of information related to the vector based on an appearance frequency of the information related to the content, and attributes related to the content based on the set weight. It is characterized by evaluating.

また、本発明は、上記発明において、前記ベクトル情報生成工程は、コンテンツに係る属性の候補の数が増加した場合に、当該候補の数の増加に応じて次元が増加したベクトル空間における前記ベクトルに係る情報を生成することを特徴とする。 Further, in the present invention according to the above invention, when the number of attribute candidates related to the content increases, the vector information generation step applies the vector in the vector space whose dimension increases in accordance with the increase in the number of candidates. Such information is generated.

また、本発明は、上記発明において、前記ベクトル情報生成工程は、属性を評価する第１のコンテンツに対してハイパーリンクまたはトラックバックにより関連付けられた第２のコンテンツに係る属性の各候補に対応付けられている互いに直交するベクトルを基底とし、かつ、前記第２のコンテンツに係る情報のベクトル空間内での位置を表すベクトルに係る情報を生成し、前記属性評価工程は、前記ベクトル情報生成工程により生成された情報に基づいて前記第１のコンテンツに係る属性を評価することを特徴とする。 Also, in the present invention according to the above invention, the vector information generation step is associated with each attribute candidate related to the second content associated with the first content for which the attribute is evaluated by hyperlink or trackback. Generating information related to a vector representing a position in the vector space of information related to the second content based on orthogonal vectors that are orthogonal to each other, and the attribute evaluation step is generated by the vector information generation step The attribute relating to the first content is evaluated based on the obtained information.

また、本発明は、コンテンツに係る情報を基にして当該コンテンツに係る属性を評価する属性評価プログラムであって、前記コンテンツに係る属性の各候補に対応付けられた互いに直交するベクトルを基底とし、かつ、前記コンテンツに係る情報のベクトル空間内での位置を表すベクトルに係る情報を生成するベクトル情報生成手順と、前記ベクトル情報生成手順により生成された情報に基づいて前記コンテンツに係る属性を評価する属性評価手順と、をコンピュータに実行させることを特徴とする。 Further, the present invention is an attribute evaluation program for evaluating attributes related to the content based on information related to the content, based on mutually orthogonal vectors associated with the respective attribute candidates related to the content, In addition, a vector information generation procedure for generating information related to a vector representing a position of the information related to the content in a vector space, and an attribute related to the content are evaluated based on the information generated by the vector information generation procedure An attribute evaluation procedure is executed by a computer.

また、本発明は、上記発明において、前記ベクトル情報生成手順は、前記コンテンツに係る情報を検索キーとしてコンテンツに係る情報とコンテンツに係る属性の候補の情報とを対応付けて記憶したデータベースからコンテンツに係る属性の候補の情報を検索し、検索の結果得られた前記コンテンツに係る属性の候補の情報を基にして、前記ベクトルに係る情報を生成することを特徴とする。 Also, in the present invention according to the above-mentioned invention, the vector information generation procedure may be configured such that the information related to the content is stored in association with the information related to the content and the attribute candidate information related to the content using the information related to the content as a search key. Information regarding the attribute candidate is searched, and information regarding the vector is generated based on the attribute candidate information regarding the content obtained as a result of the search.

また、本発明は、上記発明において、前記ベクトル情報生成手順は、コンテンツに係る情報とコンテンツに係る属性の候補との間の異なる組み合わせを複数記憶したデータベースから、指定された組み合わせにおけるコンテンツに係る属性の候補の情報を検索し、検索の結果得られた前記コンテンツに係る属性の候補の情報を基にして、前記ベクトルに係る情報を生成することを特徴とする。 Further, in the present invention according to the above invention, the vector information generation procedure may be configured such that the attribute relating to the content in the specified combination is stored in a database storing a plurality of different combinations between the information relating to the content and the attribute candidates relating to the content The information on the vector is searched, and the information on the vector is generated based on the information on the candidate attribute for the content obtained as a result of the search.

また、本発明は、上記発明において、前記ベクトル情報生成手順は、前記コンテンツに係る属性の候補の情報およびコンテンツに係る属性の各候補の重みに係る情報を記憶したデータベースから前記属性の候補の情報および重みに係る情報を読み出し、読み出した情報に基づいて前記ベクトルに係る情報を生成することを特徴とする。 Also, in the present invention according to the above-described invention, the vector information generation procedure may include information on the attribute candidates from a database storing information on attribute candidates related to the content and information related to weights of each attribute candidate related to the content. And information related to the weight is read out, and information related to the vector is generated based on the read-out information.

また、本発明は、上記発明において、前記属性評価手順は、所定のコンテンツに対してコンテンツに係る複数の属性と、各属性の優先度とを評価することを特徴とする。 Also, the present invention is characterized in that, in the above-mentioned invention, the attribute evaluation procedure evaluates a plurality of attributes related to the content and a priority of each attribute with respect to a predetermined content.

また、本発明は、上記発明において、前記属性評価手順は、前記コンテンツに係る情報の出現頻度に基づいて、前記ベクトルに係る情報の重みを設定し、設定した重みに基づいて前記コンテンツに係る属性を評価することを特徴とする。 Also, in the present invention according to the above-mentioned invention, the attribute evaluation procedure sets a weight of information related to the vector based on an appearance frequency of the information related to the content, and an attribute related to the content based on the set weight It is characterized by evaluating.

また、本発明は、上記発明において、前記ベクトル情報生成手順は、コンテンツに係る属性の候補の数が増加した場合に、当該候補の数の増加に応じて次元が増加したベクトル空間における前記ベクトルに係る情報を生成することを特徴とする。 Further, in the present invention according to the above invention, when the number of attribute candidates related to content is increased, the vector information generation procedure is performed on the vector in the vector space whose dimension is increased in accordance with the increase in the number of candidates. Such information is generated.

また、本発明は、上記発明において、前記ベクトル情報生成手順は、属性を評価する第１のコンテンツに対してハイパーリンクまたはトラックバックにより関連付けられた第２のコンテンツに係る属性の各候補に対応付けられている互いに直交するベクトルを基底とし、かつ、前記第２のコンテンツに係る情報のベクトル空間内での位置を表すベクトルに係る情報を生成し、前記属性評価手順は、前記ベクトル情報生成手順により生成された情報に基づいて前記第１のコンテンツに係る属性を評価することを特徴とする。 Also, in the present invention according to the above invention, the vector information generation procedure is associated with each attribute candidate related to the second content associated with the first content for which the attribute is evaluated by a hyperlink or a trackback. Generating information related to a vector representing a position in the vector space of the information related to the second content based on the vectors orthogonal to each other, and the attribute evaluation procedure is generated by the vector information generation procedure The attribute relating to the first content is evaluated based on the obtained information.

本発明によれば、コンテンツに係る属性の各候補に対応付けられた互いに直交するベクトルを基底とし、かつ、コンテンツに係る情報のベクトル空間内での位置を表すベクトルに係る情報を生成し、生成した情報に基づいてコンテンツに係る属性を評価することとしたので、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができるという効果を奏する。 According to the present invention, information related to a vector representing a position in a vector space of information related to content is generated based on vectors orthogonal to each other associated with each candidate for the attribute related to the content. Since the attribute related to the content is evaluated based on the information, the attribute related to the content can be evaluated flexibly, efficiently, accurately and dynamically.

また、本発明によれば、コンテンツに係る情報を検索キーとしてコンテンツに係る情報とコンテンツに係る属性の候補の情報とを対応付けて記憶したデータベースからコンテンツに係る属性の候補の情報を検索し、検索の結果得られたコンテンツに係る属性の候補の情報を基にしてベクトルに係る情報を生成することとしたので、データベースに記憶されたコンテンツに係る属性の候補の情報を読み出すことにより、効率的にベクトル情報を生成することができるという効果を奏する。 Further, according to the present invention, the information on the content is searched for from the database in which the information on the content is stored in association with the information on the content and the information on the attribute candidate on the content using the information on the content as a search key. Since the information related to the vector is generated based on the attribute candidate information related to the content obtained as a result of the search, it is efficient by reading the attribute candidate information related to the content stored in the database. This produces an effect that vector information can be generated.

また、本発明によれば、コンテンツに係る情報とコンテンツに係る属性の候補との間の異なる組み合わせを複数記憶したデータベースから、指定された組み合わせにおけるコンテンツに係る属性の候補の情報を検索し、検索の結果得られたコンテンツに係る属性の候補の情報を基にして、ベクトルに係る情報を生成することとしたので、コンテンツに係る属性を柔軟に評価することができるという効果を奏する。 Further, according to the present invention, information on attribute candidates related to content in a specified combination is searched from a database storing a plurality of different combinations between information related to content and content attribute candidates. Since the information related to the vector is generated based on the candidate attribute information related to the content obtained as a result of the above, it is possible to flexibly evaluate the attribute related to the content.

また、本発明によれば、コンテンツに係る属性の候補の情報およびコンテンツに係る属性の各候補の重みに係る情報を記憶したデータベースから属性の候補の情報および重みに係る情報を読み出し、読み出した情報に基づいてベクトルに係る情報を生成することとしたので、コンテンツに係る属性の各候補の重みを考慮してコンテンツに係る情報の評価をおこなうことにより、より正確にコンテンツに係る属性を評価することができるという効果を奏する。 Further, according to the present invention, the information on the candidate attributes and the information on the weights are read from the database storing the information on the attribute candidates related to the contents and the information on the weights of the respective attribute candidates related to the contents. Since the information related to the vector is generated based on the content, the attribute related to the content can be evaluated more accurately by evaluating the information related to the content in consideration of the weight of each candidate for the attribute related to the content. There is an effect that can be.

また、本発明によれば、所定のコンテンツに対してコンテンツに係る複数の属性と、各属性の優先度とを評価することとしたので、コンテンツに係る属性を任意の精度で評価することができるという効果を奏する。 Further, according to the present invention, since a plurality of attributes related to the content and the priority of each attribute are evaluated for a predetermined content, the attribute related to the content can be evaluated with arbitrary accuracy. There is an effect.

また、本発明によれば、コンテンツに係る情報の出現頻度に基づいて、ベクトルに係る情報の重みを設定し、設定した重みに基づいてコンテンツに係る属性を評価することとしたので、コンテンツに係る情報の出現頻度を考慮することにより、より正確にコンテンツに係る属性を評価することができるという効果を奏する。 Also, according to the present invention, the weight of information related to the vector is set based on the appearance frequency of the information related to the content, and the attribute related to the content is evaluated based on the set weight. By considering the appearance frequency of information, there is an effect that the attribute related to the content can be more accurately evaluated.

また、本発明によれば、コンテンツに係る属性の候補の数が増加した場合に、当該候補の数の増加に応じて次元が増加したベクトル空間におけるベクトルに係る情報を生成することとしたので、コンテンツに係る属性の候補の数の増加に動的に対応することができるという効果を奏する。 Further, according to the present invention, when the number of attribute candidates related to the content increases, the information related to the vector in the vector space whose dimension increases in accordance with the increase in the number of candidates, There is an effect that it is possible to dynamically cope with an increase in the number of attribute candidates related to the content.

また、本発明によれば、コンテンツに係る情報は、コンテンツの内容に係る情報を含んだメタデータまたは当該コンテンツから抽出された情報であることとしたので、メタデータまたはコンテンツから抽出された情報を基にして、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができるという効果を奏する。 Further, according to the present invention, since the information related to the content is metadata including information related to the content of the content or information extracted from the content, the information extracted from the metadata or content is Based on this, it is possible to evaluate the attribute relating to the content flexibly, efficiently, accurately and dynamically.

また、本発明によれば、属性を評価する第１のコンテンツに対してハイパーリンクまたはトラックバックにより関連付けられている第２のコンテンツに係る属性の各候補に対応付けられた互いに直交するベクトルを基底とし、かつ、第２のコンテンツに係る情報のベクトル空間内での位置を表すベクトルに係る情報を生成し、生成した情報に基づいて第１のコンテンツに係る属性を評価することとしたので、ハイパーリンクまたはトラックバックにより関連付けられた第２のコンテンツから第１のコンテンツの属性を柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができるという効果を奏する。 In addition, according to the present invention, the vectors orthogonal to each other associated with the respective attribute candidates related to the second content associated with the first content to be evaluated by the hyperlink or the trackback are used as a basis. Since the information related to the vector representing the position of the information related to the second content in the vector space is generated and the attribute related to the first content is evaluated based on the generated information, the hyperlink Alternatively, it is possible to evaluate the attribute of the first content flexibly, efficiently, accurately, and dynamically from the second content associated by the track back.

また、本発明によれば、コンテンツに係る情報は、コンテンツの検索に用いられた検索語、または、ハイパーリンクが設定された語であることとしたので、コンテンツの検索に用いられた検索語、または、ハイパーリンクが設定された語を基にして、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができるという効果を奏する。 Further, according to the present invention, since the information related to the content is a search word used for content search or a word set with a hyperlink, the search word used for content search, Alternatively, it is possible to evaluate the attribute relating to the content flexibly, efficiently, accurately and dynamically based on the word for which the hyperlink is set.

以下に添付図面を参照して、本発明に係る属性評価装置、属性評価方法および属性評価プログラムの好適な実施例を詳細に説明する。 Exemplary embodiments of an attribute evaluation apparatus, an attribute evaluation method, and an attribute evaluation program according to the present invention will be described below in detail with reference to the accompanying drawings.

まず、本発明にかかる属性評価処理の概念について説明する。図１は、本発明にかかる属性評価処理の概念を説明する図である。図１は、「卒業を控えた僕は、シンセサイザーで作曲することが好きで、専攻は、自然言語処理学である。」というテキストからなるコンテンツに係る属性を評価する場合の例である。 First, the concept of attribute evaluation processing according to the present invention will be described. FIG. 1 is a diagram for explaining the concept of attribute evaluation processing according to the present invention. FIG. 1 is an example in the case of evaluating an attribute relating to a content composed of texts such as “I am a graduate student who likes composition with a synthesizer, and major is natural language processing.”

ここで、コンテンツに係る属性とは、コンテンツの内容から判定されるコンテンツが属するカテゴリーや、当該コンテンツにアクセスするユーザの属性などのことである。たとえば、コンテンツが「スポーツ」という属性である場合には、当該コンテンツにアクセスするユーザは「スポーツ」に興味があるユーザであると評価され、「スポーツ」がユーザの興味に係る属性として設定される。 Here, the attribute relating to the content includes a category to which the content determined from the content content belongs, an attribute of a user accessing the content, and the like. For example, when the content has an attribute of “sports”, the user who accesses the content is evaluated as a user who is interested in “sports”, and “sports” is set as an attribute related to the user's interests. .

この属性評価処理においては、まず、上記テキストに対して形態素解析がまず実行される。図２は、形態素解析の実行結果の一例を示す図である。図２に示すように、形態素解析においては、テキストが各形態素に分解され、各形態素の品詞を解析する処理がおこなわれる。 In this attribute evaluation process, first, morphological analysis is first performed on the text. FIG. 2 is a diagram illustrating an example of the execution result of the morphological analysis. As shown in FIG. 2, in the morpheme analysis, the text is decomposed into each morpheme, and a process of analyzing the part of speech of each morpheme is performed.

続いて、品詞が一般名詞（図２では、「名詞-一般」に対応する。）である形態素が抽出される。図２の例では、「シンセサイザー」および「自然言語処理学」の２つの一般名詞が抽出される。 Subsequently, a morpheme whose part of speech is a general noun (corresponding to “noun-general” in FIG. 2) is extracted. In the example of FIG. 2, two common nouns “synthesizer” and “natural language processing” are extracted.

ここでは、一般名詞のみを抽出することとしたが、「卒業」、「作曲」、「専攻」などのサ変接続名詞（図２では、「名詞-サ変接続」に対応する。）をさらに加えることとしてもよい。 In this example, only common nouns are extracted, but addi- tional connection nouns such as “Graduation”, “Composition”, “Major” (corresponding to “Noun-Sabari Connection” in FIG. 2) are further added. It is good.

一方、一般名詞に対応する基本属性要素および基本属性要素間の比率の情報を記憶したデータベースをあらかじめ準備しておく。このデータベースを以下ではデジタルシソーラスと呼ぶこととする。図３は、基本属性要素および基本属性要素間の比率の情報を記憶したデジタルシソーラスの一例を示す図である。 On the other hand, a database storing basic attribute elements corresponding to general nouns and information on ratios between basic attribute elements is prepared in advance. Hereinafter, this database is referred to as a digital thesaurus. FIG. 3 is a diagram illustrating an example of a digital thesaurus that stores information on basic attribute elements and ratios between basic attribute elements.

図３に示すように、このデジタルシソーラスは、一般名詞、基本属性要素および要素比率の情報を記憶している。一般名詞は、品詞が一般名詞に分類される単語である。基本属性要素は、一般名詞が属するカテゴリーである。 As shown in FIG. 3, the digital thesaurus stores information on general nouns, basic attribute elements, and element ratios. A general noun is a word whose part of speech is classified as a general noun. A basic attribute element is a category to which a general noun belongs.

たとえば、図３の例では、「シンセサイザー」は「音楽」および「コンピュータ」の２つの基本属性要素に属し、「自然言語処理学」は、「コンピュータ」および「言語学」の２つの基本属性要素に属している。 For example, in the example of FIG. 3, “synthesizer” belongs to two basic attribute elements “music” and “computer”, and “natural language processing” is two basic attribute elements “computer” and “linguistics”. Belongs to.

要素比率は、基本属性要素間に割り当てられた比率である。たとえば、図３の例では、「シンセサイザー」の２つの基本属性要素である「音楽」および「コンピュータ」の要素比率は１：１に設定され、「自然言語処理学」の２つの基本属性要素である「コンピュータ」および「音楽」の要素比率は３：２に設定されている。この要素比率は、基本属性要素に対する重みとして用いられるものである。 The element ratio is a ratio assigned between basic attribute elements. For example, in the example of FIG. 3, the element ratio of two basic attribute elements of “synthesizer” “music” and “computer” is set to 1: 1, and two basic attribute elements of “natural language processing” are set. The element ratio of a certain “computer” and “music” is set to 3: 2. This element ratio is used as a weight for the basic attribute element.

この属性評価処理では、一般名詞がコンテンツから抽出された場合に、その一般名詞に対応する基本属性要素および要素比率の情報をデジタルシソーラスから検索する。そして、それらの情報に基づいて、図１に示すように、各基本属性要素に対応し、互いに直交する基底ベクトルを用いて一般名詞のベクトル空間（ヒルベルト空間）内での位置を表現する。 In this attribute evaluation process, when a general noun is extracted from the content, basic attribute elements and element ratio information corresponding to the general noun are searched from the digital thesaurus. Based on these pieces of information, as shown in FIG. 1, the positions of the general nouns in the vector space (Hilbert space) are expressed using base vectors that correspond to the basic attribute elements and are orthogonal to each other.

具体的には、一般名詞は以下のように表現される。
ｎ_i ＝ Σａ_j |ｅ_j> ．．．（式１）
Σａ_j ² ＝１．．．（式２） Specifically, general nouns are expressed as follows:
n _i = Σa _j | e _j >. . . (Formula 1)
Σa _j ² = 1. . . (Formula 2)

ここで、ｎ_iは、一般名詞ｉを表す大きさが１の単位ベクトルであり、|ｅ_j>は、基本属性要素ｊに対応する大きさが１の正規直交基底ベクトルであり、ａ_jは、基本属性要素ｊに対応し、要素比率から算出される|ｅ_j>の重みであり、Σａ_j |ｅ_j>は、すべてのｊに対するａ_j |ｅ_j>の和であり、Σａ_j ²は、すべてのｊに対するａ_j ²の和である。 Here, n _i is a unit vector of size 1 representing the general noun i, | e _j > is an orthonormal basis vector of size 1 corresponding to the basic attribute element j, and a _j is , A weight of | e _j > corresponding to the basic attribute element j and calculated from the element ratio, Σa _j | e _j > is the sum of a _j | e _j > for all j, and Σa _j ² Is the sum of a _j ² for all j.

たとえば、「シンセサイザー」という一般名詞に対応する基本属性要素は「音楽」および「コンピュータ」であり、「音楽」および「コンピュータ」の要素比率は１：１であるため、「シンセサイザー」を表す単位ベクトルｎ₁は、
ｎ₁ ＝１／√２ |ｅ₁> ＋１／√２ |ｅ₂> ．．．（式３）
となる。 For example, since the basic attribute elements corresponding to the general noun “synthesizer” are “music” and “computer”, and the element ratio of “music” and “computer” is 1: 1, the unit vector representing “synthesizer” n ₁ is
n ₁ = 1 / √2 | e ₁ > + 1 / √2 | e ₂ >. . . (Formula 3)
It becomes.

また、「自然言語処理学」という一般名詞に対応する基本属性要素は「コンピュータ」および「言語学」であり、「コンピュータ」および「言語学」の要素比率は３：２であるため、「自然言語処理学」を表す単位ベクトルｎ₂は、
ｎ₂ ＝３／√１３ |ｅ₂> ＋２／√１３ |ｅ₃> ．．．（式４）
となる。 The basic attribute elements corresponding to the general noun “natural language processing” are “computer” and “linguistics”, and the ratio of “computer” and “linguistics” is 3: 2. The unit vector n ₂ representing “language processing” is
n ₂ = 3 / √13 | e ₂ > + 2 / √13 | e ₃ >. . . (Formula 4)
It becomes.

そして、コンテンツに係る属性は、各一般名詞に対応するベクトルの和、すなわち、
ｐ＝ Σｎ_i ．．．（式５）
により評価される。ここで、ｐは、コンテンツに係る属性を表す属性ベクトルであり、Σｎ_iは、すべてのｉに対するｎ_iの和である。 The attribute related to the content is the sum of vectors corresponding to the respective general nouns, that is,
p = Σn _i . . . (Formula 5)
It is evaluated by. Here, p is an attribute vector representing an attribute related to the content, and Σn _i is the sum of n _i for all i.

たとえば、「シンセサイザー」および「自然言語処理学」という一般名詞が抽出されたコンテンツに係る属性を表す属性ベクトルｐは、
ｐ＝ｎ₁ ＋ｎ₂
＝１／√２ |ｅ₁> ＋ (１／√２＋３／√１３) |ｅ₂> ＋２／√１３ |ｅ₃>
．．．（式６）
となる。 For example, an attribute vector p representing attributes related to content from which general nouns “synthesizer” and “natural language processing” are extracted is:
p = n ₁ + n ₂
= 1 / √2 | e ₁ > + (1 / √2 + 3 / √13) | e ₂ > + 2 / √13 | e ₃ >
. . . (Formula 6)
It becomes.

これにより、コンテンツに係る属性が「音楽」、「コンピュータ」、「言語学」である割合はそれぞれ、
１／√２： (１／√２＋３／√１３) ：２／√１３
≒ ２５％：５５％：２０％．．．（式７）
と評価される。 As a result, the percentage of content attributes “music”, “computer”, and “linguistics”
1 / √2: (1 / √2 + 3 / √13): 2 / √13
≒ 25%: 55%: 20%. . . (Formula 7)
It is evaluated.

ここで、上記割合はコンテンツに係る属性を選択する場合の優先度と考えることができる。すなわち、コンテンツに係る属性を１つに決定する必要がある場合には「コンピュータ」を選択し、コンテンツに係る属性を２つ決定する場合には「コンピュータ」および「音楽」を選択すればよい。 Here, the ratio can be considered as a priority in selecting an attribute related to content. That is, if it is necessary to determine one attribute related to the content, “computer” is selected, and if two attributes related to the content are determined, “computer” and “music” may be selected.

なお、ここでは式７に示したように、各基底ベクトル |ｅ_j> に対応する係数の比率を単に比較することとしたが、式５の属性ベクトルを正規化し、正規化した属性ベクトルの各基底ベクトル |ｅ_j> に対応する係数の２乗の比率を比較することとしてもよい。 Here, as shown in Expression 7, the ratios of the coefficients corresponding to the respective base vectors | e _j > are simply compared. However, the attribute vectors of Expression 5 are normalized, and each of the normalized attribute vectors is The ratio of the squares of the coefficients corresponding to the basis vectors | e _j > may be compared.

たとえば、式６により示される属性ベクトルｐを正規化すると、
ｐ＝０．３９６７ |ｅ₁> ＋０．８６３６ |ｅ₂> ＋０．３１１２ |ｅ₃>
となる。 For example, when the attribute vector p represented by Equation 6 is normalized,
_{p = 0.3967 | e 1> +} 0.8636 | e 2> + 0.3112 | e 3>
It becomes.

この場合には、コンテンツに係る属性が「音楽」、「コンピュータ」、「言語学」である割合はそれぞれ、
０．３９６７² ：０．８６３６² ：０．３１１２²
≒ １６％：７４％：１０％
と評価される。 In this case, the percentage of attributes related to the content is “music”, “computer”, “linguistics”
0.3967 ² : 0.8636 ² : 0.3112 ²
≒ 16%: 74%: 10%
It is evaluated.

また、コンテンツに係る属性を評価する属性ベクトルの大きさを正規化しておくと、他のコンテンツの属性ベクトルとの間で基底ベクトルに対応する係数の比較ができるようになるという利点も生じる。 Further, if the size of the attribute vector for evaluating the attribute related to the content is normalized, there is an advantage that the coefficient corresponding to the base vector can be compared with the attribute vector of other content.

つぎに、一般名詞を表すベクトルを互いに直交する基底ベクトルを用いて表す理由を説明する。図４は、互いに直交しない基底ベクトルを用いた場合のコンテンツに係る属性評価方法を説明する図である。たとえば、「音楽を聴きながら量子化暗号について考える」というテキストからなるコンテンツに係る属性を評価する場合を考える。 Next, the reason why vectors representing general nouns are represented using mutually orthogonal basis vectors will be described. FIG. 4 is a diagram for explaining an attribute evaluation method for content when base vectors that are not orthogonal to each other are used. For example, consider a case where an attribute relating to a content consisting of a text “thinking about quantization encryption while listening to music” is evaluated.

そして、形態素解析により、「音楽」および「量子化暗号」という一般名詞が抽出され、「音楽」に対応する基本属性要素として「音楽」が、「量子化暗号」に対応する基本属性要素として「物理学」および「コンピュータ」がデジタルシソーラスから検索され、さらに、それぞれの基本属性要素に対応する要素比率が検索されたものとする。 Then, by the morphological analysis, general nouns “music” and “quantized cipher” are extracted, and “music” is selected as the basic attribute element corresponding to “music”, and “ It is assumed that “physics” and “computer” are searched from the digital thesaurus, and further, element ratios corresponding to the respective basic attribute elements are searched.

図４には、「音楽」および「量子化暗号」という一般名詞に対応し、基本属性要素および要素比率に基づいて算出されたベクトルｎ₃，ｎ₄が示されている。ただし、このベクトルの基底 |ｅ₄> 〜 |ｅ₇> は互いに直交するものではない。 FIG. 4 shows vectors n ₃ and n ₄ calculated on the basis of basic attribute elements and element ratios, corresponding to the general nouns “music” and “quantized cipher”. However, the bases | e ₄ > to | e ₇ > of the vectors are not orthogonal to each other.

このような場合、単にベクトルｎ₃，ｎ₄の和を式５に基づいて算出すると、その結果得られる属性ベクトルｑは「音楽」という要素を持たなくなり、コンテンツが「音楽」とは無関係であるといった誤った判定となることがある。このようなことを防ぐために、ここでは、互いに直交するベクトルを一般名詞を現すベクトルの基底ベクトルとして用いることとしている。 In such a case, if the sum of the vectors n ₃ and n ₄ is simply calculated based on Equation 5, the resulting attribute vector q does not have the element “music”, and the content is irrelevant to “music”. May be wrong. In order to prevent this, here, vectors that are orthogonal to each other are used as basis vectors of vectors representing general nouns.

このように、本発明にかかる属性評価処理では、基本属性要素に対応付けられた互いに直交するベクトルを基底とし、かつ、コンテンツから抽出した一般名詞のベクトル空間内での位置を表すベクトルに係る情報を生成し、生成した情報に基づいてコンテンツに係る属性を評価することとした。 As described above, in the attribute evaluation processing according to the present invention, information related to a vector representing a position in a vector space of a general noun extracted from content based on mutually orthogonal vectors associated with basic attribute elements. And attributes related to the content are evaluated based on the generated information.

ここで、基底となるベクトルには任意の基本属性要素を対応付けることができるため、インターネット広告の広告主が「ぬくもりのある」や「シャープな」、「ほのぼのとした」、「あたたかみのある」など、生活シーンに応じた感覚的に表現された属性などを新たに追加したい場合にも、デジタルシソーラスを更新することにより柔軟に対応することができる。 Here, any basic attribute element can be associated with the underlying vector, so Internet advertising advertisers can “warm”, “sharp”, “warm”, “warm” Even when it is desired to newly add a sensually expressed attribute according to the life scene, etc., it can be flexibly handled by updating the digital thesaurus.

また、互いに直交するベクトルを用いてコンテンツに係る属性を評価するため、属性を正確に評価することができる。さらに、ベクトル演算によりコンテンツに係る属性を評価するため、効率的に属性を評価することができる。 Moreover, since the attribute which concerns on a content is evaluated using the vector which mutually orthogonally crosses, an attribute can be evaluated correctly. Furthermore, since the attribute related to the content is evaluated by vector calculation, the attribute can be efficiently evaluated.

また、上述したように、基本属性要素がＮ個ある場合には、一般名詞は、基本属性要素に対応する正規直交ベクトルを基底とするＮ次元のベクトルで表される。そのため、デジタルシソーラスに一般名詞に対する基本属性要素が追加された場合でも、単にベクトルの次元を増加させることにより、基本属性要素の追加に動的に対応することができる。 As described above, when there are N basic attribute elements, the general noun is represented by an N-dimensional vector based on an orthonormal vector corresponding to the basic attribute element. Therefore, even when a basic attribute element for a general noun is added to the digital thesaurus, it is possible to dynamically respond to the addition of the basic attribute element by simply increasing the dimension of the vector.

つぎに、本実施例に係る属性評価システムの機能構成について説明する。図５は、本実施例に係る属性評価システム２０の機能構成を示す図である。以下では、インターネット３０を介して公開されているウェブサイトのコンテンツに係る属性を評価する場合について説明する。 Next, a functional configuration of the attribute evaluation system according to the present embodiment will be described. FIG. 5 is a diagram illustrating a functional configuration of the attribute evaluation system 20 according to the present embodiment. Below, the case where the attribute which concerns on the content of the website published via the internet 30 is evaluated is demonstrated.

図５に示すように、この属性評価システム２０は、外部ウェブサーバ１０ａ〜１０ｃとインターネット３０を介して接続されている。外部ウェブサーバ１０ａ〜１０ｃは、属性評価システム２０の外部でウェブサイトの閲覧サービスを提供しているサーバである。この外部ウェブサーバ１０ａ〜１０ｃは、ウェブサイトを構築するＨＴＭＬ（Hyper Text Markup Language）データおよびウェブサイトに対するユーザのアクセス履歴であるアクセスログを記憶している。 As shown in FIG. 5, the attribute evaluation system 20 is connected to external web servers 10 a to 10 c via the Internet 30. The external web servers 10 a to 10 c are servers that provide website browsing services outside the attribute evaluation system 20. The external web servers 10a to 10c store HTML (Hyper Text Markup Language) data for constructing a website and an access log that is a user's access history to the website.

属性評価システム２０は、ウェブサイトにおけるコンテンツの閲覧サービスを提供するとともに、閲覧サービスを提供するコンテンツおよび当該コンテンツにアクセスするユーザの属性を評価するシステムである。 The attribute evaluation system 20 is a system that provides a content browsing service on a website and evaluates the content of the browsing service and the attributes of a user who accesses the content.

この属性評価システム２０は、ウェブサーバ４０ａ〜４０ｃ、情報収集サーバ５０、属性評価サーバ６０および広告サーバ７０がＬＡＮ（Local Area Network）８０を介して接続された構成となっている。また、ウェブサーバ４０ａ〜４０ｃ、情報収集サーバ５０および広告サーバ７０は、外部ウェブサーバ１０ａ〜１０ｃとインターネット３０を介して接続されている。 The attribute evaluation system 20 has a configuration in which web servers 40a to 40c, an information collection server 50, an attribute evaluation server 60, and an advertisement server 70 are connected via a LAN (Local Area Network) 80. The web servers 40a to 40c, the information collection server 50, and the advertisement server 70 are connected to the external web servers 10a to 10c via the Internet 30.

ウェブサーバ４０ａ〜４０ｃは、ウェブサイト内のコンテンツの閲覧サービスを提供するサーバである。このウェブサーバ４０ａ〜４０ｃは、ウェブサイトを構築するＨＴＭＬデータおよびウェブサイトに対するユーザのアクセス履歴であるアクセスログを記憶している。 The web servers 40a to 40c are servers that provide a browsing service for content in a website. The web servers 40a to 40c store HTML data for constructing a website and an access log that is a user's access history to the website.

情報収集サーバ５０は、他のサーバにアクセスし、ウェブサイトのコンテンツ間に設定されたリンクの情報や、ウェブサイト内のコンテンツのメタデータの情報、ウェブサイト内のコンテンツの情報などを収集するサーバである。この情報収集サーバ５０は、データ送受信部５００、情報収集部５０１、記憶部５０２および制御部５０３を有する。 The information collection server 50 accesses another server and collects link information set between website contents, metadata information of website contents, contents information of websites, and the like. It is. The information collection server 50 includes a data transmission / reception unit 500, an information collection unit 501, a storage unit 502, and a control unit 503.

データ送受信部５００は、他のサーバとの間でインターネット３０またはＬＡＮ８０を介してさまざまなデータの授受をおこなうネットワークインターフェースである。情報収集部５０１は、インターネット３０に接続された外部ウェブサーバ１０ａ〜１０ｃやウェブサーバ４０ａ〜４０ｃにアクセスし、ウェブサイトのコンテンツ間に設定されたリンクの情報や、ウェブサイト内のコンテンツのメタデータの情報、ウェブサイト内のコンテンツの情報などを収集して、それらの情報を記憶部５０２に記憶する。 The data transmission / reception unit 500 is a network interface that exchanges various data with other servers via the Internet 30 or the LAN 80. The information collection unit 501 accesses the external web servers 10a to 10c and the web servers 40a to 40c connected to the Internet 30, and information on links set between the contents of the website and the metadata of the contents in the website. Information, information on contents in the website, and the like are collected and stored in the storage unit 502.

記憶部５０２は、ハードディスク装置などの記憶デバイスである。この記憶部５０２は、リンク情報５０２ａ、メタデータ情報５０２ｂ、コンテンツ情報５０２ｃを記憶している。 The storage unit 502 is a storage device such as a hard disk device. The storage unit 502 stores link information 502a, metadata information 502b, and content information 502c.

リンク情報５０２ａは、コンテンツ間に設定されたリンクの情報を記憶したものである。このリンクは、ハイパーリンクまたはトラックバックにより設定されたものである。メタデータ情報５０２ｂは、ウェブサイト内のコンテンツに係る情報を記述したメタデータの情報を記憶したものである。コンテンツ情報５０２ｃは、ウェブサイト内のテキストや画像データなどのコンテンツの情報を記憶したものである。 The link information 502a stores information on links set between contents. This link is set by a hyperlink or trackback. The metadata information 502b stores metadata information describing information related to contents in the website. The content information 502c stores content information such as text and image data in the website.

制御部５０３は、情報収集サーバ５０を全体制御する制御部であり、各機能部間のデータの授受などを司る。 The control unit 503 is a control unit that controls the information collection server 50 as a whole, and controls data exchange between the functional units.

属性評価サーバ６０は、情報収集サーバ５０により収集された情報を取得し、評価対象となるコンテンツの属性および当該コンテンツにアクセスするユーザの属性を評価する処理をおこなう。 The attribute evaluation server 60 acquires the information collected by the information collection server 50 and performs processing for evaluating the attribute of the content to be evaluated and the attribute of the user accessing the content.

この属性評価サーバ６０は、データ送受信部６００、記憶部６０１、ベクトル情報生成部６０２、属性評価部６０３および制御部６０４を有する。データ送受信部６００は、他のサーバとの間でＬＡＮ８０を介してさまざまなデータの授受をおこなうネットワークインターフェースである。 The attribute evaluation server 60 includes a data transmission / reception unit 600, a storage unit 601, a vector information generation unit 602, an attribute evaluation unit 603, and a control unit 604. The data transmission / reception unit 600 is a network interface that exchanges various data with other servers via the LAN 80.

記憶部６０１は、ハードディスク装置などの記憶デバイスである。この記憶部６０１は、デジタルシソーラスデータ６０１ａおよび属性評価情報６０１ｂを記憶している。 The storage unit 601 is a storage device such as a hard disk device. The storage unit 601 stores digital thesaurus data 601a and attribute evaluation information 601b.

デジタルシソーラスデータ６０１ａは、図３で説明したデジタルシソーラスに対応するものである。このデジタルシソーラスデータ６０１ａは、一般名詞、基本属性要素および要素比率の情報を記憶している。属性評価情報６０１ｂは、評価されたコンテンツの属性および当該コンテンツにアクセスするユーザの属性に係る情報を記憶したものである。 The digital thesaurus data 601a corresponds to the digital thesaurus described with reference to FIG. The digital thesaurus data 601a stores information on general nouns, basic attribute elements, and element ratios. The attribute evaluation information 601b stores information regarding the attribute of the evaluated content and the attribute of the user who accesses the content.

なお、ここでは、デジタルシソーラスデータ６０１ａに一般名詞、基本属性要素および要素比率の間の組み合わせを１つだけ記憶することとしたが、インターネット広告の広告主が希望する属性にコンテンツおよびコンテンツにアクセスするユーザを分類するために、一般名詞、基本属性要素および要素比率の間の異なる組み合わせをデジタルシソーラスデータ６０１ａに複数記憶しておき、使用する組み合わせを広告主に応じて切り替えることとしてもよい。 Here, only one combination between the general noun, the basic attribute element, and the element ratio is stored in the digital thesaurus data 601a. However, the content and the content are accessed by the Internet advertisement advertiser as desired. In order to classify users, a plurality of different combinations among general nouns, basic attribute elements, and element ratios may be stored in the digital thesaurus data 601a, and the combinations to be used may be switched according to the advertiser.

ここで、使用する組み合わせは、広告主ごとにあらかじめ設定しておくこととしてもよいし、インターネット３０に接続された端末装置（図示せず）を用いて広告主により指定された組み合わせをベクトル情報生成部６０２が受け付けることにより設定することとしてもよい。 Here, the combination to be used may be set in advance for each advertiser, or vector information is generated from a combination designated by the advertiser using a terminal device (not shown) connected to the Internet 30. It is good also as setting by the part 602 receiving.

これにより、ある広告主は、基本属性要素として「スポーツ」や「ニュース」などのように直接的に表現された要素を用い、別の広告主は、「ぬくもりのある」や「シャープな」、「ほのぼのとした」、「あたたかみのある」など、感覚的に表現された要素を用いたいという要望がある場合でも柔軟に対応することができる。 This allows one advertiser to use directly expressed elements such as “sports” or “news” as basic attribute elements, while other advertisers use “warmth” or “sharp” Even if there is a demand for using elements expressed sensuously, such as “warm and warm” or “warm”, it is possible to respond flexibly.

図５の説明に戻ると、ベクトル情報生成部６０２は、コンテンツに対して形態素解析を実行し、コンテンツから一般名詞を抽出する。そして、ベクトル情報生成部６０２は、抽出された一般名詞を互いに直交する基底ベクトルを用いて表現し、ベクトル空間内の位置を算出する処理をおこなう。具体的には、ベクトル情報生成部６０２は、図１において説明したように、デジタルシソーラスデータ６０１ａを参照し、一般名詞を式１および式２を用いて表現する。 Returning to the description of FIG. 5, the vector information generation unit 602 performs morphological analysis on the content and extracts general nouns from the content. Then, the vector information generation unit 602 performs processing for expressing the extracted general noun using base vectors orthogonal to each other and calculating a position in the vector space. Specifically, the vector information generation unit 602 refers to the digital thesaurus data 601a and expresses general nouns using Equations 1 and 2 as described in FIG.

属性評価部６０３は、コンテンツの属性および当該コンテンツにアクセスするユーザの属性を、互いに直交する基底ベクトルを用いて表現された一般名詞のベクトル空間内での位置の情報に基づいて評価する処理をおこなう。 The attribute evaluation unit 603 performs a process of evaluating the attribute of the content and the attribute of the user who accesses the content based on the position information in the vector space of the general nouns expressed using the mutually orthogonal basis vectors. .

具体的には、属性評価部６０３は、図１において説明したように、式５を用いて一般名詞を表すベクトルの和から属性ベクトルを生成し、属性ベクトルにおける各基底ベクトルの係数を、式７のようにして調べることにより属性を評価する処理をおこなう。 Specifically, as described with reference to FIG. 1, the attribute evaluation unit 603 generates an attribute vector from the sum of vectors representing general nouns using Equation 5, and calculates the coefficient of each base vector in the attribute vector as Equation 7 The attribute is evaluated by checking as described above.

広告サーバ７０は、属性評価サーバ６０により評価されたコンテンツの属性および当該コンテンツにアクセスするユーザの属性に基づいて、各コンテンツおよびユーザに適した広告を配信するサーバである。この広告サーバ７０は、データ送受信部７００、記憶部７０１、広告配信処理部７０２および制御部７０３を有する。 The advertisement server 70 is a server that distributes an advertisement suitable for each content and user based on the attribute of the content evaluated by the attribute evaluation server 60 and the attribute of the user who accesses the content. The advertisement server 70 includes a data transmission / reception unit 700, a storage unit 701, an advertisement distribution processing unit 702, and a control unit 703.

データ送受信部７００は、他の装置との間でインターネット３０またはＬＡＮ８０を介してさまざまなデータの授受をおこなうネットワークインターフェースである。記憶部７０１は、ハードディスク装置などの記憶デバイスである。この記憶部７０１は、広告データ７０１ａおよび配信条件データ７０１ｂを記憶している。 The data transmission / reception unit 700 is a network interface that exchanges various data with other devices via the Internet 30 or the LAN 80. The storage unit 701 is a storage device such as a hard disk device. The storage unit 701 stores advertisement data 701a and distribution condition data 701b.

広告データ７０１ａは、外部ウェブサーバ１０ａ〜１０ｃまたはウェブサーバ４０ａ〜４０ｃが閲覧サービスを提供しているコンテンツに配信するインターネット広告のデータを記憶したものである。配信条件データ７０１ｂは、インターネット広告を配信するコンテンツのＵＲＩ（Uniform Resource Identifier）や配信期間などのインターネット広告の配信条件を記憶したデータである。 The advertisement data 701a stores data of Internet advertisements that are distributed to content provided by the external web servers 10a to 10c or the web servers 40a to 40c. The distribution condition data 701b is data storing distribution conditions for Internet advertisements such as a URI (Uniform Resource Identifier) of contents for distributing Internet advertisements and a distribution period.

広告配信処理部７０２は、記憶部７０１に記憶された配信条件データ７０１ｂに基づいて、コンテンツに広告データ７０１ａに記憶されたインターネット広告を配信する処理をおこなう。制御部７０３は、広告サーバ７０を全体制御する制御部であり、各機能部間のデータの授受などを司る。 The advertisement distribution processing unit 702 performs processing for distributing the Internet advertisement stored in the advertisement data 701a to the content based on the distribution condition data 701b stored in the storage unit 701. The control unit 703 is a control unit that controls the advertisement server 70 as a whole, and controls data exchange between the functional units.

つぎに、本実施例に係る属性評価処理の処理手順について説明する。図６は、本実施例に係る属性評価処理の処理手順を示すフローチャートである。 Next, a processing procedure of attribute evaluation processing according to the present embodiment will be described. FIG. 6 is a flowchart illustrating a processing procedure of attribute evaluation processing according to the present embodiment.

図６に示すように、まず、属性評価サーバ６０のベクトル情報生成部６０２は、属性を評価する対象であるコンテンツの情報を情報収集サーバ５０の記憶部５０２に記憶されたコンテンツ情報５０２ｃから取得する（ステップＳ１０１）。 As shown in FIG. 6, first, the vector information generation unit 602 of the attribute evaluation server 60 acquires information on the content that is the target of attribute evaluation from the content information 502 c stored in the storage unit 502 of the information collection server 50. (Step S101).

そして、ベクトル情報生成部６０２は、図２を用いて説明したように、コンテンツの情報に対する形態素解析を実行し（ステップＳ１０２）、一般名詞を抽出する（ステップＳ１０３）。 Then, as described with reference to FIG. 2, the vector information generation unit 602 performs morphological analysis on the content information (step S102), and extracts general nouns (step S103).

続いて、ベクトル情報生成部６０２は、記憶部６０１に記憶されたデジタルシソーラスデータ６０１ａを参照することにより、抽出した一般名詞に対応する基本属性要素および要素比率を検索し（ステップＳ１０４）、式１および式２を用いて、一般名詞をベクトルに変換する（ステップＳ１０５）。 Subsequently, the vector information generation unit 602 refers to the digital thesaurus data 601a stored in the storage unit 601 to search for basic attribute elements and element ratios corresponding to the extracted general nouns (step S104). And the general noun is converted into a vector by using Equation 2 (step S105).

そして、属性評価部６０３は、式５を用いて、各一般名詞に対応するベクトルの和を算出し（ステップＳ１０６）、式７で説明したようにして、ベクトルの和における各基底ベクトルの係数からコンテンツに係る属性を評価し（ステップＳ１０７）、評価したコンテンツに係る属性を属性評価情報６０１ｂとして記憶部６０１に記憶し（ステップＳ１０８）、この属性評価処理を終了する。 Then, the attribute evaluation unit 603 calculates the sum of the vectors corresponding to the respective general nouns using Equation 5 (step S106), and uses the coefficient of each base vector in the vector sum as described in Equation 7. The attribute related to the content is evaluated (step S107), the attribute related to the evaluated content is stored in the storage unit 601 as the attribute evaluation information 601b (step S108), and this attribute evaluation process ends.

なお、上記実施例では、コンテンツから一般名詞を抽出し、一般名詞を基本属性要素に基づいてベクトルに変換して、コンテンツに係る属性を評価することとしているが、コンテンツから一般名詞を抽出する代わりに、検索エンジンを用いて当該コンテンツを検索する場合にユーザにより入力された検索語や、当該コンテンツに対するハイパーリンクが設定され、ユーザによりマウス等でクリックされた語をベクトルに変換し、コンテンツに係る属性を評価することとしてもよい。 In the above embodiment, general nouns are extracted from content, general nouns are converted into vectors based on basic attribute elements, and attributes related to content are evaluated. However, instead of extracting general nouns from content, In addition, when searching for the content using a search engine, a search term input by the user or a hyperlink to the content is set, and the word clicked by the user with a mouse or the like is converted into a vector, and the content The attribute may be evaluated.

上述してきたように、本実施例では、属性評価サーバ６０のベクトル情報生成部６０２が、各基本属性要素に対応付けられた互いに直交するベクトルを基底とし、かつ、一般名詞のベクトル空間内での位置を表すベクトルに係る情報を生成し、属性評価部６０３が、ベクトル情報生成部６０２により生成された情報に基づいて、コンテンツに係る属性を評価することとしたので、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができる。 As described above, in this embodiment, the vector information generation unit 602 of the attribute evaluation server 60 uses the vectors orthogonal to each other associated with each basic attribute element as a basis, and in the vector space of general nouns. Since the information related to the vector representing the position is generated and the attribute evaluation unit 603 evaluates the attribute related to the content based on the information generated by the vector information generation unit 602, it is flexible and efficient. It is possible to accurately and dynamically evaluate the attributes relating to the content.

また、本実施例では、ベクトル情報生成部６０２が、一般名詞とコンテンツに係る属性の候補の情報とを対応付けて記憶したデジタルシソーラスデータ６０１ａから、一般名詞を検索キーとして基本属性要素の情報を検索し、属性評価部６０３が、検索の結果得られた基本属性要素の情報を基にして、一般名詞のベクトル空間内での位置を表すベクトルに係る情報を生成することとしたので、デジタルシソーラスデータ６０１ａに記憶された基本属性要素の情報を読み出すことにより、効率的にベクトル情報を生成することができる。 Further, in this embodiment, the vector information generation unit 602 uses the general noun as a search key to retrieve basic attribute element information from the digital thesaurus data 601a in which the general noun and the attribute candidate information related to the content are stored in association with each other. Since the search is performed and the attribute evaluation unit 603 generates information related to the vector representing the position of the general noun in the vector space based on the information of the basic attribute element obtained as a result of the search, the digital thesaurus By reading the basic attribute element information stored in the data 601a, vector information can be generated efficiently.

また、本実施例では、ベクトル情報生成部６０２が、一般名詞と基本属性要素との間の異なる組み合わせを複数記憶したデジタルシソーラスデータ６０1ａから、指定された組み合わせにおける基本属性要素の情報を検索し、検索の結果得られた基本属性要素の情報を基にして、ベクトルに係る情報を生成することとしたので、コンテンツに係る属性を柔軟に評価することができる。 In this embodiment, the vector information generation unit 602 searches the digital thesaurus data 601a that stores a plurality of different combinations between general nouns and basic attribute elements, and searches for information on the basic attribute elements in the specified combination. Since the information related to the vector is generated based on the basic attribute element information obtained as a result of the search, the attribute related to the content can be flexibly evaluated.

また、本実施例では、ベクトル情報生成部６０２が、基本属性要素の情報および基本属性要素の要素比率に係る情報を記憶したデジタルシソーラスデータ６０１ａから基本属性要素の情報および要素比率に係る情報を読み出し、属性評価部６０３が、読み出した情報に基づいて一般名詞のベクトル空間内での位置を表すベクトルに係る情報を生成することとしたので、基本属性要素の要素比率を考慮してコンテンツに係る情報の評価をおこなうことにより、より正確にコンテンツに係る属性を評価することができる。 In this embodiment, the vector information generation unit 602 reads basic attribute element information and element ratio information from digital thesaurus data 601a storing basic attribute element information and basic attribute element ratio information. Since the attribute evaluation unit 603 generates information related to the vector representing the position of the general noun in the vector space based on the read information, the information related to the content in consideration of the element ratio of the basic attribute elements By performing the evaluation, it is possible to evaluate the attribute related to the content more accurately.

また、本実施例では、属性評価部６０３が、所定のコンテンツに対してコンテンツに係る複数の属性と、各属性の優先度とを評価することとしたので、コンテンツに係る属性を任意の精度で評価することができる。 In this embodiment, since the attribute evaluation unit 603 evaluates a plurality of attributes related to the content and the priority of each attribute with respect to the predetermined content, the attribute related to the content is determined with arbitrary accuracy. Can be evaluated.

また、本実施例では、ベクトル情報生成部６０２が、コンテンツに係る属性の候補の数が増加した場合に、当該候補の数の増加に応じて次元が増加したベクトル空間における一般名詞の位置を表すベクトルに係る情報を生成することとしたので、コンテンツに係る属性の候補の数の増加に動的に対応することができる。 In this embodiment, when the number of attribute candidates related to the content increases, the vector information generation unit 602 represents the position of the general noun in the vector space whose dimension increases in accordance with the increase in the number of candidates. Since the information related to the vector is generated, it is possible to dynamically cope with an increase in the number of attribute candidates related to the content.

また、本実施例では、一般名詞は、コンテンツの内容に係る情報を含んだメタデータまたは当該コンテンツから抽出されたものであることとしたので、メタデータまたはコンテンツから抽出された一般名詞を基にして、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができる。 In this embodiment, the general noun is metadata including information related to the content or extracted from the content. Therefore, the general noun is based on the general noun extracted from the metadata or content. Thus, it is possible to evaluate the attribute relating to the content flexibly, efficiently, accurately and dynamically.

また、本実施例では、ベクトル情報生成部６０２が、基本属性要素に対応付けられた互いに直交するベクトルを基底とし、かつ、コンテンツの検索に用いられた検索語、または、ハイパーリンクが設定された語のベクトル空間内での位置を表すベクトルに係る情報を生成することとしたので、コンテンツの検索に用いられた検索語、または、ハイパーリンクが設定された語を基にして、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができる。 Further, in this embodiment, the vector information generation unit 602 sets a search word or a hyperlink that is based on vectors orthogonal to each other that are associated with the basic attribute elements and that is used for content search. Since the information related to the vector representing the position of the word in the vector space is generated, it is flexible and efficient based on the search word used for content search or the word set with the hyperlink. Therefore, it is possible to accurately and dynamically evaluate the attribute relating to the content.

（実施例の変形例１）
ところで、上記実施例では、一般名詞に対応する基本属性要素および要素比率を基にしてコンテンツに係る属性を評価することとしているが、さらに一般名詞の出現頻度を考慮して、より精度よくコンテンツに係る属性を評価することとしてもよい。そこで、実施例の変形例１では、一般名詞の出現頻度をさらに考慮する場合について説明する。 (Modification 1 of an Example)
By the way, in the above embodiment, the attribute related to the content is evaluated on the basis of the basic attribute element corresponding to the general noun and the element ratio. Such an attribute may be evaluated. Therefore, in Modification 1 of the embodiment, a case where the appearance frequency of general nouns is further considered will be described.

ここでは、上記実施例で用いた「卒業を控えた僕は、シンセサイザーで作曲することが好きで、専攻は、自然言語処理学である。」という第１のテキストと、「自然言語処理学は楽しい。」という第２のテキストとからなるコンテンツに係る属性を評価する場合の例を示す。 Here, the first text used in the above example, “I like to synthesize music with a synthesizer, my major is natural language processing,” and “Natural language processing is The example in the case of evaluating the attribute which concerns on the content which consists of the 2nd text "It is pleasant."

図７は、変形例１における形態素解析の実行結果の一例を示す図である。図７には、第２のテキストに対して形態素解析を適用した結果が示されている。第１のテキストに対して形態素解析をおこなった結果は、図２に示したものと同様である。 FIG. 7 is a diagram illustrating an example of the execution result of the morphological analysis in the first modification. FIG. 7 shows the result of applying morphological analysis to the second text. The result of performing the morphological analysis on the first text is the same as that shown in FIG.

図７に示すように、第２のテキストからは、「自然言語処理学」という一般名詞が１つ抽出される。これにより、第１のテキストおよび第２のテキストから抽出された一般名詞は、「シンセサイザー」および「自然言語処理学」であり、出現頻度はそれぞれ１回および２回となる。 As shown in FIG. 7, one general noun “natural language processing” is extracted from the second text. Thereby, the general nouns extracted from the first text and the second text are “synthesizer” and “natural language processing science”, and the appearance frequencies are once and twice, respectively.

本変形例１における属性評価処理では、コンテンツに係る属性を評価する場合に、一般名詞の出現頻度を当該一般名詞に対応するベクトルの重みとして設定し、ベクトルの和を算出する。 In the attribute evaluation process according to the first modification, when evaluating an attribute relating to content, the appearance frequency of a general noun is set as a vector weight corresponding to the general noun, and the sum of the vectors is calculated.

すなわち、本変形例１における属性評価処理では、式５の代わりに、
ｐ＝ Σｗ_i ｎ_i ．．．（式８）
を用いてベクトルの和を算出する。ここで、ｗ_iは、ベクトルｎ_iに対応する一般名詞の出現頻度であり、Σｗ_i ｎ_iは、すべてのｉに対するｗ_i ｎ_iの和である。 That is, in the attribute evaluation process in the first modification, instead of Equation 5,
p = Σw _i n _i . . . (Formula 8)
Is used to calculate the sum of the vectors. Here, w _i is the appearance frequency of the general noun corresponding to the vector n _i , and Σw _i n _i is the sum of w _i n _i for all i.

たとえば、「シンセサイザー」および「自然言語処理学」という一般名詞が抽出され、「シンセサイザー」の出現頻度が１回であり、「自然言語処理学」の出現頻度が２回であるコンテンツの属性ベクトルｐは、
ｐ＝ｎ₁ ＋２ｎ₂
＝１／√２ |ｅ₁> ＋１／√２ |ｅ₂> ＋２ (３／√１３ |ｅ₂> ＋２／√１３ |ｅ₃>)
＝１／√２ |ｅ₁> ＋ (１／√２＋６／√１３) |ｅ₂> ＋４／√１３ |ｅ₃>
．．．（式９）
となる。 For example, the general nouns “synthesizer” and “natural language processing” are extracted, the frequency of appearance of “synthesizer” is once, and the attribute vector p of the content where “natural language processing” appears twice Is
p = n ₁ +2 n ₂
= 1 / √2 | e ₁ > + 1 / √2 | e ₂ > +2 (3 / √13 | e ₂ > + 2 / √13 | e ₃ >)
= 1 / √2 | e ₁ > + (1 / √2 + 6 / √13) | e ₂ > + 4 / √13 | e ₃ >
. . . (Formula 9)
It becomes.

これにより、コンテンツに係る属性が「音楽」、「コンピュータ」、「言語学」である割合はそれぞれ、
１／√２： (１／√２＋６／√１３) ：４／√１３
≒ １７％：５７％：２６％．．．（式１０）
となる。 As a result, the percentage of content attributes “music”, “computer”, and “linguistics”
1 / √2: (1 / √2 + 6 / √13): 4 / √13
≒ 17%: 57%: 26%. . . (Formula 10)
It becomes.

したがって、コンテンツに係る属性を１つに決定する必要がある場合には「コンピュータ」を選択し、コンテンツに係る属性を２つ決定する場合には「コンピュータ」および「言語学」を選択すればよい。 Therefore, when it is necessary to determine one attribute related to the content, “computer” is selected, and when two attributes related to the content are determined, “computer” and “linguistics” may be selected. .

なお、本変形例１に係る属性評価システムの機能構成については、図５に示した機能構成とほぼ同様である。ただし、本変形例１では、ベクトル情報生成部６０２が、コンテンツに対して形態素解析を実行し、コンテンツから一般名詞を抽出するとともに、抽出された一般名詞の出現頻度を記憶部６０１に記憶する。 Note that the functional configuration of the attribute evaluation system according to Modification 1 is substantially the same as the functional configuration shown in FIG. However, in the first modification, the vector information generation unit 602 performs morphological analysis on the content, extracts general nouns from the content, and stores the appearance frequency of the extracted general nouns in the storage unit 601.

そして、ベクトル情報生成部６０２は、図１において説明したように、抽出された一般名詞を互いに直交する基底ベクトルを用いて表現し、一般名詞のベクトル空間内における位置を算出する処理をおこなう。 Then, as described with reference to FIG. 1, the vector information generation unit 602 represents the extracted general nouns using base vectors that are orthogonal to each other, and performs processing for calculating the positions of the general nouns in the vector space.

また、属性評価部６０３は、一般名詞の出現頻度と、互いに直交する基底ベクトルを用いて表現された一般名詞のベクトル空間内での位置の情報とから、式８を用いて、コンテンツの属性および当該コンテンツにアクセスするユーザの属性を評価する。 Further, the attribute evaluation unit 603 uses the expression 8 to calculate the content attributes and the general nouns from the appearance frequency of the general nouns and the position information in the vector space of the general nouns expressed using mutually orthogonal basis vectors. Evaluate the attributes of users accessing the content.

また、本変形例１に係る属性評価処理の処理手順は、図６に示した処理手順とほぼ同様である。ただし、本変形例１では、ステップＳ１０３において、ベクトル情報生成部６０２が、コンテンツから一般名詞を抽出するとともに、抽出された一般名詞の出現頻度を記憶部６０１に記憶する。 Further, the processing procedure of the attribute evaluation processing according to the first modification is almost the same as the processing procedure shown in FIG. However, in the first modification, in step S103, the vector information generation unit 602 extracts general nouns from the content, and stores the appearance frequency of the extracted general nouns in the storage unit 601.

そして、ステップＳ１０６では、属性評価部６０３が、式５の代わりに、式８を用いて各一般名詞に対応するベクトルの和を算出する。また、ステップＳ１０７では、属性評価部６０３が、式１０で説明したようにして、ベクトルの和における各基底ベクトルの係数からコンテンツに係る属性を評価する。 In step S106, the attribute evaluation unit 603 calculates the sum of the vectors corresponding to each general noun using Expression 8 instead of Expression 5. In step S107, the attribute evaluation unit 603 evaluates the attribute related to the content from the coefficient of each base vector in the vector sum as described in Expression 10.

上述してきたように、本実施例の変形例１では、属性評価部６０３が、一般名詞の出現頻度に基づいて、ベクトル情報生成部６０２により生成されたベクトルに係る情報の重みを設定し、設定した重みに基づいてコンテンツに係る属性を評価することとしたので、一般名詞の出現頻度を考慮することにより、より正確にコンテンツに係る属性を評価することができる。 As described above, in the first modification of the present embodiment, the attribute evaluation unit 603 sets and sets the weight of information related to the vector generated by the vector information generation unit 602 based on the appearance frequency of the general noun. Since the attribute relating to the content is evaluated based on the weights thus determined, the attribute relating to the content can be more accurately evaluated by considering the appearance frequency of the general noun.

（実施例の変形例２）
ところで、上記実施例および実施例の変形例１では、コンテンツがテキストであることとしたが、コンテンツが画像である場合にも、その画像に対して設定されたメタデータを解析して、コンテンツに係る属性を評価することとしてもよい。そこで、実施例の変形例２では、テキストとともに画像を含むコンテンツに係る属性を評価する場合について説明する。 (Modification 2 of an Example)
By the way, in the first embodiment and the first modification of the embodiment, the content is text. However, even when the content is an image, the metadata set for the image is analyzed and the content is analyzed. Such an attribute may be evaluated. Therefore, in a second modification of the embodiment, a case will be described in which attributes relating to content including an image together with text are evaluated.

ここでは、上記変形例１で用いた「卒業を控えた僕は、シンセサイザーで作曲することが好きで、専攻は、自然言語処理学である。自然言語処理学は楽しい。」というテキストと、シンセサイザーおよび楽譜の画像とからなるコンテンツに係る属性を評価する場合の例を示す。 Here, the text used in Modification 1 above, “I graduated, I like to compose with a synthesizer, my major is natural language processing, and natural language processing is fun” and the synthesizer. An example in the case of evaluating an attribute related to a content including a score image and a score is shown.

ここで、シンセサイザーの画像のメタデータには、「シンセサイザー」という一般名詞が含まれており、楽譜の画像のメタデータには、「音楽」という一般名詞が含まれているものとする。 Here, the metadata of the synthesizer image includes a general noun “synthesizer”, and the metadata of the score image includes a general noun “music”.

この場合、上記テキストおよびメタデータから形態素解析により一般名詞を抽出すると、「シンセサイザー」、「自然言語処理学」および「音楽」という一般名詞が抽出され、それらの一般名詞の出現頻度はそれぞれ２回、２回、１回となる。 In this case, when general nouns are extracted from the text and metadata by morphological analysis, the general nouns “synthesizer”, “natural language processing” and “music” are extracted, and the appearance frequency of these general nouns is twice. 2 times and 1 time.

この場合、当該コンテンツに係る属性を表す属性ベクトルｐは、式８を用いることにより、
ｐ＝２ｎ₁ ＋２ｎ₂ ＋ｎ₃
＝２（１／√２ |ｅ₁> ＋１／√２ |ｅ₂>）
＋２ (３／√１３ |ｅ₂> ＋２／√１３ |ｅ₃>) ＋ |ｅ₁>
＝（√２＋１） |ｅ₁> ＋ (√２＋６／√１３) |ｅ₂> ＋４／√１３ |ｅ₃>
．．．（式１１）
となる。ここで、ｎ₃は、一般名詞「音楽」に対応する単位ベクトルであり、一般名詞「音楽」に対応する基本属性要素は「音楽」であるため、ｎ₃は|ｅ₁>に等しくなる。 In this case, the attribute vector p representing the attribute related to the content is expressed by
_{p = 2 n 1 + 2 n} 2 + n 3
= 2 (1 / √2 | e ₁ > + 1 / √2 | e ₂ >)
+2 (3 / √13 | e ₂ > + 2 / √13 | e ₃ >) + | e ₁ >
= (√2 + 1) | e ₁ > + (√2 + 6 / √13) | e ₂ > + 4 / √13 | e ₃ >
. . . (Formula 11)
It becomes. Here, since n ₃ is a unit vector corresponding to the general noun “music” and the basic attribute element corresponding to the general noun “music” is “music”, n ₃ is equal to | e ₁ >.

これにより、コンテンツに係る属性が「音楽」、「コンピュータ」、「言語学」である割合はそれぞれ、
√２＋１： (√２＋６／√１３) ：４／√１３
≒ ３７％：４７％：１６％．．．（式１２）
となる。 As a result, the percentage of content attributes “music”, “computer”, and “linguistics”
√2 + 1: (√2 + 6 / √13): 4 / √13
≒ 37%: 47%: 16%. . . (Formula 12)
It becomes.

したがって、コンテンツに係る属性を１つに決定する必要がある場合には「コンピュータ」を選択し、コンテンツに係る属性を２つ決定する場合には「音楽」および「コンピュータ」を選択すればよい。 Therefore, when it is necessary to determine one attribute related to the content, “computer” is selected. When two attributes related to the content are determined, “music” and “computer” may be selected.

なお、本変形例２に係る属性評価システムの機能構成については、図５に示した機能構成とほぼ同様である。ただし、本変形例２では、ベクトル情報生成部６０２が、コンテンツに含まれるテキストおよび画像のメタデータに対して形態素解析を実行することにより一般名詞を抽出するとともに、抽出された一般名詞の出現頻度を記憶部６０１に記憶する。 The functional configuration of the attribute evaluation system according to the second modification is almost the same as the functional configuration shown in FIG. However, in the second modification, the vector information generation unit 602 extracts a general noun by performing morphological analysis on the text and image metadata included in the content, and the appearance frequency of the extracted general noun Is stored in the storage unit 601.

そして、ベクトル情報生成部６０２は、図１において説明したように、抽出された一般名詞を互いに直交する基底ベクトルを用いて表現し、ベクトル空間内の位置を算出する処理をおこなう。 Then, as described in FIG. 1, the vector information generation unit 602 performs processing for expressing the extracted general nouns using base vectors that are orthogonal to each other and calculating a position in the vector space.

また、本変形例２に係る属性評価処理の処理手順は、図６に示した処理手順とほぼ同様である。ただし、本変形例２では、ステップＳ１０３において、ベクトル情報生成部６０２が、コンテンツに含まれるテキストおよび画像のメタデータから一般名詞を抽出するとともに、抽出された一般名詞の出現頻度を記憶部６０１に記憶する処理をおこなう。 Further, the processing procedure of the attribute evaluation processing according to the second modification is almost the same as the processing procedure shown in FIG. However, in the second modification, in step S103, the vector information generation unit 602 extracts general nouns from the text and image metadata included in the content, and the frequency of appearance of the extracted general nouns in the storage unit 601. Perform the process of memorizing.

そして、ステップＳ１０６では、属性評価部６０３が、式５の代わりに、式８を用いて各一般名詞に対応するベクトルの和を算出する。また、ステップＳ１０７では、属性評価部６０３が、式１０で説明したようにして、ベクトルの和における各基底ベクトルの係数から、テキストおよび画像を含んだコンテンツに係る属性を評価する。 In step S106, the attribute evaluation unit 603 calculates the sum of the vectors corresponding to each general noun using Expression 8 instead of Expression 5. Further, in step S107, the attribute evaluation unit 603 evaluates the attribute related to the content including the text and the image from the coefficient of each base vector in the vector sum as described in Expression 10.

上述してきたように、本変形例２では、属性評価部６０３が、コンテンツに含まれるテキストおよび画像のメタデータから抽出された一般名詞に基づいて、コンテンツに係る属性の評価をおこなうこととしたので、画像のメタデータをさらに用いることにより、より正確にコンテンツに係る属性を評価することができる。 As described above, in the second modification, the attribute evaluation unit 603 evaluates attributes related to content based on general nouns extracted from text and image metadata included in the content. By further using the metadata of the image, it is possible to evaluate the attribute relating to the content more accurately.

（実施例の変形例３）
ところで、上記実施例および実施例の変形例では、属性を評価するコンテンツまたはコンテンツのメタデータから一般名詞を抽出し、抽出した一般名詞に基づいて属性を評価することとしたが、第１のコンテンツにハイパーリンクまたはトラックバックにより関連付けられた第２のコンテンツから一般名詞を抽出して、第１のコンテンツの属性を評価することとしてもよい。 (Modification 3 of an Example)
By the way, in the said Example and the modification of an Example, although it was decided to extract a general noun from the content or content metadata which evaluates an attribute, and evaluated an attribute based on the extracted general noun, the 1st content It is also possible to extract general nouns from the second content associated with the hyperlink or trackback and evaluate the attributes of the first content.

具体的には、上記第２のコンテンツに対して図１で説明したような属性評価処理をおこなって、第２のコンテンツに係る属性を評価し、その評価結果に基づいて第１のコンテンツに係る属性を評価する。ここで、第２のコンテンツは、１つまたは複数のコンテンツである。 Specifically, the attribute evaluation process as described in FIG. 1 is performed on the second content, the attribute related to the second content is evaluated, and the first content is determined based on the evaluation result. Evaluate attributes. Here, the second content is one or a plurality of contents.

たとえば、第２の複数のコンテンツに係る属性が「音楽」、「コンピュータ」、「言語学」と評価され、そのうち「コンピュータ」と属性が評価されたコンテンツが最も多かった場合には、第１のコンテンツに係る属性を「コンピュータ」と評価すればよい。 For example, when the attributes related to the second plurality of contents are evaluated as “music”, “computer”, and “linguistics”, and the most content with the attribute “computer” is evaluated, the first The attribute relating to the content may be evaluated as “computer”.

なお、本変形例３に係る属性評価システムの機能構成については、図５に示した機能構成とほぼ同様である。ただし、本変形例３では、ベクトル情報生成部６０２が、第２のコンテンツに対して形態素解析を実行し、第２のコンテンツから一般名詞を抽出するとともに、抽出された一般名詞の出現頻度を記憶部６０１に記憶する。 The functional configuration of the attribute evaluation system according to Modification 3 is substantially the same as the functional configuration shown in FIG. However, in the third modification, the vector information generation unit 602 performs morphological analysis on the second content, extracts general nouns from the second content, and stores the appearance frequency of the extracted general nouns. Store in the unit 601.

また、属性評価部６０３は、一般名詞の出現頻度と、互いに直交する基底ベクトルを用いて表現された一般名詞のベクトル空間内での位置の情報とから、式８を用いて第２のコンテンツに係る属性を評価し、その第２のコンテンツに係る属性に基づいて、第１のコンテンツに係る属性を評価する。 In addition, the attribute evaluation unit 603 uses the expression 8 to calculate the second content from the appearance frequency of the general noun and the position information in the vector space of the general noun expressed using mutually orthogonal basis vectors. The attribute related to the first content is evaluated based on the attribute related to the second content.

また、本変形例３に係る属性評価処理の処理手順は、図６に示した処理手順とほぼ同様である。ただし、本変形例３では、ステップＳ１０１からステップＳ１０８の処理を第２のコンテンツに対して実行し、その後、第２のコンテンツに対して評価された属性の情報に基づいて第１のコンテンツに係る属性を評価する。 Further, the processing procedure of the attribute evaluation processing according to the third modification is substantially the same as the processing procedure shown in FIG. However, in the third modification, the processing from step S101 to step S108 is executed for the second content, and then the first content is related based on the attribute information evaluated for the second content. Evaluate attributes.

上述してきたように、本実施例の変形例３では、ベクトル情報生成部６０２が、属性を評価する第１のコンテンツに対してハイパーリンクまたはトラックバックにより関連付けられている第２のコンテンツに係る属性の各候補に対応付けられた互いに直交するベクトルを基底とし、かつ、第２のコンテンツに係る一般名詞のベクトル空間内での位置を表すベクトルに係る情報を生成し、属性評価部６０３が、ベクトル情報生成部６０２により生成された情報に基づいて第１のコンテンツに係る属性を評価することとしたので、ハイパーリンクまたはトラックバックにより関連付けられた第２のコンテンツから第１のコンテンツの属性を柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することができる。 As described above, in the third modification of the present embodiment, the vector information generation unit 602 sets the attribute related to the second content associated with the first content whose attribute is evaluated by the hyperlink or the trackback. Information relating to a vector representing a position in the vector space of the general noun related to the second content based on mutually orthogonal vectors associated with each candidate is generated, and the attribute evaluation unit 603 generates vector information Since the attribute related to the first content is evaluated based on the information generated by the generation unit 602, the attribute of the first content is flexibly and efficiently determined from the second content associated by the hyperlink or the trackback. Therefore, it is possible to accurately and dynamically evaluate the attribute relating to the content.

以上、属性評価処理をコンピュータ上で実現する場合について説明してきたが、属性評価処理を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータに読み込ませ、実行することにより属性評価処理を実現してもよい。図８は、属性評価処理を実現するコンピュータ１００のハードウェア構成を示すブロック図である。 As described above, the case where the attribute evaluation process is realized on the computer has been described. However, the program for realizing the attribute evaluation process is recorded on a computer-readable recording medium, and the program recorded on the recording medium is stored in the computer. The attribute evaluation process may be realized by reading and executing. FIG. 8 is a block diagram illustrating a hardware configuration of the computer 100 that realizes attribute evaluation processing.

図８に示すように、このコンピュータ１００は、上記プログラムを実行するＣＰＵ１１０と、データを入力する入力装置１２０と、各種データを記憶するＲＯＭ１３０と、演算パラメータ等を記憶するＲＡＭ１４０と、属性評価処理を実現するためのプログラムを記録した記録媒体２００からプログラムを読み取る読取装置１５０と、ディスプレイ等の出力装置１６０と、ネットワーク３００を介して他のコンピュータとの間でデータの授受をおこなうネットワークインターフェース１７０とが、バス１８０で接続された構成となっている。 As shown in FIG. 8, the computer 100 includes a CPU 110 that executes the program, an input device 120 that inputs data, a ROM 130 that stores various data, a RAM 140 that stores calculation parameters and the like, and attribute evaluation processing. A reading device 150 that reads a program from a recording medium 200 that records a program for realizing, an output device 160 such as a display, and a network interface 170 that exchanges data with other computers via a network 300. The bus 180 is connected.

ＣＰＵ１１０は、読取装置１５０を経由して記録媒体２００に記録されているプログラムを読み込んだ後、プログラムを実行することにより、属性評価処理を実現する。なお、記録媒体２００としては、光ディスク、フレキシブルディスク、ＣＤ−ＲＯＭ、ハードディスク等が挙げられる。また、このプログラムは、ネットワーク３００を介してコンピュータ１００に導入することとしてもよい。 The CPU 110 implements attribute evaluation processing by reading a program recorded on the recording medium 200 via the reading device 150 and then executing the program. Examples of the recording medium 200 include an optical disk, a flexible disk, a CD-ROM, and a hard disk. Further, this program may be installed in the computer 100 via the network 300.

さて、これまで本発明の実施例について説明したが、本発明は上述した実施例以外にも、特許請求の範囲に記載した技術的思想の範囲内において種々の異なる実施例にて実施されてもよいものである。 Although the embodiments of the present invention have been described so far, the present invention may be implemented in various different embodiments in addition to the above-described embodiments within the scope of the technical idea described in the claims. It ’s good.

また、本実施例において説明した各処理のうち、自動的におこなわれるものとして説明した処理の全部または一部を手動的におこなうこともでき、あるいは、手動的におこなわれるものとして説明した処理の全部または一部を公知の方法で自動的におこなうこともできる。 In addition, among the processes described in this embodiment, all or part of the processes described as being performed automatically can be performed manually, or the processes described as being performed manually can be performed. All or a part can be automatically performed by a known method.

この他、上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 In addition, the processing procedure, control procedure, specific name, and information including various data and parameters shown in the above-mentioned document and drawings can be arbitrarily changed unless otherwise specified.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示のように構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。 Each component of each illustrated device is functionally conceptual and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution / integration of each device is not limited to that shown in the figure, and all or a part thereof may be functionally or physically distributed or arbitrarily distributed in arbitrary units according to various loads or usage conditions. Can be integrated and configured.

さらに、各装置にて行なわれる処理機能は、その全部または任意の一部が、ＣＰＵおよび当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。 Furthermore, all or some of the processing functions performed in each device may be realized by a CPU and a program that is analyzed and executed by the CPU, or may be realized as hardware by wired logic.

以上のように、本発明に係る属性評価装置、属性評価方法および属性評価プログラムは、柔軟に、効率的に、正確に、また、動的にコンテンツに係る属性を評価することが必要な属性評価システムに有用である。 As described above, the attribute evaluation apparatus, the attribute evaluation method, and the attribute evaluation program according to the present invention are attribute evaluations that require flexible, efficient, accurate, and dynamic evaluation of content attributes. Useful for systems.

本発明にかかる属性評価処理の概念を説明する図である。It is a figure explaining the concept of the attribute evaluation process concerning this invention. 形態素解析の実行結果の一例を示す図である。It is a figure which shows an example of the execution result of a morphological analysis. 基本属性要素および基本属性要素間の比率の情報を記憶したデジタルシソーラスの一例を示す図である。It is a figure which shows an example of the digital thesaurus which memorize | stored the information of the ratio between basic attribute elements and basic attribute elements. 互いに直交しない基底ベクトルを用いた場合のコンテンツに係る属性評価方法を説明する図である。It is a figure explaining the attribute evaluation method which concerns on the content at the time of using the base vector which is not mutually orthogonal. 本実施例に係る属性評価システム２０の機能構成を示す図である。It is a figure which shows the function structure of the attribute evaluation system 20 which concerns on a present Example. 本実施例に係る属性評価処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the attribute evaluation process which concerns on a present Example. 変形例１における形態素解析の実行結果の一例を示す図である。It is a figure which shows an example of the execution result of the morphological analysis in the modification 1. 属性評価処理を実現するコンピュータ１００のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the computer 100 which implement | achieves attribute evaluation processing.

Explanation of symbols

１０ａ〜１０ｃ外部ウェブサーバ
２０属性評価システム
３０インターネット
４０ａ〜４０ｃウェブサーバ
５０情報収集サーバ
５００データ送受信部
５０１情報収集部
５０２記憶部
５０２ａリンク情報
５０２ｂメタデータ情報
５０２ｃコンテンツ情報
５０３制御部
６０属性評価サーバ
６００データ送受信部
６０１記憶部
６０１ａデジタルシソーラスデータ
６０１ｂ属性評価情報
６０２ベクトル情報生成部
６０３属性評価部
６０４制御部
７０広告サーバ
７００データ送受信部
７０１記憶部
７０１ａ広告データ
７０１ｂ広告配信データ
７０２広告配信処理部
７０３制御部
８０ＬＡＮ 10a to 10c External web server 20 Attribute evaluation system 30 Internet 40a to 40c Web server 50 Information collection server 500 Data transmission / reception unit 501 Information collection unit 502 Storage unit 502a Link information 502b Metadata information 502c Content information 503 Control unit 60 Attribute evaluation server 600 Data transmission / reception unit 601 Storage unit 601a Digital thesaurus data 601b Attribute evaluation information 602 Vector information generation unit 603 Attribute evaluation unit 604 Control unit 70 Advertising server 700 Data transmission / reception unit 701 Storage unit 701a Advertising data 701b Advertising distribution data 702 Advertising distribution processing unit 703 Control Part 80 LAN

Claims

An attribute evaluation apparatus that evaluates attributes related to content based on information related to content,
Vector information generating means for generating information related to a vector representing a position in a vector space of information related to the content based on mutually orthogonal vectors associated with each candidate of the attribute related to the content;
Attribute evaluation means for evaluating an attribute relating to the content based on information generated by the vector information generation means;
An attribute evaluation apparatus characterized by comprising:

The database further includes a database that stores information related to content and information about candidate attributes related to the content in association with each other, and the vector information generation unit uses the information related to the content as a search key to store information about candidate attributes related to the content. The attribute evaluation apparatus according to claim 1, wherein information relating to the vector is generated based on information on candidate attributes relating to the content obtained as a result of retrieval from the database.

The database stores a plurality of different combinations between information related to content and attribute candidates related to the content, and the vector information generation unit is configured to select candidate attributes related to the content in a specified combination among the plurality of combinations. The attribute evaluation apparatus according to claim 2, wherein information is searched, and information related to the vector is generated based on information on candidate attributes related to the content obtained as a result of the search.

The database further stores information relating to the weight of each candidate for the attribute relating to the content, and the vector information generating means reads out the information relating to the candidate for the attribute relating to the content and the information relating to the weight from the database, and reads the information. The attribute evaluation apparatus according to claim 2, wherein information related to the vector is generated based on the received information.

5. The attribute evaluation device according to claim 1, wherein the attribute evaluation unit evaluates a plurality of attributes related to the content and a priority of each attribute with respect to the predetermined content. .

The attribute evaluation unit sets a weight of information related to the vector based on an appearance frequency of information related to the content, and evaluates an attribute related to the content based on the set weight. The attribute evaluation apparatus as described in any one of 1-5.

The vector information generation unit generates information related to the vector in a vector space whose dimension increases in accordance with the increase in the number of candidates when the number of attribute candidates related to the content increases. The attribute evaluation apparatus as described in any one of Claims 1-6.

The attribute evaluation apparatus according to claim 1, wherein the information relating to the content is metadata including information relating to the content of the content or information extracted from the content.

The vector information generation means is based on mutually orthogonal vectors associated with respective attribute candidates related to the second content associated with the first content for which the attribute is evaluated by hyperlink or trackback, And the information concerning the vector showing the position in the vector space of the information which concerns on the said 2nd content is produced | generated, The said attribute evaluation means is a said 1st based on the information produced | generated by the said vector information production | generation means The attribute evaluation apparatus according to claim 1, wherein an attribute related to content is evaluated.

The attribute evaluation apparatus according to claim 1, wherein the information related to the content is a search word used for searching the content or a word for which a hyperlink is set. .

An attribute evaluation method for evaluating attributes related to content based on information related to content,
A vector information generating step for generating information related to a vector representing a position in a vector space of information related to the content based on mutually orthogonal vectors associated with each candidate for the attribute related to the content;
An attribute evaluation step for evaluating an attribute relating to the content based on the information generated by the vector information generation step;
Attribute evaluation method characterized by including.

The vector information generation step searches for information on candidate attributes related to content from a database that stores information related to the content and information on candidate attributes related to the content in association with the information related to the content as a search key, The attribute evaluation method according to claim 11, wherein the information related to the vector is generated based on information on candidate attributes related to the content obtained as a result of the search.

The vector information generation step searches for information on attribute candidates related to the content in the specified combination from a database storing a plurality of different combinations between the information related to the content and the attribute candidates related to the content. 13. The attribute evaluation method according to claim 12, wherein information related to the vector is generated on the basis of information on candidate attributes related to the content obtained as a result.

The vector information generation step reads out and reads out the information on the candidate attributes and the information on the weights from the database storing the information on the candidate attributes on the contents and the information on the weights of the candidate attributes on the contents. The attribute evaluation method according to claim 12 or 13, wherein information on the vector is generated based on information.

The attribute evaluation method according to claim 11, wherein the attribute evaluation step evaluates a plurality of attributes related to the content with respect to predetermined content and a priority of each attribute. .

The attribute evaluation step sets the weight of the information related to the vector based on the appearance frequency of the information related to the content, and evaluates the attribute related to the content based on the set weight. The attribute evaluation method as described in any one of 11-15.

In the vector information generation step, when the number of attribute candidates related to content increases, information related to the vector in a vector space whose dimension increases in accordance with the increase in the number of candidates is generated. The attribute evaluation method according to any one of claims 11 to 16.

The attribute evaluation method according to claim 11, wherein the information relating to the content is metadata including information relating to the content of the content or information extracted from the content.

The vector information generation step is based on mutually orthogonal vectors associated with each candidate for the attribute related to the second content associated with the first content whose attribute is evaluated by hyperlink or trackback, And the information which concerns on the vector showing the position in the vector space of the information which concerns on the said 2nd content is produced | generated, The said attribute evaluation process is a said 1st based on the information produced | generated by the said vector information production | generation process The attribute evaluation method according to claim 11, wherein an attribute related to the content is evaluated.

The attribute evaluation method according to claim 11, wherein the information related to the content is a search word used for searching the content or a word for which a hyperlink is set. .

An attribute evaluation program for evaluating attributes related to content based on information related to content,
A vector information generation procedure for generating information related to a vector representing a position in a vector space of information related to the content based on mutually orthogonal vectors associated with each candidate for the attribute related to the content;
An attribute evaluation procedure for evaluating an attribute related to the content based on information generated by the vector information generation procedure;
An attribute evaluation program for causing a computer to execute.

The vector information generation procedure searches for information on attribute candidates related to the content from a database that stores information related to the content and information about candidate attributes related to the content in association with the information related to the content as a search key, The attribute evaluation program according to claim 21, wherein information related to the vector is generated based on attribute candidate information related to the content obtained as a result of the search.

The vector information generation procedure searches for information on attribute candidates related to content in a specified combination from a database that stores a plurality of different combinations between information related to content and attribute candidates related to the content. 23. The attribute evaluation program according to claim 22, wherein information related to the vector is generated on the basis of information on candidate attributes related to the content obtained as a result.

The vector information generation procedure reads the information on the candidate attributes and the information on the weights from the database storing the information on the candidate attributes on the contents and the information on the weights of the candidate candidates on the contents, and reads the information The attribute evaluation program according to claim 22 or 23, wherein information concerning said vector is generated based on information.

The attribute evaluation program according to any one of claims 21 to 24, wherein the attribute evaluation procedure evaluates a plurality of attributes related to content with respect to predetermined content and a priority of each attribute. .

The attribute evaluation procedure sets a weight of information related to the vector based on an appearance frequency of information related to the content, and evaluates an attribute related to the content based on the set weight. The attribute evaluation program as described in any one of 21-25.

In the vector information generation procedure, when the number of attribute candidates related to content increases, information related to the vector in a vector space whose dimensions increase in accordance with the increase in the number of candidates is generated. The attribute evaluation program according to any one of claims 21 to 26.

The attribute evaluation program according to any one of claims 21 to 27, wherein the information relating to the content is metadata including information relating to the content of the content or information extracted from the content.

The vector information generation procedure is based on vectors orthogonal to each other associated with each candidate for the attribute relating to the second content associated with the first content for which the attribute is evaluated by hyperlink or trackback, And the information concerning the vector showing the position in the vector space of the information concerning the second content is generated, and the attribute evaluation procedure is based on the information generated by the vector information generation procedure. The attribute evaluation program according to any one of claims 21 to 28, wherein an attribute relating to the content is evaluated.

The attribute evaluation program according to any one of claims 21 to 28, wherein the information related to the content is a search word used for searching the content or a word for which a hyperlink is set. .