JP2008287550A

JP2008287550A - Recommendation device in consideration of order of purchase, recommendation method, recommendation program and recording medium with the program recorded thereon

Info

Publication number: JP2008287550A
Application number: JP2007132593A
Authority: JP
Inventors: Tomoharu Iwata; 具治岩田; Takeshi Yamada; 武士山田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2007-05-18
Filing date: 2007-05-18
Publication date: 2008-11-27
Anticipated expiration: 2027-05-18
Also published as: JP4847916B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a recommendation device that reduces a calculation cost while considering the order of purchases, with high prediction accuracy. <P>SOLUTION: The recommendation device 1 comprises: a preprocessing part 21 for generating input data 46 that has extracted purchase history for each user by using a user purchase history log 45; an extended Markov model estimation part 22 for estimating the prior probability that the user purchases the product from the input data 46 and a gap Markov model representing the probability that a specific product is purchased in the past when the user purchases a predetermined product; a weight estimation part 23 for building a coupled model that couples the prior probability 47 and the gap Markov model 48 by the maximum entropy principle, and estimating the weight representing unknown parameters; and a recommendation part 24 for selecting a product that shows the highest probability of purchase by the user, which is calculated from the coupled model using the input data 46, the prior probability 47, the gap Markov model 48, and the weight 49, and presenting it as a recommended object. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、購買履歴を入力として、ユーザが次に購入する商品を予測することで、ユーザに商品をリコメンドするリコメンド技術に係り、特に、オンラインストア等において商品をリコメンド対象として提示するリコメンド技術に関する。 The present invention relates to a recommendation technology for recommending a product to a user by predicting a product to be purchased next by using a purchase history as an input, and more particularly, to a recommendation technology for presenting a product as a recommendation target in an online store or the like. .

一般に、商品またはサービス（以下、単に商品という）を販売するオンラインストア等の多くの商品提供者（販売者）は、オンラインストア等を利用するユーザの購買行動に影響を与えるように商品をリコメンド対象として提示する（リコメンドする）。リコメンド（recommendation）は、ユーザへの商品の直接的または間接的な提示であり、ユーザが所望する商品の情報に対して迅速にアクセスできるように利便性を向上させる目的と、商品提供者の収益を増加させる目的とを有している。リコメンドは、多くのオンラインストアで用いられている。 In general, many product providers (sellers) such as online stores that sell products or services (hereinafter simply referred to as products) recommend products so as to influence the purchasing behavior of users who use online stores. Present as (recommend). A recommendation is a direct or indirect presentation of a product to the user, with the purpose of improving convenience so that the user can quickly access information on the desired product, and the revenue of the product provider. For the purpose of increasing. Recommendations are used in many online stores.

従来、様々なリコメンド技術が知られている（例えば非特許文献１および非特許文献２参照）。商品提供者にとって、オンラインストア等で商品を購買したことのあるユーザに次回も商品を購入してもらうために、ユーザにどのような商品をリコメンドしたらよいかを予測することは重要である。ユーザが次に購入する商品は、ユーザのその時点の興味を最も表していると考えられる。そのため、ユーザの購買の予測精度が高い手法は、ユーザの興味の予測精度が高い手法であると言える。しかしながら、非特許文献１に記載されたリコメンド手法は、商品が購入された順序（以下、順序情報という）を考慮していない。このため、予測精度が低い。 Conventionally, various recommendation techniques are known (see, for example, Non-Patent Document 1 and Non-Patent Document 2). For merchandise providers, it is important to predict what products a user should recommend in order for a user who has purchased merchandise at an online store or the like to purchase merchandise next time. The product that the user purchases next is considered to represent the user's current interest most. Therefore, it can be said that a method with high prediction accuracy of the user's purchase is a method with high prediction accuracy of the user's interest. However, the recommendation method described in Non-Patent Document 1 does not consider the order in which products are purchased (hereinafter referred to as order information). For this reason, prediction accuracy is low.

一方、非特許文献２に記載されたリコメンド手法は、順序情報を考慮して最大エントロピーモデルが適用されている。なお、最大エントロピーモデルとは、最大エントロピー原理を用いて求められた確率モデルのことを指す。この非特許文献２に記載されたリコメンド手法では、ユーザが直前に購入した商品のみを考慮するものである。これは、ユーザが直前に、つまり、最近購入した商品には、ユーザの現在の興味に関する多くの情報が含まれていると考えられるためである。また、このように順序情報を考慮した従来のリコメンド技術では、マルコフモデルや最大エントロピーモデルが適用されている。
Badrul Sarwar, George Karypis, Joseph Konstan, and John Reidl: Item-based co11aborative fi1tering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web,pp.285-295, New York,NY,USA,2001. ACM Press. Xin Jin, Bamshad Mobasher, and Yanzan Zhou: A web recommendation sustem based on maximum entropy. In Proceedings of the international conference on Information Technology: Coding and Computing(ITCC’05)-Volume I,pp.213-218, Washington,DC,USA,2005. IEEE Computer Society. On the other hand, in the recommendation method described in Non-Patent Document 2, the maximum entropy model is applied in consideration of order information. Note that the maximum entropy model refers to a probability model obtained using the maximum entropy principle. In the recommendation method described in Non-Patent Document 2, only the product purchased by the user immediately before is considered. This is because it is considered that the product purchased by the user immediately before, that is, recently purchased includes a lot of information related to the user's current interest. Further, in the conventional recommendation technique considering the order information in this way, a Markov model or a maximum entropy model is applied.
Badrul Sarwar, George Karypis, Joseph Konstan, and John Reidl: Item-based co11aborative fi1tering recommendation algorithms.In Proceedings of the 10th international conference on World Wide Web, pp.285-295, New York, NY, USA, 2001.ACM Press . Xin Jin, Bamshad Mobasher, and Yanzan Zhou: A web recommendation sustem based on maximum entropy.In Proceedings of the international conference on Information Technology: Coding and Computing (ITCC'05) -Volume I, pp.213-218, Washington, DC , USA, 2005. IEEE Computer Society.

しかしながら、非特許文献２に記載されたリコメンド手法では、ユーザが直前に購入した商品以外の情報を考慮していない。つまり、ユーザが昔に購入した商品にもある程度は含まれると考えられるユーザの現在の興味に関する情報が無視されていることになる。その結果、予測精度が低いという問題がある。 However, in the recommendation method described in Non-Patent Document 2, information other than the product purchased by the user immediately before is not considered. That is, information on the current interest of the user, which is considered to be included to some extent in the products purchased by the user in the past, is ignored. As a result, there is a problem that prediction accuracy is low.

また、順序情報を考慮した従来のリコメンド技術で用いられるマルコフモデルは、パラメータの推定および更新を高速に行うことができるが、予測精度が低いという問題点がある。また、順序情報を考慮した従来のリコメンド技術で用いられる最大エントロピーモデルは、予測精度は高いが、パラメータの推定および更新に多くの計算量を必要とするという問題点がある。 In addition, the Markov model used in the conventional recommendation technique considering order information can estimate and update parameters at high speed, but has a problem of low prediction accuracy. In addition, the maximum entropy model used in the conventional recommendation technique considering order information has a problem that the prediction accuracy is high, but a large amount of calculation is required for parameter estimation and update.

そこで、本発明は、以上のような問題点に鑑みてなされたものであり、購買順序を考慮しつつ、計算コストが低く、かつ、予測精度の高いリコメンド技術を提供することを目的とする。 Therefore, the present invention has been made in view of the above problems, and an object thereof is to provide a recommendation technique with low calculation cost and high prediction accuracy in consideration of the purchase order.

前記課題を解決するために、本発明に係るリコメンド装置は、商品またはサービスを示す販売対象を購買したことのある複数のユーザの購買順序に基づいて、それぞれのユーザに対して、前記販売対象に属する個別対象のいずれかをリコメンド対象として提示するリコメンド装置であって、前記ユーザが過去に購入した１以上の個別対象に関する情報を含む購買履歴情報を用いて、前記ユーザごとに、前記個別対象の購買履歴を抽出したデータを示す処理用データを作成する前処理手段と、前記作成された処理用データを用いて、前記ユーザが前記個別対象を購入する確率を示す事前確率を推定する事前確率推定手段と、前記作成された処理用データを用いて、前記ユーザが所定の個別対象を購入したときにその前に購入した個別対象が特定の個別対象である確率を示すギャップマルコフモデルを推定するギャップマルコフモデル推定手段と、前記推定された事前確率と前記推定されたギャップマルコフモデルとを最大エントロピー原理により結合したモデルを示す結合モデルを構築し、構築した結合モデルの未知パラメータを示す重みを推定する重み推定手段と、前記作成された処理用データと、前記推定された事前確率と、前記推定されたギャップマルコフモデルと、前記推定された重みとを用いて、前記結合モデルから計算されるユーザの購入する確率が最大となる前記個別対象を選択してリコメンド対象として提示するリコメンド手段とを備えることを特徴とする。 In order to solve the above-mentioned problem, the recommendation device according to the present invention is based on the purchase order of a plurality of users who have purchased a sales target indicating a product or a service. A recommendation device for presenting any of the individual objects belonging to the recommendation object, the purchase apparatus including purchase history information including information on one or more individual objects purchased by the user in the past, Pre-processing means for creating processing data indicating data from which purchase history is extracted, and prior probability estimation for estimating a prior probability indicating the probability that the user will purchase the individual object using the created processing data Using the means and the created processing data, when the user purchases a predetermined individual object, the individual object purchased before that is specified Gap Markov model estimation means for estimating a Gap Markov model indicating the probability of being an individual object, and a combined model indicating a model in which the estimated prior probability and the estimated gap Markov model are combined based on a maximum entropy principle. , Weight estimation means for estimating a weight indicating an unknown parameter of the constructed coupled model, the created processing data, the estimated prior probability, the estimated gap Markov model, and the estimated weight And recommending means for selecting and presenting the individual object having the highest probability of purchase by the user calculated from the combined model as a recommendation object.

また、前記課題を解決するために、本発明に係るリコメンド方法は、商品またはサービスを示す販売対象を購買したことのある複数のユーザの購買順序に基づいて、それぞれのユーザに対して、前記販売対象に属する個別対象のいずれかをリコメンド対象として提示するリコメンド装置のリコメンド方法であって、前処理手段によって、前記ユーザが過去に購入した１以上の個別対象に関する情報を含む購買履歴情報を用いて、前記ユーザごとに、前記個別対象の購買履歴を抽出したデータを示す処理用データを作成する前処理ステップと、事前確率推定手段によって、前記作成された処理用データを用いて、前記ユーザが前記個別対象を購入する確率を示す事前確率を推定する事前確率推定ステップと、ギャップマルコフモデル推定手段によって、前記作成された処理用データを用いて、前記ユーザが所定の個別対象を購入したときにその前に購入した個別対象が特定の個別対象である確率を示すギャップマルコフモデルを推定するギャップマルコフモデル推定ステップと、重み推定手段によって、前記推定された事前確率と前記推定されたギャップマルコフモデルとを最大エントロピー原理により結合したモデルを示す結合モデルを構築し、構築した結合モデルの未知パラメータを示す重みを推定する重み推定ステップと、リコメンド手段によって、前記作成された処理用データと、前記推定された事前確率と、前記推定されたギャップマルコフモデルと、前記推定された重みとを用いて、前記結合モデルから計算されるユーザの購入する確率が最大となる前記個別対象を選択してリコメンド対象として提示するリコメンドステップとを有することを特徴とする。 In order to solve the above-mentioned problem, the recommendation method according to the present invention is based on the purchase order of a plurality of users who have purchased a sales target indicating a product or service, and the sales method is performed for each user. A recommendation method for a recommendation device that presents one of individual objects belonging to an object as a recommendation object, using purchase history information including information on one or more individual objects purchased by the user in the past by a preprocessing unit. For each user, a preprocessing step for creating processing data indicating data obtained by extracting the purchase history of the individual object, and the user using the processing data created by the prior probability estimation means, the user Prior probability estimation step that estimates the probability of purchasing an individual object, and Gap Markov model estimation means Thus, using the created processing data, when the user purchases a predetermined individual object, a gap for estimating a gap Markov model indicating the probability that the individual object purchased before that user is a specific individual object A Markov model estimation step and a weight estimation means construct a coupled model indicating a model in which the estimated prior probability and the estimated gap Markov model are coupled by the maximum entropy principle, and an unknown parameter of the constructed coupled model is set. Using the weight estimation step for estimating the weight to be shown, and the processing data created by the recommendation means, the estimated prior probability, the estimated gap Markov model, and the estimated weight, Select the individual target that maximizes the probability of purchase by the user calculated from the combined model And having a recommendation step of presenting a recommendation-receiving Te.

かかる構成のリコメンド装置、または、かかる手順のリコメンド方法によれば、リコメンド装置は、第１段階として、購買履歴情報を用いてユーザごとに処理用データを作成する。ここで、購買履歴情報は、例えば、どのユーザがどのタイミングに何を購買したのかを示す情報である。購買履歴情報は、例えば、オンラインストアの商品を販売対象とする場合に、オンライン処理のログの形式で取得することができる。また、処理用データは、ユーザごとに、何をどんな順序で購買したのかを示すデータである。そして、リコメンド装置は、処理用データを用いる第２段階として、事前確率とギャップマルコフモデルとをそれぞれ推定する。ここで、事前確率は、例えば、これまでに売れた全種類の商品のうち、特定の種類の商品がいずれかのユーザに購入された確率を、商品別（種類別）に求めた確率を意味する。また、ギャップマルコフモデルは、例えば、いずれかのユーザが何回目かの購入タイミングで商品を購入したときに、その数回前に、特定の種類の商品を購入したときの確率を、商品別（種類別）に求めた確率を意味する。これにより、ユーザが昔に購入した商品にもある程度は含まれると考えられるユーザの現在の興味に関する情報を取り込むことが可能となる。また、推定方法には、例えば、最大事後確率（ＭＡＰ：Maximum A Posteriori）推定を用いることができる。 According to the recommendation device having such a configuration or the recommendation method of such a procedure, the recommendation device creates processing data for each user using the purchase history information as the first stage. Here, the purchase history information is, for example, information indicating what user has purchased what at which timing. The purchase history information can be acquired, for example, in the form of an online processing log when a product in an online store is to be sold. The processing data is data indicating what is purchased in what order for each user. Then, the recommendation device estimates the prior probability and the Gap Markov model as the second stage using the processing data. Here, the prior probability means, for example, the probability of obtaining the probability that a specific type of product was purchased by any user among all types of products sold so far, by product (by type). To do. In addition, the Gap Markov model, for example, when a user purchases a product at several purchase timings, the probability of purchasing a specific type of product several times before is calculated by product ( This means the probability obtained by type). As a result, it is possible to capture information on the current interest of the user, which is considered to be included to some extent in the products purchased by the user in the past. As the estimation method, for example, maximum posterior probability (MAP: Maximum A Posteriori) estimation can be used.

そして、リコメンド装置は、第３段階として、事前確率とギャップマルコフモデルとを最大エントロピー原理により結合した結合モデルの未知パラメータを示す重みを推定する。これにより、結合モデルは、最大エントロピーモデルによる予測精度の高さと、マルコフモデルによる計算コストの低さとを合わせもつ特徴を有することとなる。そして、リコメンド装置は、第４段階として、処理用データおよび事前に推定した各パラメータを用いて、結合モデルから計算されるユーザの購入する確率が最大となる個別対象をリコメンドする。したがって、あるユーザについてその時点の興味を最も表していると考えられる商品を精度よく、低コストの計算で求めることができる。 Then, as a third stage, the recommendation device estimates a weight indicating an unknown parameter of a combined model obtained by combining the prior probability and the Gap Markov model by the maximum entropy principle. As a result, the combined model has a feature that combines a high prediction accuracy by the maximum entropy model and a low calculation cost by the Markov model. Then, as a fourth stage, the recommendation device recommends an individual object that maximizes the probability of purchase by the user calculated from the combined model, using the processing data and each parameter estimated in advance. Therefore, it is possible to accurately obtain a product that is considered to represent the interest at that point in time for a certain user with high accuracy and low cost.

また、本発明に係るリコメンド装置は、前記重み推定手段が、経験分布による事前分布の対数尤度と、前記結合モデルと前記事前分布の対数尤度との積で表した結合モデルについての期待値とが等しいことを示す第１条件と、ギャップマルコフモデルの対数尤度と、前記結合モデルと前記ギャップマルコフモデルの対数尤度との積で表した結合モデルについての期待値とが等しいことを示す第２条件とを用いて、前記結合モデルを構築し、対象とする全ユーザについての前記処理用データから、対象とする全個別対象の購入確率の対数尤度を最大化することで前記重みを推定することが好ましい。 In the recommendation apparatus according to the present invention, the weight estimation unit may expect an expected combination model expressed by a product of a logarithmic likelihood of a prior distribution based on an empirical distribution and a log likelihood of the combined model and the prior distribution. The first condition indicating that the values are equal, the log likelihood of the Gap Markov model, and the expected value for the combined model represented by the product of the combined model and the log likelihood of the Gap Markov model. And constructing the combined model using the second condition shown, and maximizing the log likelihood of the purchase probability of all target individual targets from the processing data for all target users Is preferably estimated.

また、本発明に係るリコメンド方法は、前記重み推定ステップが、経験分布による事前分布の対数尤度と、前記結合モデルと前記事前分布の対数尤度との積で表した結合モデルについての期待値とが等しいことを示す第１条件と、ギャップマルコフモデルの対数尤度と、前記結合モデルと前記ギャップマルコフモデルの対数尤度との積で表した結合モデルについての期待値とが等しいことを示す第２条件とを用いて、前記結合モデルを構築し、対象とする全ユーザについての前記処理用データから、対象とする全個別対象の購入確率の対数尤度を最大化することで前記重みを推定することが好ましい。 In the recommendation method according to the present invention, the weight estimation step is expected for a combined model expressed by a product of a logarithmic likelihood of a prior distribution based on an empirical distribution and a log likelihood of the combined model and the prior distribution. The first condition indicating that the values are equal, the log likelihood of the Gap Markov model, and the expected value for the combined model represented by the product of the combined model and the log likelihood of the Gap Markov model. And constructing the combined model using the second condition shown, and maximizing the log likelihood of the purchase probability of all target individual targets from the processing data for all target users Is preferably estimated.

かかる構成のリコメンド装置、または、かかる手順のリコメンド方法によれば、推定された結合モデルは、購買順序を考慮した最大エントロピーモデルによる予測精度よりも高い予測精度を実現することが可能である。 According to the recommendation apparatus having such a configuration or the recommendation method of such a procedure, the estimated combined model can achieve higher prediction accuracy than the prediction accuracy based on the maximum entropy model considering the purchase order.

また、本発明に係るリコメンドプログラムは、前記したリコメンド方法をコンピュータに実行させることを特徴とする。このように構成されることにより、このプログラムをインストールされたコンピュータは、このプログラムに基づいた各機能を実現することができる。 In addition, the recommendation program according to the present invention causes a computer to execute the above-described recommendation method. By being configured in this way, a computer in which this program is installed can realize each function based on this program.

また、本発明に係るコンピュータ読み取り可能な記録媒体は、前記したリコメンドプログラムが記録されたことを特徴とする。このように構成されることにより、この記録媒体を装着されたコンピュータは、この記録媒体に記録されたプログラムに基づいた各機能を実現することができる。 A computer-readable recording medium according to the present invention is characterized in that the above-mentioned recommendation program is recorded. By being configured in this way, a computer equipped with this recording medium can realize each function based on a program recorded on this recording medium.

本発明によれば、購買順序を考慮しつつ、計算コストが低く、かつ、予測精度の高いリコメンド技術を提供することができる。その結果、ユーザが所望する商品の情報に対して迅速にアクセスできるようになると共に、商品提供者の収益を増加させることが可能となる。 According to the present invention, it is possible to provide a recommendation technique with low calculation cost and high prediction accuracy while considering the purchase order. As a result, it becomes possible to quickly access information on the product desired by the user, and increase the profit of the product provider.

以下、図面を参照して本発明のリコメンド装置およびリコメンド方法を実施するための最良の形態（以下「実施形態」という）について詳細に説明する。 The best mode for carrying out the recommendation device and the recommendation method of the present invention (hereinafter referred to as “embodiment”) will be described below in detail with reference to the drawings.

図１は、本発明の実施形態に係るリコメンド装置の構成を示すブロック図である。リコメンド装置１は、商品またはサービスを示す販売対象（商品群）を購買したことのある複数のユーザの購買順序に基づいて、それぞれのユーザに対して、販売対象（商品群）に属する個別対象（商品）のいずれかをリコメンド対象として提示する（リコメンドする）ものである。リコメンド装置１は、図１に示すように、演算手段２と、入力手段３と、記憶手段４と、出力手段５とを備えており、これら各手段２〜５はバスライン６に接続されている。 FIG. 1 is a block diagram showing a configuration of a recommendation device according to an embodiment of the present invention. The recommendation apparatus 1 is based on the purchase order of a plurality of users who have purchased a sales target (product group) indicating a product or a service, and each individual user belongs to the sales target (product group) ( Product) is presented as a recommendation target (recommended). As shown in FIG. 1, the recommendation device 1 includes a calculation means 2, an input means 3, a storage means 4, and an output means 5, and these means 2 to 5 are connected to a bus line 6. Yes.

演算手段２は、例えば、ＣＰＵ（Central Processing Unit）およびＲＡＭ（Random Access Memory）から構成される主制御装置である。この演算手段２は、図１に示すように、前処理部（前処理手段）２１と、拡張マルコフモデル推定部（拡張マルコフモデル推定手段）２２と、重み推定部（重み推定手段）２３と、リコメンド部（リコメンド手段）２４と、メモリ２５とを含んで構成される。なお、各部２１〜２４の詳細な説明は後記する。 The computing means 2 is a main control device composed of, for example, a CPU (Central Processing Unit) and a RAM (Random Access Memory). As shown in FIG. 1, the computing unit 2 includes a preprocessing unit (preprocessing unit) 21, an extended Markov model estimation unit (extended Markov model estimation unit) 22, a weight estimation unit (weight estimation unit) 23, A recommendation unit (recommendation means) 24 and a memory 25 are included. A detailed description of each part 21 to 24 will be given later.

入力手段３は、例えば、キーボード、マウス、ディスクドライブ装置などから構成される。この入力手段３は、例えば、データとして購買履歴ログ（購買履歴情報）を入力し、記憶手段４に格納する。 The input unit 3 includes, for example, a keyboard, a mouse, a disk drive device, and the like. This input means 3 inputs, for example, a purchase history log (purchase history information) as data and stores it in the storage means 4.

購買履歴ログ（購買情報）は、ユーザが過去に購入した１以上の個別対象（商品）に関する情報を含んでいる。この購買履歴ログは、例えば、表１に示すように、商品の購買ごとに（売買成立ごとに）、ユーザ番号と、商品番号と、購買時刻とを記録したログである。表１の例では、ユーザ番号が「１」であるユーザは、商品番号が「３」，「１」，「６」の商品を購入したことが分かる。 The purchase history log (purchase information) includes information on one or more individual objects (products) purchased by the user in the past. For example, as shown in Table 1, the purchase history log is a log in which a user number, a product number, and a purchase time are recorded for each purchase of a product (for each sale). In the example of Table 1, it can be seen that the user with the user number “1” has purchased the products with the product numbers “3”, “1”, and “6”.

記憶手段４は、例えば、一般的なハードディスク装置などから構成され、演算手段２で用いられる各種プログラムや各種データ等を記憶する。この記憶手段４は、プログラムとして、前処理プログラム４１と、拡張マルコフモデル推定プログラム４２と、重み推定プログラム４３と、リコメンドプログラム４４とをプログラム格納部４０ａに記憶する。そして、演算手段２は、これらのプログラム４１〜４４を記憶手段４から読み込んでメモリ２５に展開して実行することで、前記した前処理部２１、拡張マルコフモデル推定部２２、重み推定部２３、リコメンド部２４の各機能を実現する。 The storage unit 4 is composed of, for example, a general hard disk device, and stores various programs and various data used by the calculation unit 2. The storage means 4 stores a preprocessing program 41, an extended Markov model estimation program 42, a weight estimation program 43, and a recommendation program 44 as programs in the program storage unit 40a. And the calculating means 2 reads these programs 41-44 from the memory | storage means 4, expand | deploys to the memory 25, and executes them, The above-mentioned pre-processing part 21, extended Markov model estimation part 22, weight estimation part 23, Each function of the recommendation unit 24 is realized.

また、記憶手段４は、購買履歴ログ４５と、入力データ（処理用データ）４６と、事前確率４７と、ギャップマルコフモデル４８と、重み４９とをデータ格納部４０ｂに記憶する。ここで、購買履歴ログ４５は、入力手段３から入力されるデータであり、例えば、前記した表１に示したものである。入力データ４６は、演算手段２の前処理部２１の演算処理結果を示すデータである。事前確率４７とギャップマルコフモデル４８とは、演算手段２の拡張マルコフモデル推定部２２の演算処理結果を示すデータである。重み４９は、演算手段２の重み推定部２３の演算処理結果を示すデータである。 In addition, the storage unit 4 stores a purchase history log 45, input data (processing data) 46, a priori probability 47, a gap Markov model 48, and a weight 49 in the data storage unit 40b. Here, the purchase history log 45 is data input from the input means 3 and is, for example, as shown in Table 1 described above. The input data 46 is data indicating the calculation processing result of the preprocessing unit 21 of the calculation means 2. The prior probability 47 and the gap Markov model 48 are data indicating the calculation processing result of the extended Markov model estimation unit 22 of the calculation means 2. The weight 49 is data indicating the calculation processing result of the weight estimation unit 23 of the calculation means 2.

出力手段５は、例えば、グラフィックボード（出力インタフェース）およびそれに接続されたモニタである。モニタは、例えば、液晶ディスプレイ等から構成され、演算処理結果（例えば、リコメンドする商品の情報等）を表示する。 The output means 5 is, for example, a graphic board (output interface) and a monitor connected thereto. The monitor is composed of, for example, a liquid crystal display or the like, and displays a calculation processing result (for example, information on recommended products).

次に、演算手段２の各部の構成の詳細を説明する。
＜前処理部＞
図２は、前処理部の構成を示す機能ブロック図である。前処理部２１は、購買履歴ログ４５を用いて、ユーザごとに、商品（個別対象）の購買系列（購買履歴）を抽出したデータを示す入力データ（処理用データ）を作成するものであり、図２に示すように、購買履歴ログ読込部２１１と、入力データ書込部２１２とを備えている。 Next, the detail of the structure of each part of the calculating means 2 is demonstrated.
<Pre-processing section>
FIG. 2 is a functional block diagram illustrating the configuration of the preprocessing unit. The pre-processing unit 21 uses the purchase history log 45 to create input data (processing data) indicating data obtained by extracting a purchase series (purchase history) of products (individual targets) for each user. As shown in FIG. 2, a purchase history log reading unit 211 and an input data writing unit 212 are provided.

購買履歴ログ読込部２１１は、購買履歴ログ４５から、各購買のユーザ、時刻、商品の情報を読み込み、入力データ書込部２１２に出力する。
入力データ書込部２１２は、購買履歴ログ４５に含まれる商品の購買ごとのユーザ番号、商品番号、購買時刻に基づいて、ユーザごとに、購入商品の購買系列を算出するものである。また、入力データ書込部２１２は、ユーザごとに算出した購買系列を、入力データ（処理用データ）４６として、記憶手段４（図１参照）に書き込む。なお、書き込まれた入力データ４６は、拡張マルコフモデル推定部２２と、重み推定部２３と、リコメンド部２４で利用される。 The purchase history log reading unit 211 reads information on the user, time, and product of each purchase from the purchase history log 45 and outputs the information to the input data writing unit 212.
The input data writing unit 212 calculates a purchase sequence of purchased products for each user based on the user number, the product number, and the purchase time for each purchase of the product included in the purchase history log 45. Further, the input data writing unit 212 writes the purchase series calculated for each user into the storage unit 4 (see FIG. 1) as input data (processing data) 46. The written input data 46 is used by the extended Markov model estimation unit 22, the weight estimation unit 23, and the recommendation unit 24.

以下では、ユーザ集合Ｕは式（１）で定義され、商品集合Ｓは式（２）で定義されるものとする。式（１）において、ｎはユーザ番号（単にユーザともいう）、Ｎはユーザ数を示す。式（２）において、ｊは商品番号（単に商品ともいう）、Ｖは商品数を示す。 In the following, it is assumed that the user set U is defined by equation (1) and the product set S is defined by equation (2). In Expression (1), n represents a user number (also simply referred to as a user), and N represents the number of users. In Expression (2), j represents a product number (also simply referred to as a product), and V represents the number of products.

あるユーザｎがｋ番目に購入した商品ｘ_n,kは式（３）に示すように商品集合Ｓに含まれ、そのときのユーザｎの購買系列ｕ_nkは、式（４）で表される。式（４）で示した購買系列ｕ_nkは、ユーザｎがｋ番目の商品ｘ_n,kを購入する前に購入した（ｋ−１）個の商品による系列である。この購買系列ｕ_nkは、入力データ書込部２１２によって、購買履歴ログ４５から算出される。 A product x _{n, k} purchased by a user n in the k-th order is included in the product set S as shown in Equation (3), and a purchase sequence _unk of the user n at that time is expressed by Equation (4). . The purchase sequence _unk shown in Expression (4) is a sequence of (k−1) items purchased before the user n purchases the k-th item x _{n, k} . This purchase sequence _unk is calculated from the purchase history log 45 by the input data writing unit 212.

ここで、入力データ（処理用データ）４６の具体例について表２を参照して説明する。入力データは、表２に示されるように、各ユーザの購入商品の購買系列で構成される。 A specific example of the input data (processing data) 46 will be described with reference to Table 2. As shown in Table 2, the input data is composed of purchase series of purchased products of each user.

例えば、表１に示す購買履歴ログ４５によれば、ユーザ番号が「１」であるユーザ（ｎ＝１）は、商品番号「３」の商品に続いて商品番号「１」の商品を購入している。これにより、表２に示すように、ユーザ番号ｎが「１」の購買系列の要素である「ｘ_1,1」は、商品番号「３」の商品を示し、同様に「ｘ_1,2」は、商品番号「１」の商品を示すこととなる。 For example, according to the purchase history log 45 shown in Table 1, a user (n = 1) with a user number “1” purchases a product with a product number “1” following a product with a product number “3”. ing. As a result, as shown in Table 2, “x _1,1 ”, which is an element of the purchase series whose user number n is “1”, indicates the product with the product number “3”, and similarly “x _1,2 ”. Indicates the product of the product number “1”.

＜拡張マルコフモデル推定部＞
図３は、拡張マルコフモデル推定部の構成を示す機能ブロック図である。
拡張マルコフモデル推定部２２は、図３に示すように、入力データ読込部２２１と、事前確率推定部２２２と、ギャップマルコフモデル推定部２２３と、拡張マルコフモデル書込部２２４とを備えている。
入力データ読込部２２１は、入力データ４６を読み込み、事前確率推定部２２２およびギャップマルコフモデル推定部２２３に出力する。 <Extended Markov model estimation unit>
FIG. 3 is a functional block diagram illustrating a configuration of the extended Markov model estimation unit.
As shown in FIG. 3, the extended Markov model estimation unit 22 includes an input data reading unit 221, a prior probability estimation unit 222, a gap Markov model estimation unit 223, and an extended Markov model writing unit 224.
The input data reading unit 221 reads the input data 46 and outputs the input data 46 to the prior probability estimation unit 222 and the gap Markov model estimation unit 223.

≪事前確率推定部≫
事前確率推定部（事前確率推定手段）２２２は、前処理部２１（図１参照）で作成された入力データ４６を用いて、ユーザが商品（個別対象）を購入する確率を示す事前確率を推定するものである。
本実施形態では、事前確率推定部２２２は、式（５）に示す事前確率Ｐ＾（ｉ）の計算を行う。なお、本明細書において、記号「＾（ハット）」は直前の文字の上に記載されることを意味する。
最大事後確率（ＭＡＰ：Maximum A Posteriori）推定によると、商品ｉを購入する事前確率Ｐ＾（ｉ）は、式（５）で推定される。式（５）において、δは、データ数が少ない場合に計算を安定化させる役割を持つハイパーパラメータであり、leave-one-out交差検定法により推定することができる。 ≪A priori probability estimation part≫
The prior probability estimation unit (prior probability estimation means) 222 estimates the prior probability indicating the probability that the user purchases a product (individual target) using the input data 46 created by the preprocessing unit 21 (see FIG. 1). To do.
In the present embodiment, the prior probability estimation unit 222 calculates the prior probability P ^ (i) shown in Expression (5). In this specification, the symbol “＾ (hat)” means to be written on the immediately preceding character.
According to the maximum posterior probability (MAP: Maximum A Posteriori) estimation, the prior probability P ^ (i) for purchasing the product i is estimated by the equation (5). In Expression (5), δ is a hyperparameter that plays a role of stabilizing the calculation when the number of data is small, and can be estimated by a leave-one-out cross-validation method.

事前確率推定部２２２は、事前確率推定部２２２で推定された事前確率Ｐ＾（ｉ）を拡張マルコフモデル書込部２２４に出力する。 Prior probability estimation unit 222 outputs prior probability P ^ (i) estimated by prior probability estimation unit 222 to extended Markov model writing unit 224.

≪ギャップマルコフモデル推定部≫
ギャップマルコフモデル推定部（ギャップマルコフモデル推定手段）２２３は、前処理部２１（図１参照）で作成された入力データ４６を用いて、ユーザが所定の商品（個別対象）を購入したときにその前に購入した商品（個別対象）が特定の商品（個別対象）である確率を示すギャップマルコフモデルを推定するものである。
本実施形態では、ギャップマルコフモデル推定部２２３は、式（６）に示すｌギャップマルコフモデルＰ_l（ｊ_l｜ｉ）の計算を行う。ｌギャップマルコフモデルＰ_l（ｊ_l｜ｉ）は、商品ｉを購入したｌ個前の商品がｊである確率を表す。ＭＡＰ推定によると、ｌギャップマルコフモデルは、式（６）で推定される。ギャップマルコフモデル推定部２２３は、推定されたｌギャップマルコフモデルを拡張マルコフモデル書込部２２４に出力する。 ≪Gap Markov model estimation part≫
The gap Markov model estimation unit (gap Markov model estimation means) 223 uses the input data 46 created by the preprocessing unit 21 (see FIG. 1), and when the user purchases a predetermined product (individual target) A gap Markov model indicating the probability that a previously purchased product (individual target) is a specific product (individual target) is estimated.
In the present embodiment, the gap Markov model estimation unit 223 calculates the l gap Markov model P _l (j _l | i) shown in Expression (6). The l-gap Markov model P _l (j _l | i) represents the probability that the l-th previous product that purchased the product i is j. According to the MAP estimation, the l-gap Markov model is estimated by Equation (6). The gap Markov model estimation unit 223 outputs the estimated l-gap Markov model to the extended Markov model writing unit 224.

前記した式（５）に示した事前確率および式（６）に示したｌギャップマルコフモデルにおけるパラメータは、それぞれ単純な和のみで計算できる。そのため、これらの推定に必要な計算量は少なく、また、新たなデータが増えたときに、これらの更新は容易に行うことができる。 The prior probabilities shown in equation (5) and the parameters in the l-gap Markov model shown in equation (6) can be calculated by simple sums. Therefore, the amount of calculation required for these estimations is small, and when new data increases, these updates can be easily performed.

拡張マルコフモデル書込部２２４は、式（５）に示した事前確率を事前確率４７として記憶手段４（図１参照）に格納する。また、拡張マルコフモデル書込部２２４は、式（６）に示したギャップマルコフモデルをギャップマルコフモデル４８として記憶手段４（図１参照）に格納する。なお、格納された事前確率４７およびギャップマルコフモデル４８は、重み推定部２３およびリコメンド部２４で利用される。 The extended Markov model writing unit 224 stores the prior probability shown in Expression (5) as the prior probability 47 in the storage unit 4 (see FIG. 1). The extended Markov model writing unit 224 stores the gap Markov model shown in the equation (6) in the storage unit 4 (see FIG. 1) as the gap Markov model 48. The stored prior probabilities 47 and gap Markov models 48 are used by the weight estimation unit 23 and the recommendation unit 24.

＜重み推定部＞
図４は、重み推定部の構成を示す機能ブロック図である。重み推定部２３は、拡張マルコフモデル推定部２２（図１参照）で推定された事前確率４７とギャップマルコフモデル４８とを最大エントロピー原理により結合したモデルを示す結合モデルを構築し、構築した結合モデルの未知パラメータを示す重みを推定するものである。なお、この結合モデルのことを拡張マルコフモデルともいう。 <Weight estimation unit>
FIG. 4 is a functional block diagram showing the configuration of the weight estimation unit. The weight estimation unit 23 constructs a coupled model indicating a model in which the prior probability 47 estimated by the extended Markov model estimating unit 22 (see FIG. 1) and the gap Markov model 48 are coupled by the maximum entropy principle, and the constructed coupled model The weight indicating the unknown parameter is estimated. This combined model is also called an extended Markov model.

重み推定部２３は、図４に示すように、入力データ読込部２３１と、拡張マルコフモデル読込部２３２と、ギャップ重み推定部２３３と、重み書込部２３４とを備えている。
入力データ読込部２３１は、入力データ４６を読み込み、ギャップ重み推定部２３３に出力する。
拡張マルコフモデル読込部２３２は、事前確率４７とギャップマルコフモデル４８を読み込み、ギャップ重み推定部２３３に出力する。 As shown in FIG. 4, the weight estimation unit 23 includes an input data reading unit 231, an extended Markov model reading unit 232, a gap weight estimation unit 233, and a weight writing unit 234.
The input data reading unit 231 reads the input data 46 and outputs it to the gap weight estimation unit 233.
The extended Markov model reading unit 232 reads the prior probability 47 and the gap Markov model 48 and outputs them to the gap weight estimation unit 233.

≪ギャップ重み推定部≫
ギャップ重み推定部２３３は、入力データ４６と、事前確率４７と、ギャップマルコフモデル４８とを用いて、式（７）および式（８）の制約のもと、エントロピーを最大化することにより、事前確率４７と、ギャップマルコフモデル４８とを最大エントロピー原理により結合して式（９）に示す結合モデルを構築し、その重みを推定する。なお、以下では、対数は自然対数、すなわち、対数logの底は「ｅ」であるものとする。また、式（９）において、商品ｉを購入するｌ個前に購入した商品をｊ_lとする。また、Ｌは、ｋ番目の商品を購入するまでに購入した商品の個数を示す。 ≪Gap weight estimation part≫
The gap weight estimation unit 233 uses the input data 46, the prior probability 47, and the gap Markov model 48 to maximize the entropy under the constraints of the equations (7) and (8). The probability 47 and the gap Markov model 48 are combined according to the maximum entropy principle to construct a combined model shown in Expression (9), and the weight is estimated. In the following, it is assumed that the logarithm is a natural logarithm, that is, the base of the logarithm log is “e”. Further, in the equation (9), let j _l be the product purchased ₁ item before purchasing the product i. L represents the number of products purchased before the k-th product is purchased.

式（７）の左辺は、「経験分布による事前分布の対数尤度」である。また、式（７）の右辺は、「モデルＰ（確率）」と、「事前分布の対数尤度」との積であらわしたもの（これはモデルＰについての期待値に相当する）を示す。そして、式（７）の左辺が、式（７）の右辺と等しいと仮定すること（第１条件）が「制約」を意味する。 The left side of Expression (7) is “log likelihood of prior distribution based on experience distribution”. The right side of Expression (7) indicates a product of “model P (probability)” and “log likelihood of prior distribution” (this corresponds to an expected value for model P). Then, assuming that the left side of Expression (7) is equal to the right side of Expression (7) (first condition) means “constraint”.

式（８）の左辺は、「ギャップマルコフモデルの対数尤度」である。また、式（８）の右辺は、「モデルＰ（確率）」と、「ギャップマルコフモデルの対数尤度」との積であらわしたもの（これはモデルＰについての期待値に相当する）を示す。そして、式（８）の左辺が、式（８）の右辺と等しいと仮定すること（第２条件）が「制約」を意味する。 The left side of Equation (8) is “log likelihood of Gap Markov model”. In addition, the right side of the equation (8) represents a product of “model P (probability)” and “log likelihood of Gap Markov model” (this corresponds to an expected value for model P). . Then, assuming that the left side of Expression (8) is equal to the right side of Expression (8) (second condition) means “constraint”.

この場合、式（９）に示す結合モデルは、式（１０）に示すように展開することができる。式（１０）において、Ｚは式（１１）で示される正規化項であり、αは式（１２）で示される未知パラメータ（以下、重みともいう）である。なお、本明細書において、αに添字を付す場合には、個別のパラメータを指し、αに添字を付さない場合には、Ｌ個のパラメータの集合を指す。 In this case, the combined model shown in Equation (9) can be developed as shown in Equation (10). In Expression (10), Z is a normalization term represented by Expression (11), and α is an unknown parameter (hereinafter also referred to as weight) represented by Expression (12). In the present specification, when subscript is added to α, it indicates an individual parameter, and when α is not subscripted, it indicates a set of L parameters.

未知パラメータαは、式（１３）に示す対数尤度Ｊを、例えば、準ニュートン法などの最適化手法を用い最大化することにより、大域的最適解を得ることができる。 For the unknown parameter α, a global optimal solution can be obtained by maximizing the log likelihood J shown in Equation (13) using an optimization method such as a quasi-Newton method.

式（１３）に示す対数尤度Ｊは、対象とする全ユーザについての購買系列ｕ_nk、つまり、入力データ（処理用データ）から求められる、対象とする全商品（全個別対象）ｘ_n,kの購入確率の対数尤度を示す。したがって、ギャップ重み推定部２３３は、式（１３）に示す対数尤度Ｊを最大化することで重みαを推定する。 The log likelihood J shown in the equation (13) is the purchase series u _nk for all target users, that is, all target products (all individual targets) x _n, obtained from input data (processing data) _. Indicates the log likelihood of the purchase probability of _k . Therefore, the gap weight estimation unit 233 estimates the weight α by maximizing the log likelihood J shown in Expression (13).

また、本実施形態では、ギャップ重み推定部２３３は、過学習を抑えるため、未知パラメータαの事前分布として平均０の正規分布を用いることとする。なお、学習に用いなかったデータに対する汎化誤差が大きくなってしまう現象は過学習と呼ばれている。ここで、予め定められた学習データのうち、拡張マルコフモデル推定部２２によって事前確率およびギャップマルコフモデルの推定に用いた学習データを、未知パラメータαの推定に用いると、過学習する可能性がある。そのため、交差検定法により、未知パラメータαを推定する。すなわち、予め定められた学習データを分割し、事前確率およびギャップマルコフモデルの推定に用いなかったデータの対数尤度を準ニュートン法などの最適化手法を用い最大化することにより、重みαを推定する。 In the present embodiment, the gap weight estimation unit 233 uses a normal distribution with an average of 0 as the prior distribution of the unknown parameter α in order to suppress overlearning. Note that the phenomenon in which the generalization error for data not used for learning becomes large is called overlearning. Here, if the learning data used for the estimation of the prior probabilities and the gap Markov model by the extended Markov model estimation unit 22 among the predetermined learning data is used for the estimation of the unknown parameter α, there is a possibility of overlearning. . Therefore, the unknown parameter α is estimated by cross-validation. That is, the weight α is estimated by dividing predetermined learning data and maximizing the log likelihood of the data that was not used for the estimation of the prior probability and the Gap Markov model using an optimization method such as the quasi-Newton method. To do.

重み書込部２３４は、ギャップ重み推定部２３３で推定された重みαを重み４９として記憶手段４（図１参照）に格納する。なお、格納された重み４９は、リコメンド部２４で利用される。 The weight writing unit 234 stores the weight α estimated by the gap weight estimation unit 233 as the weight 49 in the storage unit 4 (see FIG. 1). The stored weight 49 is used by the recommendation unit 24.

＜リコメンド部＞
図５は、リコメンド部の構成を示す機能ブロック図である。
リコメンド部２４は、入力データ（処理用データ）４６と、推定された事前確率４７と、推定されたギャップマルコフモデル４８と、推定された重み４９とを用いて、結合モデル（拡張マルコフモデル）から計算されるユーザの購入する確率が最大となる商品（個別対象）を選択してリコメンド対象として提示するものである。 <Recommendation>
FIG. 5 is a functional block diagram showing the configuration of the recommendation unit.
The recommendation unit 24 uses the input data (processing data) 46, the estimated prior probability 47, the estimated gap Markov model 48, and the estimated weight 49 from the combined model (extended Markov model). A product (individual target) having the maximum probability of purchase of the user to be calculated is selected and presented as a recommendation target.

リコメンド部２４は、図５に示すように、入力データ読込部２４１と、拡張マルコフモデル読込部２４２と、重み読込部２４３と、最大商品選択部２４４と、リコメンド出力部２４５とを備えている。 As shown in FIG. 5, the recommendation unit 24 includes an input data reading unit 241, an extended Markov model reading unit 242, a weight reading unit 243, a maximum product selection unit 244, and a recommendation output unit 245.

入力データ読込部２４１は、入力データ４６を読み込み、最大商品選択部２４４に出力する。
拡張マルコフモデル読込部２４２は、事前確率４７とギャップマルコフモデル４８とを読み込み、最大商品選択部２４４に出力する。
重み読込部２４３は、重み４９を読み込み、最大商品選択部２４４に出力する。 The input data reading unit 241 reads the input data 46 and outputs it to the maximum product selection unit 244.
The extended Markov model reading unit 242 reads the prior probability 47 and the gap Markov model 48 and outputs them to the maximum product selection unit 244.
The weight reading unit 243 reads the weight 49 and outputs it to the maximum product selection unit 244.

最大商品選択部２４４は、式（１４）に示す演算を実行する。ここで、ユーザｎの購買履歴ｕ_nを式（１５）とする。なお、Ｋ_nは、ユーザｎの購入商品数である。 The maximum product selection unit 244 executes the calculation shown in Expression (14). Here, the purchase history u _n users n and equation (15). K _n is the number of products purchased by user n.

本実施形態では、最大商品選択部２４４は、式（１４）から計算されるユーザｎの購入する確率が、最も高い商品を選択する。すなわち、最大商品選択部２４４は、ユーザｕが購入した商品の商品集合Ｓ（商品番号１≦ｉ≦Ｖ）の中から、式（１４）で示される確率が最大になる商品（商品番号）

を選択する。 In the present embodiment, the maximum product selection unit 244 selects a product having the highest probability of purchase by the user n calculated from Expression (14). In other words, the maximum product selection unit 244 selects a product (product number) having the maximum probability represented by the formula (14) from the product set S (product number 1 ≦ i ≦ V) of products purchased by the user u.

Select.

リコメンド出力部２４５は、最大商品選択部２４４で選択された商品番号を出力することで、当該商品番号の商品をリコメンド対象として提示する。このリコメンド対象は、出力手段５に出力される。 The recommendation output unit 245 outputs the product number selected by the maximum product selection unit 244 to present the product with the product number as a recommendation target. This recommendation target is output to the output means 5.

[リコメンド装置の動作]
図１に示したリコメンド装置１の動作について図６を参照（適宜図１参照）して説明する。図６は、リコメンド装置の動作を示すフローチャートである。リコメンド装置１は、前処理部２１によって、購買履歴ログ４５を用いて、入力データを生成する（ステップＳ１：前処理ステップ）。そして、リコメンド装置１は、拡張マルコフモデル推定部２２によって、拡張マルコフモデル推定処理を実行する（ステップＳ２：拡張マルコフモデル推定ステップ）。続いて、リコメンド装置１は、重み推定部２３によって、ステップＳ２で推定された事前確率４７とギャップマルコフモデル４８とを用いて、重み推定処理を実行する（ステップＳ３：重み推定ステップ）。そして、リコメンド装置１は、リコメンド部２４によって、結合モデル（拡張マルコフモデル）から計算されるユーザの購入する確率が最大となる商品（個別対象）を選択してリコメンドする（ステップＳ４：リコメンドステップ）。 [Recommendation unit operation]
The operation of the recommendation device 1 shown in FIG. 1 will be described with reference to FIG. 6 (see FIG. 1 as appropriate). FIG. 6 is a flowchart showing the operation of the recommendation device. The recommendation device 1 uses the purchase history log 45 by the preprocessing unit 21 to generate input data (step S1: preprocessing step). And the recommendation apparatus 1 performs an extended Markov model estimation process by the extended Markov model estimation part 22 (step S2: extended Markov model estimation step). Then, the recommendation apparatus 1 performs a weight estimation process by using the prior probability 47 and the gap Markov model 48 estimated in step S2 by the weight estimation unit 23 (step S3: weight estimation step). The recommended device 1, the recommendation unit 24, the probability of purchase of the user is calculated from binding model (extended Markov model) is recommended to select a product (individual subject) as the maximum (Step S 4: recommendation step ).

次に、前記したステップＳ２の拡張マルコフモデル推定処理と、前記したステップＳ３の重み推定処理について図７および図８をそれぞれ参照して説明する。図７は、拡張マルコフモデル推定処理を示すフローチャートであり、図８は、重み推定処理を示すフローチャートである。 Next, the extended Markov model estimation process in step S2 and the weight estimation process in step S3 will be described with reference to FIGS. 7 and 8, respectively. FIG. 7 is a flowchart showing the extended Markov model estimation process, and FIG. 8 is a flowchart showing the weight estimation process.

まず、前記したステップＳ２の拡張マルコフモデル推定処理では、図７に示すように、
拡張マルコフモデル推定部２２は、入力データ読込部２２１によって、記憶手段４（図１参照）から、入力データ４６を読み込む（ステップＳ２１）。そして、拡張マルコフモデル推定部２２は、事前確率推定部２２２によって、事前確率を推定し（ステップＳ２２）、拡張マルコフモデル推定部２２は、ギャップマルコフモデル推定部２２３によって、ギャップマルコフモデルを推定する（ステップＳ２３）。そして、拡張マルコフモデル推定部２２は、拡張マルコフモデル書込部２２４によって、推定された事前確率とギャップマルコフモデルとを記憶手段４（図１参照）に格納する（ステップＳ２４）。なお、ステップＳ２２の処理と、ステップＳ２３の処理との実行順序は、任意であり、処理を並列に実行してもよい。 First, in the extended Markov model estimation process in step S2, as shown in FIG.
The extended Markov model estimation unit 22 reads the input data 46 from the storage unit 4 (see FIG. 1) by the input data reading unit 221 (step S21). Then, the extended Markov model estimation unit 22 estimates the prior probability by the prior probability estimation unit 222 (step S22), and the extended Markov model estimation unit 22 estimates the gap Markov model by the gap Markov model estimation unit 223 ( Step S23). Then, the extended Markov model estimation unit 22 stores the estimated prior probability and the gap Markov model in the storage unit 4 (see FIG. 1) by the extended Markov model writing unit 224 (step S24). Note that the execution order of the process of step S22 and the process of step S23 is arbitrary, and the processes may be executed in parallel.

次に、前記したステップＳ３の重み推定処理では、図８に示すように、重み推定部２３は、入力データ読込部２３１によって、記憶手段４（図１参照）から、入力データ４６を読み込む（ステップＳ３１）。また、重み推定部２３は、拡張マルコフモデル読込部２３２によって、記憶手段４（図１参照）から、事前確率４７とギャップマルコフモデル４８を読み込む（ステップＳ３２）。続いて、重み推定部２３は、ギャップ重み推定部２３３によって、入力データ４６と、事前確率４７と、ギャップマルコフモデル４８とを用いて、結合モデルの未知パラメータαを推定する（ステップＳ３３）。そして、重み推定部２３は、重み書込部２３４によって、推定されたパラメータαを重み４９として記憶手段４（図１参照）に格納する（ステップＳ３４）。なお、ステップＳ３１の処理と、ステップＳ３２の処理との実行順序は、任意であり、処理を並列に実行してもよい。 Next, in the weight estimation process of step S3 described above, as shown in FIG. 8, the weight estimation unit 23 reads the input data 46 from the storage means 4 (see FIG. 1) by the input data reading unit 231 (step S3). S31). Further, the weight estimation unit 23 reads the prior probability 47 and the gap Markov model 48 from the storage unit 4 (see FIG. 1) by the extended Markov model reading unit 232 (step S32). Subsequently, the weight estimation unit 23 uses the gap weight estimation unit 233 to estimate the unknown parameter α of the combined model using the input data 46, the prior probability 47, and the gap Markov model 48 (step S33). Then, the weight estimation unit 23 stores the parameter α estimated by the weight writing unit 234 in the storage unit 4 (see FIG. 1) as the weight 49 (step S34). Note that the execution order of the process of step S31 and the process of step S32 is arbitrary, and the processes may be executed in parallel.

なお、リコメンド装置１は、一般的なコンピュータに、前記した各ステップを実行させるリコメンドプログラムを実行することで実現することもできる。このプログラムは、通信回線を介して配布することも可能であるし、ＣＤ−ＲＯＭ等の記録媒体に書き込んで配布することも可能である。 The recommendation device 1 can also be realized by executing a recommendation program that causes a general computer to execute the above steps. This program can be distributed via a communication line, or can be written on a recording medium such as a CD-ROM for distribution.

本実施形態によれば、購買順序を考慮しつつ、計算コストが低く、かつ、予測精度の高い商品をユーザにリコメンドすることができる。その結果、ユーザが所望する商品の情報に対して迅速にアクセスできるようになると共に、商品提供者の収益を増加させることが可能となる。 According to the present embodiment, it is possible to recommend a product having a low calculation cost and a high prediction accuracy to the user in consideration of the purchase order. As a result, it becomes possible to quickly access information on the product desired by the user, and increase the profit of the product provider.

以上、本発明の実施形態について説明したが、本発明はこれに限定されるものではなく、その趣旨を変えない範囲で実施することができる。例えば、リコメンド装置１を構成する装置は、１台に限定されることはなく、複数の装置に機能を分散配置してもよい。例えば、演算手段２の前処理部２１、拡張マルコフモデル推定部２２、重み推定部２３、リコメンド部２４や、記憶手段４のデータ格納部４０ｂを、別々の装置として構成してもよい。これにより、各装置への負荷が分散され、高速な処理が実現可能となる。 As mentioned above, although embodiment of this invention was described, this invention is not limited to this, It can implement in the range which does not change the meaning. For example, the number of devices constituting the recommendation device 1 is not limited to one, and the functions may be distributed to a plurality of devices. For example, the preprocessing unit 21, the extended Markov model estimation unit 22, the weight estimation unit 23, the recommendation unit 24, and the data storage unit 40b of the storage unit 4 may be configured as separate devices. As a result, the load on each device is distributed, and high-speed processing can be realized.

本発明の効果を確認するために、本実施形態に係るリコメンド装置１に、音楽配信サービスの購買履歴ログを入力する場合の商品の予測精度と、動画配信サービスの購買履歴ログを入力する場合の商品の予測精度とを求めた。 In order to confirm the effect of the present invention, when the purchase history log of the music distribution service is input to the recommendation device 1 according to the present embodiment and the purchase history log of the video distribution service is input. The prediction accuracy of the product was obtained.

＜設定＞
音楽配信サービスの購買履歴ログ（以下、音楽データという）は、2005年４月１日から2005年６月30日までの音楽配信サービスにおける購買履歴を示すログである。この音楽データにおいて、ユーザ数は「2,104」、商品数（楽曲数）は「561」、購買数は「15,216」であった。
動画配信サービスの購買履歴ログ（以下、動画データという）は、2007年１月１日の動画配信サービスにおける購買履歴を示すログである。この動画データにおいて、ユーザ数は「3,085」、商品数は「1,569」、購買数は「25,363」であった。 <Setting>
The purchase history log of the music distribution service (hereinafter referred to as music data) is a log indicating the purchase history of the music distribution service from April 1, 2005 to June 30, 2005. In this music data, the number of users was “2,104”, the number of products (number of songs) was “561”, and the number of purchases was “15,216”.
The purchase history log of the moving image distribution service (hereinafter referred to as moving image data) is a log indicating the purchase history of the moving image distribution service on January 1, 2007. In this moving image data, the number of users was “3,085”, the number of products was “1,569”, and the number of purchases was “25,363”.

なお、音楽データおよび動画データから、売上数が「10」未満の商品を省くと共に、購買数が「５」未満であるユーザを省いた。また、あるユーザが同じ商品を２回以上購入した場合、その商品に関する２回目以降の購買を購買履歴から省いた。また、各ユーザが最後に購入した商品をテストデータとして用いると共に、それ以前の購買履歴を学習データとして用いた。ここで、ユーザが最後に購入した商品が、学習データに含まれていないものである場合には、その商品をテストデータから省いた。 Note that, from music data and moving image data, products whose sales number is less than “10” are omitted, and users whose purchase number is less than “5” are omitted. Further, when a user purchases the same product twice or more, the second and subsequent purchases regarding the product are omitted from the purchase history. In addition, the product purchased last by each user is used as test data, and the previous purchase history is used as learning data. Here, when the product purchased last by the user is not included in the learning data, the product is omitted from the test data.

実施例（Our Method）を以下の６つのモデル（比較例１〜比較例６）と比較した。
比較例１：１次マルコフモデル（1stMarkov）
比較例２：２次マルコフモデル（2ndMarkov）
比較例３：３次マルコフモデル（3rdMarkov）
比較例４：ギャップマルコフモデルがそれぞれ独立と仮定したモデル（GapMarkov）
比較例５：購買順序を考慮した最大エントロピーモデル（MaxEnt(seq)）
比較例６：購買順序を考慮しない最大エントロピーモデル（MaxEnt） The Example (Our Method) was compared with the following six models (Comparative Examples 1 to 6).
Comparative Example 1: 1st order Markov model (1stMarkov)
Comparative Example 2: Second-order Markov model (2ndMarkov)
Comparative Example 3: Third-order Markov model (3rdMarkov)
Comparative example 4: Gap Markov model assuming that each is independent (GapMarkov)
Comparative Example 5: Maximum entropy model considering purchase order (MaxEnt (seq))
Comparative Example 6: Maximum entropy model (MaxEnt) without considering the purchase order

ここで、事前確率およびギャップマルコフモデルにおけるハイパーパラメータは、leave-one-out交差検定法により求めた。すなわち、本実施形態のリコメンド装置１では、前記した式（５）におけるハイパーパラメータ“δ”と、式（６）におけるハイパーパラメータ“δ”とを、leave-one-out交差検定法により求めた。
また、最大エントロピーモデルにおけるパラメータの事前分布は、分散が「１」の正規分布とした。すなわち、本実施形態のリコメンド装置１では、前記した式（１２）における未知パラメータ（重み）αの事前分布は、分散が「１」の正規分布とした。
また、本実施形態のリコメンド装置１では、重みαを１０重交差検定法により求めた。 Here, the prior probability and the hyperparameter in the Gap Markov model were obtained by the leave-one-out cross-validation method. That is, in the recommendation device 1 of the present embodiment, the hyper parameter “δ” in the above equation (5) and the hyper parameter “δ” in the equation (6) are obtained by the leave-one-out cross-validation method.
The parameter prior distribution in the maximum entropy model is a normal distribution with a variance of “1”. That is, in the recommendation device 1 of the present embodiment, the prior distribution of the unknown parameter (weight) α in the above-described equation (12) is a normal distribution with a variance of “1”.
In the recommendation device 1 of the present embodiment, the weight α is obtained by a 10-fold cross validation method.

＜結果（正答率）＞
このときの各手法の実験結果（正答率）を表３に示す。 <Result (correct answer rate)>
Table 3 shows the experimental results (correct answer rate) of each method at this time.

比較例４（GapMarkov）、比較例５（MaxEnt(seq)）、実施例（Our Method）において、ｌ個前までの購買履歴を用いた手法（ｌ＝１，…，１０）で実験し、最もよい正答率となったときの値を表示し、そのときのｌを括弧内に表示している。つまり、表３では、例えば、実施例（Our Method）の場合には、音楽データはｌ＝８の場合に正答率が最もよく、動画データはｌ＝９の場合に正答率が最もよいことを示している。 In Comparative Example 4 (GapMarkov), Comparative Example 5 (MaxEnt (seq)), and Example (Our Method), an experiment was performed using a method (l = 1,..., 10) using the purchase history up to the previous one. The value when the correct answer rate is obtained is displayed, and l at that time is displayed in parentheses. That is, in Table 3, for example, in the case of the Example (Our Method), music data has the best correct answer rate when l = 8, and video data has the best correct answer rate when l = 9. Show.

表３に示すように、音楽データおよび動画データのいずれでも、実施例（Our Method）の正答率が最も高かった。比較例４（GapMarkov）の正答率が実施例（Our Method）の正答率に比べて低くなった理由は、「ギャップマルコフモデルがそれぞれ独立である」という仮定が適切ではないためであると考えられる。 As shown in Table 3, the correct answer rate of the Example (Our Method) was the highest in both music data and moving image data. The reason why the correct answer rate of Comparative Example 4 (GapMarkov) is lower than the correct answer rate of Example (Our Method) is considered to be because the assumption that “the Gap Markov model is independent” is not appropriate. .

また、表３に示すように、比較例３（3rdMarkov）は、すべてのデータセットにおいて、比較例１（1stMarkov）や比較例２（2ndMarkov）に比べて正答率が低くなっている。これは、高次になるとパラメータ数がデータ数に比べて多くなり、頑健な推定ができていないためであると考えられる。 Further, as shown in Table 3, in Comparative Example 3 (3rdMarkov), the correct answer rate is lower in all data sets than in Comparative Example 1 (1stMarkov) and Comparative Example 2 (2ndMarkov). This is thought to be because the number of parameters increases compared to the number of data at higher orders, and robust estimation cannot be performed.

また、比較例５（MaxEnt(seq)）と比較例６（MaxEnt）とを比較すると、すべてのデータセットにおいて、比較例５（MaxEnt(seq)）は、比較例６（MaxEnt）に比べて正答率が高くなっている。これは、購買順序を考慮することは、正答率を上げるために重要であることを示唆している。 Further, when Comparative Example 5 (MaxEnt (seq)) and Comparative Example 6 (MaxEnt) are compared, Comparative Example 5 (MaxEnt (seq)) is correct in comparison with Comparative Example 6 (MaxEnt) in all data sets. The rate is high. This suggests that considering the purchase order is important for increasing the correct answer rate.

＜重みαとギャップｌとの関係＞
また、本実施例でリコメンド装置１により推定された重みαとギャップｌとの関係を図９に示す。図９において、musicは、音楽データを示し、movieは、動画データを示している。なお、α_l（ｌ＝１〜１０）は、ｌギャップマルコフモデルの重みを示し、α_l（ｌ＝０）は、事前確率の重みを示す。図９のグラフに示すように、全データセットともに、ギャップが小さいギャップマルコフモデルの重みが大きい。これは、最近の履歴が購買予測に関する大きな情報を与えるという直感と一致している。 <Relationship between weight α and gap l>
FIG. 9 shows the relationship between the weight α estimated by the recommendation device 1 in this embodiment and the gap l. In FIG. 9, music indicates music data, and movie indicates moving image data. Α _l (l = 1 to 10) indicates the weight of the l-gap Markov model, and α _l (l = 0) indicates the weight of the prior probability. As shown in the graph of FIG. 9, the weight of the gap Markov model with a small gap is large in all the data sets. This is consistent with the intuition that recent history gives great information about purchase forecasts.

＜ｄ日前までのデータで推定した重みαを用いた正答率＞
音楽データについて、ｄ日前までのデータで推定した重みαを用いて、事前確率、ギャップマルコフモデルのみを更新したときの正答率を図１０に示す。ここで、事前確率およびギャップマルコフモデルにおけるハイパーパラメータ“δ”もｄ日前までのデータで推定したものを用いた。なお、図１０の縦軸は、正答率（accuracy）であり単位は％である。 <Accuracy rate using weight α estimated from data up to d days ago>
For music data, FIG. 10 shows the correct answer rate when only the prior probability and the Gap Markov model are updated using the weight α estimated from data up to d days ago. Here, the prior probability and the hyperparameter “δ” in the Gap Markov model were also estimated using data up to d days ago. In addition, the vertical axis | shaft of FIG. 10 is a correct answer rate (accuracy), and a unit is%.

図１０に示すように、実施例（Our Method）は、比較例６（MaxEnt）、比較例４（GapMarkov）および比較例１（1stMarkov）と比べて、正答率（accuracy）が高い。また、実施例（Our Method）は、過去何日のデータを用いるかによって正答率（accuracy）が変動する。そのため、実施例（Our Method）は、比較例５（MaxEnt(seq)）よりも正答率が高い場合と、低い場合とがあって、平均すると、比較例５（MaxEnt(seq)）と同程度の性能であると言える。なお、実施例（Our Method）は、過去のデータとして、例えば３０日前までのデータで推定した重みを用いた場合には、正答率が、比較例５（MaxEnt(seq)）の正答率よりも約１％上回った。 As shown in FIG. 10, the example (Our Method) has a higher accuracy (accuracy) than Comparative Example 6 (MaxEnt), Comparative Example 4 (GapMarkov) and Comparative Example 1 (1stMarkov). Further, in the example (Our Method), the accuracy rate (accuracy) varies depending on how many days of data are used. Therefore, the Example (Our Method) has a case where the correct answer rate is higher than that of Comparative Example 5 (MaxEnt (seq)) and is lower than that of Comparative Example 5; It can be said that it is performance. In the example (Our Method), when the weight estimated using data up to 30 days ago is used as past data, the correct answer rate is higher than the correct answer rate of Comparative Example 5 (MaxEnt (seq)). About 1% higher.

これらの結果から、以下のことが理解できる。すなわち、図１に示す購買履歴ログ４５に新規のデータを追加した場合、つまり、入力データ（処理用データ）４６が追加された場合に、事前確率４７、ギャップマルコフモデル４８、重み４９を、それぞれ更新する必要がある。このうち、事前確率４７およびギャップマルコフモデル４８よりも、重み４９の方がパラメータの更新に多くの計算量が必要である。したがって、重みαを１カ月に１回程度の割合で更新し、かつ、事前確率とギャップマルコフモデルをそれよりも短い間隔で更新するようにしても、重みαを頻繁に更新したときと同様な高い予測精度を実現できる。そのため、重みαを１カ月に１回程度の割合で更新することで、長期スパンの計算コストを効果的に抑制することが可能となる。 From these results, the following can be understood. That is, when new data is added to the purchase history log 45 shown in FIG. 1, that is, when input data (processing data) 46 is added, the prior probability 47, the Gap Markov model 48, and the weight 49 are respectively Need to update. Of these, the weight 49 requires a larger amount of calculation for updating the parameters than the prior probability 47 and the gap Markov model 48. Therefore, even if the weight α is updated at a rate of about once a month and the prior probability and the Gap Markov model are updated at shorter intervals, the same as when the weight α is updated frequently. High prediction accuracy can be realized. Therefore, by updating the weight α at a rate of about once a month, it is possible to effectively suppress the calculation cost of the long span.

本発明の実施形態に係るリコメンド装置の構成を示すブロック図である。It is a block diagram which shows the structure of the recommendation apparatus which concerns on embodiment of this invention. 前処理部の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of a pre-processing part. 拡張マルコフモデル推定部の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of an extended Markov model estimation part. 重み推定部の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of a weight estimation part. リコメンド部の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of a recommendation part. リコメンド装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of a recommendation apparatus. 拡張マルコフモデル推定処理を示すフローチャートである。It is a flowchart which shows an extended Markov model estimation process. 重み推定処理を示すフローチャートである。It is a flowchart which shows a weight estimation process. ギャップとしてｌ個前までの購買履歴を用いて推定された重みを示すグラフである。It is a graph which shows the weight estimated using the purchase log | history up to 1 piece ago as a gap. ｄ日前までのデータで推定した重みを用いたときの音楽データでの正答率を示すグラフである。It is a graph which shows the correct answer rate in music data when using the weight estimated by the data until d days ago.

Explanation of symbols

１リコメンド装置
２演算手段
３入力手段
４記憶手段
５出力手段
６バスライン
２１前処理部（前処理手段）
２２拡張マルコフモデル推定部
２３重み推定部（重み推定手段）
２４リコメンド部（リコメンド手段）
２５メモリ
４０ａプログラム格納部
４１前処理プログラム
４２拡張マルコフモデル推定プログラム
４３重み推定プログラム
４４リコメンドプログラム
４０ｂデータ格納部
４５購買履歴ログ
４６入力データ
４７事前確率
４８ギャップマルコフモデル
４９重み
２１１購買履歴ログ読込部
２１２入力データ書込部
２２１入力データ読込部
２２２事前確率推定部（事前確率推定手段）
２２３ギャップマルコフモデル推定部（ギャップマルコフモデル推定手段）
２２４ギャップマルコフモデル書込部
２３１入力データ読込部
２３２拡張マルコフモデル読込部
２３３ギャップ重み推定部
２３４重み書込部
２４１入力データ読込部
２４２拡張マルコフモデル読込部
２４３重み読込部
２４４最大商品選択部
２４５リコメンド出力部 DESCRIPTION OF SYMBOLS 1 Recommendation apparatus 2 Calculation means 3 Input means 4 Memory | storage means 5 Output means 6 Bus line 21 Pre-processing part (pre-processing means)
22 extended Markov model estimation unit 23 weight estimation unit (weight estimation means)
24 recommendation section (recommendation means)
25 memory 40a program storage unit 41 preprocessing program 42 extended Markov model estimation program 43 weight estimation program 44 recommendation program 40b data storage unit 45 purchase history log 46 input data 47 prior probability 48 gap Markov model 49 weight 211 purchase history log reading unit 212 Input data writing unit 221 Input data reading unit 222 Prior probability estimation unit (prior probability estimation means)
223 Gap Markov model estimation unit (gap Markov model estimation means)
224 Gap Markov Model Writing Unit 231 Input Data Reading Unit 232 Extended Markov Model Reading Unit 233 Gap Weight Estimation Unit 234 Weight Writing Unit 241 Input Data Reading Unit 242 Extended Markov Model Reading Unit 243 Weight Reading Unit 244 Maximum Product Selection Unit 245 Recommendation Output section

Claims

A recommendation device that presents, as a recommendation target, one of the individual targets belonging to the sales target to each user based on the purchase order of a plurality of users who have purchased the sales target indicating a product or service. And
Preprocessing means for creating processing data indicating data obtained by extracting the purchase history of the individual object for each user using purchase history information including information on one or more individual objects purchased by the user in the past; ,
A prior probability estimating means for estimating a prior probability indicating a probability that the user purchases the individual object using the generated processing data;
Gap Markov model estimation that uses the created processing data to estimate a Gap Markov model indicating the probability that an individual object purchased before the user purchases a predetermined individual object is a specific individual object Means,
A weight estimation means for constructing a combined model indicating a model obtained by combining the estimated prior probability and the estimated gap Markov model by a maximum entropy principle, and estimating a weight indicating an unknown parameter of the combined model;
Using the generated processing data, the estimated prior probability, the estimated gap Markov model, and the estimated weight, the probability of purchase by the user calculated from the combined model is the maximum. Recommending means for selecting the individual object to be presented as a recommendation object;
A recommendation device comprising:

The weight estimation means includes
A first condition indicating that a logarithmic likelihood of a prior distribution based on an empirical distribution and an expected value for a combined model expressed by a product of the combined model and the log likelihood of the prior distribution are equal;
Using the logarithmic likelihood of the Gap Markov model and the second condition indicating that an expected value of the coupled model represented by a product of the coupled model and the log likelihood of the Gap Markov model is equal. Build
From the processing data for all target users, the weight is estimated by maximizing the log likelihood of the purchase probability of all target individual targets.
The recommendation device according to claim 1.

Recommendation of a recommendation device that presents, as a recommendation target, one of the individual objects belonging to the sales target to each user based on the purchase order of a plurality of users who have purchased the sales target indicating goods or services A method,
Preprocessing means creates processing data indicating data obtained by extracting the purchase history of the individual object for each user, using purchase history information including information on one or more individual objects purchased by the user in the past. Preprocessing steps to
A prior probability estimating step of estimating a prior probability indicating a probability that the user purchases the individual object by using the created processing data by a prior probability estimating unit;
Gap Markov model indicating the probability that an individual object purchased before the user purchases a predetermined individual object by the gap Markov model estimation means using the generated processing data is a specific individual object Gap Markov model estimation step for estimating
A weight for estimating a weight indicating an unknown parameter of the constructed coupled model by constructing a coupled model indicating a model obtained by combining the estimated prior probability and the estimated gap Markov model by the maximum entropy principle by weight estimation means An estimation step;
User purchase calculated from the combined model using the created processing data, the estimated prior probability, the estimated gap Markov model, and the estimated weight by a recommendation means A recommendation step of selecting the individual object with the highest probability of being presented and presenting it as a recommendation object;
A recommending method characterized by comprising:

The weight estimation step includes:
A first condition indicating that a logarithmic likelihood of a prior distribution based on an empirical distribution and an expected value for a combined model expressed by a product of the combined model and the log likelihood of the prior distribution are equal;
Using the logarithmic likelihood of the Gap Markov model and the second condition indicating that an expected value of the coupled model represented by a product of the coupled model and the log likelihood of the Gap Markov model is equal. Build
From the processing data for all target users, the weight is estimated by maximizing the log likelihood of the purchase probability of all target individual targets.
The recommendation method according to claim 3.

A recommendation program for causing a computer to execute the recommendation method according to claim 3.

6. A computer-readable recording medium on which the recommendation program according to claim 5 is recorded.