JP7530134B1

JP7530134B1 - Information processing device, program, and information processing method

Info

Publication number: JP7530134B1
Application number: JP2024054860A
Authority: JP
Inventors: ケニーイジョソン
Original assignee: Citadel Ai
Current assignee: Citadel Ai
Priority date: 2024-03-28
Filing date: 2024-03-28
Publication date: 2024-08-07
Anticipated expiration: 2044-03-28

Abstract

An information processing device for improving the reliability of a content generation system using generative AI is provided.
[Solution] According to the present invention, an information processing device for verifying a content generation system using a generative AI is provided, comprising a control means and a memory means, wherein the control means comprises an output information acquisition unit, an output information verification unit, a feedback information acquisition unit, a history information recording unit, and a reward model learning unit, wherein the output information acquisition unit acquires AI output information from the generative AI in response to input from an inputter who inputs a prompt to the content generation system, the output information verification unit verifies the AI output information, the feedback information acquisition unit acquires feedback information from the inputter in response to system output information from the content generation system, the history information recording unit records the AI output information and the feedback information in the memory means as history information, and the reward model learning unit learns a reward model based on the history information.
[Selected Figure] Figure 1

Description

本発明は、情報処理装置、プログラム及び情報処理方法に関する。 The present invention relates to an information processing device, a program, and an information processing method.

近年、大規模言語モデル（ＬＬＭ）を用いて文章生成を行うアプリケーションやサービスが開発されている。例えば、特許文献１には、ＬＬＭからの回答文の正確性を高めるため、ＬＬＭに入力するための質問文（プロンプト）を生成する装置が開示されている。 In recent years, applications and services that use large-scale language models (LLMs) to generate text have been developed. For example, Patent Literature 1 discloses a device that generates questions (prompts) to be input to an LLM in order to improve the accuracy of answers from the LLM.

特許７３１３７５７号Patent No. 7313757

しかしながら、プロンプトを改善したとしても、依然としてＬＬＭやＬＬＭを用いた文章生成システムが不適当な回答文を生成するおそれがあり、文章生成システムを安全に提供あるいは使用することができなかった。また、このような望まない結果を出力する問題は、ＬＬＭ及びこれに対応する文章生成システムに限らず、画像・動画生成ＡＩ等の他の生成ＡＩ及び対応するコンテンツ生成システムでも生じていた。 However, even if the prompts were improved, there was still a risk that the LLM or a sentence generation system using the LLM would generate an inappropriate answer sentence, making it impossible to provide or use the sentence generation system safely. Furthermore, the problem of outputting such undesired results was not limited to the LLM and the corresponding sentence generation system, but also occurred in other generation AIs such as image and video generation AI and corresponding content generation systems.

本発明はこのような事情に鑑みてなされたものであり、生成ＡＩを用いたコンテンツ生成システムの信頼性を向上させるための情報処理装置を提供するものである。 The present invention has been made in consideration of these circumstances, and provides an information processing device for improving the reliability of a content generation system that uses generation AI.

本発明によれば、以下の発明が提供される。
［１］生成ＡＩを用いたコンテンツ生成システムを検証する情報処理装置であって、制御手段と、記憶手段とを備え、前記制御手段は、出力情報取得部と、出力情報検証部と、フィードバック情報取得部と、履歴情報記録部と、報酬モデル学習部と、を備え、前記出力情報取得部は、前記コンテンツ生成システムへプロンプトを入力する入力者からの入力に対する前記生成ＡＩからのＡＩ出力情報を取得し、前記出力情報検証部は、前記ＡＩ出力情報を検証し、前記フィードバック情報取得部は、前記コンテンツ生成システムからのシステム出力情報に対する前記入力者からのフィードバック情報を取得し、前記履歴情報記録部は、前記ＡＩ出力情報と前記フィードバック情報とを履歴情報として前記記憶手段に記録し、前記報酬モデル学習部は、前記履歴情報に基づいて報酬モデルを学習させる、情報処理装置。
［２］［１］に記載の情報処理装置であって、前記制御手段は、入力情報取得部をさらに備え、前記入力情報取得部は、前記入力者から前記コンテンツ生成システムへの入力者入力情報及び前記コンテンツ生成システムから前記生成ＡＩへのシステム入力情報の少なくとも一方を取得し、前記履歴情報記録部は、前記ＡＩ出力情報及び前記フィードバック情報に加えて、前記入力者入力情報及び前記システム入力情報の少なくとも一方を前記履歴情報として前記記憶手段に記録し、前記報酬モデル学習部は、当該履歴情報に基づいて前記報酬モデルを学習させる、情報処理装置。
［３］［２］に記載の情報処理装置であって、前記入力情報取得部及び前記フィードバック情報取得部は、予め前記コンテンツ生成システムにＡＰＩを提供しておき、当該ＡＰＩを用いて、前記コンテンツ生成システムから、前記入力者入力情報及び前記システム入力情報の少なくとも一方と、前記フィードバック情報とをそれぞれ取得する、情報処理装置。
［４］［１］～［３］のいずれかに記載の情報処理装置であって、前記出力情報取得部は、予め前記コンテンツ生成システムにＡＰＩを提供しておき、当該ＡＰＩを用いて前記コンテンツ生成システムから前記ＡＩ出力情報を取得する、情報処理装置。
［５］［１］～［４］のいずれかに記載の情報処理装置であって、前記制御手段は、出力情報提示部をさらに備え、前記出力情報提示部は、前記出力情報検証部の検証結果に基づいて、前記システム出力情報とともに、又は前記システム出力情報に代えて、前記検証結果を前記入力者に提示する、情報処理装置。
［６］［１］～［５］のいずれかに記載の情報処理装置であって、前記報酬モデル学習部は、学習フェーズにおいて、前記ＡＩ出力情報と当該ＡＩ出力情報に対する評価とに基づいて前記報酬モデルを予め学習させ、前記出力情報検証部は、提供フェーズにおいて、前記報酬モデルを用いて前記ＡＩ出力情報を検証し、前記報酬モデル学習部は、前記提供フェーズにおいて、前記履歴情報に基づいて前記報酬モデルを更新する、情報処理装置。
［７］［６］に記載の情報処理装置であって、前記制御手段は、入力情報取得部をさらに備え、前記入力情報取得部は、前記学習フェーズにおいて、前記入力者から前記コンテンツ生成システムへの入力者入力情報及び前記コンテンツ生成システムから前記生成ＡＩへのシステム入力情報の少なくとも一方を取得し、前記報酬モデル学習部は、前記学習フェーズにおいて、前記入力者が入力する学習用データセットについての前記入力者入力情報及び前記システム入力情報の少なくとも一方と、これに対する前記ＡＩ出力情報と、当該ＡＩ出力情報に対する評価とに基づいて前記報酬モデルを学習させる、情報処理装置。
［８］［６］又は［７］に記載の情報処理装置であって、前記出力情報検証部は、前記提供フェーズにおいて、前記報酬モデルを用いた検証に加えて、前記ＡＩ出力情報の有害性、正確性及び感情の少なくとも１つをさらに検証する、情報処理装置。
［９］［２］又は［３］に記載の情報処理装置であって、前記制御手段は、入力情報検証部をさらに備え、前記入力情報検証部は、前記入力者入力情報の有害性及び感情の少なくとも１つを検証する、情報処理装置。
［１０］［１］～［９］のいずれかに記載の情報処理装置であって、前記報酬モデル学習部は、単一入力者ごと又は同じグループに所属する複数の入力者ごとに前記報酬モデルを学習させる、情報処理装置。
［１１］［１］～［１０］のいずれかに記載の情報処理装置であって、前記生成ＡＩは大規模言語モデルであり、前記コンテンツ生成システムは文章生成システムである、情報処理装置。
［１２］生成ＡＩを用いたコンテンツ生成システムを検証する情報処理方法であって、出力情報取得処理と、出力情報検証処理と、フィードバック情報取得処理と、履歴情報記録処理と、報酬モデル学習処理とを行い、前記出力情報取得処理では、前記コンテンツ生成システムへプロンプトを入力する入力者からの入力に対する前記生成ＡＩからのＡＩ出力情報を取得し、前記出力情報検証処理では、前記ＡＩ出力情報を検証し、前記フィードバック情報取得処理では、前記コンテンツ生成システムからのシステム出力情報に対する前記入力者からのフィードバック情報を取得し、前記履歴情報記録処理では、前記ＡＩ出力情報と前記フィードバック情報とを履歴情報として記憶手段に記録し、前記報酬モデル学習処理では、前記履歴情報に基づいて報酬モデルを学習させる、情報処理方法。
［１３］生成ＡＩを用いたコンテンツ生成システムを検証するプログラムであって、コンピュータに、出力情報取得処理と、出力情報検証処理と、フィードバック情報取得処理と、履歴情報記録処理と、報酬モデル学習処理とを実行させ、前記出力情報取得処理では、前記コンテンツ生成システムへプロンプトを入力する入力者からの入力に対する前記生成ＡＩからのＡＩ出力情報を取得し、前記出力情報検証処理では、前記ＡＩ出力情報を検証し、前記フィードバック情報取得処理では、前記コンテンツ生成システムからのシステム出力情報に対する前記入力者からのフィードバック情報を取得し、前記履歴情報記録処理では、前記ＡＩ出力情報と前記フィードバック情報とを履歴情報として記憶手段に記録し、前記報酬モデル学習処理では、前記履歴情報に基づいて報酬モデルを学習させる、プログラム。 According to the present invention, the following inventions are provided.
[1] An information processing device for verifying a content generation system using a generative AI, comprising a control means and a storage means, wherein the control means comprises an output information acquisition unit, an output information verification unit, a feedback information acquisition unit, a history information recording unit, and a reward model learning unit, wherein the output information acquisition unit acquires AI output information from the generative AI in response to input from an input person who inputs a prompt to the content generation system, the output information verification unit verifies the AI output information, the feedback information acquisition unit acquires feedback information from the input person in response to system output information from the content generation system, the history information recording unit records the AI output information and the feedback information as history information in the storage means, and the reward model learning unit learns a reward model based on the history information.
[2] An information processing device as described in [1], wherein the control means further includes an input information acquisition unit, which acquires at least one of user input information from the user to the content generation system and system input information from the content generation system to the generation AI, the history information recording unit records at least one of the user input information and the system input information as the history information in the storage means in addition to the AI output information and the feedback information, and the reward model learning unit learns the reward model based on the history information.
[3] An information processing device as described in [2], wherein the input information acquisition unit and the feedback information acquisition unit provide an API to the content generation system in advance, and use the API to acquire at least one of the user input information and the system input information, and the feedback information, from the content generation system, respectively.
[4] An information processing device according to any one of [1] to [3], wherein the output information acquisition unit provides an API to the content generation system in advance and acquires the AI output information from the content generation system using the API.
[5] An information processing device according to any one of [1] to [4], wherein the control means further includes an output information presentation unit, and the output information presentation unit presents the verification result to the person inputting information together with or in place of the system output information based on the verification result of the output information verification unit.
[6] An information processing device according to any one of [1] to [5], wherein the reward model learning unit, in a learning phase, pre-learns the reward model based on the AI output information and an evaluation of the AI output information, the output information verification unit, in a provision phase, verifies the AI output information using the reward model, and the reward model learning unit, in the provision phase, updates the reward model based on the history information.
[7] An information processing device as described in [6], wherein the control means further includes an input information acquisition unit, which acquires, during the learning phase, at least one of user input information from the user inputting to the content generation system and system input information from the content generation system to the generation AI, and the reward model learning unit, during the learning phase, learns the reward model based on at least one of the user input information and the system input information for a learning dataset input by the user, the AI output information corresponding thereto, and an evaluation of the AI output information.
[8] An information processing device according to [6] or [7], wherein the output information verification unit, in addition to verification using the reward model, further verifies at least one of harmfulness, accuracy, and emotion of the AI output information in the provision phase.
[9] An information processing device according to [2] or [3], wherein the control means further includes an input information verification unit, and the input information verification unit verifies at least one of harmfulness and emotions of the information input by the person entering the information.
[10] An information processing device according to any one of [1] to [9], wherein the reward model learning unit learns the reward model for each individual user who inputs data or for each of multiple users who belong to the same group.
[11] An information processing device according to any one of [1] to [10], wherein the generative AI is a large-scale language model, and the content generation system is a sentence generation system.
[12] An information processing method for verifying a content generation system using a generative AI, comprising the steps of: an output information acquisition process; an output information verification process; a feedback information acquisition process; a history information recording process; and a reward model learning process, in which the output information acquisition process acquires AI output information from the generative AI in response to input from an input user who inputs a prompt to the content generation system; the output information verification process verifies the AI output information; the feedback information acquisition process acquires feedback information from the input user in response to system output information from the content generation system; the history information recording process records the AI output information and the feedback information as history information in a storage means; and the reward model learning process learns a reward model based on the history information.
[13] A program for verifying a content generation system using a generative AI, the program causing a computer to execute an output information acquisition process, an output information verification process, a feedback information acquisition process, a history information recording process, and a reward model learning process, wherein the output information acquisition process acquires AI output information from the generative AI in response to input from an inputter who inputs a prompt to the content generation system, the output information verification process verifies the AI output information, the feedback information acquisition process acquires feedback information from the inputter in response to system output information from the content generation system, the history information recording process records the AI output information and the feedback information as history information in a storage means, and the reward model learning process learns a reward model based on the history information.

本発明によれば、生成ＡＩを用いたコンテンツ生成システムの信頼性を向上させることが可能となっている。 The present invention makes it possible to improve the reliability of a content generation system that uses generative AI.

本発明の第１実施形態に係る情報処理装置１００を含むシステムの全体構成を示す図である。1 is a diagram showing an overall configuration of a system including an information processing apparatus 100 according to a first embodiment of the present invention. 図２Ａは、図１の情報処理装置１００のハードウェア構成を示すブロック図であり、図２Ｂは、ユーザ端末４００のハードウェア構成を示すブロック図である。2A is a block diagram showing the hardware configuration of information processing device 100 in FIG. 1, and FIG. 2B is a block diagram showing the hardware configuration of user terminal 400. As shown in FIG. 図２Ａの情報処理装置１００の制御手段１の機能構成を示すブロック図である。FIG. 2B is a block diagram showing a functional configuration of a control unit 1 of the information processing apparatus 100 of FIG. 2A. ユーザＵがコンテンツ生成システム２００を使用する際の、ユーザ端末４００、コンテンツ生成システム２００、生成ＡＩ３００及び情報処理装置１００の間で行われる処理を示すシーケンス図である。This is a sequence diagram showing the processing performed between the user terminal 400, the content generation system 200, the generation AI 300 and the information processing device 100 when a user U uses the content generation system 200. 図１のユーザ端末４００の表示手段４０５に表示される、コンテンツ生成システム２００の出力する比較画面例Ｄ１である。1. This is an example D1 of a comparison screen output by the content generation system 200 and displayed on the display unit 405 of the user terminal 400 of FIG. 図１のユーザ端末４００の表示手段４０５に表示される、コンテンツ生成システム２００の出力する画面例Ｄ２である。1. This is an example of a screen D2 output by the content generation system 200 and displayed on the display unit 405 of the user terminal 400 of FIG. 図１のユーザ端末４００の表示手段４０５に表示される、コンテンツ生成システム２００の出力する比較画面例Ｄ３である。1. This is an example of a comparison screen D3 outputted from the content generation system 200 and displayed on the display unit 405 of the user terminal 400 in FIG. 図１のユーザ端末４００の表示手段４０５に表示される、情報処理装置１００の出力する画面例Ｄ４である。1. This is an example of a screen D4 output by the information processing device 100 and displayed on the display unit 405 of the user terminal 400 in FIG.

以下、本発明の実施形態について説明する。以下に示す実施形態中で示した各種特徴事項は、互いに組み合わせ可能である。また、各特徴について独立して発明が成立する。 The following describes the embodiments of the present invention. The various features shown in the embodiments below can be combined with each other. In addition, each feature can be an invention independently.

１．第１実施形態
１．１情報処理装置１００を含むシステムの全体構成
図１は、本発明の一実施形態に係る情報処理装置１００及び、これを適用するコンテンツ生成システム２００、生成ＡＩ３００、さらにはコンテンツ生成システム２００を用いるユーザＵが使用するユーザ端末４００を含むシステムの全体構成を示す図である。ここで、本実施形態に係る情報処理装置１００が適用される生成ＡＩ３００は、ＬＬＭ（大規模言語モデル）であり、コンテンツ生成システム２００は、当該生成ＡＩ３００を用いてコンテンツとしての文章を生成する文章生成システムである。 1. First embodiment 1.1 Overall configuration of a system including an information processing device 100 Fig. 1 is a diagram showing the overall configuration of a system including an information processing device 100 according to an embodiment of the present invention, a content generation system 200 to which the information processing device 100 is applied, a generation AI 300, and a user terminal 400 used by a user U who uses the content generation system 200. Here, the generation AI 300 to which the information processing device 100 according to this embodiment is applied is an LLM (large-scale language model), and the content generation system 200 is a text generation system that generates text as content using the generation AI 300.

ＬＬＭは、より具体的には、大量のテキストデータから学習し、人間が書くようなテキストを生成するＡＩであり、例えば、文章の補完、質問への回答、エッセイの作成など、プロンプトとして入力した指示や質問に対して、レスポンスとして文章を返してくれるものである。 More specifically, LLM is an AI that learns from large amounts of text data and generates text that sounds like something a human would write. For example, it can complete sentences, answer questions, write essays, and provide responses to instructions or questions entered as prompts.

また、コンテンツ生成システム２００は、より具体的には、ユーザ端末４００を介して入力者としてのユーザＵからユーザ入力情報（入力者入力情報）の入力を受け付け、当該入力者入力情報を適宜加工したうえでシステム入力情報として生成ＡＩ３００に送信し、生成ＡＩからＡＩ出力情報を取得して、これをシステム出力情報としてユーザＵに提供するものである。また、本実施形態の情報処理装置１００を適用するコンテンツ生成システム２００としては、例えば、コールセンター、営業部門、技術部門など、同じ業種等の組織ごとに作成又はカスタマイズされたものが挙げられる。カスタマイズのためには、回答マニュアルや業務マニュアル等、組織独自のデータが取り込まれ、ユーザ入力情報を加工してシステム入力情報とする際や、ＡＩ出力情報を加工してシステム出力情報にする際に用いられる。ただし、コンテンツ生成システム２００は、個人ごとにカスタマイズされても良い。 More specifically, the content generation system 200 accepts user input information (input person input information) from a user U as an input person via a user terminal 400, appropriately processes the input person input information and transmits it to the generation AI 300 as system input information, obtains AI output information from the generation AI, and provides this to the user U as system output information. In addition, the content generation system 200 to which the information processing device 100 of this embodiment is applied may be created or customized for each organization of the same industry, such as a call center, a sales department, or a technical department. For customization, data unique to the organization, such as a response manual or a business manual, is imported and used when processing user input information to make it system input information or when processing AI output information to make it system output information. However, the content generation system 200 may be customized for each individual.

そして、本実施形態の情報処理装置１００は、このようなコンテンツ生成システム２００が不適当な回答文を出力しないか、つまりコンテンツ生成システム２００の背後で動作する生成ＡＩ３００が不適当な回答文を出力しないかを検証するために用いられる装置である。以下、本実施形態の情報処理装置１００について具体的に説明する。なお、本実施形態において、これら情報処理装置１００、コンテンツ生成システム２００、生成ＡＩ３００及びユーザ端末４００はネットワークＮを介して接続される。 The information processing device 100 of this embodiment is a device used to verify whether such a content generation system 200 outputs an inappropriate answer sentence, that is, whether the generation AI 300 operating behind the content generation system 200 outputs an inappropriate answer sentence. The information processing device 100 of this embodiment will be described in detail below. In this embodiment, the information processing device 100, the content generation system 200, the generation AI 300, and the user terminal 400 are connected via a network N.

１．２情報処理装置１００のハードウェア構成
図２Ａは、情報処理装置１００のハードウェア構成を示すブロック図である。図２Ａに示すように、情報処理装置１００は、具体的には、制御手段１と、記憶手段２と、通信手段３とを備える。 2A is a block diagram showing the hardware configuration of the information processing device 100. As shown in FIG. 2A, the information processing device 100 specifically includes a control unit 1, a storage unit 2, and a communication unit 3.

制御手段１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）等の１以上のプロセッサで構成され、記憶手段２に記憶された所定のプログラムを実行することにより、情報処理装置１００全体の動作を制御する。なお、制御手段１は、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）やＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）等の他のプロセッサを含んでいてもよい。また、制御手段１の少なくとも一部は、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）やＦＰＧＡ（Ｆｉｅｌｄ－ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）等の集積回路であっても良い。 The control means 1 is composed of one or more processors such as a CPU (Central Processing Unit), and controls the operation of the entire information processing device 100 by executing a predetermined program stored in the storage means 2. The control means 1 may also include other processors such as a GPU (Graphics Processing Unit) or a DSP (Digital Signal Processor). At least a part of the control means 1 may be an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field-Programmable Gate Array).

記憶手段２の一部は、例えば、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）やＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等で構成されており、制御手段１による各種プログラムに基づく処理の実行時のワークエリア等として用いられる。また、記憶手段２の一部は、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）等の不揮発性メモリ、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）又はＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）であり、各種データ及び制御手段１の処理に利用されるプログラム等を保存する。なお、記憶手段２の少なくとも一部は、外部クラウドや分散ストレージから構成されていても良い。 A part of the storage means 2 is composed of, for example, RAM (Random Access Memory) or DRAM (Dynamic Random Access Memory), and is used as a work area when the control means 1 executes processes based on various programs. In addition, a part of the storage means 2 is, for example, a non-volatile memory such as ROM (Read Only Memory), HDD (Hard Disk Drive) or SSD (Solid State Drive), and stores various data and programs used in the processing of the control means 1. Note that at least a part of the storage means 2 may be composed of an external cloud or distributed storage.

記憶手段２に記憶されるプログラムは、例えば、情報処理装置１００の基本的な機能を実現するためのＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）、各種ハードウェア制御するためのドライバ、各種機能を実現するためのプログラム等であって、本実施形態に係るコンピュータプログラムを含む。 The programs stored in the storage means 2 include, for example, an OS (Operating System) for implementing the basic functions of the information processing device 100, drivers for controlling various hardware, programs for implementing various functions, etc., and include the computer program according to this embodiment.

また、本実施形態の記憶手段２には、図２Ａに示されるように、履歴情報ＨＳ及び報酬モデルＲＭも記憶されている。履歴情報ＨＳは、システム入力情報と、ＡＩ出力情報と、システム出力情報に対するユーザＵのフィードバックであるフィードバック情報とを含む情報である。また、報酬モデルＲＭは、生成ＡＩ３００の出力（回答）を評価するためのモデルであり、本実施形態では、上記履歴情報ＨＳに基づいて学習される。履歴情報ＨＳ及び報酬モデルＲＭの詳細については、後述する。 In addition, as shown in FIG. 2A, the storage means 2 of this embodiment also stores history information HS and reward model RM. The history information HS is information including system input information, AI output information, and feedback information which is feedback from the user U on the system output information. The reward model RM is a model for evaluating the output (answer) of the generation AI 300, and in this embodiment, is learned based on the history information HS. Details of the history information HS and reward model RM will be described later.

通信手段３は、例えばＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣｏｎｔｒｏｌｌｅｒ）であり、ネットワークＮに接続する機能を有する。なお、通信手段３は、ＮＩＣに代えて又はＮＩＣと共に、無線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）に接続する機能、無線ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）に接続する機能、例えばＢｌｕｅｔｏｏｔｈ（登録商標）等の近距離の無線通信、及び赤外線通信等を可能とする機能を有してもよい。情報処理装置１００は、ネットワークＮを介してコンテンツ生成システム２００及びユーザ端末４００と接続され、各種データの送受信を行うことができる。 The communication means 3 is, for example, a NIC (Network Interface Controller) and has a function of connecting to the network N. Note that the communication means 3 may have a function of connecting to a wireless LAN (Local Area Network) instead of or together with the NIC, a function of connecting to a wireless WAN (Wide Area Network), a function of enabling short-range wireless communication such as Bluetooth (registered trademark), and infrared communication, for example. The information processing device 100 is connected to the content generation system 200 and the user terminal 400 via the network N, and can transmit and receive various data.

これら制御手段１、記憶手段２及び通信手段３は、バス４を介して相互に電気的に接続されている。したがって、制御手段１は、記憶手段２へのアクセス及び通信手段３を介して、コンテンツ生成システム２００やユーザ端末４００等との通信等を行うことができる。 The control means 1, storage means 2, and communication means 3 are electrically connected to each other via a bus 4. Therefore, the control means 1 can access the storage means 2 and communicate with the content generation system 200, the user terminal 400, etc. via the communication means 3.

なお、情報処理装置１００は、図２に示すような１つの装置によって構成される必要はなく、所謂クラウドや分散コンピューティングの技術等を用いた、複数の装置によって実現されてもよい。また、記憶手段２が記憶する各種データ（情報）も、図示のように１つのデータベースで構成される必要はなく、分散データベースによって構成されても良い。 The information processing device 100 does not have to be configured as a single device as shown in FIG. 2, but may be realized as a plurality of devices using so-called cloud or distributed computing technology. In addition, the various data (information) stored in the storage means 2 does not have to be configured as a single database as shown in the figure, but may be configured as a distributed database.

１．３ユーザ端末４００のハードウェア構成
図２Ｂは、ユーザ端末４００のハードウェア構成を示すブロック図である。ユーザ端末４００は、コンテンツ生成システム２００を用いるユーザＵが用いる端末であり、例えば、パーソナルコンピュータ（ＰＣ）、スマートフォンやタブレット端末、車載端末等の情報処理端末とされる。図２Ｂに示すように、ユーザ端末４００は、具体的には、制御手段４０１と、記憶手段４０２と、通信手段４０３と、入力手段４０４と、表示手段４０５とを備える。これらの各構成は、バス４０６を介して相互に電気的に接続されている。 1.3 Hardware Configuration of User Terminal 400 Fig. 2B is a block diagram showing the hardware configuration of the user terminal 400. The user terminal 400 is a terminal used by a user U who uses the content generation system 200, and is, for example, an information processing terminal such as a personal computer (PC), a smartphone, a tablet terminal, or an in-vehicle terminal. As shown in Fig. 2B, the user terminal 400 specifically includes a control unit 401, a storage unit 402, a communication unit 403, an input unit 404, and a display unit 405. These components are electrically connected to each other via a bus 406.

制御手段４０１、記憶手段４０２及び通信手段４０３の一般的な構成は、上述した情報処理装置１００のものと同じであるため、その説明を省略する。 The general configuration of the control means 401, storage means 402, and communication means 403 is the same as that of the information processing device 100 described above, so the description will be omitted.

入力手段４０４は、ユーザＵの入力を受け付ける装置であり、マウス、キーボード、タッチパネル、マイクロフォン等の各種入力手段で構成される。表示手段４０５は、液晶ディスプレイやタッチパネルディスプレイ等であり、ユーザＵに対し画像等を表示する。 The input means 404 is a device that accepts input from the user U, and is composed of various input means such as a mouse, keyboard, touch panel, microphone, etc. The display means 405 is a liquid crystal display, touch panel display, etc., and displays images, etc. to the user U.

１．４情報処理装置１００の（制御手段１）の機能構成
図３に示すように、情報処理装置１００の制御手段１は、入力情報取得部１０と、出力情報取得部１１と、フィードバック情報取得部１２と、履歴情報記録部１３と、報酬モデル学習部１４と、出力情報提示部１５と、入力情報検証部１６と、出力情報検証部１７とを備える。本実施形態の情報処理装置１００は、これらの機能構成により、コンテンツ生成システム２００（生成ＡＩ３００）が不適当な回答文を出力しないかを検証する。以下、各機能構成について説明する。 1.4 Functional configuration of (control means 1) of information processing device 100 As shown in Fig. 3, the control means 1 of the information processing device 100 includes an input information acquisition unit 10, an output information acquisition unit 11, a feedback information acquisition unit 12, a history information recording unit 13, a reward model learning unit 14, an output information presentation unit 15, an input information verification unit 16, and an output information verification unit 17. With these functional configurations, the information processing device 100 of this embodiment verifies whether the content generation system 200 (generation AI 300) outputs an inappropriate answer sentence. Each functional configuration will be described below.

なお、本実施形態において、情報処理装置１００による検証機能は、ＡＰＩを介してコンテンツ生成システム２００と情報をやり取りすることで提供される。また、図１及び以下の説明において、「ユーザ入力情報」「システム入力情報」「ＡＩ出力情報」「システム出力情報」「フィードバック情報」はそれぞれ、文脈に応じて、ユーザＵが入力し又はユーザＵに提示される自然言語そのものを指すこともあり、送受信あるいは記憶手段２に記録するために電子化（符号化）されたものを指すこともある。さらには、これらの情報は、暗号化されていることも好適である。 In this embodiment, the verification function by the information processing device 100 is provided by exchanging information with the content generation system 200 via an API. In addition, in FIG. 1 and the following description, "user input information," "system input information," "AI output information," "system output information," and "feedback information" may refer to the natural language itself input by or presented to the user U, depending on the context, or may refer to information that has been digitized (encoded) for transmission/reception or recording in the storage means 2. Furthermore, it is also preferable that this information is encrypted.

入力情報取得部１０は、入力情報取得処理（図４のステップＳ４参照）として、ＡＰＩ（ＡｐｐｌｉｃａｔｉｏｎＰｒｏｇｒａｍｍｉｎｇＩｎｔｅｒｆａｃｅ）を用いて、コンテンツ生成システム２００から生成ＡＩ３００へのシステム入力情報を取得する。具体的には、入力情報取得部１０は、予めコンテンツ生成システム２００にＡＰＩを提供しておき、コンテンツ生成システム２００が生成ＡＩ３００へシステム入力情報を送信すると、同時に又はその前後のタイミングでコンテンツ生成システム２００に情報処理装置１００への当該システム入力情報の送信も実行させる。 As an input information acquisition process (see step S4 in FIG. 4), the input information acquisition unit 10 acquires system input information from the content generation system 200 to the generation AI 300 using an API (Application Programming Interface). Specifically, the input information acquisition unit 10 provides the API to the content generation system 200 in advance, and when the content generation system 200 transmits system input information to the generation AI 300, the input information acquisition unit 10 causes the content generation system 200 to transmit the system input information to the information processing device 100 at the same time or around the same time.

出力情報取得部１１は、出力情報取得処理（図４のステップＳ１０参照）として、ＡＰＩを用いて、生成ＡＩ３００からコンテンツ生成システム２００へのＡＩ出力情報を取得する。具体的には、出力情報取得部１１は、予めコンテンツ生成システム２００にＡＰＩを提供しておき、コンテンツ生成システム２００が生成ＡＩ３００からＡＩ出力情報を取得すると、同時に又はその後のタイミングでコンテンツ生成システム２００に情報処理装置１００への当該ＡＩ出力情報の送信を実行させる。 As an output information acquisition process (see step S10 in FIG. 4), the output information acquisition unit 11 uses an API to acquire AI output information from the generation AI 300 to the content generation system 200. Specifically, the output information acquisition unit 11 provides the API to the content generation system 200 in advance, and when the content generation system 200 acquires the AI output information from the generation AI 300, the output information acquisition unit 11 causes the content generation system 200 to transmit the AI output information to the information processing device 100 at the same time or at a later timing.

フィードバック情報取得部１２は、フィードバック情報取得処理（図４のステップＳ１８参照）として、ＡＰＩを用いて、システム出力情報に対するユーザＵからのフィードバック情報を取得する。具体的には、フィードバック情報取得部１２は、予めコンテンツ生成システム２００にＡＰＩを提供しておき、コンテンツ生成システム２００がユーザ端末４００からフィードバック情報を取得すると、同時に又はその後のタイミングでコンテンツ生成システム２００に情報処理装置１００への当該フィードバック情報の送信を実行させる。なお、フィードバック情報とは、コンテンツ生成システム２００がユーザ端末４００の表示手段４０５を介してユーザＵに提示したシステム出力情報、言い換えると、ユーザＵが入力したプロンプトに対するレスポンスについてのユーザＵによる評価の情報である。本実施形態において、フィードバック情報は、具体的には例えば、「良い／悪い」の２段階の評価（無回答を含めると３段階の評価）の情報とされる。ただし、それ以上の多段階の評価（例えば、０点～１００点といった数値での評価）であっても良く、文章による評価であっても良い。 The feedback information acquisition unit 12 acquires feedback information from the user U on the system output information using an API as a feedback information acquisition process (see step S18 in FIG. 4). Specifically, the feedback information acquisition unit 12 provides the API to the content generation system 200 in advance, and when the content generation system 200 acquires feedback information from the user terminal 400, the feedback information acquisition unit 12 causes the content generation system 200 to transmit the feedback information to the information processing device 100 at the same time or at a later timing. The feedback information is the system output information presented to the user U by the content generation system 200 via the display means 405 of the user terminal 400, in other words, information on the evaluation by the user U of the response to the prompt input by the user U. In this embodiment, the feedback information is specifically, for example, information on a two-level evaluation of "good/bad" (a three-level evaluation including no response). However, it may be a multi-level evaluation (for example, a numerical evaluation such as 0 points to 100 points) or a written evaluation.

履歴情報記録部１３は、履歴情報記録処理（図４のステップＳ５，Ｓ１１及びＳ１９参照）として、入力情報取得部１０が取得したシステム入力情報、出力情報取得部１１が取得したＡＩ出力情報及び、フィードバック情報取得部１２が取得したフィードバック情報を、履歴情報ＨＳとして記憶手段２に記録する。より具体的には、履歴情報記録部１３は、ユーザＵがプロンプトを入力してレスポンスを得、当該レスポンスに対するフィードバックをする一連の流れにおいて取得したシステム入力情報、ＡＩ出力情報及びフィードバック情報を関連付けて、履歴情報ＨＳとして記録する。 As part of the history information recording process (see steps S5, S11, and S19 in FIG. 4), the history information recording unit 13 records the system input information acquired by the input information acquisition unit 10, the AI output information acquired by the output information acquisition unit 11, and the feedback information acquired by the feedback information acquisition unit 12 as history information HS in the storage means 2. More specifically, the history information recording unit 13 associates the system input information, AI output information, and feedback information acquired in the series of steps in which the user U inputs a prompt, obtains a response, and provides feedback on the response, and records them as history information HS.

報酬モデル学習部１４は、報酬モデル学習処理（図４のステップＳ２０参照）として、履歴情報記録部１３が記録した履歴情報ＨＳに基づいて報酬モデルＲＭを学習させる。本実施形態において、報酬モデルＲＭの学習には、学習フェーズにおける事前学習と提供フェーズにおける更新が含まれる。ここで、学習フェーズにおける事前学習は、主にユーザＵがコンテンツ生成システム２００を使用する前の報酬モデルＲＭの学習であり、提供フェーズにおける更新は、ユーザＵが実際にコンテンツ生成システム２００を使用するなかでの報酬モデルＲＭの更新である。 The reward model learning unit 14 learns the reward model RM based on the history information HS recorded by the history information recording unit 13 as a reward model learning process (see step S20 in FIG. 4). In this embodiment, learning the reward model RM includes pre-learning in the learning phase and updating in the provision phase. Here, the pre-learning in the learning phase is mainly learning the reward model RM before the user U uses the content generation system 200, and the updating in the provision phase is updating the reward model RM while the user U actually uses the content generation system 200.

学習フェーズにおける事前学習は、具体的には、コンテンツ生成システム２００の開発者やコンテンツ生成システム２００を運用する運用者、コンテンツ生成システム２００のβ版ユーザ、さらには本実施形態に係る情報処理装置１００の提供者など、ユーザＵとは別の入力者に学習用データセットとして多数のプロンプト（最低でも数百件のプロンプト）を入力させ、当該入力者からプロンプト（つまりシステム入力情報）ごとにＡＩ出力情報及びシステム出力情報に対する評価（フィードバック情報）を得ることで実施される。ここで、フィードバックをもらう入力者は１人であってもよく、２人以上であっても良い。ただし、１つの報酬モデルＲＭは、単一入力者ごと又は同じ目的でコンテンツ生成システム２００を使用する組織（グループ）に所属する複数の入力者ごとに作成され、学習される。また、人間ではなく、他の生成ＡＩに評価をさせることで報酬モデルＲＭを学習させるようなことも考えられる。 The pre-learning in the learning phase is specifically performed by having an inputter other than the user U, such as a developer of the content generation system 200, an operator who operates the content generation system 200, a beta version user of the content generation system 200, or even a provider of the information processing device 100 according to this embodiment, input a large number of prompts (at least several hundred prompts) as a learning dataset, and obtaining an evaluation (feedback information) of the AI output information and the system output information for each prompt (i.e., system input information) from the inputter. Here, the inputter who receives feedback may be one person, or two or more people. However, one reward model RM is created and learned for each single inputter or for multiple inputters belonging to an organization (group) that uses the content generation system 200 for the same purpose. It is also possible to have the reward model RM learn by having another generation AI evaluate instead of a human.

一方、提供フェーズにおける更新は、具体的には、ユーザＵがコンテンツ生成システム２００を使用して履歴情報ＨＳが追加されるごと、つまり、ユーザＵがプロンプトを入力してレスポンスを得、当該レスポンスに対するフィードバックをする一連の流れごと、あるいは履歴情報ＨＳが所定量蓄積されるごと（例えば、１００回のフィードバックが得られるごと）に実行される。 On the other hand, updates in the provision phase are specifically performed each time history information HS is added by user U using content generation system 200, that is, each time a series of steps is performed in which user U inputs a prompt, obtains a response, and provides feedback on the response, or each time a predetermined amount of history information HS is accumulated (for example, each time 100 pieces of feedback are obtained).

出力情報提示部１５は、出力情報検証部１７の検証結果に基づいて、出力情報提示処理として、ユーザ端末４００の表示手段４０５を介し、システム出力情報とともに、又はシステム出力情報に代えて、出力情報検証部１７による検証結果をユーザＵに提示する（図６の警告文Ｗを参照）。ここで、「システム出力情報に代えて」というのは、出力情報検証部１７の検証結果が好ましくなくユーザＵにシステム出力情報を提示するべきではないと判断した場合には、当該システム出力情報は表示手段４０５に表示させないということである。なお、具体的な検証結果の提示の方法については、後述する。また、出力情報提示部１５は、履歴情報ＨＳに基づき、コンテンツ生成システム２００の入出力の統計情報を提示する（図８参照）。 Based on the verification result of the output information verification unit 17, the output information presentation unit 15 presents the verification result by the output information verification unit 17 to the user U via the display means 405 of the user terminal 400 together with or instead of the system output information as an output information presentation process (see warning message W in FIG. 6). Here, "instead of the system output information" means that if the verification result of the output information verification unit 17 is unfavorable and it is determined that the system output information should not be presented to the user U, the system output information is not displayed on the display means 405. A specific method of presenting the verification result will be described later. In addition, the output information presentation unit 15 presents statistical information on the input and output of the content generation system 200 based on the history information HS (see FIG. 8).

入力情報検証部１６は、入力情報検証処理（図４のステップＳ６参照）として、入力情報取得部１０が取得したユーザ入力情報を検証する。本実施形態の入力情報検証部１６は、具体的には、図３に示すように、入力有害性フィルタ１６ａと入力感情フィルタ１６ｂとを備え、ユーザ入力情報の有害性及び感情を検証するよう構成される。ここで、「ユーザ入力情報の有害性」とは、ユーザ入力の中に差別や偏見など不適切な用語や表現が含まれる状態のことであり、「ユーザ入力情報の感情」とは、ユーザ入力から推測されるポジティブな感情表現あるいはネガティブな感情表現のことである。なお、入力情報検証部１６は、入力有害性フィルタ１６ａと入力感情フィルタ１６ｂのいずれかのみを備えていても良い。また、入力情報検証部１６は、これら以外のフィルタを追加で備えていても良い。 The input information verification unit 16 verifies the user input information acquired by the input information acquisition unit 10 as an input information verification process (see step S6 in FIG. 4). Specifically, as shown in FIG. 3, the input information verification unit 16 of this embodiment is configured to include an input harmfulness filter 16a and an input emotion filter 16b, and to verify the harmfulness and emotion of the user input information. Here, the "harmfulness of the user input information" refers to a state in which the user input contains inappropriate terms or expressions such as discrimination or prejudice, and the "emotion of the user input information" refers to a positive or negative emotional expression inferred from the user input. The input information verification unit 16 may include only either the input harmfulness filter 16a or the input emotion filter 16b. The input information verification unit 16 may also include additional filters other than these.

出力情報検証部１７は、出力情報検証処理（図４のステップＳ１３参照）として、出力情報取得部１１が取得したＡＩ出力情報を検証する。本実施形態の出力情報検証部１７は、具体的には、図３に示すように、出力有害性フィルタ１７ａと、出力感情フィルタ１７ｂと、ファクトチェックフィルタ１７ｃと、報酬モデルフィルタ１７ｄとを備える。出力有害性フィルタ１７ａは、ＡＩ出力情報の有害性を検証し、出力感情フィルタ１７ｂは、ＡＩ出力情報の感情を検証する。ここで、「ＡＩ出力情報の有害性」とは、ＡＩ出力の中に差別や偏見など不適切な用語や表現が含まれる状態のことであり、「ＡＩ出力情報の感情」とは、ＡＩ出力から推測されるポジティブな感情表現あるいはネガティブな感情表現のことである。また、ファクトチェックフィルタ１７ｃは、ＡＩ出力情報の内容が事実に沿ったものであるか（正確性）を検証するものであり、事実との一致度合いが出力される。加えて、報酬モデルフィルタ１７ｄは、報酬モデルＲＭを用いてＡＩ出力情報を検証する。報酬モデルフィルタ１７ｄは、具体的には、システム入力情報に対してコンテンツ生成システム２００が出力するシステム出力情報（基本的に生成ＡＩ３００が出力するＡＩ出力情報と同じもの）を評価（例えば、０点～１点での数値評価）する。ただし、報酬モデルフィルタ１７ｄによるシステム出力情報の評価は、例えば、Ａ，Ｂ，Ｃなどのクラス分けとすることもできる。報酬モデルフィルタ１７ｄによるＡＩ出力情報の検証については、後述する。なお、出力情報検証部１７は、出力感情フィルタ１７ｂ、ファクトチェックフィルタ１７ｃ及び報酬モデルフィルタ１７ｄの少なくとも１つを備えていなくても良い。また、出力情報検証部１７は、これら以外のフィルタを追加で備えていても良い。 The output information verification unit 17 verifies the AI output information acquired by the output information acquisition unit 11 as an output information verification process (see step S13 in FIG. 4). Specifically, as shown in FIG. 3, the output information verification unit 17 of this embodiment includes an output harmfulness filter 17a, an output emotion filter 17b, a fact check filter 17c, and a reward model filter 17d. The output harmfulness filter 17a verifies the harmfulness of the AI output information, and the output emotion filter 17b verifies the emotion of the AI output information. Here, the "harmfulness of the AI output information" refers to a state in which the AI output contains inappropriate terms or expressions such as discrimination or prejudice, and the "emotion of the AI output information" refers to a positive emotional expression or a negative emotional expression inferred from the AI output. In addition, the fact check filter 17c verifies whether the content of the AI output information is in line with the facts (accuracy), and outputs the degree of agreement with the facts. In addition, the reward model filter 17d verifies the AI output information using the reward model RM. Specifically, the reward model filter 17d evaluates (e.g., a numerical evaluation from 0 to 1 points) the system output information (which is essentially the same as the AI output information output by the generation AI 300) output by the content generation system 200 in response to the system input information. However, the evaluation of the system output information by the reward model filter 17d can also be classified into classes such as A, B, and C. Verification of the AI output information by the reward model filter 17d will be described later. Note that the output information verification unit 17 does not have to include at least one of the output emotion filter 17b, the fact check filter 17c, and the reward model filter 17d. The output information verification unit 17 may also include additional filters other than these.

なお、上述した各機能構成及び処理は、情報処理装置１００に適宜インストールされるソフトウェア（いわゆるアプリを含む）によって実現してもよく、ハードウェアによって実現してもよい。ソフトウェアによって実現する場合、制御手段１がソフトウェアを構成するプログラムを実行することによって各種機能を実現することができる。また、単一のソフトウェアではなく、複数のソフトウェアによって実現されていても良い。 The above-mentioned functional configurations and processes may be realized by software (including so-called apps) that is appropriately installed in the information processing device 100, or may be realized by hardware. When realized by software, the various functions can be realized by the control means 1 executing the programs that make up the software. Also, they may be realized by multiple pieces of software rather than a single piece of software.

プログラムを実行することで実現される場合、当該プログラムは、情報処理装置１００が内蔵する記憶手段２に格納してもよく、コンピュータが読み取り可能な非一時的な記録媒体に格納してもよい。また、外部の記憶装置に格納されたプログラムを読み出し、いわゆるクラウドコンピューティングにより実現してもよい。もしくは、ハードウェアによって実現する場合、ＡＳＩＣ、ＳＯＣ、ＦＰＧＡ、又はＤＲＰなどの種々の回路によって実現することができる。また、上述した機能構成のうちの少なくとも一部の機能構成を、ソフトウェア又はハードウェアによって、入力を受け付けるユーザ端末４００等で処理されるようにしてもよい。 When it is realized by executing a program, the program may be stored in the storage means 2 built into the information processing device 100, or may be stored in a non-transitory recording medium that is readable by a computer. It may also be realized by reading out a program stored in an external storage device, so-called cloud computing. Or, when it is realized by hardware, it can be realized by various circuits such as an ASIC, SOC, FPGA, or DRP. Also, at least a part of the functional configurations described above may be processed by software or hardware on a user terminal 400 that accepts input, etc.

また、上述した機能構成は、複数のコンピュータによって実現してもよく、その場合、上述した各機能構成は、複数のコンピュータに分散して配置してもよい。 The above-mentioned functional configuration may be realized by multiple computers, in which case each of the above-mentioned functional configurations may be distributed among multiple computers.

１．５情報処理装置１００による情報処理方法
次に、図４のシーケンス図及び図５～図８の画面例Ｄ１～Ｄ４を用いて、本実施形態の情報処理装置１００がコンテンツ生成システム２００の出力（生成ＡＩ３００の出力）を検証する情報処理方法を説明する。なお、図４のシーケンス図には、情報処理装置１００の処理だけでなく、ユーザ端末４００、コンテンツ生成システム２００及び生成ＡＩ３００の処理も含まれている。また、上述したように、本実施形態の情報処理装置１００による検証は、ＡＰＩを介してコンテンツ生成システム２００との間で必要な情報を送受信することで実現される。 1.5 Information processing method by the information processing device 100 Next, an information processing method in which the information processing device 100 of this embodiment verifies the output of the content generation system 200 (output of the generation AI 300) will be described using the sequence diagram of Fig. 4 and screen examples D1 to D4 of Fig. 5 to Fig. 8. Note that the sequence diagram of Fig. 4 includes not only the processing of the information processing device 100, but also the processing of the user terminal 400, the content generation system 200, and the generation AI 300. Also, as described above, the verification by the information processing device 100 of this embodiment is realized by transmitting and receiving necessary information between the content generation system 200 and the information processing device 100 via an API.

本実施形態の情報処理方法は、具体的には、まず、ステップＳ１において、ユーザ端末４００が入力手段４０４を介してユーザＵからコンテンツ生成システム２００への入力、すなわちプロンプトを受け付ける。次に、ステップＳ２において、ユーザ端末４００は、受け付けたユーザ入力情報をコンテンツ生成システム２００に送信する。次に、コンテンツ生成システム２００は、ステップＳ３において、必要に応じてユーザ入力情報を加工する。ここで、コンテンツ生成システム２００によるユーザ入力情報の加工には、例えば、ユーザＵが入力したプロンプトに、ＲＡＧ（ＲｅｔｒｉｅｖａｌＡｕｇｍｅｎｔｅｄＧｅｎｅｒａｔｉｏｎ）の仕組みを使って補足情報を付加する等が考えられる。 Specifically, in the information processing method of this embodiment, first, in step S1, the user terminal 400 accepts an input, i.e., a prompt, from the user U to the content generation system 200 via the input means 404. Next, in step S2, the user terminal 400 transmits the accepted user input information to the content generation system 200. Next, in step S3, the content generation system 200 processes the user input information as necessary. Here, processing of the user input information by the content generation system 200 can be, for example, by adding supplemental information to the prompt entered by the user U using a mechanism of RAG (Retrieval Augmented Generation).

次に、ステップＳ４において、コンテンツ生成システム２００は、当該加工したユーザ入力情報をシステム入力情報として情報処理装置１００に送信し、情報処理装置１００の制御手段１の入力情報取得部１０は、当該システム入力情報を取得する。 Next, in step S4, the content generation system 200 transmits the processed user input information to the information processing device 100 as system input information, and the input information acquisition unit 10 of the control means 1 of the information processing device 100 acquires the system input information.

次に、制御手段１の履歴情報記録部１３は、ステップＳ５において、入力情報取得部１０の取得した履歴情報ＨＳとして記憶手段２に記録する。また、制御手段１の入力情報検証部１６は、ステップＳ６において、システム入力情報を検証し、ステップＳ７において、システム入力情報の検証結果をコンテンツ生成システム２００に送信する。なお、情報処理装置１００による検証結果をどのように活用するかは、コンテンツ生成システム２００側で自由に設定できる。例えば、入力に有害な言葉や文が含まれた場合に、ユーザＵに警告することや、当該ユーザＵからの入力の生成ＡＩ３００への送信をブロックするなどが想定される。 Next, in step S5, the history information recording unit 13 of the control means 1 records the acquired history information HS by the input information acquisition unit 10 in the storage means 2. In step S6, the input information verification unit 16 of the control means 1 verifies the system input information, and in step S7, transmits the verification result of the system input information to the content generation system 200. Note that the content generation system 200 can freely set how to utilize the verification result by the information processing device 100. For example, if the input contains harmful words or sentences, it is conceivable to warn the user U or block the transmission of the input from the user U to the generation AI 300.

次に、コンテンツ生成システム２００は、ステップＳ８において、システム入力情報の検証結果が当該システム入力情報を生成ＡＩ３００に送信するべきではないというものでない限り、当該システム入力情報を生成ＡＩ３００に送信する。生成ＡＩ３００は、ステップＳ９において、プロンプトとしてシステム入力情報を受け取り、レスポンスとしてＡＩ出力情報をコンテンツ生成システム２００に送信する。 Next, in step S8, the content generation system 200 transmits the system input information to the generation AI 300 unless the verification result of the system input information indicates that the system input information should not be transmitted to the generation AI 300. In step S9, the generation AI 300 receives the system input information as a prompt and transmits the AI output information to the content generation system 200 as a response.

次に、ステップＳ１０において、コンテンツ生成システム２００は、生成ＡＩ３００から取得したＡＩ出力情報を情報処理装置１００に送信し、情報処理装置１００の制御手段１の出力情報取得部１１は、当該ＡＩ出力情報を取得する。 Next, in step S10, the content generation system 200 transmits the AI output information acquired from the generation AI 300 to the information processing device 100, and the output information acquisition unit 11 of the control means 1 of the information processing device 100 acquires the AI output information.

次に、制御手段１の履歴情報記録部１３は、ステップＳ１１において、出力情報取得部１１の取得したＡＩ出力情報を履歴情報ＨＳとして記憶手段２に記録する。また、制御手段１の出力情報検証部１７は、ステップＳ１２において、ＡＩ出力情報を検証し、ステップＳ１３において、検証結果をコンテンツ生成システム２００に送信する。 Next, in step S11, the history information recording unit 13 of the control means 1 records the AI output information acquired by the output information acquisition unit 11 in the storage means 2 as history information HS. In addition, in step S12, the output information verification unit 17 of the control means 1 verifies the AI output information, and in step S13, transmits the verification result to the content generation system 200.

次に、コンテンツ生成システム２００は、ステップＳ１４において、情報処理装置１００から取得したＡＩ出力情報の検証結果に基づいてＡＩ出力情報を加工し、ユーザ端末４００に送信するシステム出力情報を生成する。生成されたシステム出力情報は、ステップＳ１５においてユーザ端末４００に送信され、表示手段４０５に表示される。 Next, in step S14, the content generation system 200 processes the AI output information based on the verification result of the AI output information acquired from the information processing device 100, and generates system output information to be transmitted to the user terminal 400. In step S15, the generated system output information is transmitted to the user terminal 400 and displayed on the display means 405.

なお、図５は、ユーザＵがコンテンツ生成システム２００のプロンプト入力欄Ａ１に「日本で最も小さい都道府県はどこですか？」と質問し、レスポンス欄Ａ２に「最も小さい都道府県は北海道です。」と回答が表示された比較画面例Ｄ１である。ここで、最も小さい都道府県は実際には北海道ではないので、この回答は適切ではない回答と言える。生成ＡＩ３００がＡＩ出力情報としてこのような回答を出力した場合、情報処理装置１００の出力情報検証部１７は、例えば当該ＡＩ出力情報をハルシネーション（つまり、もっともらしい誤情報）の可能性がある出力であると判断し、出力情報提示部１５は、当該検証結果をコンテンツ生成システム２００に送信する。これにより、コンテンツ生成システム２００は、図６の画面例Ｄ２に示すように、ＡＩ出力情報を加工し、ハルシネーションの可能性のある出力である旨の注意喚起の警告文Ｗを追加したシステム出力情報を生成することが可能となる。 FIG. 5 shows a comparative screen example D1 in which a user U asks "What is the smallest prefecture in Japan?" in the prompt input field A1 of the content generation system 200, and the answer "The smallest prefecture is Hokkaido" is displayed in the response field A2. Here, since the smallest prefecture is not actually Hokkaido, this answer can be said to be an inappropriate answer. When the generation AI 300 outputs such an answer as the AI output information, the output information verification unit 17 of the information processing device 100 determines, for example, that the AI output information is an output that may be hallucination (i.e., plausible false information), and the output information presentation unit 15 transmits the verification result to the content generation system 200. As a result, the content generation system 200 can process the AI output information and generate system output information to which a warning statement W is added to warn that the output is a possible hallucination, as shown in the screen example D2 of FIG. 6.

また、図７は、図５のような生成ＡＩ３００の誤った回答に対してユーザＵがプロンプト入力欄Ａ１にて誤りを指摘した結果、生成ＡＩ３００がＡＩ出力情報としてユーザＵに対して有害な回答（悪口）をした例を示す比較画面例Ｄ３である。生成ＡＩ３００がＡＩ出力情報としてこのような回答を出力した場合、情報処理装置１００の出力情報検証部１７は、例えば当該ＡＩ出力情報をユーザＵに提示するべきではない出力であると判断し、当該検証結果をコンテンツ生成システム２００に送信する。これにより、コンテンツ生成システム２００は、ユーザＵに対して有害な回答を提示することを回避することが可能となる。この場合、コンテンツ生成システム２００は、ＡＩ出力情報に代えて、例えば、問題のある出力であったため出力内容を表示することはできない旨のシステム出力情報を生成することができる。 Also, FIG. 7 is a comparison screen example D3 showing an example in which the generated AI 300 gives a harmful answer (bad word) to the user U as AI output information as a result of the user U pointing out an error in the prompt input field A1 in response to an incorrect answer from the generated AI 300 as shown in FIG. 5. When the generated AI 300 outputs such an answer as the AI output information, the output information verification unit 17 of the information processing device 100 determines, for example, that the AI output information is an output that should not be presented to the user U, and transmits the verification result to the content generation system 200. This enables the content generation system 200 to avoid presenting a harmful answer to the user U. In this case, the content generation system 200 can generate system output information instead of the AI output information, for example, indicating that the output content cannot be displayed because the output was problematic.

次に、ステップＳ１６において、ユーザ端末４００は、入力手段４０４を介してユーザＵからシステム出力情報に対するフィードバック（例えば、良い／悪いの２段階評価）を受け付ける。そして、ステップＳ１７において、ユーザ端末４００は、受け付けたフィードバック情報をコンテンツ生成システム２００に送信する。コンテンツ生成システム２００は、ステップＳ１８において当該フィードバック情報を情報処理装置１００に送信し、制御手段１のフィードバック情報取得部１２は、当該フィードバック情報を取得する。 Next, in step S16, the user terminal 400 accepts feedback on the system output information (e.g., a two-level evaluation of good/bad) from the user U via the input means 404. Then, in step S17, the user terminal 400 transmits the accepted feedback information to the content generation system 200. In step S18, the content generation system 200 transmits the feedback information to the information processing device 100, and the feedback information acquisition unit 12 of the control means 1 acquires the feedback information.

次に、制御手段１の履歴情報記録部１３は、ステップＳ１９において、フィードバック情報取得部１２の取得したフィードバック情報を履歴情報ＨＳとして記憶手段２に記録する。なお、履歴情報記録部１３は、ステップＳ５におけるシステム入力情報、ステップＳ１１におけるＡＩ出力情報及びこのステップＳ１９におけるフィードバック情報を関連付けて記録するようになっている。 Next, in step S19, the history information recording unit 13 of the control means 1 records the feedback information acquired by the feedback information acquisition unit 12 as history information HS in the storage means 2. Note that the history information recording unit 13 is configured to record the system input information in step S5, the AI output information in step S11, and the feedback information in step S19 in association with each other.

最後に、ステップＳ２０において、制御手段１の報酬モデル学習部１４は、記録された履歴情報ＨＳに基づいて、報酬モデルＲＭを更新する。 Finally, in step S20, the reward model learning unit 14 of the control means 1 updates the reward model RM based on the recorded history information HS.

以上のようなステップＳ１～Ｓ２０の一連のステップにより、本実施形態の情報処理方法では、ユーザＵがコンテンツ生成システム２００を使用してフィードバックを提供する度に、あるいは履歴情報ＨＳが所定量蓄積された度に、報酬モデルＲＭが更新されるようになっている。 By performing the above-described series of steps S1 to S20, in the information processing method of this embodiment, the reward model RM is updated each time the user U uses the content generation system 200 to provide feedback, or each time a predetermined amount of history information HS is accumulated.

なお、情報処理装置１００の出力情報提示部１５は、図８に示すように、記録された履歴情報ＨＳに基づいて、コンテンツ生成システム２００の入出力の最新の統計情報をＷｅｂページや電子メール等でユーザＵに提示することも可能である。図８に示す画面例Ｄ４では、ＡＩ出力情報の有害性（テキストの有害性）についての統計データが示されている。 In addition, as shown in FIG. 8, the output information presentation unit 15 of the information processing device 100 can also present the latest statistical information on the input and output of the content generation system 200 to the user U via a web page, email, etc., based on the recorded history information HS. In the example screen D4 shown in FIG. 8, statistical data on the harmfulness of the AI output information (harmfulness of text) is displayed.

１．６作用効果
（１）本実施形態に係る情報処理装置１００によれば、出力情報取得部１１が生成ＡＩ３００からのＡＩ出力情報を取得し、出力情報検証部１７が報酬モデルＲＭに基づいてＡＩ出力情報を検証するようになっている。そして、フィードバック情報取得部１２が、コンテンツ生成システム２００を介してユーザＵからのシステム出力情報に対するフィードバック情報を取得し、履歴情報記録部１３がＡＩ出力情報とともにフィードバック情報を履歴情報ＨＳとして記録し、報酬モデル学習部１４が当該履歴情報ＨＳに基づいて報酬モデルＲＭを学習・更新するようになっている。本実施形態に係る情報処理装置１００は、このような構成となっていることから、フィードバック情報により日々更新される報酬モデルＲＭに基づいて出力情報検証部１７によるコンテンツ生成システム２００（システム出力情報）の検証を行うことができ、日々変化する社会の中でのコンテンツ生成システム２００の信頼性を向上させることが可能となっている。 1.6 Effects (1) According to the information processing device 100 of this embodiment, the output information acquisition unit 11 acquires the AI output information from the generation AI 300, and the output information verification unit 17 verifies the AI output information based on the reward model RM. The feedback information acquisition unit 12 acquires feedback information on the system output information from the user U via the content generation system 200, the history information recording unit 13 records the feedback information together with the AI output information as history information HS, and the reward model learning unit 14 learns and updates the reward model RM based on the history information HS. Since the information processing device 100 of this embodiment is configured in this way, the output information verification unit 17 can verify the content generation system 200 (system output information) based on the reward model RM that is updated daily by the feedback information, and it is possible to improve the reliability of the content generation system 200 in a society that changes daily.

（２）本実施形態に係る情報処理装置１００は、入力情報取得部１０がシステム入力情報を取得し、ＡＩ出力情報及びフィードバック情報とともに履歴情報ＨＳとして記録して、報酬モデル学習部１４が当該履歴情報ＨＳに基づいて報酬モデルＲＭを学習・更新するようになっている。このように、報酬モデルＲＭの学習及び更新にシステム入力情報も用いることで、報酬モデルＲＭの学習・更新の精度を向上させることが可能となっている。 (2) In the information processing device 100 according to this embodiment, the input information acquisition unit 10 acquires system input information and records it together with AI output information and feedback information as history information HS, and the reward model learning unit 14 learns and updates the reward model RM based on the history information HS. In this way, by also using the system input information to learn and update the reward model RM, it is possible to improve the accuracy of learning and updating the reward model RM.

（３）本実施形態に係る情報処理装置１００は、入力情報取得部１０、出力情報取得部１１及びフィードバック情報取得部１２がそれぞれＡＰＩを介してコンテンツ生成システム２００から対応する情報を取得するよう構成されている。これにより、生成ＡＩ３００やユーザ端末４００から直接情報を取得できない場合でも、出力情報検証部１７によるコンテンツ生成システム２００の検証及び報酬モデルＲＭを学習・更新を行うことが可能となっている。なお、ＡＰＩの設計によりコンテンツ生成システム２００が情報処理装置１００に各情報を提供する際に、当該情報を暗号化（ベクトル化）するようにすることも好適である。これにより、コンテンツ生成システム２００の開発者や運用者、各ユーザＵは、安心して情報処理装置１００を利用することが可能である。 (3) In the information processing device 100 according to this embodiment, the input information acquisition unit 10, the output information acquisition unit 11, and the feedback information acquisition unit 12 are each configured to acquire corresponding information from the content generation system 200 via an API. This makes it possible for the output information verification unit 17 to verify the content generation system 200 and learn and update the reward model RM even when information cannot be acquired directly from the generation AI 300 or the user terminal 400. It is also preferable to design the API so that when the content generation system 200 provides each piece of information to the information processing device 100, the information is encrypted (vectorized). This allows the developer and operator of the content generation system 200, and each user U, to use the information processing device 100 with peace of mind.

（４）本実施形態に係る情報処理装置１００は、出力情報提示部１５が、コンテンツ生成システム２００によるシステム出力情報とともに、又はシステム出力情報に代えて、出力情報検証部１７による検証結果をユーザＵが用いるユーザ端末４００の表示手段４０５に表示するようになっている。これにより、コンテンツ生成システム２００のユーザＵは、コンテンツ生成システム２００によるレスポンス（あるいは、生成ＡＩによるレスポンス）が信頼できるものであるかを判断することができる。 (4) In the information processing device 100 according to this embodiment, the output information presenting unit 15 displays the verification result by the output information verifying unit 17 on the display means 405 of the user terminal 400 used by the user U, together with or instead of the system output information by the content generation system 200. This allows the user U of the content generation system 200 to determine whether the response by the content generation system 200 (or the response by the generation AI) is trustworthy.

（５）本実施形態に係る情報処理装置１００は、報酬モデル学習部１４が学習フェーズにおいて学習用データセットにより報酬モデルＲＭを予め学習させ、その後の提供フェーズにおいて、出力情報検証部１７が学習させた報酬モデルＲＭを用いて前記ＡＩ出力情報を検証するよう構成されている。これにより、情報処理装置１００を適用したコンテンツ生成システム２００のユーザＵへの提供の当初から、ユーザＵ向けにカスタマイズされた検証結果を出力することが可能となっている。 (5) The information processing device 100 according to this embodiment is configured such that the reward model learning unit 14 pre-learns the reward model RM using a learning dataset in the learning phase, and then in the provision phase, the output information verification unit 17 verifies the AI output information using the trained reward model RM. This makes it possible to output verification results customized for the user U from the beginning of the provision of the content generation system 200 to the user U, to which the information processing device 100 is applied.

（６）本実施形態に係る情報処理装置１００において、制御手段１は、出力情報検証部１７が報酬モデルフィルタ１７ｄに加えて出力有害性フィルタ１７ａ、出力感情フィルタ１７ｂ及びファクトチェックフィルタ１７ｃも備えている。これにより、ユーザＵのフィードバックに基づいてチューニングされる報酬モデルフィルタ１７ｄによる検証に限らない、総合的な検証を行うことが可能となっている。 (6) In the information processing device 100 according to this embodiment, the control means 1 includes an output information verification unit 17 that includes an output harmfulness filter 17a, an output emotion filter 17b, and a fact check filter 17c in addition to a reward model filter 17d. This makes it possible to perform comprehensive verification that is not limited to verification using the reward model filter 17d, which is tuned based on feedback from the user U.

（７）さらに、本実施形態の制御手段１は、入力有害性フィルタ１６ａと入力感情フィルタ１６ｂとを備えた入力情報検証部１６も備えており、適切ではないユーザ入力情報及びこれに基づいたＡＩ出力情報（システム出力情報）と、これに対するフィードバック情報とにより、報酬モデルＲＭが適切でない学習をしてしまうことを回避することが可能となっている。加えて、適切ではない入力からコンテンツ生成システム２００を保護することも可能となっている。 (7) Furthermore, the control means 1 of this embodiment also includes an input information verification unit 16 equipped with an input harmfulness filter 16a and an input emotion filter 16b, and it is possible to prevent the reward model RM from inappropriately learning due to inappropriate user input information, AI output information (system output information) based on the inappropriate user input information, and feedback information thereto. In addition, it is possible to protect the content generation system 200 from inappropriate input.

（８）本実施形態において、報酬モデル学習部１４は、単一ユーザＵごと又は同じ目的でコンテンツ生成システム２００を使用するグループに所属する複数のユーザＵ（入力者）ごとに報酬モデルＲＭを作成し、事前学習及び更新するようになっている。これにより、報酬モデルＲＭは単一ユーザあるいはグループの目的に沿うよう生成・更新されるため、検証結果が微調整され、すべてのユーザＵが同じ報酬モデル（汎用の報酬モデル）を用いる場合と比較して、使用するユーザＵに精度の高い検証結果を提供することが可能となっている。 (8) In this embodiment, the reward model learning unit 14 creates, pre-learns, and updates a reward model RM for each individual user U or for each group of multiple users U (inputters) who use the content generation system 200 for the same purpose. As a result, the reward model RM is generated and updated to suit the purpose of a single user or group, and the verification results are fine-tuned, making it possible to provide each user U with more accurate verification results than if all users U used the same reward model (general-purpose reward model).

（９）本実施形態において、情報処理装置１００を適用するコンテンツ生成システム２００は文章生成システムであり、生成ＡＩ３００はＬＬＭ（大規模言語モデル）である。ここで、ＬＬＭやこれを用いた文章生成システムによる不適切なレスポンス（回答）は、不適切性が取り上げられやすいため、近年特に問題となっている。そして、外部のＬＬＭサービスを用いるコンテンツ生成システム２００の開発者や運用者等は、不適切なレスポンスが報告されると、ＬＬＭサービス側で対応がなされるまでシステムの提供中止を余儀なくされたり、システム側での問題に対する対応に追われたりするおそれがあった。この点、本実施形態に係る情報処理装置１００によれば、コンテンツ生成システム２００でも生成ＡＩ３００でもない第３者の立場でコンテンツ生成システム２００及び生成ＡＩ３００の出力を検証するとともに、報酬モデルＲＭを日々更新するようになっている。これにより、コンテンツ生成システム２００開発者や運用者等が自ら不適切なレスポンスに対応しなくても、安心してシステムを提供することが可能となっている。 (9) In this embodiment, the content generation system 200 to which the information processing device 100 is applied is a text generation system, and the generation AI 300 is an LLM (large-scale language model). Inappropriate responses (answers) by LLMs and text generation systems using them have become a particular problem in recent years because their inappropriateness is easily highlighted. When an inappropriate response is reported, developers and operators of the content generation system 200 using an external LLM service may be forced to suspend the provision of the system until the LLM service responds, or may be forced to deal with the problem on the system side. In this regard, according to the information processing device 100 of this embodiment, the output of the content generation system 200 and the generation AI 300 is verified from the standpoint of a third party that is neither the content generation system 200 nor the generation AI 300, and the reward model RM is updated daily. This makes it possible for developers and operators of the content generation system 200 to provide the system with peace of mind, even if they do not themselves respond to inappropriate responses.

３．変形例
なお、本発明は、以下の態様でも実施可能である。 3. Modifications The present invention can also be implemented in the following aspects.

上記実施形態において、情報処理装置１００は、生成ＡＩ３００を用いたコンテンツ生成システム２００として、ＬＬＭ（大規模言語モデル）を用いた文章生成システムを検証するよう構成されていた。しかしながら、本発明の情報処理装置１００は、画像を生成するＡＩを用いた画像生成システムや動画を生成するＡＩを用いた動画生成システム、さらには、音楽を生成するＡＩを用いた音楽生成システムを検証するものであっても良い。 In the above embodiment, the information processing device 100 was configured to verify a text generation system using an LLM (large-scale language model) as a content generation system 200 using the generative AI 300. However, the information processing device 100 of the present invention may also be used to verify an image generation system using AI to generate images, a video generation system using AI to generate videos, or even a music generation system using AI to generate music.

上記実施形態において、情報処理装置１００は、履歴情報記録部１３が履歴情報ＨＳとして、出力情報取得部１１の取得したＡＩ出力情報に加えて入力情報取得部１０の取得したユーザ入力情報も、フィードバック情報取得部１２の取得したフィードバック情報とともに記録し、当該履歴情報ＨＳに基づいて報酬モデル学習部１４が報酬モデルＲＭを学習及び更新するよう構成されていた。しかしながら、履歴情報記録部１３がＡＩ出力情報とフィードバック情報に基づいて、すなわち、ユーザ入力情報には基づかずに報酬モデルＲＭを学習及び更新するようにしても良い。 In the above embodiment, the information processing device 100 was configured such that the history information recording unit 13 records, as history information HS, the AI output information acquired by the output information acquisition unit 11 as well as the user input information acquired by the input information acquisition unit 10, together with the feedback information acquired by the feedback information acquisition unit 12, and the reward model learning unit 14 learns and updates the reward model RM based on the history information HS. However, the history information recording unit 13 may also learn and update the reward model RM based on the AI output information and feedback information, i.e., not based on the user input information.

本実施形態の入力情報取得部１０は、コンテンツ生成システム２００が生成ＡＩ３００に送信する加工済みのプロンプトをＡＩ入力情報として取得するよう構成されていた。しかしながら、入力情報取得部１０は、これに代えて、ユーザ端末４００がコンテンツ生成システム２００に送信する生のプロンプト（ユーザＵが入力したプロンプトそのもの）をＡＩ入力情報として取得しても良い。この場合、報酬モデルＲＭは、ＡＩ出力情報及びフィードバック情報に加えて、当該情報に基づいて学習、更新される。 The input information acquisition unit 10 in this embodiment is configured to acquire the processed prompt sent by the content generation system 200 to the generation AI 300 as AI input information. However, instead, the input information acquisition unit 10 may acquire the raw prompt (the prompt itself entered by the user U) sent by the user terminal 400 to the content generation system 200 as AI input information. In this case, the reward model RM learns and is updated based on this information in addition to the AI output information and feedback information.

上記実施形態において、入力情報取得部１０、出力情報取得部１１及びフィードバック情報取得部１２は、それぞれＡＰＩを介してコンテンツ生成システム２００から対応する情報を取得するよう構成されていた。しかしながら、本発明に係る入力情報取得部１０、出力情報取得部１１及びフィードバック情報取得部１２の情報の取得方法はこれに限定されない。例えば、入力情報取得部１０とフィードバック情報取得部１２の少なくとも一方が、ユーザ端末４００からネットワークＮを介して直接対応する情報を取得しても良い。例えば、Ｗｅｂブラウザを介して取得することや、ウィルスソフトのような端末に常駐するアプリケーションを用いて取得することが考えられる。また、出力情報取得部１１が生成ＡＩ３００から直接回答を取得しても良い。さらに、情報処理装置１００が直接コンテンツ生成システム２００から入力を受け付け、当該入力を生成ＡＩ３００にプロンプトとして送信するとともに、生成ＡＩ３００からのレスポンスを情報処理装置１００が直接受け付け、当該レスポンスをコンテンツ生成システム２００に送信することも可能である。この場合、情報処理装置１００は、コンテンツ生成システム２００と生成ＡＩ３００の間に配置される、コンテンツ生成システム２００あるいは生成ＡＩ３００に対してのいわばファイアウォールとして機能する。 In the above embodiment, the input information acquisition unit 10, the output information acquisition unit 11, and the feedback information acquisition unit 12 were each configured to acquire corresponding information from the content generation system 200 via an API. However, the method of acquiring information of the input information acquisition unit 10, the output information acquisition unit 11, and the feedback information acquisition unit 12 according to the present invention is not limited to this. For example, at least one of the input information acquisition unit 10 and the feedback information acquisition unit 12 may directly acquire corresponding information from the user terminal 400 via the network N. For example, it is possible to acquire the information via a web browser or by using an application resident in the terminal such as virus software. In addition, the output information acquisition unit 11 may acquire an answer directly from the generation AI 300. Furthermore, the information processing device 100 may directly accept an input from the content generation system 200 and transmit the input to the generation AI 300 as a prompt, and the information processing device 100 may directly accept a response from the generation AI 300 and transmit the response to the content generation system 200. In this case, the information processing device 100 is placed between the content generation system 200 and the generation AI 300, and functions as a sort of firewall for the content generation system 200 or the generation AI 300.

上記実施形態では、情報処理装置１００の出力情報取得部１１がＡＩ出力情報を取得し、ＡＩ出力情報をユーザ入力情報及びフィードバック情報とともに履歴情報ＨＳとして記録して報酬モデルＲＭの学習に用いていた。しかしながら、出力情報取得部１１がＡＩ出力情報に代えてシステム出力情報を取得して、履歴情報ＨＳとして記録して報酬モデルＲＭの学習に用いることも可能である。ただし、システム出力情報に情報処理装置１００の出力情報提示部１５が送信した検証結果がすでに含まれている場合、これを出力情報検証部１７が再度検証するとループが発生してしまうため、システム出力情報から生成ＡＩ３００の出力のみを抽出して検証等を実行することが必要である。 In the above embodiment, the output information acquisition unit 11 of the information processing device 100 acquired the AI output information, and recorded the AI output information together with the user input information and feedback information as history information HS for use in learning the reward model RM. However, it is also possible for the output information acquisition unit 11 to acquire system output information instead of the AI output information, record it as history information HS, and use it for learning the reward model RM. However, if the system output information already contains the verification result sent by the output information presentation unit 15 of the information processing device 100, a loop will occur if the output information verification unit 17 verifies it again, so it is necessary to extract only the output of the generation AI 300 from the system output information and perform verification, etc.

上記実施形態では、出力情報提示部１５は、出力情報検証部１７による検証結果をコンテンツ生成システム２００に送信し、コンテンツ生成システム２００のシステム上でＡＩ出力情報を加工して警告文Ｗを表示させる構成であった（図６参照）。しかしながら、出力情報検証部１７による検証結果は、コンテンツ生成システム２００とは別に、例えばＷｅｂブラウザやポップアップ、プッシュ通知等により表示させることも可能である。また、コンテンツ生成システム２００のシステム上に表示させ、さらに、Ｗｅｂブラウザ等で別途統計データとともに確認できるようにすることも好適である。 In the above embodiment, the output information presentation unit 15 was configured to transmit the verification result by the output information verification unit 17 to the content generation system 200, and process the AI output information on the content generation system 200 to display the warning message W (see FIG. 6). However, the verification result by the output information verification unit 17 can also be displayed separately from the content generation system 200, for example, by a web browser, a pop-up, a push notification, etc. It is also preferable to display the verification result on the content generation system 200 system and further enable confirmation together with separate statistical data on a web browser, etc.

上記実施形態の報酬モデル学習部１４は、情報処理装置１００をコンテンツ生成システム２００に適用した後、当該コンテンツ生成システム２００がユーザＵに提供される前に報酬モデルＲＭを事前学習させていた。しかしながら、本発明に係る情報処理装置１００は、報酬モデルＲＭを事前学習させずにユーザＵに提供することも可能である。この場合、ユーザＵからのフィードバックのみにより報酬モデルＲＭが学習されることになる。 In the above embodiment, the reward model learning unit 14 pre-learns the reward model RM after applying the information processing device 100 to the content generation system 200 and before the content generation system 200 is provided to the user U. However, the information processing device 100 according to the present invention can also provide the reward model RM to the user U without pre-learning it. In this case, the reward model RM is learned solely based on feedback from the user U.

上記実施形態において、情報処理装置１００は、入力情報取得部１０によるシステム入力情報の取得と出力情報取得部１１によるＡＩ出力情報の取得とを異なるタイミング（ステップＳ４／ステップＳ１０）で実行していた。しかしながら、システム入力情報の取得とＡＩ出力情報の取得とを同時に行うことも可能である。この場合、その後のシステム入力情報及びＡＩ出力情報の履歴情報ＨＳとしての記録及び検証も、同時に実行することができる。 In the above embodiment, the information processing device 100 acquires system input information by the input information acquisition unit 10 and acquires AI output information by the output information acquisition unit 11 at different times (step S4/step S10). However, it is also possible to acquire system input information and AI output information simultaneously. In this case, subsequent recording and verification of the system input information and AI output information as history information HS can also be performed simultaneously.

上記実施形態では、システム出力情報のうち、生成ＡＩ３００による回答（ＡＩ出力情報）に対してユーザＵからフィードバックをもらうことが想定されていたが、これとは別に、出力情報提示部１５が提提示した出力情報検証部１７による検証結果に対してもフィードバックをもらうようにしても良い。 In the above embodiment, it was assumed that feedback would be received from the user U regarding the answer (AI output information) by the generation AI 300 among the system output information, but separately from this, feedback may also be received regarding the verification results by the output information verification unit 17 presented by the output information presentation unit 15.

上記実施形態において、情報処理装置１００は、コンテンツ生成システム２００とネットワークＮを介して接続され、いわゆるウェブサービスとしてコンテンツ生成システム２００（生成ＡＩ３００）の検証を行うものであった。しかしながら、情報処理装置１００は、コンテンツ生成システム２００と同じローカルネットワーク上あるいは同じプライベートクラウド上にオンプレミスで構成されても良い。 In the above embodiment, the information processing device 100 is connected to the content generation system 200 via a network N, and verifies the content generation system 200 (generation AI 300) as a so-called web service. However, the information processing device 100 may be configured on-premise on the same local network or the same private cloud as the content generation system 200.

上記実施形態において、コンテンツ生成システム２００及び生成ＡＩ３００は、それぞれユーザ端末４００とネットワークＮを介して接続され、ユーザＵに各種機能を提供するものであった。しかしながら、これらコンテンツ生成システム２００及び生成ＡＩ３００はそれぞれ、ユーザ端末４００、具体的には、ＰＣ・携帯端末・自動車・医療機器などのローカルデバイス上に搭載された集積回路上で動くシステム、いわゆるエッジＡＩとして提供されるものであっても良い。この場合、情報処理装置１００も、少なくとも一部の機能要素をこれらと同じユーザ端末４００（ローカルデバイス）上にオンプレミスで配置することができる。また、生成ＡＩ３００がネットワークＮ上にあり、コンテンツ生成システム２００のみがユーザ端末４００上にある場合も、情報処理装置１００をコンテンツ生成システム２００と同じデバイス上にオンプレミスで設置することが可能である。 In the above embodiment, the content generation system 200 and the generation AI 300 are each connected to the user terminal 400 via the network N to provide various functions to the user U. However, the content generation system 200 and the generation AI 300 may each be provided as a system that runs on an integrated circuit mounted on the user terminal 400, specifically, a local device such as a PC, a mobile terminal, an automobile, or a medical device, so-called edge AI. In this case, at least some of the functional elements of the information processing device 100 can also be placed on-premise on the same user terminal 400 (local device). Also, even if the generation AI 300 is on the network N and only the content generation system 200 is on the user terminal 400, the information processing device 100 can be installed on-premise on the same device as the content generation system 200.

１：制御手段
２：記憶手段
３：通信手段
４：バス
１０：入力情報取得部
１１：出力情報取得部
１２：フィードバック情報取得部
１３：履歴情報記録部
１４：報酬モデル学習部
１５：出力情報提示部
１６：入力情報検証部
１６ａ：入力有害性フィルタ
１６ｂ：入力感情フィルタ
１７：出力情報検証部
１７ａ：出力有害性フィルタ
１７ｂ：出力感情フィルタ
１７ｃ：ファクトチェックフィルタ
１７ｄ：報酬モデルフィルタ
１００：情報処理装置
２００：コンテンツ生成システム
３００：生成ＡＩ
４００：ユーザ端末
４０１：制御手段
４０２：記憶手段
４０３：通信手段
４０４：入力手段
４０５：表示手段
４０６：バス
Ａ１：プロンプト入力欄
Ａ２：レスポンス欄
ＨＳ：履歴情報
Ｎ：ネットワーク
ＲＭ：報酬モデル
Ｕ：ユーザ
Ｗ：警告文 1: Control means 2: Storage means 3: Communication means 4: Bus 10: Input information acquisition unit 11: Output information acquisition unit 12: Feedback information acquisition unit 13: History information recording unit 14: Reward model learning unit 15: Output information presentation unit 16: Input information verification unit 16a: Input harmfulness filter 16b: Input emotion filter 17: Output information verification unit 17a: Output harmfulness filter 17b: Output emotion filter 17c: Fact check filter 17d: Reward model filter 100: Information processing device 200: Content generation system 300: Generation AI
400: User terminal 401: Control means 402: Storage means 403: Communication means 404: Input means 405: Display means 406: Bus A1: Prompt input field A2: Response field HS: History information N: Network RM: Reward model U: User W: Warning message

Claims

An information processing device for verifying a content generation system using generation AI,
The content generation system accepts input information from an input user who inputs a prompt, transmits it to the generation AI as system input information, obtains AI output information from the generation AI for the system input information, and provides system output information to the input user based on the AI output information;
A control means and a storage means are provided,
The control means includes an output information acquisition unit, an output information verification unit, a feedback information acquisition unit, a history information recording unit, and a reward model learning unit,
The output information acquisition unit acquires the AI output information,
The output information verification unit verifies the AI output information,
the feedback information acquisition unit acquires feedback information from the person inputting the system output information,
The history information recording unit records the AI output information and the feedback information as history information in the storage means,
The reward model learning unit learns a reward model based on the history information.

The information processing device according to claim 1 ,
The control means further includes an input information acquisition unit,
the input information acquisition unit acquires at least one of the person input information and the system input information,
The history information recording unit records at least one of the input person input information and the system input information as the history information in the storage means in addition to the AI output information and the feedback information,
The reward model learning unit learns the reward model based on the history information.

3. The information processing device according to claim 2,
The input information acquisition unit and the feedback information acquisition unit provide an API to the content generation system in advance, and use the API to acquire at least one of the user input information and the system input information, and the feedback information, from the content generation system, respectively.

The information processing device according to claim 1 ,
The output information acquisition unit provides an API to the content generation system in advance and acquires the AI output information from the content generation system using the API.

2. The information processing device according to claim 1,
The control means further includes an output information presentation unit,
The information processing apparatus, wherein the output information presenting unit presents the verification result to the person inputting information together with the system output information or in place of the system output information based on the verification result of the output information verifying unit.

The information processing device according to claim 1 ,
The reward model learning unit pre-learns the reward model based on the AI output information and an evaluation of the AI output information in a learning phase,
The output information verification unit verifies the AI output information using the reward model in the provision phase,
The reward model learning unit updates the reward model based on the history information in the provision phase.

7. The information processing device according to claim 6,
The control means further includes an input information acquisition unit,
the input information acquisition unit acquires at least one of the user input information and the system input information in the learning phase;
The reward model learning unit, during the learning phase, learns the reward model based on at least one of the user input information and the system input information for a learning dataset input by the user, the AI output information corresponding thereto, and an evaluation of the AI output information.

7. The information processing device according to claim 6,
The information processing device, wherein the output information verification unit further verifies at least one of harmfulness, accuracy, and emotion of the AI output information in addition to verification using the reward model in the provision phase.

3. The information processing device according to claim 2,
The control means further includes an input information verification unit,
The input information verification unit verifies at least one of harmfulness and emotion of the input user input information.

2. The information processing device according to claim 1,
The reward model learning unit is an information processing device that learns the reward model for each individual user who inputs data or for each of multiple users who belong to the same group.

An information processing device according to any one of claims 1 to 10,
An information processing device, wherein the generative AI is a large-scale language model, and the content generation system is a sentence generation system.

An information processing method for verifying a content generation system using generation AI, comprising:
The content generation system accepts input information from an input user who inputs a prompt, transmits it to the generation AI as system input information, obtains AI output information from the generation AI for the system input information, and provides system output information to the input user based on the AI output information;
performing an output information acquisition process, an output information verification process, a feedback information acquisition process, a history information recording process, and a reward model learning process;
In the output information acquisition process, the AI output information is acquired,
In the output information verification process, the AI output information is verified,
In the feedback information acquisition process, feedback information from the person inputting the system output information is acquired,
In the history information recording process, the AI output information and the feedback information are recorded as history information in a storage means;
In the reward model learning process, a reward model is learned based on the history information.

A program for verifying a content generation system using generation AI,
The content generation system accepts input information from an input user who inputs a prompt, transmits it to the generation AI as system input information, obtains AI output information from the generation AI for the system input information, and provides system output information to the input user based on the AI output information;
causing a computer to execute an output information acquisition process, an output information verification process, a feedback information acquisition process, a history information recording process, and a reward model learning process;
In the output information acquisition process, the AI output information is acquired,
In the output information verification process, the AI output information is verified,
In the feedback information acquisition process, feedback information from the person inputting the system output information is acquired,
In the history information recording process, the AI output information and the feedback information are recorded as history information in a storage means;
In the reward model learning process, a reward model is learned based on the history information.