CN117291185A - Task processing method, entity identification method and task processing data processing method - Google Patents
Task processing method, entity identification method and task processing data processing method Download PDFInfo
- Publication number
- CN117291185A CN117291185A CN202311047348.1A CN202311047348A CN117291185A CN 117291185 A CN117291185 A CN 117291185A CN 202311047348 A CN202311047348 A CN 202311047348A CN 117291185 A CN117291185 A CN 117291185A
- Authority
- CN
- China
- Prior art keywords
- task
- text
- sample
- format
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 title claims abstract description 326
- 238000000034 method Methods 0.000 title claims abstract description 82
- 238000003672 processing method Methods 0.000 title claims abstract description 56
- 238000012549 training Methods 0.000 claims description 93
- 238000012216 screening Methods 0.000 claims description 18
- 230000015654 memory Effects 0.000 claims description 6
- 239000000523 sample Substances 0.000 description 520
- 238000000605 extraction Methods 0.000 description 37
- 230000008569 process Effects 0.000 description 32
- 239000013074 reference sample Substances 0.000 description 29
- 230000008451 emotion Effects 0.000 description 22
- 239000013598 vector Substances 0.000 description 15
- 230000006870 function Effects 0.000 description 14
- 230000007246 mechanism Effects 0.000 description 14
- 238000013136 deep learning model Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 13
- 238000004590 computer program Methods 0.000 description 9
- 230000000694 effects Effects 0.000 description 8
- 238000003058 natural language processing Methods 0.000 description 8
- 230000002829 reductive effect Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 230000000873 masking effect Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013256 Gubra-Amylin NASH model Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the specification provides a task processing method, an entity identification method and a data processing method for task processing, wherein the task processing method comprises the following steps: receiving target text input by a front end and task information of a target task selected by the target text; constructing an instruction text conforming to a first format based on the target text and the task information; and executing a target task on the target text by using a task processing model based on the instruction text, and generating a task execution result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information. The model has high universality and high task processing precision, and improves the universality and the task processing precision of task processing.
Description
Technical Field
The embodiment of the specification relates to the technical field of text processing, in particular to a task processing method.
Background
With the continued development of deep learning techniques, natural language processing (NLP, natural Language Processing) typified by natural language understanding (NLU, natural Language Understanding) and natural language generation (NLG, natural Language Generation) has been greatly advanced.
At present, under various task subdivision scenes, based on large-scale high-quality sample data, a deep learning model is trained in advance, so that a task processing model obtained through training has extremely high task processing precision. However, the model performance of the task processing model is directly related to the scale of the sample data, the quality of the sample data, the training method of the model, the software and hardware resources, and the like in the pre-training process. Even if the directly related conditions are satisfied, a task processing model with excellent model performance is obtained through training, however, in the practical application process, how to fully utilize the model performance not only realizes generalized task processing under various task subdivision scenes, but also has high universality, and has high task processing precision under any task subdivision scene, which is a problem to be solved urgently. Therefore, a task processing method having high versatility and high accuracy is demanded.
Disclosure of Invention
In view of this, the present embodiment provides a task processing method. One or more embodiments of the present disclosure relate to another task processing method, an entity recognition method, a task processing data processing method, a task processing device, another task processing device, an entity recognition device, a task processing data processing device, a computing device, a computer readable storage medium, and a computer program, so as to solve the technical defects in the prior art.
In one embodiment of the present disclosure, a task processing method is provided, including:
receiving target text input by a front end and task information of a target task selected by the target text;
constructing an instruction text conforming to a first format based on the target text and the task information;
and executing a target task on the target text by using a task processing model based on the instruction text, and generating a task execution result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information.
In one embodiment of the specification, receiving target text input by a front end and task information of a target task selected for the target text; constructing an instruction text conforming to a first format based on the target text and the task information; and executing a target task on the target text by using a task processing model based on the instruction text, and generating a task execution result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information. The task processing model is obtained by performing supervision training on a sample instruction text based on a first format and a label result text based on a second format in advance, the input instruction text based on the first format and the output result based on the second format are learned, and under the condition that the second format is determined based on sample task information and a label text corresponding to the sample task information, the task processing model can generate a task execution result corresponding to the task information and the corresponding label text and conforming to the second format according to the task information of the instruction text, and the task processing model has the refinement processing capability on specific task information while executing target tasks on target texts under different task information, namely, the task processing has high universality and high task processing precision.
Drawings
FIG. 1 is a flow chart of a method of task processing provided in one embodiment of the present disclosure;
FIG. 2 is a flow chart of another task processing method provided by one embodiment of the present disclosure;
FIG. 3 is a flow chart of a method of entity identification provided in one embodiment of the present disclosure;
FIG. 4 is a flow chart of a data processing method for task processing provided in one embodiment of the present disclosure;
FIG. 5 is a front end interface schematic diagram of a task processing method according to an embodiment of the present disclosure;
FIG. 6 is a process flow diagram of a task processing method applied to a search engine according to one embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a task processing device according to an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of another task processing device according to one embodiment of the present disclosure;
FIG. 9 is a schematic diagram of an entity identification device according to one embodiment of the present disclosure;
FIG. 10 is a schematic diagram of a task processing data processing device according to one embodiment of the present disclosure;
FIG. 11 is a block diagram of a computing device provided in one embodiment of the present description.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
Furthermore, it should be noted that, user information (including, but not limited to, user equipment information, user personal information, etc.) and data (including, but not limited to, data for analysis, stored data, presented data, etc.) according to one or more embodiments of the present disclosure are information and data authorized by a user or sufficiently authorized by each party, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions, and is provided with corresponding operation entries for the user to select authorization or denial.
In one or more embodiments of the present description, a large model refers to a deep learning model with large scale model parameters, typically including hundreds of millions, billions, trillions, and even more than one billion model parameters. The large Model can be called as a Foundation Model, a training Model is performed by using a large-scale unlabeled corpus, a pre-training Model with more than one hundred million parameters is produced, the Model can adapt to a wide downstream task, and the Model has better generalization capability, such as a large-scale language Model (Large Language Model, LLM), a multi-mode pre-training Model and the like.
When the large model is actually applied, the pretrained model can be applied to different tasks by only slightly adjusting a small number of samples, the large model can be widely applied to the fields of natural language processing (Natural Language Processing, NLP for short), computer vision and the like, and particularly can be applied to the tasks of the computer vision fields such as visual question and answer (Visual QuestionAnswering, VQA for short), image description (IC for short), image generation and the like, and the tasks of the natural language processing fields such as emotion classification based on texts, text abstract generation, machine translation and the like, and main application scenes of the large model comprise digital assistants, intelligent robots, searching, online education, office software, electronic commerce, intelligent design and the like.
First, terms related to one or more embodiments of the present specification will be explained.
Natural language understanding (NLU, natural Language Understanding): the deep learning model is used for understanding the task of the natural language, specifically converting the natural language into feature vectors which can be understood by the deep learning model, and further processing and applying the feature vectors, including grammar analysis, semantic analysis, entity recognition extraction, text classification and the like. Natural language understanding is an important basis for natural language processing, and is a bottom technology of various systems such as information management systems, automation office systems, search engines, recommendation systems and the like. In general, deep learning models rely on pre-trained sample inputs and sample outputs to learn task content to understand natural language understanding tasks.
Open domain model (Open Domain Model): the general model without limitation of the field (task scene) has high universality.
CNN (Convolutional Neural Networks, convolutional neural network) model: a multi-layer deep learning model with forward and backward propagation has convolution kernels (filters) that process feature data.
RNN (Recurrent Neural Network ) model: a recursive deep learning model is recursive in the processing direction of vector characterization and each intermediate layer is chained.
LSTM (Long Short Term Memory, long and short term memory network) model: a deep learning model with the ability to memorize long-short term information has a convolution kernel (filter) that processes feature data.
Transformer model: a deep learning model based on an attention mechanism extracts and analyzes the characteristics of data through the attention mechanism.
BERT (Bidirectional Encoder Representation from Transformers, bi-directional coding characterizes translation) model: a deep learning model of a bi-directional attention code characterization function.
RoBERTa (ARobustly Optimized BERTPretrainingApproach, brute force optimization BERT method) model: a BERT derivative model with a character-level and word-level hybrid coding and dynamic masking mechanism.
GAN (GenerativeAdversarial Network, generated challenge model): comprising a Generator and a discriminant, a high accuracy Generator is obtained by alternating training of the Generator and discriminant.
At present, natural language understanding has been the direction of key research and development due to its importance and the broad applicability. Especially, natural language understanding has made great progress since the birth of the BERT model. In various subdivision task scenes, based on large-scale high-quality sample data, the natural language understanding model of supervised learning is adopted to achieve extremely high precision. However, the high custom cost of the supervision model (sample data labeling cost, model development cost, etc.) limits its application scope. Besides the relatively fixed requirement, the method has the scene that the cost can be thinned in large-scale use, and is difficult to practically land for the scene with sensitive cost, large sample data marking difficulty and rapid requirement change. Therefore, researchers have been working to explore a general model that is not limited by the field, i.e., an open field model, and has high task processing accuracy in many types of task subdivision scenarios.
However, it is limited by the parameter size of the current open domain model (typically 10 9 Level), the generalization of the model is insufficient. With the advent of large-scale language models, and particularly generative large-scale language models, users only need to carefully describe their own needs, and can obtain relatively reliable task execution results without the need to train the model by marking a large amount of sample data. However, the generated large-scale language model has the following problems:
1. the cost problem is that natural language understanding tasks are taken as basic tasks, the calling quantity is huge, and the long-term calling cost even exceeds the cost of specially developing a supervision model. 2. The data security problem is that for some scenes with strict data security management, a generated large-scale language model of distributed deployment cannot be called from outside. 3. The reasoning speed problem, the concurrency number and response time of the generation type large-scale language model are difficult to support natural language to understand the large-scale real-time calling request of the basic task. 4. The task processing precision of the generated large-scale language model is greatly affected by the instruction text, and format styles of the instruction text are required to be correspondingly designed for different tasks. 5. The model output has poor resolvability and stability, the output of the generated large-scale language model is in a natural language form, and strict format constraint is not generated, so that the use of a downstream task is difficult. 6. The closed source model cannot be deeply customized by combining the precipitated project data, and the general model is difficult to meet the task scene with high task processing precision requirements.
In view of the above problems, in the embodiment of the description of fig. 1, a task processing method is provided, a multitasking unified pre-training manner is adopted to train a task processing model, and the format of an instruction text and the format of a result text are fixed, so that the problems of 1-6 are overcome, and the task processing method has universality in multiple natural language understanding tasks, and generates a friendly use manner similar to an application programming interface (API, application Programming Interface).
In the present specification, there are provided a task processing method, which relates to another task processing method, an entity recognition method, a task processing data processing method, a task processing device, another task processing device, an entity recognition device, a task processing data processing device, a computing device, a computer-readable storage medium, and a computer program, which are described in detail one by one in the following embodiments.
Referring to fig. 1, fig. 1 shows a flowchart of a task processing method according to an embodiment of the present disclosure, including the following specific steps:
step 102: and receiving target text input by the front end and task information of a target task selected by the target text.
The embodiment of the specification is applied to a client, a server or an application platform of an application with various task processing functions, and a deep learning model with various task processing functions, namely a task processing model, is deployed on the client, the server or the application platform. The front end is the application front end or the application platform front end which is logged in by the user and has a plurality of task processing functions. The user can manually input the target text at the front end, and can acquire the target text at the front end so as to realize the input of the target text. Task information of various tasks can be set on the front end, a user can select task information of a target task in a clicking or dragging mode, and the user can select the task information of the target task by inputting the task information of the target task, and the task information is not limited.
The target text is a natural language text to be processed by the task to be executed, and can be at least one sentence or at least one word. For example, the target text is "I don't care".
The target task is a text processing task to be executed, and is a natural language understanding task, including but not limited to: recognition and extraction tasks such as entity recognition (entity extraction), event recognition (event extraction), question answer recognition (question answer extraction), and text classification tasks such as topic classification, intention classification, emotion classification. The task information of the target task is task feature information of the target task, which is a natural language text, and characterizes the task features of the target task, including but not limited to: task type, tag type, and task execution logic. For example, for the target text "i don't care", the target task is selected as emotion classification, and the task information of the target task is "classification: emotion.
Illustratively, the user a logs in a multifunctional integrated application platform with an identification extraction function and a text classification function, the front end of the T platform inputs a target text "a is true and hot" at the front end, and selects a target task of a location entity from the identification extraction tasks, and task information of the target task selected for the target text is "extraction: location).
Receiving target text input by the front end and task information of a target task selected by aiming at the target text, and laying a data foundation for constructing a first format instruction text subsequently.
Step 104: and constructing the instruction text conforming to the first format based on the target text and the task information.
The instruction text is a natural language text instruction directly input into the task processing model, and is used for guiding the task processing model to understand task content of the target task, executing the target task corresponding to the task information on the target text, and generating an output result corresponding to the output format. Generally, the instruction text has a corresponding format style, and the model clearly understands the target task through the definition of a specific format style and executes the target task on the target text according to the definition of the specific format style. For example, the format style of the instruction text is "adverbs+adjectives+nouns; text style labels, text style examples ", the task content and output format of the target task are clearly understood by the model through the text grammar level of the adverbs, adjectives and nouns and through the text style level of the text style labels and the text style examples.
The first format is a text instruction format of an instruction text corresponding to a target task, the first format is determined based on sample task text and sample task information in a pre-training process and is learned by a task processing model, the instruction text is input in a first format style in an application process, and the first format style comprises: and the control symbol (input, task) is used for guiding the task processing model to understand the meaning of the instruction text and controlling the task execution of the task processing model. The first format style includes target text and task information, and in this embodiment of the present disclosure, the first format style is: input: a target text; task information. For example, the target task is a human entity recognition extraction, and the first format style is "input: a target text; extracting: name of the person.
Based on the target text and the task information, constructing an instruction text conforming to a first format in the following specific modes: and constructing the instruction text conforming to the first format according to the first format style based on the target text and the task information.
Illustratively, the extraction is based on the target text "A. True heat" and task information ": place ", input according to a first format style: a target text; task information ", build instruction text conforming to the first format: input: a is true heat; extracting: a location.
Based on the target text and the task information, an instruction text conforming to the first format is constructed, so that a follow-up task processing model can more accurately understand task content of the target task, and a format foundation is laid for follow-up generation of a high-precision task execution result.
Step 106: and executing a target task on the target text by using a task processing model based on the instruction text, and generating a task execution result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information.
The task processing model is a deep learning model with various task processing functions and has natural language generation capability, and corresponding tasks can be executed based on task information of different tasks to generate task execution results in a specific format. Task processing models include, but are not limited to: CNN model, RNN model, LSTM model, transducer model, BERT model, roBERTa model, and GAN model. The task processing model is a generative small-scale language model (parameter scale is generally 10 12 Level GB), compared to generative large scale language models (parameter scale is typically 10 15 Level), may be deployed on a client, server, or application platform, without being deployed on an external distributed cluster. The task processing model is Pre-trained, i.e., pre-trained (Pre-train) and/or Fine-tuned (Fine-tune). In the embodiment of the present disclosure, the task processing model is a generating language model, and processes a natural language understanding task by adopting a natural language generating manner, in order for the generating language model to understand the task content of the natural language understanding task, a first format needs to be used for defining an input instruction text, and in order for the generating language model to output an output result of the natural language understanding in a specific format, a second format needs to be used for defining the output result, so that the task processing model with multiple task processing functions can accurately execute the natural language understanding task.
The task execution result is a natural language text result of executing the target task on the target text and is an input result generated by the task processing model.
The second format is a text format of a task execution result corresponding to the target task, the second format is determined based on sample task information in a pre-training process and label text corresponding to the sample task information and is learned by a task processing model, the task execution result is output in a second format style in an application process, and the second format style comprises: and the controller (output) is used for guiding the task processing model to generate a task execution result. The specific second format style is: and (3) outputting: task information: and predicting the text. For example, the target task is a human entity recognition extraction, and the second format style is "output: name of person: name entity.
The sample task text is natural language text of the sample task to be processed by the task to be executed. The sample task information characterizes task features of the sample task. The sample task is a pre-trained text processing task and is a task for understanding various natural languages. For example, the sample task is emotion classification, the sample task text of emotion classification is "i don't care", and the sample task information of emotion classification is "classification: emotion.
The label text is an execution object text of task processing, and is a text corresponding to sample task information in the task processing, namely a pre-training process. For example, the sample task is emotion classification, the sample task text of emotion classification is "i don't care", and the sample task information of emotion classification is "classification: emotion, the corresponding label text is "not happy".
In the pre-training process of the task processing model, the sample instruction text in the first format is used as a training sample, the label result text in the second format is used as a label sample, and the task processing model is subjected to supervision training. The first format is determined based on the sample task text and the sample task information, and the second format is determined based on the sample task information and the tag text to which the sample task information corresponds. The task processing model learns the first format of the instruction text of the different sample and the second format of the corresponding label result text through such supervision training, and in step 106, the task processing model understands and executes the target task through the input instruction text conforming to the first format and correspondingly generates a task execution result conforming to the second format. The second format promotes the interpretability and stability of the output result through the constraint of the target task, and facilitates the use of the subsequent downstream task.
Based on the instruction text, executing a target task on the target text by using a task processing model, and generating a task execution result conforming to a second format, wherein the specific mode is as follows: inputting the instruction text into a task processing model, generating a prediction text corresponding to the task information based on the context of the instruction text, and determining a task execution result conforming to a second format based on the task information and the prediction text. The method comprises the steps of generating a predictive text based on a context, namely, performing feature coding on an instruction text in a coding and decoding mode to obtain a feature vector of the instruction text, and decoding the feature vector based on a context decoding mechanism to generate the predictive text. The context mechanism may be an attention mask mechanism or a diagonal mask mechanism. Specifically, a token for a next location is predicted based on a previous token in a feature vector of the instruction text. Based on the task information and the predicted text, determining a task execution result conforming to a second format, wherein the specific mode is as follows: and constructing a task execution result conforming to the second format according to the second format style based on the task information and the predicted text.
Illustratively, the task processing model is a transducer model trained based on a sample instruction text in a first format and a label result text in a second format for identifying the extraction task and the text classification task, and has the function of identifying the extraction and the text classification. The instruction text "input: a is true heat; extracting: the method comprises the steps of inputting a task processing model at a place, carrying out Feature coding on an instruction text to obtain a Feature vector Feature of the instruction text, decoding the Feature vector based on a diagonal masking mechanism, and generating task information extraction: the place "corresponding predicted text" a ground ", based on the task information and the predicted text, is output in the second format style: task information: predictive text ", constructing a task execution result conforming to the second format: and (3) outputting: location: and A is ground.
In the embodiment of the specification, receiving target text input by a front end and task information of a target task selected by aiming at the target text; constructing an instruction text conforming to a first format based on the target text and the task information; and executing a target task on the target text by using a task processing model based on the instruction text, and generating a task execution result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information. The task processing model is obtained by performing supervision training on a sample instruction text based on a first format and a label result text based on a second format in advance, the input instruction text based on the first format and the output result based on the second format are learned, and under the condition that the second format is determined based on sample task information and a label text corresponding to the sample task information, the task processing model can generate a task execution result corresponding to the task information and the corresponding label text and conforming to the second format according to the task information of the instruction text, and the task processing model has the refinement processing capability on specific task information while executing target tasks on target texts under different task information, namely, the task processing has high universality and high task processing precision.
In an alternative embodiment of the present specification, the task information of the target task includes a task type and a tag type;
correspondingly, the step 106 includes the following specific steps:
inputting the instruction text into a task processing model, executing a target task corresponding to the task type on the target text based on the tag type, generating a prediction text corresponding to the tag type, and determining a task execution result conforming to the second format based on the tag type and the prediction text.
The task information characterizes the task characteristics of the target task, so that a first format and a second format are determined, and input and output of the task processing model are limited, so that the task processing model accurately understands the task content of the target task and obtains an output result in a specific format. Further, the task information may be subdivided into task types and tag objects, characterizing task mode features and task object features.
The task type is the task mode of the target task, namely the type of the task execution mode. The tag type is the task object of the target task, i.e. the type of the tag text. For example, for a place entity recognition extraction task, the task type is recognition extraction, the tag type is place, and recognition extraction needs to be performed on tag text of the tag type of place in the target text. The task type and the label type are preset, and the user inputs the target text and correspondingly selects the target text according to the target task to be executed on the target text.
In this embodiment of the present disclosure, the first format style includes: control symbols (inputs, tasks); a target text; a task type; tag type. The specific first format style is: input: a target text; task type: tag type.
Since in the embodiment of the present specification, the task processing model is a generative language model, which is executed by means of natural language generation for a natural language understanding task, it is necessary to correspondingly generate a predicted text, instead of directly outputting a task execution result in a manner similar to natural language understanding.
Inputting the instruction text into a task processing model, executing a target task corresponding to the task type on the target text based on the tag type, and generating a predicted text corresponding to the tag type, wherein the specific mode is as follows: inputting the instruction text into a task processing model, and generating a prediction text corresponding to the label type based on the label type and the context of the instruction text.
Illustratively, the instruction text "input: a is true heat; extracting: inputting a task processing model at a place, performing Feature coding on an instruction text to obtain a Feature vector Feature of the instruction text, executing a task type extraction corresponding to a task type extraction of a target text A, generating a corresponding prediction text A based on a tag type true heat based on the place and a diagonal masking mechanism, and outputting according to a second format style based on task information and the prediction text: task information: predictive text ", determining a task execution result conforming to the second format: and (3) outputting: location: and A is ground.
Inputting the instruction text into a task processing model, executing a target task corresponding to the task type on the target text based on the tag type, generating a prediction text corresponding to the tag type, and determining a task execution result conforming to the second format based on the tag type and the prediction text. The task processing model is enabled to more finely understand the task content of the target task through the task type and the label type, and a task execution result conforming to the second format is generated, so that the task processing precision is further improved.
In an alternative embodiment of the present disclosure, before step 106, the following specific steps are further included:
acquiring sample task texts and sample task information of various sample tasks;
based on the sample task text and the sample task information, constructing a sample instruction text conforming to a first format;
determining a second format and a label result text conforming to the second format based on the sample task information and sample label text corresponding to the sample task information;
based on the sample instruction text, executing a corresponding sample task on the sample task text by using a task processing model, and generating a prediction result text conforming to a second format;
training the task processing model based on the predicted result text and the label result text, and obtaining the task processing model after training under the condition that the preset training ending condition is reached.
The sample task is a pre-trained text processing task and is a task for understanding various natural languages. The sample task text is a natural language text of the sample task to be processed by the task to be executed, and can be at least one sentence or at least one word. Sample task information is task feature information of a sample task, which is a natural language text, and characterizes task features of the sample task, including but not limited to: task type, tag type, and task execution logic. The sample task text and sample task information can be obtained from an open source sample set, can be collected from a historical task processing data set, and can be constructed manually.
The sample instruction text is a natural language text instruction which is directly input into the task processing model in the pre-training process, and is used for guiding the task processing model to understand the task content of the sample task in the pre-training process, and executing the sample task corresponding to the sample task information on the sample task text to generate an output result corresponding to the output format. During the pre-training of the task processing model, the sample instruction text in the first format serves as a training sample.
The sample label text is text of an execution object of a sample task in a pre-training process, namely text corresponding to sample task information. For example, the sample task is emotion classification and the tag text is an emotion word. The label result text is a label natural language text result used for measuring the task execution effect of the task processing model in the pre-training process. In the pre-training process of the task processing model, the label result text in the second format is used as a label sample. The predicted result text is a natural language text result of executing the sample task on the sample task text by the task processing model in the pre-training process, and is an output result generated by the task processing model.
The preset training ending condition is a preset judging condition for ending training of the task processing model, including but not limited to: preset iteration times, preset loss value threshold value and preset training convergence conditions.
Based on the sample task text and the sample task information, a sample instruction text conforming to a first format is constructed in the following specific mode: and constructing the sample instruction text conforming to the first format according to the first format style based on the sample task text and the sample task information.
Based on sample task information and sample label text corresponding to the sample task information, determining a second format and label result text conforming to the second format, wherein the specific mode is as follows: and determining a second format based on the sample task information and sample label texts corresponding to the sample task information, and constructing a label result text conforming to the second format according to a second format style based on the sample label texts corresponding to the sample task information and the sample task information.
Training the task processing model based on the predicted result text and the label result text, wherein the specific mode is as follows: and calculating a loss value based on the predicted result text and the label result text, and adjusting model parameters of the task processing model based on the loss value. Wherein the penalty value is the degree of difference between the predicted outcome text and the label outcome text, including, but not limited to: cosine loss value, cross entropy loss value and vector distance loss value, and the method for adjusting parameters is a gradient descent method.
Illustratively, sample task text and sample task information are obtained that identify extraction tasks (entity extraction, event extraction, question answer extraction) and text classification tasks (topic classification, intent classification, emotion classification), 100 each. Based on the sample task text and sample task information, input in a first format style: sample task text; sample task information ", a sample instruction text conforming to a first format is constructed. Determining a second format based on the sample task information and sample tag text corresponding to the sample task information: and (3) outputting: sample task information: sample label text or predicted result text, based on sample task information and sample label text corresponding to the sample task information, outputting according to a second format style: sample task information: sample tag text or predicted result text ", and constructing tag result text conforming to the second format. And inputting the sample instruction text into a transducer model, executing a corresponding sample task on the sample task text, and generating a prediction result text conforming to the second format. And calculating a Loss value Loss based on the predicted result text and the label result text, adjusting model parameters of the task processing model based on the Loss value, and obtaining the task processing model after training under the condition that the preset training convergence condition is reached.
Acquiring sample task texts and sample task information of various sample tasks; based on the sample task text and the sample task information, constructing a sample instruction text conforming to a first format; determining a second format and a label result text conforming to the second format based on the sample task information and sample label text corresponding to the sample task information; based on the sample instruction text, executing a corresponding sample task on the sample task text by using a task processing model, and generating a prediction result text conforming to a second format; training the task processing model based on the predicted result text and the label result text, and obtaining the task processing model after training under the condition that the preset training ending condition is reached. Based on the sample task text and sample task information, a sample instruction text conforming to a first format is constructed, based on the sample task information and a sample tag text corresponding to the sample task information, a second format and a tag result text conforming to the second format are determined, based on the sample instruction text and the tag result text, training is performed on a task processing model under the constraint of sample task information of various sample tasks, so that the task processing model has task processing capacities for different tasks, the task processing model has high universality, and meanwhile, the task processing model has refined task processing capacities for various sample tasks through training under the constraint of the fixed formats of the first format and the second format, and the task processing model has high task processing precision.
In an alternative embodiment of the present disclosure, based on the sample instruction text, executing a corresponding sample task on the sample task text by using a task processing model, and generating a predicted result text conforming to the second format, including the following specific steps:
and inputting the sample instruction text into a task processing model, generating a prediction text corresponding to the sample task information based on the context of the sample instruction text, and determining a prediction result text conforming to the second format based on the sample task information and the prediction text.
The sample instruction text conforming to the first format may be regarded as a natural language text sentence, and the prediction text is generated based on the context of the text sentence, and the prediction result text conforming to the second format is determined therefrom.
The predicted text is a natural language text result corresponding to sample task information directly generated by the task processing model in the pre-training process, and is an output result generated by the task processing model. For example, the sample task information is "decimate: place ", predicted text is" B place ".
Based on the context of the sample instruction text, generating a prediction text corresponding to the sample task information, wherein the prediction text is generated by performing feature coding on the sample instruction text in a coding and decoding mode to obtain a feature vector of the sample instruction text, and decoding the feature vector based on a context decoding mechanism. The context mechanism may be an attention mask mechanism or a diagonal mask mechanism. Specifically, a token for a next location is predicted based on a previous token in a feature vector of the instruction text.
Based on the sample task information and the predicted text, determining a predicted result text conforming to a second format in the following specific manner: and constructing the predicted result text conforming to the second format according to the second format style based on the sample task information and the predicted text.
The sample instruction text is input into a task processing model, feature codes are carried out on the sample instruction text, feature vectors of the sample instruction text are obtained, the feature vectors are decoded based on a diagonal masking mechanism, prediction text corresponding to sample task information is generated, and the prediction text is output according to a second format style based on the task information and the prediction text: task information: predictive text ", constructing a predictive result text conforming to the second format.
And inputting the sample instruction text into a task processing model, generating a prediction text corresponding to the sample task information based on the context of the sample instruction text, and determining a prediction result text conforming to the second format based on the sample task information and the prediction text. By means of generating the predicted text by context, accuracy of the determined predicted result text is improved, and training effect of the model is improved.
In an alternative embodiment of the present specification, after obtaining sample task text and sample task information of a plurality of sample tasks, the method further includes the following specific steps:
And screening the sample task texts according to the quantity distribution of the sample task texts and a preset quantity balance strategy.
Because the sample task texts of the sample tasks are unbalanced in quantity distribution, for example, the number of sample task texts of the entity identification extraction task is large, so that the sample of the entity identification extraction task is oversampled, the number of sample task texts of the text classification task is small, so that the sample of the text classification task is undersampled, the task processing model obtained through training has higher task processing precision for the entity identification task, has lower task processing precision for the text classification task, the universality of task processing is reduced, and the task processing precision is reduced. To prevent this problem, the embodiment of the present specification summarizes the screening of sample task texts according to a preset number balance strategy.
The preset quantity balance policy is a measurement of distribution balance set for the quantity distribution of sample task text, including but not limited to: a quantity balancing policy for sample task types and a quantity balancing policy for sample tag types.
Exemplary, sample tasks include: the recognition extraction task (entity, event, question answer) and the text classification task (theme classification, intention classification, emotion classification) are divided into six major classes. The preset number of balancing policies is 100 sample task texts per subclass. And screening the sample task texts according to the quantity distribution of the sample task texts of the sample tasks of the six subclasses and the quantity balance strategy to obtain 100 sample task texts of each subclass, wherein the total number of the sample task texts is 600.
And screening the sample task texts according to the quantity distribution of the sample task texts and a preset quantity balance strategy. The training effect of the task processing model is improved, and the universality and the task processing precision of task processing are improved.
In an alternative embodiment of the present description, the sample task information includes a sample task type;
correspondingly, according to the quantity distribution of the sample task texts and a preset quantity balance strategy, the sample task texts are screened, and the method comprises the following specific steps:
according to the sample task types, determining the quantity distribution of sample task texts of each sample task type;
according to the quantity distribution of sample task texts of each sample task type, screening the sample task texts according to a preset first quantity balancing strategy, wherein the first quantity balancing strategy is a quantity balancing strategy aiming at the sample task type.
The sample task type is a task mode of a sample task, namely a task execution mode, including but not limited to: extraction and text classification are identified.
The first quantity balancing policy is a quantity balancing policy for sample task types. In the embodiment of the present disclosure, the number of sample task texts with the first number balance policy being any sample task type does not exceed the first preset threshold M.
Illustratively, a number distribution of sample task text for each sample task type is determined from the sample task types: identifying extraction tasks: 400 text classification tasks: 380. The first quantity balance strategy is that the quantity of sample task texts of any sample task type does not exceed 300 of a first preset threshold value, and the sample task texts are screened to obtain 300 sample task texts for identifying extraction tasks and 300 sample task texts for classifying the tasks.
According to the sample task types, determining the quantity distribution of sample task texts of each sample task type; according to the quantity distribution of sample task texts of each sample task type, screening the sample task texts according to a preset first quantity balancing strategy, wherein the first quantity balancing strategy is a quantity balancing strategy aiming at the sample task type. The universality of the task processing model on different task types and the task processing precision on different task types are improved.
In an alternative embodiment of the present description, the sample task information includes a sample tag type;
correspondingly, according to the quantity distribution of the sample task texts and a preset quantity balance strategy, the sample task texts are screened, and the method comprises the following specific steps:
According to the sample label types, determining the quantity distribution of sample task texts corresponding to the sample label types;
and screening the sample task texts according to a preset second quantity balance strategy according to the quantity distribution of the sample task texts corresponding to each sample label type, wherein the second quantity balance strategy is a quantity balance strategy aiming at the sample label type.
The sample tag type is a task object of the sample task, i.e., the type of sample tag text, including but not limited to: entity extraction (place entity extraction, name entity extraction), event extraction, question answer extraction, topic classification, intention classification, emotion classification.
The second number balancing policy is a number balancing policy for sample tag types. In the embodiment of the present disclosure, the number of sample task texts with the second number balancing policy being any sample label type does not exceed the second preset threshold K.
Illustratively, a number distribution of sample task text for each sample tag type is determined from the sample tag types: extracting a place entity: 120, name entity extraction: 150, event extraction: 80, extracting answers to questions: 90, topic classification: 110, intent classification: 130 emotion classifications: 100. The number of sample task texts with the second number balance policy being any sample label type does not exceed 75 of a second preset threshold value, and the sample task texts are screened to obtain sample task texts extracted from 75 place entities, sample task texts extracted from 75 person entities, sample task texts extracted from 75 events, sample task texts extracted from 75 answers to questions, sample task texts with 75 subject classifications, sample task texts with 75 intention classifications and sample task texts with 75 emotion classifications.
According to the sample label types, determining the quantity distribution of sample task texts corresponding to the sample label types; and screening the sample task texts according to a preset second quantity balance strategy according to the quantity distribution of the sample task texts corresponding to each sample label type, wherein the second quantity balance strategy is a quantity balance strategy aiming at the sample label type. The universality of the task processing model on different label types and the task processing precision on different label types are improved.
In an optional embodiment of the present disclosure, the sample task information includes target sample task information and reference sample task information, a semantic association exists between the target sample task information and the reference sample task information, and the sample task text includes a sample tag text corresponding to the target sample task information;
correspondingly, based on the sample task text and the sample task information, constructing a sample instruction text conforming to the first format, comprising the following specific steps of:
constructing a sample instruction text conforming to a first format based on the sample task text, the target sample task information and the reference sample task information;
correspondingly, based on sample task information and sample label text corresponding to the sample task information in the sample task text, determining a second format and constructing label result text conforming to the second format, wherein the method comprises the following specific steps of:
Determining a second format based on the target sample task information and sample label text corresponding to the target sample task information, and constructing a positive label result text conforming to the second format;
constructing a negative tag result text conforming to a second format based on the reference sample task information and the interference tag text corresponding to the reference sample task information;
based on the positive and negative label result texts, a label result text is determined.
If the sample task information is only the correct task feature information for the sample task, i.e., corresponds to the sample tag text, for example, if the sample task information contains only "extract: place "without sample task information of" classification "," name "," event ", and other errors, the task processing model is directed to generate only one type of predicted text, regardless of whether the sample task information is correct or not. Therefore, in order to eliminate such an influence, a certain interference negative example needs to be added in the pre-training process, so that the anti-interference performance of the task processing model is improved.
The target sample task information is correct task characteristic information of the sample task, the reference sample task information is error task characteristic information of the sample task, and the reference sample task information is interference information. In the embodiment of the present disclosure, in order to improve the interference immunity of the model, the target sample task information and the reference sample task information have semantic similarity, for example, the target sample task information is "location", and the reference sample task information is "direction". The reference sample task information can be selected from sample task information of a plurality of sample tasks, and the principle of selection is to consider both a high-frequency sample task and a low-frequency sample task (i.e. a long tail sample task).
The positive label result text is a label natural language text result for positively measuring the task execution effect of the task processing model. The negative label result text is a label natural language text result for negatively measuring the task execution effect of the task processing model.
Based on the sample task text, the target sample task information and the reference sample task information, a sample instruction text conforming to a first format is constructed in the following specific mode: and constructing the sample instruction text conforming to the first format according to the first format style based on the sample task text, the target sample task information and the reference sample task information.
Based on the target sample task information and the sample label text corresponding to the target sample task information, determining a second format and constructing a positive label result text conforming to the second format, wherein the specific mode is as follows: and determining a second format based on the target sample task information and sample label texts corresponding to the target sample task information, and constructing a positive label result text conforming to the second format according to a second format style based on the target sample task information and the sample label texts corresponding to the target sample task information.
Based on the reference sample task information and the interference label text corresponding to the reference sample task information, constructing a negative label result text conforming to a second format, wherein the specific mode is as follows: and constructing a negative tag result text conforming to the second format according to the second format style based on the reference sample task information and the interference tag text corresponding to the reference sample task information.
Based on the positive label result text and the negative label result text, the label result text is determined by the following specific modes: and splicing the positive label result text and the negative label result text to obtain the label result text.
Illustratively, the sample task text is "true heat of A", the sample tag text is "ground of A", the interference tag text is "none", and the target sample task information is "extract: place ", reference sample task information is" extract: name of the person. Based on the sample task text, the target sample task information, and the reference sample task information, input in a first format style: sample task text; target sample task information, reference sample task information ", construct sample instruction text conforming to a first format: input: a is true heat; extracting: place, name of person. Determining a second format based on the target sample task information and sample tag text corresponding to the target sample task information: and (3) outputting: sample task information: sample label text, based on the target sample task information and sample label text corresponding to the target sample task information, constructing a positive label result text conforming to a second format according to a second format style: and (3) outputting: location: and A is ground. Based on the reference sample task information and the interference label text corresponding to the reference sample task information, constructing a negative label result text conforming to a second format according to a second format style: and (3) outputting: name of person: and no. Splicing the positive label result text and the negative label result text to obtain a label result text: and (3) outputting: location: a, land; name of person: and no.
Constructing a sample instruction text conforming to a first format based on the sample task text, the target sample task information and the reference sample task information; determining a second format based on the target sample task information and sample label text corresponding to the target sample task information, and constructing a positive label result text conforming to the second format; constructing a negative tag result text conforming to a second format based on the reference sample task information and the interference tag text corresponding to the reference sample task information; based on the positive and negative label result texts, a label result text is determined. The task understanding bias of the task processing model is eliminated, input and output constraints in a fixed format are avoided, and the task processing performance of the model is reduced.
In an alternative embodiment of the present specification, before executing the target task on the target text by using the task processing model based on the instruction text, the method further includes the following specific steps:
acquiring a fine tuning task text and fine tuning task information of a fine tuning task;
constructing a fine tuning instruction text conforming to a first format based on the fine tuning task text and the fine tuning task information;
determining a second format based on the fine tuning task information and an object text corresponding to the fine tuning task information in the fine tuning task text, and constructing a tag result text conforming to the second format;
Based on the fine tuning instruction text, executing a corresponding fine tuning task on the fine tuning task text by using the trained task processing model, and generating a prediction result text conforming to the second format;
and adjusting model parameters of the task processing model based on the predicted result text and the label result text, and obtaining the task processing model with fine adjustment completed under the condition that the preset fine adjustment ending condition is reached.
In the pre-training process of the task processing model, the continuous pre-training of the task processing model can be completed by using large-scale weak supervision sample data constructed by remote supervision. Although the universality and the task processing precision of the task processing model are improved to a certain extent, a space for improving the task processing precision in a refined task scene exists, and at the moment, small-scale strong supervision sample data (high-quality labeling sample data) constructed by experts can be used to finish fine adjustment of the task processing model.
The fine tuning task is a text processing task for fine tuning and is a task for understanding various natural languages. The fine tuning task text is a natural language text of the fine tuning task to be processed by the task to be executed, and can be at least one sentence or at least one word. The fine tuning task information is task feature information of a fine tuning task, which is a natural language text, and characterizes task features of the fine tuning task, including but not limited to: task type, tag type, and task execution logic. The fine tuning task text and the fine tuning task information are high-quality labeling sample data which are constructed by experts.
The fine tuning instruction text is a natural language text instruction which is directly input into the task processing model in the fine tuning process, and is used for guiding the task processing model to understand the task content of the fine tuning task in the fine tuning process, executing the fine tuning task corresponding to the fine tuning task information on the fine tuning task text, and generating an output result corresponding to the output format. In the fine tuning process of the task processing model, fine tuning instruction text in a first format is used as a fine tuning sample.
The fine tuning label text is an execution object text of a fine tuning task in the fine tuning process, namely a text corresponding to fine tuning task information. For example, the fine tuning task is emotion classification and the tag text is an emotion word. The label result text is a label natural language text result used for measuring the task execution effect of the task processing model in the fine tuning process. In the fine tuning process of the task processing model, the label result text in the second format is used as a label sample. The predicted result text is a natural language text result of the task processing model executing the fine tuning task on the fine tuning task text in the fine tuning process, and is an output result generated by the task processing model.
The preset training ending condition is a preset judging condition for ending training of the task processing model, including but not limited to: preset iteration times, preset loss value threshold value and preset training convergence conditions.
The specific implementation of the fine tuning is similar to the implementation of the pre-training described above, and is specifically referred to the pre-training process described above, and will not be described here again.
Acquiring a fine tuning task text and fine tuning task information of a fine tuning task; constructing a fine tuning instruction text conforming to a first format based on the fine tuning task text and the fine tuning task information; determining a second format based on the fine tuning task information and an object text corresponding to the fine tuning task information in the fine tuning task text, and constructing a tag result text conforming to the second format; based on the fine tuning instruction text, executing a corresponding fine tuning task on the fine tuning task text by using the trained task processing model, and generating a prediction result text conforming to the second format; and adjusting model parameters of the task processing model based on the predicted result text and the label result text, and obtaining the task processing model with fine adjustment completed under the condition that the preset fine adjustment ending condition is reached. The universality and the task processing precision of the task processing model are further improved.
Referring to fig. 2, fig. 2 shows a flowchart of another task processing method provided in an embodiment of the present disclosure, where the method is applied to cloud-side devices, and includes the following specific steps:
Step 202: and receiving target text input by the front end and task information of a target task selected by the target text.
Step 204: and constructing the instruction text conforming to the first format based on the target text and the task information.
Step 206: and executing a target task on the target text by using a task processing model based on the instruction text, and generating a task execution result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information.
Step 208: and feeding back the task execution result to the front end.
The cloud side device is network cloud equipment where a server end of application of multiple task processing functions is located, and is virtual equipment, and a deep learning model with the multiple task processing functions, namely a task processing model, is deployed on the cloud side device. The front end is the application front end or the application platform front end which is logged in by the user and has a plurality of task processing functions. And the cloud side equipment and the terminal side equipment are connected through a network transmission channel to perform data transmission. The computing power performance of the cloud side device is higher than that of the end side device.
It should be noted that, the steps 202 to 208 are described in detail in the embodiment of fig. 1, and are not described herein.
In the embodiment of the specification, receiving target text input by a front end and task information of a target task selected by aiming at the target text; constructing an instruction text conforming to a first format based on the target text and the task information; executing a target task on the target text by using a task processing model based on the instruction text to generate a task execution result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information; and feeding back the task execution result to the front end. The task processing model is obtained by performing supervision training on a sample instruction text based on a first format and a label result text based on a second format in advance, learns an input instruction text based on the first format and an output result based on the second format, and can generate a task execution result corresponding to the task information and the corresponding label text according to the second format according to the task information of the instruction text under the condition that the second format is determined based on the sample task information and the label text corresponding to the sample task information.
In an alternative embodiment of the present disclosure, following step 106, the following specific steps are further included:
feeding back the task execution result to the front end;
receiving execution result feedback sent by a front end, wherein the execution result feedback is generated based on a task execution result;
based on the execution result feedback, the model parameters of the task processing model are adjusted.
After receiving the feedback task execution result, the front end can further generate execution result feedback based on the task execution result, and further adjust the task processing model in an interactive mode.
Illustratively, the feedback task execution result "output: location: b, "to the front end, user a generates execution result feedback based on the task execution result: the output location should be A ground, receiving the feedback of the execution result sent by the front end, and adjusting the model parameters of the task processing model based on the feedback of the execution result.
Feeding back the task execution result to the front end; receiving execution result feedback sent by a front end, wherein the execution result feedback is generated based on a task execution result; based on the execution result feedback, the model parameters of the task processing model are adjusted. Further adjustment of model parameters of the task processing model is achieved through an interactive mode, and task processing precision of the task processing model is further improved.
Referring to fig. 3, fig. 3 shows a flowchart of an entity identification method according to an embodiment of the present disclosure, including the following specific steps:
step 302: and receiving target text input by the front end and identifying task information of the identifying task selected by the target text.
Step 304: based on the target text and the recognition task information, an instruction text conforming to a first format is constructed.
Step 306: and executing an identification task on the target text by using a task processing model based on the instruction text to generate an entity identification result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and an object text corresponding to the sample task information in the sample task text.
For the same inventive concept as the embodiment of fig. 1, the specific manner of steps 302 to 306 is referred to steps 102 to 106, and will not be repeated here.
In the embodiment of the specification, receiving target text input by a front end and identification task information of a target task selected by aiming at the target text; constructing an instruction text conforming to a first format based on the target text and the identification task information; and executing a target task on the target text by using a task processing model based on the instruction text, and generating an entity identification result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in a first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information. The task processing model is obtained by performing supervision training on a sample instruction text based on a first format and a label result text based on a second format in advance, learns an input instruction text based on the first format and an output result based on the second format, and can generate an entity recognition result corresponding to the recognition task information and the corresponding label text according to the recognition task information of the instruction text under the condition that the second format is determined based on the sample task information and the label text corresponding to the sample task information, so that the task processing model has refined entity recognition capability of specific recognition task information and high generality and high entity recognition precision while executing target tasks on target texts under different task information.
Referring to fig. 4, fig. 4 shows a flowchart of a data processing method for task processing, which is applied to cloud-side equipment and includes the following specific steps:
step 402: sample task text and sample task information of various sample tasks are obtained.
Step 404: based on the sample task text and the sample task information, a sample instruction text conforming to a first format is constructed.
Step 406: and determining a second format and a label result text conforming to the second format based on the sample task information and the sample label text corresponding to the sample task information.
Step 408: and executing a corresponding sample task on the sample task text by using the task processing model based on the sample instruction text, and generating a prediction result text conforming to the second format.
Step 410: training the task processing model based on the predicted result text and the label result text, and obtaining the task processing model after training under the condition that the preset training ending condition is reached.
Step 412: and sending the task processing model to the end-side equipment.
The cloud side device is network cloud side device for providing model training function, and is a virtual device. The terminal equipment is terminal equipment for providing various task processing functions, and is entity equipment. And the terminal side equipment and the cloud side equipment are connected through a network channel to perform data transmission. The computing power performance and the storage capacity of the cloud side equipment are higher than those of the end side equipment.
For the same inventive concept as the embodiment of fig. 1, the specific manner of steps 402 to 410 is referred to the embodiment of fig. 1 for pre-training the task processing model, which is not described herein.
In the embodiment of the specification, sample task text and sample task information of various sample tasks are obtained; based on the sample task text and the sample task information, constructing a sample instruction text conforming to a first format; determining a second format and a label result text conforming to the second format based on the sample task information and sample label text corresponding to the sample task information; based on the sample instruction text, executing a corresponding sample task on the sample task text by using a task processing model, and generating a prediction result text conforming to a second format; training the task processing model based on the predicted result text and the label result text, and obtaining a task processing model after training under the condition that a preset training ending condition is reached; and sending the task processing model to the end-side equipment. Based on sample task text and sample task information, a sample instruction text conforming to a first format is constructed, based on sample task information and sample label text corresponding to the sample task information, a second format and label result text conforming to the second format are determined, based on the sample instruction text and the label result text, training of a task processing model under the constraint of sample task information of various sample tasks is performed, so that the task processing model has task processing capacity for different tasks, the task processing model has high universality, meanwhile, the task processing model has fine task processing capacity for various sample tasks through the constraint of the first format and the second format, the task processing model has high task processing precision, the training process is executed on cloud side equipment, the advantages of high calculation force and large storage of cloud side equipment are utilized, the task processing cost is reduced, and the training efficiency and the training effect of the task processing model are improved.
Fig. 5 is a schematic front-end interface of a task processing method according to an embodiment of the present disclosure, where the front-end interface is shown in fig. 5:
the front end interface of the task processing platform comprises an input and output display area, a task selection area and an input box. The input and output display area displays instruction text conforming to the first format input to the task processing model and output task execution results conforming to the second format (result text). The task selection area displays alternative tasks: two task types are extracted and classified, and the label types of place entities, name entities, events, answers to questions, topic classifications, intention classifications, emotion classifications and question-answering tasks. Inputting target text in an input box: the ground is true heat, and the task type of the task selection area is clicked: extracting, and clicking the label types: and finally clicking an 'input' control by the location entity, and displaying instruction texts conforming to a first format in an input and output display area: input: a is true heat; extracting: the location, based on the instruction text, utilizes the task processing model to execute the location entity extraction task on the target text, and generates a task execution result conforming to the second format: and (3) outputting: location: and A, displaying the task execution result in the input and output display areas.
The task processing method provided in the present specification is further described below with reference to fig. 6 by taking an application of the task processing method to a search engine as an example. Fig. 6 is a flowchart of a processing procedure of a task processing method applied to a search engine according to an embodiment of the present disclosure, where the processing procedure includes the following specific steps:
step 602: sample task text and sample task information of various sample tasks are obtained.
Step 604: based on the sample task text and the sample task information, a sample instruction text conforming to a first format is constructed.
Step 606: and determining a second format and a label result text conforming to the second format based on the sample task information and the sample label text corresponding to the sample task information.
Step 608: and executing a corresponding sample task on the sample task text by using the task processing model based on the sample instruction text, and generating a prediction result text conforming to the second format.
Step 610: training the task processing model based on the predicted result text and the label result text, and obtaining the task processing model after training under the condition that the preset training ending condition is reached.
Step 612: and receiving target text input by the front end and task information of a search task selected for the target text.
Step 614: and constructing the instruction text conforming to the first format based on the target text and the task information.
Step 616: and executing the target task on the target text by utilizing the task processing model based on the instruction text, and generating a task execution result conforming to the second format.
Step 618: and determining the task execution result as a search keyword.
Step 620: based on the search keywords, a search engine is called, and corresponding search results are obtained through searching and fed back to the front end.
In the embodiment of the specification, a multitasking unified pre-training mode is adopted to train a task processing model, the format of an instruction text and the format of a result text are fixed, and the task execution result in the fixed format is used as a search keyword to search, so that the search efficiency and the search accuracy are improved, and the search experience of a user is improved.
Corresponding to the method embodiment, the present disclosure further provides an embodiment of a task processing device, and fig. 7 shows a schematic structural diagram of the task processing device provided in one embodiment of the present disclosure. As shown in fig. 7, the apparatus includes:
a first receiving module 702 configured to receive target text input by a front end and task information of a target task selected for the target text;
A first construction module 704 configured to construct an instruction text conforming to a first format based on the target text and the task information;
the first generating module 706 is configured to execute the target task on the target text by using a task processing model based on the instruction text, and generate a task execution result conforming to a second format, where the task processing model is trained based on the sample instruction text in the first format and the tag result text in the second format, and the sample instruction text includes a sample task text and sample task information, and the second format is determined based on the sample task information and the tag text corresponding to the sample task information.
Optionally, the task information of the target task includes a task type and a tag type;
correspondingly, the first generation module 706 is further configured to:
inputting the instruction text into a task processing model, executing a target task corresponding to the task type on the target text based on the tag type, generating a prediction text corresponding to the tag type, and determining a task execution result conforming to the second format based on the tag type and the prediction text.
Optionally, the apparatus further comprises:
the first training module is configured to acquire sample task texts and sample task information of various sample tasks; based on the sample task text and the sample task information, constructing a sample instruction text conforming to a first format; determining a second format and a label result text conforming to the second format based on the sample task information and sample label text corresponding to the sample task information; based on the sample instruction text, executing a corresponding sample task on the sample task text by using a task processing model, and generating a prediction result text conforming to a second format; training the task processing model based on the predicted result text and the label result text, and obtaining the task processing model after training under the condition that the preset training ending condition is reached.
Optionally, the first training module is further configured to:
and inputting the sample instruction text into a task processing model, generating a prediction text corresponding to the sample task information based on the context of the sample instruction text, and determining a prediction result text conforming to the second format based on the sample task information and the prediction text.
Optionally, the apparatus further comprises:
the first screening module is configured to screen the sample task text according to the quantity distribution of the sample task text and a preset quantity balance strategy.
Optionally, the sample task information includes a sample task type;
correspondingly, the first screening module is further configured to:
according to the sample task types, determining the quantity distribution of sample task texts of each sample task type; according to the quantity distribution of sample task texts of each sample task type, screening the sample task texts according to a preset first quantity balancing strategy, wherein the first quantity balancing strategy is a quantity balancing strategy aiming at the sample task type.
Optionally, the sample task information includes a sample tag type;
correspondingly, the first screening module is further configured to:
according to the sample label types, determining the quantity distribution of sample task texts corresponding to the sample label types; and screening the sample task texts according to a preset second quantity balance strategy according to the quantity distribution of the sample task texts corresponding to each sample label type, wherein the second quantity balance strategy is a quantity balance strategy aiming at the sample label type.
Optionally, the sample task information includes target sample task information and reference sample task information, a semantic association exists between the target sample task information and the reference sample task information, and the sample task text includes a sample tag text corresponding to the target sample task information;
correspondingly, the first training module is further configured to: constructing a sample instruction text conforming to a first format based on the sample task text, the target sample task information and the reference sample task information; determining a second format based on the target sample task information and sample label text corresponding to the target sample task information, and constructing a positive label result text conforming to the second format; constructing a negative tag result text conforming to a second format based on the reference sample task information and the interference tag text corresponding to the reference sample task information; based on the positive and negative label result texts, a label result text is determined.
Optionally, the apparatus further comprises:
the first fine tuning module is configured to acquire fine tuning task text and fine tuning task information of a fine tuning task; constructing a fine tuning instruction text conforming to a first format based on the fine tuning task text and the fine tuning task information; determining a second format based on the fine tuning task information and an object text corresponding to the fine tuning task information in the fine tuning task text, and constructing a tag result text conforming to the second format; based on the fine tuning instruction text, executing a corresponding fine tuning task on the fine tuning task text by using the trained task processing model, and generating a prediction result text conforming to the second format; and adjusting model parameters of the task processing model based on the predicted result text and the label result text, and obtaining the task processing model with fine adjustment completed under the condition that the preset fine adjustment ending condition is reached.
In the embodiment of the present disclosure, the task processing model is obtained by performing supervision training in advance based on a sample instruction text in a first format and a tag result text in a second format, and learns an input instruction text in the first format and an output result in the second format.
The above is a schematic solution of a task processing device of the present embodiment. It should be noted that, the technical solution of the task processing device and the technical solution of the task processing method belong to the same concept, and details of the technical solution of the task processing device, which are not described in detail, can be referred to the description of the technical solution of the task processing method.
Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a task processing device, and fig. 8 shows a schematic structural diagram of another task processing device provided in one embodiment of the present disclosure. As shown in fig. 8, the apparatus is applied to cloud-side equipment, and includes:
A second receiving module 802 configured to receive target text input by the front end and task information of a target task selected for the target text;
a second construction module 804 configured to construct an instruction text conforming to the first format based on the target text and the task information;
a second generating module 806, configured to execute, based on the instruction text, the target task on the target text by using a task processing model, and generate a task execution result according to a second format, where the task processing model is obtained by training based on the sample instruction text in the first format and the tag result text in the second format, and the sample instruction text includes a sample task text and sample task information, and the second format is determined based on the sample task information and the tag text corresponding to the sample task information;
and a feedback module 808 configured to feedback the task execution result to the front end.
Optionally, the apparatus further comprises:
the interaction module is configured to feed back a task execution result to the front end; receiving execution result feedback sent by a front end, wherein the execution result feedback is generated based on a task execution result; based on the execution result feedback, the model parameters of the task processing model are adjusted.
In the embodiment of the specification, the task processing model is obtained by performing supervision training on a sample instruction text based on a first format and a label result text based on a second format in advance, so that an output result of the first format and an output result of the second format are learned, and when the second format is determined based on sample task information and a label text corresponding to the sample task information, the task processing model can generate a task execution result corresponding to the task information and the corresponding label text and conforming to the second format according to the task information of the instruction text, and the task processing model has the refinement processing capability of specific task information, namely, the task processing has high universality and high task processing precision while executing target tasks on different task information, and the task processing efficiency is improved by utilizing the advantages of high calculation capability of cloud side equipment.
The above is a schematic solution of a task processing device of the present embodiment. It should be noted that, the technical solution of the task processing device and the technical solution of the task processing method belong to the same concept, and details of the technical solution of the task processing device, which are not described in detail, can be referred to the description of the technical solution of the task processing method.
Corresponding to the method embodiment, the present disclosure further provides an embodiment of an entity identification device, and fig. 9 shows a schematic structural diagram of an entity identification device provided in one embodiment of the present disclosure. As shown in fig. 9, the apparatus includes:
a third receiving module 902 configured to receive target text input by the front end and identification task information of an identification task selected for the target text;
a third construction module 904 configured to construct an instruction text conforming to the first format based on the target text and the recognition task information;
and a third generating module 906, configured to perform an identification task on the target text by using a task processing model based on the instruction text, and generate an entity identification result conforming to a second format, where the task processing model is trained based on the sample instruction text in the first format and the label result text in the second format, and the sample instruction text includes a sample task text and sample task information, and the second format is determined based on the sample task information and the object text corresponding to the sample task information in the sample task text.
In the embodiment of the present disclosure, the task processing model is obtained by performing supervision training in advance based on a sample instruction text in a first format and a tag result text in a second format, and learns an input instruction text in the first format and an output result in the second format.
The above is a schematic scheme of an entity recognition apparatus of the present embodiment. It should be noted that, the technical solution of the entity recognition device and the technical solution of the entity recognition method belong to the same concept, and details of the technical solution of the entity recognition device, which are not described in detail, can be referred to the description of the technical solution of the entity recognition method.
Corresponding to the method embodiment, the present disclosure further provides an embodiment of a task processing data processing device, and fig. 10 shows a schematic structural diagram of a task processing data processing device according to one embodiment of the present disclosure. As shown in fig. 10, the apparatus is applied to cloud-side equipment, and includes:
an acquisition module 1002 configured to acquire sample task text and sample task information of a plurality of sample tasks;
a building module 1004 configured to build a sample instruction text conforming to the first format based on the sample task text and the sample task information;
a determining module 1006 configured to determine a second format and a tag result text conforming to the second format based on the sample task information and the sample tag text corresponding to the sample task information;
a generating module 1008 configured to execute a corresponding sample task on the sample task text using the task processing model based on the sample instruction text, generating a prediction result text conforming to the second format;
The training module 1010 is configured to train the task processing model based on the predicted result text and the label result text, and obtain a trained task processing model when a preset training end condition is reached;
a sending module 1012 is configured to send the task processing model to the end-side device.
In the embodiment of the specification, a sample instruction text conforming to a first format is constructed based on a sample task text and sample task information, a second format and a label result text conforming to the second format are determined based on sample task information and sample label text, training of a task processing model under the constraint of sample task information of various sample tasks is performed on the task processing model based on the sample instruction text and the label result text, the task processing model has task processing capacities of different tasks, the task processing model has high universality, meanwhile, the task processing model has the task processing capacity of refining each sample task through training under the constraint of the fixed formats of the first format and the second format, the task processing model has high task processing precision, the training process is executed on cloud side equipment, the advantages of high calculation power and large storage of cloud side equipment are utilized, the task processing cost is reduced, and the training efficiency and the training effect of the task processing model are improved.
The above is a schematic scheme of a data processing apparatus for task processing of the present embodiment. It should be noted that, the technical solution of the task processing data processing device and the technical solution of the task processing data processing method belong to the same concept, and details of the technical solution of the task processing data processing device which are not described in detail can be referred to the description of the technical solution of the task processing data processing method.
FIG. 11 illustrates a block diagram of a computing device provided in one embodiment of the present description. The components of computing device 1100 include, but are not limited to, a memory 1110 and a processor 1120. Processor 1120 is coupled to memory 1110 via bus 1130, and database 1150 is used to hold data.
The computing device 1100 also includes an access device 1140, the access device 1140 enabling the computing device 1100 to communicate via one or more networks 1160. Examples of such networks include public switched telephone networks (PSTN, public SwitchedTelephone Network), local area networks (LAN, localAreaNetwork), wide area networks (WAN, wideAreaNetwork), personal area networks (PAN, personal Area Network), or combinations of communication networks such as the internet. The access device 1140 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network interface controller), such as an IEEE802.11 wireless local area network (WLAN, wireless Local Area Network) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, near field communication (NFC, near Field Communication).
In one embodiment of the present description, the above components of computing device 1100, as well as other components not shown in FIG. 11, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 11 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 1100 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 1100 may also be a mobile or stationary server.
The processor 1120 is configured to execute computer-executable instructions that, when executed by the processor, implement the steps of the task processing method, the entity identification method, or the data processing method for task processing described above.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solutions of the task processing method, the entity identification method and the task processing data processing method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solutions of the task processing method, the entity identification method or the task processing data processing method.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the task processing method, the entity identification method, or the data processing method for task processing described above.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solutions of the task processing method, the entity identification method and the data processing method for task processing described above belong to the same concept, and details of the technical solution of the storage medium which are not described in detail may be referred to the description of the technical solutions of the task processing method, the entity identification method or the data processing method for task processing described above.
An embodiment of the present disclosure also provides a computer program, where the computer program, when executed in a computer, causes the computer to perform the steps of the task processing method, the entity recognition method, or the data processing method for task processing described above.
The above is an exemplary version of a computer program of the present embodiment. It should be noted that, the technical solution of the computer program and the technical solution of the task processing method, the entity identification method and the data processing method for task processing described above belong to the same concept, and details of the technical solution of the computer program which are not described in detail can be referred to the description of the technical solution of the task processing method, the entity identification method or the data processing method for task processing described above.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), an electrical carrier signal, a telecommunication signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be increased or decreased appropriately according to the requirements of the patent practice, for example, in some areas, according to the patent practice, the computer readable medium does not include an electric carrier signal and a telecommunication signal.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.
Claims (14)
1. A task processing method, comprising:
receiving target text input by a front end and task information of a target task selected by aiming at the target text;
constructing an instruction text conforming to a first format based on the target text and the task information;
and executing the target task on the target text by using a task processing model based on the instruction text, and generating a task execution result conforming to a second format, wherein the task processing model is trained based on the sample instruction text in the first format and the label result text in the second format, the sample instruction text comprises sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information.
2. The method of claim 1, the task information of the target task comprising a task type and a tag type;
the task processing module is configured to execute the target task on the target text based on the instruction text by using a task processing model, and generate a task execution result conforming to a second format, where the task execution result includes:
inputting the instruction text into a task processing model, executing the target task corresponding to the task type on the target text based on the tag type, generating a prediction text corresponding to the tag type, and determining a task execution result conforming to a second format based on the tag type and the prediction text.
3. The method of claim 1 or 2, further comprising, prior to the performing the target task on the target text using a task processing model based on the instruction text:
acquiring sample task texts and sample task information of various sample tasks;
constructing a sample instruction text conforming to a first format based on the sample task text and the sample task information;
determining a second format and a label result text conforming to the second format based on the sample task information and sample label text corresponding to the sample task information;
Based on the sample instruction text, executing a corresponding sample task on the sample task text by using a task processing model, and generating a prediction result text conforming to the second format;
training the task processing model based on the predicted result text and the label result text, and obtaining the task processing model after training under the condition that the preset training ending condition is reached.
4. The method of claim 3, the generating predicted-result text conforming to the second format based on the sample instruction text by performing a corresponding sample task on the sample task text using a task processing model, comprising:
and inputting the sample instruction text into a task processing model, generating a prediction text corresponding to the sample task information based on the context of the sample instruction text, and determining a prediction result text conforming to the second format based on the sample task information and the prediction text.
5. The method of claim 3, further comprising, after the obtaining sample task text and sample task information for a plurality of sample tasks:
and screening the sample task texts according to the quantity distribution of the sample task texts and a preset quantity balance strategy.
6. The method of claim 5, the sample task information comprising a sample task type;
the screening of the sample task text according to the number distribution of the sample task text and a preset number balance strategy comprises the following steps:
according to the sample task types, determining the quantity distribution of sample task texts of each sample task type;
and screening the sample task texts according to the quantity distribution of the sample task texts of each sample task type and a preset first quantity balance strategy, wherein the first quantity balance strategy is a quantity balance strategy aiming at the sample task type.
7. The method of claim 5, the sample task information comprising a sample tag type;
the screening of the sample task text according to the number distribution of the sample task text and a preset number balance strategy comprises the following steps:
according to the sample label types, determining the quantity distribution of sample task texts corresponding to the sample label types;
and screening the sample task text according to the number distribution of the sample task text corresponding to each sample label type and a preset second number balance strategy, wherein the second number balance strategy is a number balance strategy aiming at the sample label type.
8. The method of claim 1, further comprising, prior to said executing said target task on said target text using a task processing model based on said instruction text:
acquiring a fine tuning task text and fine tuning task information of a fine tuning task;
constructing a fine tuning instruction text conforming to a first format based on the fine tuning task text and the fine tuning task information;
determining a second format based on the fine tuning task information and an object text corresponding to the fine tuning task information in the fine tuning task text, and constructing a label result text conforming to the second format;
based on the fine tuning instruction text, executing a corresponding fine tuning task on the fine tuning task text by using the trained task processing model, and generating a prediction result text conforming to the second format;
and adjusting model parameters of the task processing model based on the predicted result text and the label result text, and obtaining the task processing model with fine adjustment completed under the condition that a preset fine adjustment ending condition is reached.
9. The task processing method is applied to cloud side equipment and comprises the following steps:
receiving target text input by a front end and task information of a target task selected by aiming at the target text;
Constructing an instruction text conforming to a first format based on the target text and the task information;
executing the target task on the target text by using a task processing model based on the instruction text, and generating a task execution result conforming to a second format, wherein the task processing model is trained based on a sample instruction text in the first format and a label result text in the second format, the sample instruction text comprises a sample task text and sample task information, and the second format is determined based on the sample task information and the label text corresponding to the sample task information;
and feeding back the task execution result to the front end.
10. The method of claim 9, further comprising, after said feeding back the task execution result to the front end:
receiving execution result feedback sent by the front end, wherein the execution result feedback is generated based on the task execution result;
and adjusting model parameters of the task processing model based on the execution result feedback.
11. An entity identification method, comprising:
receiving target text input by a front end and identifying task information of an identifying task selected by aiming at the target text;
Constructing an instruction text conforming to a first format based on the target text and the identification task information;
and executing the recognition task on the target text by using a task processing model based on the instruction text to generate an entity recognition result conforming to a second format, wherein the task processing model is trained based on the sample instruction text in the first format and the label result text in the second format, the sample instruction text comprises sample task text and sample task information, and the second format is determined based on the sample task information and the object text corresponding to the sample task information in the sample task text.
12. The data processing method for task processing is applied to cloud side equipment and comprises the following steps:
acquiring sample task texts and sample task information of various sample tasks;
constructing a sample instruction text conforming to a first format based on the sample task text and the sample task information;
determining a second format and a label result text conforming to the second format based on the sample task information and sample label text corresponding to the sample task information;
based on the sample instruction text, executing a corresponding sample task on the sample task text by using a task processing model, and generating a prediction result text conforming to the second format;
Training the task processing model based on the predicted result text and the label result text, and obtaining a task processing model after training under the condition that a preset training ending condition is reached;
and sending the task processing model to the end-side equipment.
13. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer executable instructions, the processor being configured to execute the computer executable instructions, which when executed by the processor, implement the steps of the method of any one of claims 1 to 12.
14. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the steps of the method of any one of claims 1 to 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311047348.1A CN117291185A (en) | 2023-08-17 | 2023-08-17 | Task processing method, entity identification method and task processing data processing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311047348.1A CN117291185A (en) | 2023-08-17 | 2023-08-17 | Task processing method, entity identification method and task processing data processing method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117291185A true CN117291185A (en) | 2023-12-26 |
Family
ID=89256082
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311047348.1A Pending CN117291185A (en) | 2023-08-17 | 2023-08-17 | Task processing method, entity identification method and task processing data processing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117291185A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117648986A (en) * | 2024-01-26 | 2024-03-05 | 浙江阿里巴巴机器人有限公司 | Task processing and code processing method, computing device, medium, and program product |
CN118095266A (en) * | 2024-03-15 | 2024-05-28 | 北京师范大学 | Language model-based ancient Chinese text processing method, device and storage medium |
-
2023
- 2023-08-17 CN CN202311047348.1A patent/CN117291185A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117648986A (en) * | 2024-01-26 | 2024-03-05 | 浙江阿里巴巴机器人有限公司 | Task processing and code processing method, computing device, medium, and program product |
CN117648986B (en) * | 2024-01-26 | 2024-05-14 | 浙江阿里巴巴机器人有限公司 | Task processing and code processing method, computing device, medium, and program product |
CN118095266A (en) * | 2024-03-15 | 2024-05-28 | 北京师范大学 | Language model-based ancient Chinese text processing method, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020228376A1 (en) | Text processing method and model training method and apparatus | |
CN117521675A (en) | Information processing method, device, equipment and storage medium based on large language model | |
EP4209965A1 (en) | Data processing method and related device | |
CN112487182A (en) | Training method of text processing model, and text processing method and device | |
CN111897941A (en) | Dialog generation method, network training method, device, storage medium and equipment | |
CN113268609B (en) | Knowledge graph-based dialogue content recommendation method, device, equipment and medium | |
US11475225B2 (en) | Method, system, electronic device and storage medium for clarification question generation | |
CN116579339B (en) | Task execution method and optimization task execution method | |
CN116720004B (en) | Recommendation reason generation method, device, equipment and storage medium | |
CN108846077A (en) | Semantic matching method, device, medium and the electronic equipment of question and answer text | |
CN117291185A (en) | Task processing method, entity identification method and task processing data processing method | |
CN113761153B (en) | Picture-based question-answering processing method and device, readable medium and electronic equipment | |
CN114676234A (en) | Model training method and related equipment | |
WO2023137911A1 (en) | Intention classification method and apparatus based on small-sample corpus, and computer device | |
CN111125406A (en) | Visual relation detection method based on self-adaptive cluster learning | |
CN117573842B (en) | Document retrieval method and automatic question-answering method | |
US20240046067A1 (en) | Data processing method and related device | |
CN117216544A (en) | Model training method, natural language processing method, device and storage medium | |
CN116050405A (en) | Text processing, question-answer text processing and text processing model training method | |
US20240037335A1 (en) | Methods, systems, and media for bi-modal generation of natural languages and neural architectures | |
CN118246537B (en) | Question and answer method, device, equipment and storage medium based on large model | |
CN117971420A (en) | Task processing, traffic task processing and task processing model training method | |
CN109002498B (en) | Man-machine conversation method, device, equipment and storage medium | |
CN116384405A (en) | Text processing method, text classification method and emotion recognition method | |
CN116861913A (en) | Position detection method based on GPT large model and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |