CN117312979A - Object classification method, classification model training method and electronic equipment - Google Patents
Object classification method, classification model training method and electronic equipment Download PDFInfo
- Publication number
- CN117312979A CN117312979A CN202311065380.2A CN202311065380A CN117312979A CN 117312979 A CN117312979 A CN 117312979A CN 202311065380 A CN202311065380 A CN 202311065380A CN 117312979 A CN117312979 A CN 117312979A
- Authority
- CN
- China
- Prior art keywords
- category
- sample
- information
- classification
- level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013145 classification model Methods 0.000 title claims abstract description 122
- 238000000034 method Methods 0.000 title claims abstract description 89
- 238000012549 training Methods 0.000 title claims abstract description 70
- 238000012545 processing Methods 0.000 claims abstract description 71
- 230000015654 memory Effects 0.000 claims description 25
- 238000007499 fusion processing Methods 0.000 claims description 22
- 230000008569 process Effects 0.000 claims description 22
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000010606 normalization Methods 0.000 claims description 12
- 238000012216 screening Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 description 21
- 230000006870 function Effects 0.000 description 21
- 239000013598 vector Substances 0.000 description 14
- 238000010586 diagram Methods 0.000 description 11
- 230000000694 effects Effects 0.000 description 11
- 235000013305 food Nutrition 0.000 description 9
- 238000013473 artificial intelligence Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000011284 combination treatment Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application provides an object classification method, a classification model training method, a device, electronic equipment and a computer readable storage medium; the object classification method comprises the following steps: acquiring preset category information and description information to be tested of an object to be tested; wherein the preset category information comprises preset categories of a plurality of category levels; performing hierarchical classification processing based on the description information to be detected and the preset category information through a classification model to obtain classification results of the object to be detected in a plurality of category levels; and determining target categories of the object to be tested in the category levels according to classification results of the object to be tested in the category levels. According to the method and the device, the object to be measured is classified layer by layer, the target category of each category level is determined, the conditions of a plurality of category levels can be comprehensively considered, and the classification precision is effectively improved.
Description
Technical Field
The present application relates to artificial intelligence technology, and in particular, to an object classification method, a classification model training method, an apparatus, an electronic device, and a computer readable storage medium.
Background
A category system is created for efficient resolution and application of a particular object, and includes a plurality of category hierarchies, each including a number of categories. Category systems are commonly found on e-commerce platforms, which can help the platform to better perform in-station commodity maintenance and management by determining the categories to which each commodity belongs.
For the category to which the commodity belongs, in the scheme provided by the related technology, text matching is generally performed on text information of the commodity and the last category level, so that the category to which the commodity belongs in the last category level is determined. However, the inventors have found that, due to the bulkiness and complexity of the category system itself, the classification by the last category level is less accurate and the goods can be easily classified into the wrong category.
Disclosure of Invention
The embodiment of the application provides an object classification method, a classification model training method, an object classification device, electronic equipment and a computer readable storage medium, which can comprehensively consider the situations of multiple category levels, improve the training effect of a classification model and further improve classification precision.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides an object classification method, which comprises the following steps:
Acquiring preset category information and description information to be tested of an object to be tested; wherein the preset category information comprises preset categories of a plurality of category levels;
performing hierarchical classification processing based on the description information to be detected and the preset category information through a classification model to obtain classification results of the object to be detected in the category levels;
and determining target categories of the object to be tested in the category levels according to classification results of the object to be tested in the category levels.
Through the scheme, the object to be tested is classified in multiple levels by using the trained classification model, and compared with the scheme of classifying by only depending on the last category level, the classification accuracy can be effectively improved by comprehensively considering the category levels.
In the above scheme, the method further comprises:
for any one category level, the following processing is performed:
screening a plurality of preset categories of any category hierarchy according to the classification result of the previous category hierarchy of the any category hierarchy;
and updating the preset category information according to the screened preset categories.
Through the scheme, the preset category information is updated according to the existing classification result, so that the rule of a category system can be met, the classification accuracy and rationality are improved, and the calculated amount is reduced.
In the above scheme, the step-by-step classification processing is performed by the classification model based on the description information to be detected and the preset category information, to obtain classification results of the object to be detected at the plurality of category levels, including:
the following processing is performed by the classification model:
extracting description features to be detected from the description information to be detected, and extracting preset category features from preset categories included in the preset category information;
according to the description characteristic to be detected and the preset category characteristic, calculating the probability that the object to be detected belongs to the preset category corresponding to the preset category characteristic;
and aiming at any category level, determining the probability that the object to be detected respectively belongs to a plurality of preset categories of the any category level as a classification result of the object to be detected in the any category level.
Through the scheme, the probability that the object to be detected belongs to the preset category can be predicted based on the learned feature association, and the accuracy of the obtained probability is ensured.
The embodiment of the application provides a classification model training method, which comprises the following steps:
acquiring sample description information and sample category information of a sample object; the sample category information comprises sample categories of the sample object at a plurality of category levels;
Performing hierarchical classification processing based on the sample description information and the sample category information through a classification model to obtain classification results at the category levels;
performing loss calculation according to sample categories and classification results of the same category level to obtain level loss, and performing multi-level fusion processing on the level loss corresponding to each category level to obtain multi-level loss;
training the classification model according to the multi-level loss; the trained classification model is used for predicting classification results of the object to be tested in the category levels.
Through the scheme, the level loss corresponding to the category levels is comprehensively considered, the training effect of the classification model can be improved, the trained classification model has multi-level classification capability, the last category level is not relied on, and the classification precision can be effectively improved.
In the above solution, the step-by-step classification processing is performed by a classification model based on the sample description information and the sample category information, to obtain classification results at the plurality of category levels, including:
the following processing is performed by the classification model:
Extracting sample description features from the sample description information and extracting sample category features from the sample category information;
and calculating category probability distribution at any category level according to the sample description characteristics and the sample category characteristics aiming at any category level, and taking the category probability distribution as a classification result at any category level.
By the scheme, one implementation mode of hierarchical classification is provided, namely, the characteristics are extracted from the input information to predict the category probability distribution of each category hierarchy, and the method can be adapted to the situation of each category hierarchy.
In the above aspect, after extracting the sample description feature from the sample description information and extracting the sample category feature from the sample category information, the method further includes:
performing linear projection processing on the sample description characteristic and the sample category characteristic;
and carrying out normalization processing on the sample description characteristic and the sample category characteristic after linear projection.
According to the scheme, the sample description features and the sample category features are adjusted to be consistent in dimension in a linear projection mode, and the sample description features and the sample category features are adjusted to be consistent in measurement in a normalization mode, so that the comparability among the features can be enhanced, and the learning effect can be enhanced.
In the above scheme, the sample description information includes description information of a plurality of modalities; the extracting the sample description feature from the sample description information comprises the following steps:
extracting a mode description characteristic from the description information of each mode;
and carrying out feature fusion processing on the mode description features respectively corresponding to the multiple modes to obtain sample description features.
Through the scheme, the description information of a plurality of modes related to the sample object is comprehensively considered, so that the richness of the information can be improved, and the training effect of the classification model is further improved.
In the above-described aspect, the number of sample objects includes a plurality; the step-by-step classification processing is performed by a classification model based on the sample description information and the sample category information to obtain classification results at the plurality of category levels, including:
combining the plurality of sample description information and the plurality of sample category information to obtain a plurality of information combinations; each information combination comprises a sample description information and a sample category information;
performing hierarchical classification processing based on information combination through the classification model to obtain classification results of the information combination in the category levels;
The step of calculating the loss according to the sample category and the classification result of the same category level to obtain the level loss comprises the following steps:
for any one category level, the following processing is performed:
determining a category label of the information combination at any category level according to the sample category of the target sample object corresponding to the information combination at any category level;
determining the difference between the category label of any category level and the classification result of the information combination as the information combination loss;
and carrying out information combination fusion processing on the information combination losses corresponding to the plurality of information combinations respectively to obtain the hierarchy loss of any category hierarchy.
Through the scheme, a plurality of information combinations are generated based on ideas of contrast learning, so that the richness of training data can be improved, and the model training effect is improved.
In the above scheme, the method further comprises:
when sample description information and sample category information in the information combination correspond to the same sample object, determining the same sample object as a target sample object corresponding to the information combination;
when the sample description information and the sample category information in the information combination correspond to different sample objects, determining the target sample object corresponding to the information combination as no object; and the sample category of the no-object at any category level is no category.
Through the scheme, when the sample description information and the sample category information in the information combination correspond to the same sample object, the information combination is used as a positive sample; when the sample description information and the sample category information in the information combination correspond to different sample objects, the information combination is taken as a negative sample. Thus, the expansion of training data is accurately and effectively realized.
An embodiment of the present application provides an object classification device, including:
the target acquisition module is used for acquiring preset category information and description information to be detected of the object to be detected; the preset category information comprises preset categories of each category level in a plurality of category levels;
the target processing module is used for carrying out hierarchical classification processing on the basis of the description information to be detected and the preset category information through a classification model to obtain classification results of the object to be detected in the category levels;
and the determining module is used for determining the category of the object to be detected in the category levels according to the classification results of the object to be detected in the category levels.
The embodiment of the application provides a classification model training device, which comprises:
the sample acquisition module is used for acquiring sample description information and sample category information of the sample object; the sample category information comprises sample categories of the sample object at a plurality of category levels;
The sample processing module is used for carrying out hierarchical classification processing on the basis of the sample description information and the sample category information through a classification model to obtain classification results at the category levels;
the loss calculation module is used for carrying out loss calculation according to sample categories and classification results of the same category level to obtain level loss, and carrying out multi-level fusion processing on the level loss corresponding to the category levels to obtain multi-level loss;
the training module is used for training the classification model according to the multi-level loss; the trained classification model is used for predicting classification results of the object to be tested in the category levels.
An embodiment of the present application provides an electronic device, including:
a memory for storing executable instructions;
and the processor is used for realizing the object classification method or the classification model training method provided by the embodiment of the application when executing the executable instructions stored in the memory.
The embodiment of the application provides a computer readable storage medium, which stores executable instructions for causing a processor to execute the method for classifying objects or training a classification model.
Embodiments of the present application provide a computer program product comprising executable instructions for implementing the object classification method or classification model training method provided in embodiments of the present application when executed by a processor.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a classification system according to an embodiment of the present application;
FIG. 2A is a schematic diagram of a server according to an embodiment of the present disclosure;
fig. 2B is another schematic structural diagram of a server according to an embodiment of the present application;
FIG. 3A is a schematic flow chart of a classification model training method according to an embodiment of the present application;
FIG. 3B is another flow chart of a classification model training method according to an embodiment of the present application;
FIG. 3C is another flow chart of a classification model training method according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of an object classification method according to an embodiment of the present application;
fig. 5 is a schematic diagram of merchandise display of the e-commerce platform according to the embodiment of the present application;
fig. 6 is a schematic diagram of constructing a feature combination matrix according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict. In the following description, the term "plurality" refers to at least two.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.
Before further describing embodiments of the present application in detail, the terms and expressions that are referred to in the embodiments of the present application are described, and are suitable for the following explanation.
1) Artificial intelligence (Artificial Intelligence, AI): the system is a theory, a method, a technology and an application system which simulate, extend and extend human intelligence by using a digital computer or a machine controlled by the digital computer, sense environment, acquire knowledge and acquire an optimal result by using the knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
Machine Learning (ML) is the core of artificial intelligence, and is the fundamental approach to make computers intelligent, which is applied throughout various fields of artificial intelligence. Machine learning involves multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc., and specially studies how a computer simulates or implements learning behavior of a human being to obtain new knowledge or skills, and reorganizes existing knowledge structures to continuously improve their own performance.
In the embodiment of the application, the classification model may be an artificial neural network model, which is implemented based on a machine learning theory and is a way of simulating the mutual transmission of signals of biological neurons.
2) Forward propagation (Forward Propagation): refers to the process of operating according to the sequence from left to right of the network layer of the neural network until the last layer obtains an output result. Back Propagation (Back Propagation) refers to the process of updating weight parameters and offsets in the network layer of the neural network in the event of errors between the output result and the desired result, thereby reducing such errors. The back propagation is reverse in order to the forward propagation.
3) Category system: the system comprises a plurality of category levels, wherein each category level comprises a plurality of preset categories. By way of example with a merchandise category system, the first category level may include categories such as "clothing," "food," and the like; for the "clothing" category, the next level (second category level) may include the "coat", "pants", and the like; for the "coat" category, the next level (third category level) may include the "T-shirt", "down jacket", and the like.
4) The object is: the object to be classified is not limited to the type of the object, and may be a commodity, a vehicle, a human, an animal, or the like.
5) Loss value (Loss): the "loss" means a penalty to the model due to failure to output the expected result, the loss function determines the performance of the model by comparing the output result of the model with the expected result, and then the optimization direction is searched, and the value obtained by the loss function is the loss value. If the deviation between the output result and the expected result is very large, the loss value will be large; if the deviation is small, the loss value will be very low. The embodiment of the application does not limit the type of the loss function, and can be cross entropy loss function, hinge loss function, negative log likelihood loss function and the like.
For classification of a category of a specific object, in a scheme provided by the related art, text information of the object is generally matched with each category in a last category hierarchy, and a category with the largest text similarity is taken as a category to which the object belongs. However, the inventor finds that, on one hand, the text information is more one-sided, the information quantity is insufficient, and the mere text information is extremely confused with the judgment of the category; on the other hand, the classification is performed by means of the last category level, so that the accuracy is low, and the objects are easily classified into the wrong categories.
In view of this, the embodiments of the present application provide an object classification method, a classification model training method, an apparatus, an electronic device, and a computer-readable storage medium, which enable a classification model to have a multi-level classification capability by means of machine learning, thereby improving the accuracy of object classification. An exemplary application of the electronic device provided by the embodiment of the present application is described below, where the electronic device provided by the embodiment of the present application may be implemented as various types of terminal devices, and may also be implemented as a server.
Referring to fig. 1, fig. 1 is a schematic diagram of an architecture of a classification system 100 according to an embodiment of the present application, a terminal device 400 is connected to a server 200 through a network 300, and the server 200 is connected to a database 500, where the network 300 may be a wide area network or a local area network, or a combination of the two.
In some embodiments, taking an example that the electronic device is a server, the classification model training method provided in the embodiments of the present application may be implemented by the server. For example, the server 200 acquires sample description information and sample category information of a sample object, the sample category information including sample categories to which the sample object belongs at a plurality of category levels, respectively. The server 200 may acquire sample description information of the sample object and sample category information from the terminal device 400 or the database 500. Then, the server 200 invokes a classification model to perform hierarchical classification processing based on the sample description information and the sample category information, so as to obtain classification results at a plurality of category levels; according to sample categories and classification results of the same category hierarchy, performing loss calculation to obtain hierarchy losses, and performing multi-level fusion processing on the hierarchy losses corresponding to the category hierarchies to obtain multi-level losses; training a classification model according to the multi-level loss; the trained classification model is used for predicting classification results of the object to be detected in a plurality of category levels.
In some embodiments, taking an example that the electronic device is a server, the object classification method provided in the embodiments of the present application may be implemented by the server. For example, the server 200 may acquire preset category information (for example, may be stored in the server 200 locally or in the database 500 in advance) and acquire description information to be measured of the object to be measured from the terminal device 400 or the database 500. Then, the server 200 invokes the trained classification model to perform hierarchical classification processing based on the description information to be tested and the preset category information to obtain classification results of the object to be tested at multiple category levels, and determines target categories of the object to be tested at multiple category levels according to the classification results of the object to be tested at multiple category levels. For the target category of the resulting object under test at a plurality of category levels, the server 200 may be stored in the database 500 or transmitted to the terminal device 400.
In some embodiments, taking an example that the electronic device is a terminal device, the object classification method provided in the embodiments of the present application may be implemented by the terminal device. For example, the server 200 may send the trained classification model to the terminal device 400, such that the terminal device 400 deploys the trained classification model locally. The terminal device 400 may obtain preset category information and description information of the object to be tested, call the trained classification model to perform hierarchical classification processing based on the description information and the preset category information, obtain classification results of the object to be tested at multiple category levels, and determine target categories of the object to be tested at multiple category levels according to the classification results of the object to be tested at multiple category levels.
In some embodiments, the terminal device 400 or the server 200 may implement the classification model training method or the object classification method provided in the embodiments of the present application by running a computer program, for example, the computer program may be a native program or a software module in an operating system; may be a Native (APP) Application, i.e., a program that needs to be installed in an operating system to run, such as an e-commerce platform Application (e.g., client 410 in fig. 1); the method can also be an applet, namely a program which can be run only by being downloaded into a browser environment; but also an applet that can be embedded in any APP, where the applet can be run or shut down by the user control. In general, the computer programs described above may be any form of application, module or plug-in.
In some embodiments, the server 200 may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDNs), and basic cloud computing services such as big data and artificial intelligence platforms, where the cloud services may be classification model training services or object classification services for the terminal device 400 to call. The terminal device 400 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart television, a smart watch, etc. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiments of the present application.
Taking the electronic device provided in the embodiment of the application as a server as an example, it can be understood that, for the case that the electronic device is a terminal device, modules such as a user interface, a presentation module, an input processing module and the like may also be included on the basis of fig. 2A. Referring to fig. 2A, fig. 2A is a schematic structural diagram of a server 200 provided in an embodiment of the present application, and the server 200 shown in fig. 2A includes: at least one processor 210, a memory 250, and at least one network interface 220. The various components in server 200 are coupled together by bus system 240. It is understood that the bus system 240 is used to enable connected communications between these components. The bus system 240 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 240 in fig. 2A.
The processor 210 may be an integrated circuit chip with signal processing capabilities such as a general purpose processor, such as a microprocessor or any conventional processor, or the like, a digital signal processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.
The memory 250 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 250 optionally includes one or more storage devices physically located remote from processor 210.
Memory 250 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a random access Memory (RAM, random Access Memory). The memory 250 described in embodiments of the present application is intended to comprise any suitable type of memory.
In some embodiments, memory 250 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.
An operating system 251 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;
network communication module 252 for reaching other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 include: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (USB, universal Serial Bus), etc.;
In some embodiments, the classification model training apparatus provided in the embodiments of the present application may be implemented in a software manner, and fig. 2A shows a classification model training apparatus 2551 stored in a memory 250, which may be software in the form of a program and a plug-in, and includes the following software modules: sample acquisition module 25511, sample processing module 25512, loss calculation module 25513, and training module 25514 are logical, and thus can be arbitrarily combined or further split depending on the functions implemented. The functions of the respective modules will be described hereinafter.
In some embodiments, the object classification device provided in the embodiments of the present application may also be implemented in software, and fig. 2B shows an object classification device 2552 stored in a memory 250, which may be software in the form of a program and a plug-in, and includes the following software modules: the target acquisition module 25521, the target processing module 25522, and the determination module 25523 are logical, and thus may be arbitrarily combined or further split according to the implemented functions. The functions of the respective modules will be described hereinafter. It should be noted that, except for the object classifying device 2552 shown in fig. 2B, the rest of the structures may be the same as those of fig. 2A.
The classification model training method provided by the embodiment of the application will be described in connection with exemplary applications and implementations of the electronic device provided by the embodiment of the application.
Referring to fig. 3A, fig. 3A is a schematic flow chart of a classification model training method according to an embodiment of the present application, and will be described with reference to the steps shown in fig. 3A.
In step 101, sample description information and sample category information of a sample object are obtained; the sample category information includes sample categories for sample objects at multiple category levels.
In the embodiment of the present application, the category system is pre-established, and the category system includes a plurality of category levels, where each category level includes a plurality of preset categories, and there is an association relationship between the categories of different category levels, that is, there is an association relationship between a category in a category level and at least one category in a next category level. Based on the established category system, the specific objects need to be classified into the correct categories, so that the related data of the sample objects can be used as training data to train the classification model, so that the classification model has the capability of accurately classifying, wherein the sample objects refer to the objects marked with the categories.
The related data of the sample object comprises sample description information and sample category information, wherein the sample description information is used for describing the sample object, and the sample description information can be single-mode information such as text information; multimodal information is also possible, including text information, picture information, etc. The sample category information includes sample categories to which the sample object belongs at each category level, and the sample categories refer to tagged categories that are deemed correct.
Taking an e-commerce scenario as an example, an e-commerce category system may include three category levels, a first category level may include "clothing," "food," "appliances," categories; for the "clothes" category, the next level (second category level) may include the "coat" and "pants" category, i.e., the "clothes" category has an association with the "coat" and "pants" category, as follows; for the "coat" category, the next level (third category level) may include the "T-shirt", "down jacket" category. In this scenario, the sample object may be a commodity with a manually marked category, the sample description information may include a picture, a text (such as a commodity introduction) and the like of the commodity, the sample category information includes a sample category to which each category level of the sample object in the e-commerce category system belongs, for example, the sample category information may be "clothes-jacket-T-shirt", the sample category "clothes" corresponds to a first category level, the sample category "jacket" corresponds to a second category level, and the sample category "T-shirt" corresponds to a third category level.
In step 102, a hierarchical classification process is performed by a classification model based on the sample description information and the sample category information, so as to obtain classification results at multiple category levels.
Here, the sample description information and the sample category information are used as input data of the classification model, and forward propagation processing is performed in the classification model, that is, hierarchical classification processing is performed by the classification model based on the sample description information and the sample category information, and the obtained output result is a classification result at each category level. The classification model may be an artificial neural network model, and the forward propagation processing refers to processing input data of a network layer according to a sequence from front to back of the network layer in the classification model.
In step 103, a loss calculation is performed according to the sample category and the classification result of the category hierarchy to obtain a hierarchy loss, and a multi-level fusion process is performed on the hierarchy losses corresponding to the category hierarchies to obtain a multi-level loss.
For each category level, calculating the level loss of the category level according to the classification result of the category level and the sample category in the sample category information, wherein the level loss represents the error between the output result and the expected result of the classification model in the category level. The loss function used in the loss calculation in the embodiment of the present application is not limited, and may be various loss functions for solving the classification problem, such as a cross entropy loss function.
Therefore, the hierarchy loss corresponding to the category hierarchies can be obtained, and then the hierarchy losses are subjected to multi-hierarchy fusion processing to obtain multi-hierarchy losses, wherein the multi-hierarchy losses represent the total error of the classification model in all category hierarchies. The method of the multi-level fusion processing in the embodiment of the present application is not limited, and may be, for example, weighted summation processing, and weights of various category levels may be set according to actual application scenarios.
In step 104, training a classification model based on the multi-level loss; the trained classification model is used for predicting classification results of the object to be detected in a plurality of category levels.
For example, on the basis of obtaining the multi-level loss, the classification model is subjected to back propagation processing according to the multi-level loss, and the weight parameters and the offset of the classification model are updated in the back propagation process, so that the purpose of training the classification model is achieved. During back propagation, a Gradient Descent (Gradient Descent) algorithm may be employed to find weight parameters and offsets that minimize multi-level loss. The trained classification model has the capability of accurately classifying the objects layer by layer, so that the classification result of the objects to be tested at each category layer can be predicted through the trained classification model.
In some embodiments, when training the classification model according to the multi-level loss, further comprising: and stopping training the classification model when the preset stopping condition is met. Here, the stop condition may be set in advance, for example, the stop condition may be that the number of training rounds of the classification model reaches a training round number threshold, or the performance index of the classification model reaches a performance index threshold, and the performance index may be an accuracy rate, an F1-score, or the like. Through the mode, the training process can be accurately controlled, and the waste of computing resources is reduced on the premise of guaranteeing the performance of the classification model.
As shown in fig. 3A, the embodiment of the present application comprehensively considers the level loss of each category level, so that the trained classification model has the capability of accurately classifying level by level, and can improve the classification accuracy, and avoid the problem of high classification error rate caused by only depending on the last category level.
In some embodiments, referring to fig. 3B, fig. 3B is a schematic flow chart of a classification model training method provided in the embodiments of the present application, and step 102 shown in fig. 3A may be implemented by steps 201 to 202, which will be described in connection with the steps.
In step 201, sample description features are extracted from the sample description information, and sample category features are extracted from the sample category information.
Here, the sample description information is subjected to feature extraction to obtain sample description features, and the sample category information is subjected to feature extraction to obtain sample category features. The feature extraction manner in the embodiment of the present application is not limited, and may be implemented by a specific model corresponding to the information type (where the specific model is a part of a classification model) for example, for a Text (for example, a Text in sample description information or sample category information embodied in a Text form), feature extraction may be implemented by a Text feature extraction model, for example, a Text transform model; for pictures (e.g., pictures in sample description information), feature extraction may be achieved by a picture feature extraction model, such as convolutional neural network (Convolutional Neural Networks, CNN) or Vision Transformer model, and the like. It should be noted that, for the sample description information and the sample category information, the types of models implementing feature extraction may be the same or partially the same, but the weight parameters and offsets in the models are different.
In some embodiments, the sample description information includes description information of a plurality of modalities; the extraction of sample description features from sample description information described above may be achieved in such a way: extracting a mode description characteristic from the description information of each mode; and carrying out feature fusion processing on the mode description features respectively corresponding to the modes to obtain sample description features.
The sample description information may include description information of a single modality, such as including only picture information or text information; the method can also comprise description information of a plurality of modes, such as picture information, text information, sound information and the like, and the description information of the plurality of modes can describe the sample object more fully.
In the case that the sample description information includes description information of a plurality of modes, the mode description features may be extracted from the description information of each mode, and then feature fusion processing is performed on all the mode description features to obtain the sample description features, where a manner of the feature fusion processing is not limited, and may be direct summation or weighted summation, for example. Through the multi-mode fusion, the comprehensiveness of the obtained sample description characteristics can be improved, so that the model training effect is improved, and the data rule between each mode and category is fully learned.
In step 202, for any one category hierarchy, a category probability distribution at any one category hierarchy is calculated based on the sample description features and the sample category features as a classification result at any one category hierarchy.
Here, for each category hierarchy, the category probability distribution is calculated according to the sample description feature and the sample category feature, that is, the probability corresponding to each preset category in the category hierarchy is calculated, so that the dimension of the category probability distribution of any one category hierarchy (the dimension of the category probability distribution embodied in the form of a vector) is the same as the number of preset categories of any one category hierarchy. It is worth noting that the process of obtaining the category probability distribution is independent for different category levels; the category probability distribution for each category hierarchy may be sequentially determined in the order of the first category hierarchy to the last category hierarchy in the category system.
In some embodiments, extracting the sample description feature from the sample description information, and after extracting the sample category feature from the sample category information, further comprises: performing linear projection processing on the sample description characteristics and the sample category characteristics; and carrying out normalization processing on the linear projected sample description characteristics and sample category characteristics.
Here, the sample description feature and the sample category feature may be linearly projected, where the purpose of the linear projection is to project the sample description feature and the sample category feature to the same dimension, so as to facilitate subsequent computation. On the basis of linear projection, the sample description characteristics and sample category characteristics after linear projection can be normalized, wherein normalization means that the dimensionless numerical value in the characteristics is changed into a dimensionless numerical value, namely into a scalar, so that the comparability among different characteristics can be enhanced, and the model training effect is improved. The normalization method in the embodiment of the present application is not limited, and may be, for example, L1 normalization or L2 normalization. Through the mode, the sample description characteristics and the sample category characteristics can be more suitable for operation and comparison, and the model training effect can be effectively enhanced.
As shown in fig. 3B, by extracting features in the information and predicting the category probability distribution of each category level based on the extracted features, the classification can be realized while being adapted to the actual situation of each category level.
In some embodiments, referring to fig. 3C, fig. 3C is a schematic flow chart of a classification model training method provided in the embodiments of the present application, and step 102 shown in fig. 3A may be implemented by steps 301 to 302, which will be described in connection with the steps.
In step 301, a plurality of sample description information and a plurality of sample category information are combined to obtain a plurality of information combinations; each combination of information includes a sample description information and a sample category information.
In the embodiment of the application, the sample description information and the sample category information of the sample object can be regarded as positive samples of model training, and if model training is performed only according to the positive samples, the classification model is easily put into an overfitting state, so that the generalization capability of the classification model is greatly reduced. Thus, a negative sample can be constructed for model training based on ideas of contrast learning.
For example, when the number of sample objects includes a plurality of sample description information and a plurality of sample category information may be obtained, and then, a combination process is performed on the plurality of sample description information and the plurality of sample category information to obtain a plurality of information combinations, where each information combination includes one sample description information and one sample category information. The combination processing may be an exhaustive combination, for example, if N pieces of sample description information and N pieces of sample category information exist, n×n pieces of information combination may be obtained.
For ease of understanding, an illustration is made. The sample object comprises a sample object A and a sample object B, wherein the sample object A corresponds to sample description information A1 and sample category information A2, and the sample object B corresponds to sample description information B1 and sample category information B2, so that four information combinations, namely A1-A2, A1-B2, B1-A2 and B1-B2, can be obtained after combination processing. For the obtained information combination, there are two cases, the first case is that the sample description information and the sample category information in the information combination correspond to the same sample object (such as A1-A2 and B1-B2), and the information combination conforming to the case is a positive sample; in the second case, the sample description information and the sample category information in the information combination correspond to different sample objects (such as A1-B2 and B1-A2), and the information combination conforming to the case is a negative sample.
In step 302, a hierarchical classification process is performed based on the information combination by a classification model, so as to obtain classification results of the information combination at multiple category levels.
And for each information combination, calling a classification model to perform hierarchical classification processing based on sample description information and sample category information in the information combination, namely performing forward propagation processing to obtain a classification result of the information combination in each category level.
In fig. 3C, step 103 shown in fig. 3A can be implemented by steps 303 to 306, and the description will be made in connection with each step.
In step 303, a category label of the information combination at any one category level is determined according to the sample category of the target sample object corresponding to the information combination at any one category level.
Here, after determining the classification result at each category level corresponding to each information combination, the level loss is calculated at each category level based on the classification result, and for ease of understanding, the first category level will be described below as an example.
First, for each information combination, determining a category label of the information combination at the first category level according to the category of the sample object corresponding to the information combination at the first category level, wherein the category label is a desired result.
The form of the category label is not limited, for example, the category label may be in the form of an One-Hot vector, the One-Hot vector is a vector having 1 element and having 0 values of all other elements, on the basis, in the process of determining the level loss of the first category level, the dimension of the obtained category label is equal to the number (category) of preset categories in the first category level, each element in the category label corresponds to One preset category in the first category level, and the element with 1 value corresponds to the sample category of the target sample object in the first category level. Correspondingly, the classification result of the first category level may also be in a vector form (such as a category probability distribution embodied in a vector form), and the classification result is the same as the category label of the first category level in dimension, and the sum of the values of a plurality of elements included in the classification result is 1, and the value of each element represents the probability of belonging to the corresponding preset category.
For example, the first category level has 3 preset categories, namely, "clothes," "food," and "electric appliance," respectively, and then in the process of determining the level loss of the first category level, the determined dimension of the category label is 3, and the determined dimension is [ element 1, element 2, and element 3], where element 1 corresponds to the category "clothes," element 2 corresponds to the category "food," and element 3 corresponds to the category "electric appliance," and of course, the one-to-one correspondence between the elements and the categories may be adjusted, and is not limited to the above example. By way of example, if the classification result of a certain combination of information at the first category level is [0.1,0.7,0.2], it is indicated that the combination of information has a probability of belonging to "clothes" of 10%, a probability of belonging to "food" of 70%, and a probability of belonging to "electric appliance" of 20%.
In some embodiments, when the sample description information and the sample category information in the information combination correspond to the same sample object, determining the same sample object as a target sample object corresponding to the information combination; when the sample description information and the sample category information in the information combination correspond to different sample objects, determining the target sample object corresponding to the information combination as no object; wherein, the sample category of the no object in any category hierarchy is no category.
When the sample description information and the sample category information in the information combination correspond to the same sample object, the information combination is proved to be a positive sample, the same sample object is determined to be a target sample object corresponding to the information combination, and the category label is determined according to the sample category of the target sample object. By way of example again, if the sample category of the target sample object at the first category level is determined to be "clothes", the category label of the corresponding information combination at the first category level is determined to be [1, 0].
When the sample description information and the sample category information in the information combination correspond to different sample objects, and the information combination is proved to be a negative sample, the target sample object corresponding to the information combination is determined to be no object. For no object, the category labels may be fixed, e.g., a vector having values of 0, and again exemplified by the foregoing examples, the sample category of the no object at the first category level is no category, and the category label of the corresponding information combination at the first category level is determined to be [0, 0].
Through the mode, the positive sample and the negative sample can be accurately divided, the expansion of training data can be accurately and effectively realized, and meanwhile, the accurate category label can be obtained.
In step 304, the difference between the category labels and the classification results of the information combination at any one category level is determined as an information combination loss.
For each information combination, a difference between the category label (expected result) of the information combination at the first category level and the classification result (output result) of the information combination at the first category level is determined as an information combination loss, for example, the category label of the information combination at the first category level and the classification result of the information combination at the first category level may be substituted into a loss function to obtain the information combination loss.
In step 305, information combination fusion processing is performed on the information combination losses corresponding to the plurality of information combinations, so as to obtain a hierarchy loss of any category hierarchy.
Here, after obtaining the information combination loss of each information combination at the first category level, the information combination loss is subjected to information combination fusion processing to obtain the level loss of the first category level. The method of the information combination fusion processing in the embodiment of the present application is not limited, and may be, for example, direct summation or weighted summation.
The above is merely an example of a process of determining the level loss of the first category level, and the process of determining the level loss of the remaining category levels may be performed with reference.
In step 306, multi-level fusion processing is performed on the level losses corresponding to the plurality of category levels, so as to obtain multi-level losses.
As shown in fig. 3C, the embodiment of the application generates a plurality of information combinations based on ideas of contrast learning, and determines corresponding category labels, so that the richness of training data can be improved, the classification model can be fully trained based on positive samples and negative samples, the generalization capability of the trained classification model can be effectively improved, and the situation of falling into overfitting is avoided.
In some embodiments, referring to fig. 4, fig. 4 is a schematic flow chart of an object classification method according to an embodiment of the present application, and description will be made with reference to each step shown.
In step 401, obtaining preset category information and description information to be tested of an object to be tested; the preset category information comprises preset categories of a plurality of category levels.
Here, the classification result of the object to be measured at each category level may be predicted by the classification model. Firstly, preset category information and description information to be tested of an object to be tested can be obtained, wherein a classification model can be obtained through training by the classification model training method provided by the embodiment of the application. The preset category information may be determined based on the category system, for example, may include all preset categories of each category level in the category system, and of course, according to requirements in an actual application scenario, some but not all preset categories in the category system may be selected to be added to the preset category information, for example, some preset categories do not need to be classified, and then these preset categories do not need to be added to the preset category information.
It should be noted that the description information to be measured and the sample description information are the same in type, for example, both include only picture information and include both picture information and text information.
In step 402, a hierarchical classification process is performed based on the description information to be tested and the preset category information through the classification model, so as to obtain classification results of the object to be tested at multiple category levels.
Here, the classification model is called to perform hierarchical classification processing based on the description information to be detected and the preset category information, so as to obtain classification results of the object to be detected at each category level. Here, the hierarchical classification may refer to classification in order from the first category level to the last category level.
In some embodiments, the above-mentioned hierarchical classification processing based on the description information to be measured and the preset category information by using the classification model may be implemented in such a manner that classification results of the object to be measured in multiple category levels are obtained: the following processing is performed by the classification model: extracting description features to be detected from the description information to be detected, and extracting preset category features from preset categories included in the preset category information; according to the description characteristics to be detected and the characteristics of the preset categories, calculating the probability that the object to be detected belongs to the preset categories corresponding to the characteristics of the preset categories; aiming at any category level, determining the probability that the object to be detected respectively belongs to a plurality of preset categories of any category level as a classification result of the object to be detected in any category level.
Here, the probability that the object to be measured belongs to each preset category is individually predicted for each preset category in the preset category information. Firstly, extracting description features to be detected from description information to be detected, and extracting preset category features from preset categories included in preset category information, wherein the preset categories and the preset category features are in one-to-one relation. It should be noted that the preset category feature is herein a feature of a preset category, which is not preset per se.
And then, calculating the probability that the object to be measured belongs to the preset category corresponding to the preset category according to the description characteristic to be measured and the preset category characteristic. If the class probability distribution is calculated according to the sample description feature and the sample class feature during training the classification model, the class probability distribution may be calculated according to the description feature to be tested and the preset class feature, and the probability corresponding to the preset class (here, the preset class corresponding to the preset class feature) in the class probability distribution may be searched.
For example, among the preset category information, the first category level has 3 preset categories, respectively, "clothes", "food", "electric appliance". Then, for the category of 'clothes', the characteristics of the category of 'clothes' are extracted, and then the probability distribution of the category is calculated according to the description characteristics to be tested and the characteristics of the category of 'clothes'. Since the "clothes" category is located in the first category level, the calculated dimension of the category probability distribution is equal to the number of preset categories of the preset category information in the first category level, that is, the dimension is 3, and each element in the category probability distribution corresponds to one preset category of the preset category information in the first category level. If the elements in the category probability distribution correspond to "clothes", "food" and "electric appliance" in sequence from front to back, and the category probability distribution is specifically [0.6,0.3,0.1], the probability that the object to be measured belongs to the category of "clothes" can be obtained to be 60%, and as for the numerical values of other elements in the category probability distribution, attention is not required in the process of obtaining the probability that the object to be measured belongs to the category of "clothes".
Thus, for any category level, the probability of each preset category of the category level of the object to be tested can be obtained, and the probabilities are the classification results of the object to be tested in the category level. Through the mode, the probability that the object to be measured belongs to the corresponding preset category can be accurately solved based on the description characteristic to be measured and the preset category characteristic, and accurate prediction is realized for each preset category.
In some embodiments, when the hierarchical classification processing is performed based on the description information to be tested and the preset category information through the classification model, the method further includes: for any one category level, the following processing is performed: screening a plurality of preset categories of any category hierarchy according to the classification result of the previous category hierarchy of any category hierarchy; and updating the preset category information according to the screened preset categories.
In the hierarchical classification process according to the order from the first category hierarchy to the last category hierarchy, for any category hierarchy except the first category hierarchy, a plurality of preset categories of any category hierarchy can be screened according to the classification result of the previous category hierarchy of the any category hierarchy.
For easy understanding, taking the second category level as an example, for the second category level, a category to which the object to be measured belongs in the first category level (for convenience of distinguishing, named as a target category) may be determined according to a classification result of the first category level, then, a preset category in the second category level, which has an association relationship with the target category of the first category level, may be screened out, and preset category information may be updated according to the preset category screened out in the second category level, so that the second category level in the preset category information only includes the screened preset category.
For example, in the preset category information, a first category level includes "clothes", "food" categories, which are associated with "coats" and "pants" categories at a second category level, and "food" categories, which are associated with "vegetables" and "fruits" categories at a second category level, and if the target category of the first category level is "clothes", the "coats" and "pants" categories in association with the "clothes" categories in the second category level are screened out, and the preset category information is updated so that the preset category information includes only "coats" and "pants" categories at the second category level.
By the method, the layer-by-layer reasoning can be performed based on category association relations among different category levels, so that the calculated amount can be reduced, and meanwhile, the target categories of the different category levels are associated.
In step 403, the target category of the object to be tested in the multiple category levels is determined according to the classification results of the object to be tested in the multiple category levels.
Here, for each category level, the target category of the object to be measured at the category level may be determined according to the classification result of the object to be measured at the category level. For example, for any one category level, when the classification result of the object to be measured in the any one category level refers to the probability that the object to be measured belongs to each preset category of the any one category level, the preset category corresponding to the probability with the largest value can be taken as the target category.
It should be noted that, in some embodiments, step 403 may be performed during the execution of step 402, for example, after obtaining a classification result of an object to be measured at a category level, determining a target category of the object to be measured at the category level, so as to determine a classification result of the object to be measured at a next category level based on an inference principle.
As shown in fig. 4, in the embodiment of the present application, by performing a hierarchical classification process, the situations of each category hierarchy are comprehensively considered, so that the target category of the object to be measured in each category hierarchy can be accurately obtained, and the object to be measured is accurately classified according to the category system.
In the following, an exemplary application of the embodiments of the present application in a practical application scenario will be described, and for convenience of understanding, a commodity classification scenario will be exemplified.
Step one: training data of the classification model is constructed.
Here, the data input into the classification model includes three types of information, which are respectively picture information, text information and category information, wherein the picture information and the text information are descriptive information of the commodity, the picture information may be the appearance of the commodity, and the text information may include the name of the commodity, the introduction text of the commodity, and the like. Fig. 5 is an example of a display diagram of a commodity in the electronic commerce platform provided in the embodiment of the present application, including a commodity appearance 51, a commodity name 52, a commodity introduction text 53, and category information 54, where the commodity appearance 51 includes six small diagrams on the left side and one large diagram on the right side.
For sample commodities (commodities with marked categories), relevant picture information, text information and category information can be identified from the display diagram of the sample commodities or can be directly obtained from a database of an electronic commerce platform.
In order to improve the training effect of the model, a plurality of sample commodities can be accurately obtained, and sample picture information (picture information of the sample commodity, the same shall apply hereinafter), sample text information and sample category information of each sample commodity can be obtained to be used as training data of a classification model.
Step two: hierarchical classification is performed based on training data by a classification model.
For ease of understanding, description will be given with reference to fig. 6.
Step 1) for each sample commodity, extracting sample category characteristics from sample category information through a first text characteristic extraction model, extracting sample picture characteristics from sample picture information through a picture characteristic extraction model, extracting sample text characteristics from sample text information through a second text characteristic extraction model, and respectively marking the sample picture characteristics, the sample text characteristics and the sample category characteristics as I_f, T_f and C_f.
The first text feature extraction model, the picture feature extraction model and the second text feature extraction model are all part of the classification model. The Text feature extraction model (the first Text feature extraction model or the second Text feature extraction model) and the picture feature extraction model are not limited in types, for example, the Text feature extraction model may be a Text transform model, and the picture feature extraction model may be a CNN model or a Vision Transformer model.
Step 2) performing feature fusion processing on the I_f and the T_f of the sample commodity, for example, summing processing can be performed to obtain a sample description feature I_f+T_f.
Step 3) performing linear projection processing on the I_f+T_f and C_f of the sample commodity so as to project the sample commodity to the same dimension. And then carrying out L2 normalization processing on the linearly projected I_f+T_f and C_f to obtain I (corresponding to I_f+T_f) and T (corresponding to C_f) respectively. The linear projection process and the L2 normalization process are not shown in fig. 6.
Wherein, the L2 normalization formula is as follows:
x=[x 1 ,x 2 ,…,x n ]
y=[y 1 ,y 2 ,…,y n ]
in the above formula, x represents an original feature vector, y represents a feature vector normalized by L2, and n represents a dimension of the feature vector.
Step 4) carrying out exhaustive combination treatment on a plurality of I and T to obtain a plurality of feature combinations, wherein each feature combination comprises an I and a T, as shown in FIG. 6, a feature combination matrix is formed, wherein I1 represents the I of the first sample commodity, T1 represents the T of the first sample commodity, and the rest can be analogized; i1.t1 represents a combination of features constituted by I1 and T1, and the rest can be analogized.
For each feature combination, a category probability distribution Logits at each category level is calculated from the two features therein, and the category probability distribution can be embodied in the form of a vector.
Step three: multi-level losses are calculated.
For each category level, the level penalty is calculated separately, and then all the level penalties are fused into a multi-level penalty. The various category levels may use the same loss function, such as a cross entropy loss function, as follows:
in the above formula, L represents a loss value, y (i) Indicating the output result corresponding to the ith sample commodity,and (5) representing the expected result corresponding to the ith sample commodity, wherein N represents the number of the sample commodities.
For ease of understanding, the process of finding the level loss of the first category level is illustrated. In the feature combination matrix, if two features in the feature combination correspond to the same sample commodity, the feature combination is a positive sample (i.e. the feature combination thickened in fig. 6), taking i1.t1 as an example, the class label Labels corresponding to i1.t1 is an One-Hot vector, and the element with the value of 1 in the One-Hot vector corresponds to the sample category to which the first sample commodity (i.e. the same sample commodity corresponding to the two features in the feature combination) belongs at the first category level; if two features in the feature combination do not correspond to the same sample commodity, the feature combination is a negative sample, and the corresponding category Labels are all 0 vectors. Logits is output result, and Labels is expected result. On the basis, traversing each feature combination in the feature combination matrix according to rows (namely according to the sequence of I1.T1 to I1.TN, I2.T1 to I2.TN … … IN.T1 to IN.TN), substituting Logits and Labels corresponding to the traversed feature combination into the loss function, and summing the obtained loss values to obtain loss_i_1; each feature combination in the feature combination matrix is traversed by columns (i.e. in the order of i1.t1 to in.t1, i1.t2 to in.t2 … … i1.tn to in.tn), logits and Labels corresponding to the traversed feature combination are substituted into the loss function, and the obtained loss values are summed to obtain loss_t_1. Finally, the level loss loss_1= (loss_i_1+loss_t_1)/2 of the first category level is calculated, which is essentially a process of taking the loss value of each feature combination and summing.
Similarly, the level Loss of the subsequent category level is calculated sequentially, and assuming that the category level has 5 levels in total, the multi-level Loss loss=x1×loss_1+x2×loss_2+x3×loss_3+x4×loss_4+x5×loss_5
Where loss_2 represents the level loss of the second category level, and so on; x1, x2, x3, x4, x5 are custom parameters that can be determined according to training conditions, and can be set to 3, 2, 1, respectively, for example.
Step four: and training a classification model according to the multi-level loss until a stopping condition is met.
Fifth step: and predicting target categories to which the object to be tested respectively belongs in a plurality of category levels based on an inference principle through the trained classification model. The inference principle refers to that on the basis of the determined target category of a category hierarchy, predictions are made within the range of a plurality of preset categories associated with the target category in the next category hierarchy.
At least the following technical effects can be achieved by the embodiments of the present application: 1) The advantages of multiple modes (pictures and texts) are fully played, and the information of the pictures and the texts is synthesized, so that compared with a traditional text matching scheme, the model training effect and the classification precision can be improved; 2) By constructing a multi-level classification loss function, the classification can be realized in a level-by-level manner, and compared with a scheme of classifying directly in the level of the last category, the classification precision is higher, for example, the classification according to clothes-coat-T-shirts can be more accurate than the direct identification of T-shirts.
Continuing with the description below, classification model training apparatus 2551 provided in embodiments of the present application is implemented as an exemplary structure of software modules, which in some embodiments, as shown in fig. 2A, may be stored in classification model training apparatus 2551 of memory 250, including: a sample acquiring module 25511, configured to acquire sample description information and sample category information of a sample object; the sample category information includes sample categories of the sample object at a plurality of category levels; sample processing module 25512, configured to perform a hierarchical classification process based on the sample description information and the sample category information through the classification model, to obtain classification results at multiple category levels; the loss calculation module 25513 is configured to perform loss calculation according to sample categories and classification results of the same category hierarchy to obtain a hierarchy loss, and perform multi-level fusion processing on the hierarchy losses corresponding to the plurality of category hierarchies to obtain a multi-level loss; a training module 25514 for training the classification model based on the multi-level loss; the trained classification model is used for predicting classification results of the object to be detected in a plurality of category levels.
In some embodiments, sample processing module 25512 is further configured to perform the following processing by the classification model: extracting sample description features from the sample description information and extracting sample category features from the sample category information; and calculating the category probability distribution at any one category level according to the sample description characteristics and the sample category characteristics for any one category level to serve as a classification result at any one category level.
In some embodiments, sample processing module 25512 is further to: performing linear projection processing on the sample description characteristics and the sample category characteristics; and carrying out normalization processing on the linear projected sample description characteristics and sample category characteristics.
In some embodiments, the sample description information includes description information of a plurality of modalities; sample processing module 25512 is also configured to: extracting a mode description characteristic from the description information of each mode; and carrying out feature fusion processing on the mode description features respectively corresponding to the modes to obtain sample description features.
In some embodiments, the number of sample objects includes a plurality; sample processing module 25512 is also configured to: combining the plurality of sample description information and the plurality of sample category information to obtain a plurality of information combinations; each information combination comprises a sample description information and a sample category information; performing hierarchical classification processing based on the information combination through a classification model to obtain classification results of the information combination in a plurality of category levels; the loss calculation module 25513 is further configured to perform, for any category hierarchy, the following processing: determining category labels of the information combination at any category level according to sample categories of the target sample object corresponding to the information combination at any category level; determining the difference between the category label of any category level and the classification result of the information combination as the information combination loss; and carrying out information combination fusion processing on the information combination losses corresponding to the plurality of information combinations respectively to obtain the hierarchy loss of any category hierarchy.
In some embodiments, the loss calculation module 25513 is further to: when the sample description information and the sample category information in the information combination correspond to the same sample object, determining the same sample object as a target sample object corresponding to the information combination; when the sample description information and the sample category information in the information combination correspond to different sample objects, determining the target sample object corresponding to the information combination as no object; wherein, the sample category of the no object in any category hierarchy is no category.
In some embodiments, as shown in fig. 2B, the software modules stored in the object classification device 2552 of the memory 250 may include: the target obtaining module 25521 is configured to obtain preset category information and description information to be tested of the object to be tested; the preset category information comprises preset categories of each category level in the plurality of category levels; the target processing module 25522 is used for performing hierarchical classification processing based on the description information to be detected and the preset category information through the classification model to obtain classification results of the object to be detected in a plurality of category levels; the determining module 25523 is configured to determine categories of the object to be tested at multiple category levels according to classification results of the object to be tested at multiple category levels.
In some embodiments, the target processing module 25522 is further configured to perform the following processing for any one category hierarchy: screening a plurality of preset categories of any category hierarchy according to the classification result of the previous category hierarchy of any category hierarchy; and updating the preset category information according to the screened preset categories.
In some embodiments, the target processing module 25522 is further configured to perform the following processing by the classification model: extracting description features to be detected from the description information to be detected, and extracting preset category features from preset categories included in the preset category information; according to the description characteristics to be detected and the characteristics of the preset categories, calculating the probability that the object to be detected belongs to the preset categories corresponding to the characteristics of the preset categories; aiming at any category level, determining the probability that the object to be detected respectively belongs to a plurality of preset categories of any category level as a classification result of the object to be detected in any category level.
Embodiments of the present application provide a computer program product or computer program comprising executable instructions stored in a computer readable storage medium. The processor of the electronic device reads the executable instructions from the computer-readable storage medium, and the processor executes the executable instructions, so that the electronic device executes the classification model training method or the object classification method according to the embodiment of the application.
Embodiments of the present application provide a computer readable storage medium having stored therein executable instructions that, when executed by a processor, cause the processor to perform a method provided by embodiments of the present application, for example, a classification model training method as shown in fig. 3A, 3B, and 3C, or an object classification method as shown in fig. 4.
In some embodiments, the computer readable storage medium may be FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; but may be a variety of devices including one or any combination of the above memories.
In some embodiments, the executable instructions may be in the form of programs, software modules, scripts, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.
As an example, the executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, in one or more scripts in a hypertext markup language (HTML, hyper Text Markup Language) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
As an example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices located at one site or, alternatively, distributed across multiple sites and interconnected by a communication network.
The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. that are within the spirit and scope of the present application are intended to be included within the scope of the present application.
Claims (10)
1. An object classification method, comprising:
acquiring preset category information and description information to be tested of an object to be tested; wherein the preset category information comprises preset categories of a plurality of category levels;
performing hierarchical classification processing based on the description information to be detected and the preset category information through a classification model to obtain classification results of the object to be detected in the category levels;
and determining target categories of the object to be tested in the category levels according to classification results of the object to be tested in the category levels.
2. The method according to claim 1, wherein the method further comprises:
For any one category level, the following processing is performed:
screening a plurality of preset categories of any category hierarchy according to the classification result of the previous category hierarchy of the any category hierarchy;
and updating the preset category information according to the screened preset categories.
3. The method according to claim 1, wherein the step-by-step classification processing is performed by a classification model based on the description information to be measured and the preset category information to obtain classification results of the object to be measured at the plurality of category levels, including:
the following processing is performed by the classification model:
extracting description features to be detected from the description information to be detected, and extracting preset category features from preset categories included in the preset category information;
according to the description characteristic to be detected and the preset category characteristic, calculating the probability that the object to be detected belongs to the preset category corresponding to the preset category characteristic;
and aiming at any category level, determining the probability that the object to be detected respectively belongs to a plurality of preset categories of the any category level as a classification result of the object to be detected in the any category level.
4. A method of training a classification model, comprising:
acquiring sample description information and sample category information of a sample object; the sample category information comprises sample categories of the sample object at a plurality of category levels;
performing hierarchical classification processing based on the sample description information and the sample category information through a classification model to obtain classification results at the category levels;
performing loss calculation according to sample categories and classification results of the same category level to obtain level loss, and performing multi-level fusion processing on the level loss corresponding to each category level to obtain multi-level loss;
training the classification model according to the multi-level loss; the trained classification model is used for predicting classification results of the object to be tested in the category levels.
5. The method of claim 4, wherein the performing, by a classification model, a hierarchical classification process based on the sample description information and the sample category information to obtain classification results at the plurality of category levels comprises:
the following processing is performed by the classification model:
extracting sample description features from the sample description information and extracting sample category features from the sample category information;
And calculating category probability distribution at any category level according to the sample description characteristics and the sample category characteristics aiming at any category level, and taking the category probability distribution as a classification result at any category level.
6. The method of claim 5, wherein the extracting sample description features from the sample description information, and wherein after extracting sample category features from the sample category information, the method further comprises:
performing linear projection processing on the sample description characteristic and the sample category characteristic;
and carrying out normalization processing on the sample description characteristic and the sample category characteristic after linear projection.
7. The method of claim 5, wherein the sample description information includes description information of a plurality of modalities; the extracting the sample description feature from the sample description information comprises the following steps:
extracting a mode description characteristic from the description information of each mode;
and carrying out feature fusion processing on the mode description features respectively corresponding to the multiple modes to obtain sample description features.
8. The method of claim 4, wherein the number of sample objects comprises a plurality; the step-by-step classification processing is performed by a classification model based on the sample description information and the sample category information to obtain classification results at the plurality of category levels, including:
Combining the plurality of sample description information and the plurality of sample category information to obtain a plurality of information combinations; each information combination comprises a sample description information and a sample category information;
performing hierarchical classification processing based on information combination through the classification model to obtain classification results of the information combination in the category levels;
the step of calculating the loss according to the sample category and the classification result of the same category level to obtain the level loss comprises the following steps:
for any one category level, the following processing is performed:
determining a category label of the information combination at any category level according to the sample category of the target sample object corresponding to the information combination at any category level;
determining the difference between the category label of any category level and the classification result of the information combination as the information combination loss;
and carrying out information combination fusion processing on the information combination losses corresponding to the plurality of information combinations respectively to obtain the hierarchy loss of any category hierarchy.
9. An electronic device, comprising:
a memory for storing executable instructions;
a processor for implementing the object classification method of any one of claims 1 to 3 or the classification model training method of any one of claims 4 to 8 when executing executable instructions stored in the memory.
10. A computer readable storage medium storing executable instructions for implementing the object classification method of any one of claims 1 to 3 or the classification model training method of any one of claims 4 to 8 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311065380.2A CN117312979A (en) | 2023-08-22 | 2023-08-22 | Object classification method, classification model training method and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311065380.2A CN117312979A (en) | 2023-08-22 | 2023-08-22 | Object classification method, classification model training method and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117312979A true CN117312979A (en) | 2023-12-29 |
Family
ID=89287373
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311065380.2A Pending CN117312979A (en) | 2023-08-22 | 2023-08-22 | Object classification method, classification model training method and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117312979A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118013047A (en) * | 2024-04-03 | 2024-05-10 | 浙江口碑网络技术有限公司 | Data classification prediction method and device based on large language model |
-
2023
- 2023-08-22 CN CN202311065380.2A patent/CN117312979A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118013047A (en) * | 2024-04-03 | 2024-05-10 | 浙江口碑网络技术有限公司 | Data classification prediction method and device based on large language model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Deep reinforcement learning in recommender systems: A survey and new perspectives | |
US20230195845A1 (en) | Fast annotation of samples for machine learning model development | |
US11645548B1 (en) | Automated cloud data and technology solution delivery using machine learning and artificial intelligence modeling | |
US11537506B1 (en) | System for visually diagnosing machine learning models | |
CN111291266A (en) | Artificial intelligence based recommendation method and device, electronic equipment and storage medium | |
CN112256537B (en) | Model running state display method and device, computer equipment and storage medium | |
CN110297911A (en) | Internet of Things (IOT) calculates the method and system that cognition data are managed and protected in environment | |
Li et al. | A CTR prediction model based on user interest via attention mechanism | |
CN112182362A (en) | Method and device for training model for online click rate prediction and recommendation system | |
Barry-Straume et al. | An evaluation of training size impact on validation accuracy for optimized convolutional neural networks | |
WO2023050143A1 (en) | Recommendation model training method and apparatus | |
CN112819024B (en) | Model processing method, user data processing method and device and computer equipment | |
CN117312979A (en) | Object classification method, classification model training method and electronic equipment | |
Guo | [Retracted] Financial Market Sentiment Prediction Technology and Application Based on Deep Learning Model | |
US20230117893A1 (en) | Machine learning techniques for environmental discovery, environmental validation, and automated knowledge repository generation | |
CN117573961A (en) | Information recommendation method, device, electronic equipment, storage medium and program product | |
CN116910357A (en) | Data processing method and related device | |
CN116308640A (en) | Recommendation method and related device | |
WO2021115269A1 (en) | User cluster prediction method, apparatus, computer device, and storage medium | |
CN112052386A (en) | Information recommendation method and device and storage medium | |
Sun et al. | Online programming education modeling and knowledge tracing | |
CN112749335B (en) | Lifecycle state prediction method, lifecycle state prediction apparatus, computer device, and storage medium | |
US20240211750A1 (en) | Developer activity modeler engine for a platform signal modeler | |
CN118245638B (en) | Method, device, equipment and storage medium for predicting graph data based on generalization model | |
CN116662814B (en) | Object intention prediction method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |