CN110223095A - Determine the method, apparatus, equipment and storage medium of item property - Google Patents
Determine the method, apparatus, equipment and storage medium of item property Download PDFInfo
- Publication number
- CN110223095A CN110223095A CN201810175616.0A CN201810175616A CN110223095A CN 110223095 A CN110223095 A CN 110223095A CN 201810175616 A CN201810175616 A CN 201810175616A CN 110223095 A CN110223095 A CN 110223095A
- Authority
- CN
- China
- Prior art keywords
- item property
- commodity title
- vocabulary
- vectorization
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 71
- 239000013598 vector Substances 0.000 claims description 161
- 238000012549 training Methods 0.000 claims description 145
- 230000011218 segmentation Effects 0.000 claims description 21
- 238000012545 processing Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 9
- 238000007637 random forest analysis Methods 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 238000003066 decision tree Methods 0.000 claims description 6
- 238000007477 logistic regression Methods 0.000 claims description 6
- 238000012706 support-vector machine Methods 0.000 claims description 6
- 238000013139 quantization Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 14
- PLXMOAALOJOTIY-FPTXNFDTSA-N Aesculin Natural products OC[C@@H]1[C@@H](O)[C@H](O)[C@@H](O)[C@H](O)[C@H]1Oc2cc3C=CC(=O)Oc3cc2O PLXMOAALOJOTIY-FPTXNFDTSA-N 0.000 description 8
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 5
- 239000010931 gold Substances 0.000 description 5
- 229910052737 gold Inorganic materials 0.000 description 5
- 230000000712 assembly Effects 0.000 description 3
- 238000000429 assembly Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 241000233855 Orchidaceae Species 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0282—Rating or review of business operators or products
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Accounting & Taxation (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Finance (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Marketing (AREA)
- Economics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the invention discloses method, apparatus, equipment and the storage mediums of a kind of determining item property characterized by comprising obtains commodity title;A point word is carried out to the commodity title to segment;Commodity title after dividing word to divide word described in vectorization obtains the commodity title of vectorization;By in the commodity title input item property model of the vectorization, the corresponding item property of the commodity title is exported.After the embodiment of the present invention, determine that the accuracy rate of item property is higher.
Description
Technical field
The present invention relates to computer field more particularly to a kind of method, apparatus, equipment and the computers of determining item property
Readable storage medium storing program for executing.
Background technique
Electric business platform determines item property there are mainly two types of mode at present.When a kind of mode is commodity publication, businessman
Voluntarily fill in item property.Another mode is by commodity transaction information, between the information such as logistics information and commodity title text
Connect determining item property.
Most item properties are that businessman voluntarily fills in, however also can have the case where businessman accidentally fills out and businessman is intentional
The case where mistake is filled out.Especially intentional mistake fills out the businessman of item property, often malice businessman, can cause to encroach on to consumers' rights and interests.
Secondly, by the information indirects such as commodity transaction information, logistics information and commodity title text determine the accuracy rate of item property compared with
It is low.
To sum up, following technical problem exists in the prior art: determining that the accuracy rate of item property is lower.
Summary of the invention
The embodiment of the invention provides method, apparatus, equipment and the storage mediums of a kind of determining item property, determine commodity
The accuracy rate school of attribute is high.
A kind of method of determining item property, comprising:
Obtain commodity title;
A point word is carried out to the commodity title to segment;
Commodity title after dividing word to divide word described in vectorization obtains the commodity title of vectorization;
By in the commodity title input item property model of the vectorization, the corresponding commodity category of the commodity title is exported
Property.
It is described a point word is carried out to the commodity title to segment, comprising:
A point word is carried out to the commodity title in conjunction with high frequency vocabulary to segment.
The combination high frequency vocabulary carries out a point word to the commodity title and segments, comprising:
The commodity title is segmented in conjunction with high frequency vocabulary, then by the commodity title in addition to high frequency vocabulary
Character carries out a point word.
The combination high frequency vocabulary carries out the commodity title before point word participle, further includes:
Commodity title in training sample is segmented to obtain word segmentation result;
Statistics participle is as a result, obtain the high frequency vocabulary.
After the acquisition high frequency vocabulary, further includes:
Specialized vocabulary is added in the high frequency vocabulary, to update the high frequency vocabulary.
Commodity title after dividing word to divide word described in the vectorization obtains the commodity title of vectorization, comprising:
Using word vector table and term vector table, the commodity title after dividing word to divide word described in vectorization obtains the commodity of vectorization
Title.
The word vector table is obtained using after the commodity title training word vector in the training sample after the segmentation of high frequency vocabulary
The vector table arrived;
The term vector table is obtained using after the commodity title training term vector in the training sample after the segmentation of high frequency vocabulary
The vector table arrived.
The word vector table is the vector table obtained using Skip-Gram model or CBOW model training word vector;
The term vector table is the vector table obtained using Skip-Gram model or CBOW model training term vector.
The commodity title by the vectorization inputs before preset item property model, further includes:
By the word vector table and the term vector table, the commodity title of vectorization training sample obtains vectorization
The commodity title of the training sample;
The commodity title of the training sample according to the vectorization and the item property of the training sample, training quotient
Product attribute model obtains the item property model.
The commodity title of the training sample according to the vectorization and the item property of the training sample, instruction
Practice item property model and obtain the item property model, comprising:
The commodity title of the training sample according to the vectorization and the item property of the training sample are based on dividing
Class device training item property model obtains the item property model.
The classifier includes decision tree, logistic regression, Bayes, neural network, random forest or support vector machines.
A kind of method of determining item property, comprising:
Receive the vocabulary character of user's input and the non-vocabulary character of user's input;
According to the vocabulary character of user input, the sequence of the vocabulary character of user input, user input
Non- vocabulary character and the user input non-vocabulary character sequence, construct commodity title;
The commodity title is sent to determine item property.
The non-vocabulary character of the vocabulary character for receiving user's input and user's input, comprising:
The vocabulary character that user calls input method to input is received, and calls the non-vocabulary character of input method input.
The sequence of the vocabulary character of user input include: the vocabulary character of user's input input sequence and/
Or the random sequence of the vocabulary character of user's input;
The sequence of the non-vocabulary character of user's input includes: the input sequence of the non-vocabulary character of user's input
And/or the random sequence of the non-vocabulary character of user's input.
The number of the commodity title is more than or equal to 1.
A kind of method of determining item property, comprising:
By word vector table and term vector table, the commodity title of vectorization training sample obtains the training of vectorization
The commodity title of sample;
The commodity title of the training sample according to the vectorization and the item property of the training sample, training quotient
Product attribute model trained after item property model;
The commodity title of vectorization is inputted in the item property model after the training, the commodity of the vectorization are exported
The corresponding item property of title.
The word vector table is the vector table obtained using Skip-Gram model or CBOW model training word vector;
The term vector table is the vector table obtained using Skip-Gram model or CBOW model training term vector.
The commodity title of the training sample according to the vectorization and the item property of the training sample, instruction
Practice item property model and obtain the item property model, comprising:
The commodity title of the training sample according to the vectorization and the item property of the training sample are based on dividing
Class device training item property model obtains the item property model.
The classifier includes decision tree, logistic regression, Bayes, neural network, random forest or support vector machines.
A kind of device of determining item property, described device include:
Module is obtained, for obtaining commodity title;
Processing module is segmented for carrying out a point word to the commodity title;
Vector module obtains the commodity title of vectorization for the commodity title after dividing word to divide word described in vectorization;
Output module, for exporting the commodity mark in the commodity title input item property model by the vectorization
Inscribe corresponding item property.
A kind of equipment of determining item property, memory, for storing program;
Processor, for running the described program stored in the memory, to execute the side of above-mentioned determining item property
Method.
A kind of computer storage medium is stored with computer program instructions in the computer storage medium;The calculating
Machine program instruction realizes above-mentioned determining item property method when being executed by processor.
A kind of device of determining item property, described device include:
Receiving module, for receiving the vocabulary character of user's input and the non-vocabulary character of user's input;
Module is constructed, the sequence of the vocabulary character of vocabulary character, user input for being inputted according to the user,
The sequence of the non-vocabulary character of the non-vocabulary character and user input of user's input, constructs commodity title;
Sending module, for sending the commodity title to determine item property.
A kind of equipment of determining item property,
Memory, for storing program;
Processor, for running the described program stored in the memory, to execute the side of above-mentioned determining item property
Method.
A kind of computer storage medium is stored with computer program instructions in the computer storage medium;The calculating
Such as the above-mentioned method for stating determining item property is realized when machine program instruction is executed by processor.
A kind of device of determining item property, described device include:
Title module, for by word vector table and term vector table, the commodity title of vectorization training sample to obtain vector
The commodity title for the training sample changed;
Training module, for the commodity title of the training sample according to the vectorization and the quotient of the training sample
Product attribute, the item property model after training item property model training;
Attribute module exports institute for inputting the commodity title of vectorization in the item property model after the training
State the corresponding item property of commodity title of vectorization.
A kind of equipment of determining item property,
Memory, for storing program;
Processor, for running the described program stored in the memory, to execute such as above-mentioned determining item property
Method.
A kind of computer storage medium is stored with computer program instructions in the computer storage medium;The calculating
The method such as above-mentioned determining item property is realized when machine program instruction is executed by processor.
After it can be seen that acquisition commodity title in above-mentioned technical proposal, a point word is carried out to commodity title and is segmented, by commodity
Title is divided into word and word.Then vectorization is carried out using word vector table and term vector table and obtain the commodity title of vectorization, with defeated
Enter the item property model that training obtains in advance, to export the corresponding item property of commodity title.Due to being based on commodity category
Property module export item property, it is thus determined that the accuracy rate school of item property is high.
Detailed description of the invention
The present invention may be better understood from the description with reference to the accompanying drawing to a specific embodiment of the invention wherein,
The same or similar appended drawing reference indicates the same or similar feature.
Fig. 1 is the schematic diagram of a scenario that item property is determined in the embodiment of the present invention;
Fig. 2 is the method flow schematic diagram that item property is determined in the embodiment of the present invention;
Fig. 3 is the method flow schematic diagram that item property is determined in another embodiment of the present invention;
Fig. 4 is the method flow schematic diagram that item property is determined in another embodiment of the invention;
Fig. 5 is the apparatus structure schematic diagram that item property is determined in the embodiment of the present invention;
Fig. 6 is the apparatus structure schematic diagram that item property is determined in another embodiment of the present invention;
Fig. 7 is the apparatus structure schematic diagram that item property is determined in another embodiment of the invention;
Fig. 8 is the exemplary hardware architecture of the calculating equipment of the method and apparatus of the determination item property of the embodiment of the present invention
Structure chart;
Fig. 9 is the exemplary hard of the calculating equipment of the method and apparatus of the determination item property of another embodiment of the present invention
The structure chart of part framework;
Figure 10 is the exemplary of the calculating equipment of the method and apparatus of the determination item property of another embodiment of the invention
The structure chart of hardware structure.
Specific embodiment
To make the object, technical solutions and advantages of the present invention express to be more clearly understood, with reference to the accompanying drawing and specifically
The present invention is further described in more detail for embodiment.
Item property can determine in several ways in electric business platform, it is contemplated that buyer is often crucial by input
Word, determines whether to browse the webpage of the commodity in search commercial articles title with the presence or absence of the keyword, therefore can be using passing through
Commodity title determines the attribute of commodity.
Item property generally refers to goods themselves specific features.As an example, commodity can be divided into physical commodity and
Virtual goods.Seller includes multiple keywords in commodity title to attract the webpages of buyer browses commodity.Show as one
Example, " video member network member privilege price ", this corresponding commodity title is virtual goods;" polychrome ox-hide foreign export
Former list high heel women's shoes ", this corresponding commodity sign is physical commodity.
In general, user inputs commodity title by client.Client can be located at PC, be also located at mobile device,
As an example, mobile device can be mobile phone or tablet computer.
Commodity title is made of multiple characters, and multiple characters may be constructed vocabulary, and the character for constituting vocabulary is known as word
Remittance character;Correspondingly, no character for constituting vocabulary is known as non-vocabulary character.Wherein, non-vocabulary character further includes punctuate symbol
Number etc..
User is by calling input method to input commodity title.User can input a character every time;It can also input every time
One vocabulary character;It is, of course, also possible to repeatedly input vocabulary character and non-vocabulary character.
As an example: commodity title are as follows: the former single high heel women's shoes of polychrome ox-hide foreign export.
If one character of input every time, it may be assumed that more/color/ox/skin/outer/trade/goes out/mouth/original/mono-/height/and with/female/shoes, it is total
Input 14 characters.
If one vocabulary character of input every time, it may be assumed that polychrome/ox-hide/foreign trade/outlet/original list/high heel/women's shoes amounts to input
7 vocabulary characters.
It is then also possible to input vocabulary character and non-vocabulary character, it may be assumed that polychrome/ox/skin/foreign trade/outlet/original/mono-/height
With/women's shoes.
For the same commodity title, the sequence of the sequence of vocabulary character and non-vocabulary character also results in commodity title
Difference.As an example, the first commodity title are as follows: the former single high heel women's shoes of polychrome ox-hide foreign export;Second commodity title
Are as follows: the former single high heel women's shoes of ox-hide polychrome foreign export, due in the second title " ox-hide " before " polychrome ", the second title is more
Emphasize the importance of " ox-hide ".
Therefore, in embodiments of the present invention, the sequence of vocabulary character can be the input sequence that user inputs vocabulary character,
It is also possible to the random sequence of the vocabulary character of user's input or the combination of input sequence and random sequence.In this way, in determination
When item property, so that it may fully consider the importance of each input vocabulary character.
Likewise, in embodiments of the present invention, the sequence of the non-vocabulary character of user's input can be the non-of user's input
The input sequence of vocabulary character is also possible to the random sequence or input sequence and random of the non-vocabulary character of user's input
The combination of sequence.When determining item property, so that it may fully consider the importance of each non-vocabulary character of input.
It, not only can be according to the non-word for the vocabulary character and user's input that user inputs during constructing commodity title
Remittance character, it is also necessary to consider the sequence of the sequence of the vocabulary character of user's input and the non-vocabulary character of user's input.In this way, quotient
The number more than one of product title, has multiple.As an example, the commodity title of user's input are as follows: polychrome ox-hide foreign trade
The former single high heel women's shoes in outlet.In view of user input vocabulary character sequence and user input non-vocabulary character it is suitable
After sequence, vocabulary character and the combination of non-vocabulary character sequential, identical vocabulary character and identical non-vocabulary character correspondence are more
A commodity title.
That is, user inputs a commodity title in client, client can send multiple commodity to server
Title.Wherein, multiple commodity signs are relevant to the commodity title that user inputs.
Server end obtains the commodity title that user is inputted by client.Server end can be utilized for commodity title
Natural language processing technique judges item property.
Natural language is given to the algorithm in machine learning to handle, it usually needs first by natural language mathematicization, word
Vector is exactly a kind of mode for the vocabulary in language to be carried out to mathematicization.
Simplest term vector mode is one-hot representation, is exactly indicated with a very long vector
One vocabulary, the length of vector are the size of dictionary, the component of vector only one 1, other are all 0.1 position corresponds to the word
Position of the remittance in dictionary.But there are two disadvantages for this lexical representation: (1) puzzlement by dimension disaster is easy, especially by it
When some algorithms for deep learning.(2) similitude that cannot well between portrayed words and word.
Natural language processing technique further includes insertion (embedding) technology being widely adopted recently, embedding tool
Body includes word2vec, doc2vec and characters2vec etc..Word2vec is for obtaining term vector (word
Vector kit).Doc2vec is the kit for obtaining article vector (doc vector).characters2vec
It is the kit for obtaining word vector (characters vector).
Item property is judged merely with word2vec, can generate a relatively large dictionary.As an example, to quotient
The training set that product quantity is more than 2,000 ten thousand is segmented, and one more than 60 ten thousand vocabulary can be obtained.If the setting of term vector length is longer
If, this word2vec term vector table occupy memory space it is larger, term vector length be exactly for by the word in language into
Length after row mathematicization expression.Calling model and term vector table are needed during determining item property, for online mould
For type, on-time model is limited to the limitation of computing capability and network condition, it is difficult to call the biggish term vector table of amount of storage.
Chinese character has very big advantage compared to English character, and Chinese character contains more information in character, if
Term vector before directly being replaced using word vector determines item property using word vector table, due to the storage of word vector table
Amount is much smaller than the amount of storage of term vector table, will greatly reduce the scale of coding schedule, but determines that the accuracy rate of item property is lower.
So item property can will be judged using word2vec and characters2vec, i.e., from term vector and word vector
Two aspects judge item property, to improve the accuracy rate of determining item property.
Meanwhile high frequency vocabulary can be counted in training sample, it is based on high frequency vocabulary training term vector and word vector.Specifically
For, training sample is selected from the commodity of electric business platform, and the commodity title and instruction of training sample are included at least in training sample
Practice the item property of sample.All commodity titles in training sample are segmented, and data processing is carried out to word segmentation result,
Data processing may include removal stop words.Stop words refers in information retrieval, to save memory space and improving search effect
Rate, certain words or word are fallen in meeting automatic fitration before or after handling natural language data (or text).
Then, based on word segmentation result obtain be directed to training sample high frequency vocabulary, can based on high frequency vocabulary training word to
Amount, word vector sum item property model.Since the quantity of high frequency vocabulary is much smaller than the quantity of vocabulary in word segmentation result, it is based on
The amount of storage for the term vector that high frequency words training obtains is based on vocabulary training in word segmentation result certainly less than general term vector and obtains
Term vector.So, item property is determined in the term vector table obtained using high frequency vocabulary training, then it is accurate equally to guarantee
Rate.
It should be noted that high frequency vocabulary is greater than after can be word segmentation result according to what the frequency statistics that vocabulary occurs obtained
The vocabulary of preset threshold.That is, high frequency words remittance abroad it is existing frequency it is higher than the frequency that normal words occur.
In addition, in practical applications in view of the high frequency vocabulary obtained according to the frequency statistics that vocabulary occurs not fully can
Enough meet demand, as an example, training sample is difficult to real-time update, occurs cyberspeak in network or represents specific meanings
The proprietary word of network.So in order to improve determining item property, then the height that can be obtained in the frequency statistics occurred according to vocabulary
On the basis of frequency vocabulary, specialized vocabulary is added as updated high frequency vocabulary.In this way, updated high frequency vocabulary can be abundant
Embody the higher vocabulary of frequency of use in practical application.Wherein, specialized vocabulary may include professional skill field proprietary vocabulary,
At least one of vocabulary and self-word creation that network words, expert arrange.The vocabulary that expert arranges refers to industry specialists from profession
The word of related fields that arranges of angle.Self-word creation refers to the word according to actual demand, created by user oneself
It converges.
The purpose segmented for the first time, which is to remove, interferes data, and secondary participle is can be according to high frequency vocabulary to commodity title weight
New segmentation.
It can first check with the presence or absence of high frequency vocabulary in commodity title, if there is high frequency vocabulary, then by high frequency vocabulary point
It is not divided into individual word, the remaining part of commodity title is directly divided according to character.
Since high frequency vocabulary number is less, commodity title is segmented in conjunction with high frequency vocabulary, it is clear that can guarantee to segment
Correctness, the character in commodity title in addition to high frequency vocabulary is then subjected to a point word.
As an example, commodity are entitled " 10000 gold medals are practiced in generation by five Brancard alliance, area of World of Warcraft's gold coin ", in quotient
It include keyword: " World of Warcraft in product title;Gold coin;Alliance ", then the segmentation result of the commodity title should are as follows: Warcraft generation
Boundary/gold coin/five/area/cloth/orchid/card/moral/alliance/1/0/0/0/0/0/ gold medal /=/ 2/3/./0/ yuan, i.e., by 20 non-vocabulary words
The combination that symbol and 3 vocabulary characters are formed.
Commodity title can be divided based on high frequency vocabulary again and obtain segmentation result, training word vector is to obtain word vector
Table is based on above-mentioned segmentation result, and training term vector is to obtain term vector table.The term vector table obtained based on high frequency vocabulary training
Data volume is less than normal term vector table.The embedding technology that training term vector and word vector are all made of.Show as one
Example, can be according to CBOW model or Skip-Gram model, is obtained word vector table and term vector table respectively based on segmentation result.
CBOW model and Skip-Gram model training term vector are illustrated below.CBOW model and Skip-
Gram model training word vector is similar with training term vector.
The training input of CBOW model is the corresponding term vector of context-sensitive word of some Feature Words, and exports just
It is the term vector of this specific one word.
As an example, context size value is 4, the output term vector that specific word namely needs.Context pair
The word answered has 8, and each 4 of front and back, this 8 words are the inputs of model.Since CBOW uses bag of words, this 8
Word is all equality, that is, does not consider the distance between word size, as long as within the context of specific word.
Skip-Gram model is opposite, the i.e. input term vector that is specific word with the thinking of CBOW model, and exporting is
The corresponding context term vector of specific word.
As an example, context size value is 4, and specific word is input, and this about 8 clictions are output.
That is, a vocabulary to be mapped to the vector of a regular length by training, all these vectors are put together
Form a term vector table, and each vector is then that term vector table corresponds to a point in space, introduced on this space " away from
From ", then the similitude or correlation between them can be judged according to the distance between word.Term vector space, that is, term vector
Table.
That is, a word to be mapped to the vector of a regular length by training, all these vectors are put together
Form a word vector table, and each vector is then that word vector table corresponds to a point in space, introduced on this space " away from
From ", then the similitude or correlation between them can be judged according to the distance between word.
The commodity title vectorization of training sample is obtained into the training sample of vectorization by word vector table and term vector table
This commodity title.That is, according in training sample, available commodity title with after vectorization commodity title it is corresponding
Relationship.Since the corresponding item property of training sample commodity title is known.Therefore, can by commodity title after vectorization and
Item property trains item property model.
The commodity title of vectorization is inputted commodity category by training item property identification model on the basis of existing classifier
Property identification model, the output of item property identification model is item property.Wherein, by largely testing it is found that classifier can
To be decision tree, logistic regression, Bayes, neural network, random forest (Random Forests, RF) or support vector machines.
Therefore, after training obtains word vector table and term vector table, training sample, word vector table and term vector table are being based on
Training obtains item property identification model.
Then word vector table, term vector table and the item property identification model that can be obtained based on training determine commodity category
Property.
When obtaining new commodity title every time, commodity title can be input to item property identification model
Obtain the corresponding item property of commodity title.
It specifically, is the schematic diagram of a scenario that item property is determined in the embodiment of the present invention referring to Fig. 1, Fig. 1.Seller passes through
Client inputs the commodity title of new commodity, then when the platform to electric business that new commodity is online by server, obtains its commodity
Title, and a point word is carried out to commodity title and is segmented, the commodity title after available point of word participle.
Commodity title after vectorization divides word to divide word obtains the commodity title of vectorization.It can will be made of in this way character
Commodity title switchs to the form for mathematic sign.
In this way, item property model can identify the commodity title of vectorization, and the commodity mark of the vectorization based on input
Topic, the corresponding item property of output commodity title.
Based on above-mentioned process, the method for determining item property provided in an embodiment of the present invention can be summarized as shown in Fig. 2
The step of.Fig. 2 is the flow diagram that the method for item property is determined in the embodiment of the present invention, may include:
S201, commodity title is obtained.
Seller by new commodity by the server online platform to electric business when, server can obtain commodity in several ways
Title.If seller directly inputs the text information of commodity title by client, commodity title can be directly acquired.If seller
In the form of other, commodity title is inputted such as in the form of picture, then can be then based on template matching to picture recognition is carried out
Or geometrical feature extraction obtains commodity title.
S202, a point word is carried out to commodity title segment.
Being divided to word participle to refer to is divided to word and participle two processes.For commodity title, first commodity title can be carried out
Word is divided to obtain point word as a result, then again to dividing word result to segment, obtain commodity title divides word word segmentation result.It can also be first
Commodity title is segmented to obtain word segmentation result, a point word then is carried out to word segmentation result again, obtain commodity title divides word point
Word result.
Commodity title after S203, vectorization divide word to divide word obtains the commodity title of vectorization.
Commodity title vector including character is turned into the commodity title including mathematic sign, so that item property model can
With identification.
S204, the commodity title of vectorization is inputted in item property model, the corresponding item property of output commodity title.
After item property model receives the commodity title of vectorization, the corresponding attribute of commodity title can be directly inputted.
Wherein, item property model can be pre-set model, be also possible to the model that training obtains in advance.
In embodiments of the present invention, word is segmented and divided to commodity title, it is possible to reduce data processing amount and amount of storage.
Using item property model, the accuracy rate of the item property of determining commodity title can be improved.
In one embodiment of the invention, a point word can be carried out to commodity title in conjunction with high frequency vocabulary to segment.
High frequency vocabulary is the biggish vocabulary of probability of occurrence in commodity title, and carrying out point words in conjunction with high frequency vocabulary can be effective
Reduce the data storage capacity of term vector.
As an example, commodity title can be segmented in conjunction with high frequency vocabulary, height then will be removed in commodity title
Character carries out a point word other than frequency vocabulary.High frequency vocabulary, term vector and word vector are effectively combined, on the one hand ensure word to
In a certain range, another ensures the correctness segmented to the data storage capacity of amount.
In one embodiment of the invention, the commodity title in training sample can be segmented to obtain participle knot
Fruit.Statistics participle will be greater than the word of preset threshold as a result, according to the vocabulary greater than preset threshold that the frequency that vocabulary occurs obtains
It converges and is used as high frequency vocabulary.
In addition, it is contemplated that can not fully meet demand according to the high frequency vocabulary that the frequency statistics that vocabulary occurs obtains,
Then specialized vocabulary can be added as updated on the basis of the high frequency vocabulary that the frequency statistics occurred according to vocabulary obtains
High frequency vocabulary.In this way, updated high frequency vocabulary can fully demonstrate the higher vocabulary of frequency of use in practical application.
In one embodiment of the invention, word vector table and term vector table be can use, after vectorization divides word to divide word
Commodity title obtains the commodity title of vectorization.
Wherein, word vector table is obtained using after the commodity title training word vector in the training sample after the segmentation of high frequency vocabulary
The vector table arrived.
Term vector table is obtained using after the commodity title training term vector in the training sample after the segmentation of high frequency vocabulary
Vector table.
As an example, word vector table be obtained using Skip-Gram model or CBOW model training word vector to
Scale.Term vector table is the vector table obtained using Skip-Gram model or CBOW model training term vector.
It is the flow diagram for determining the method for item property in another embodiment of the present invention referring to Fig. 3, Fig. 3, passes through
Technical solution in Fig. 3 can train to obtain item property model.It can specifically include following steps:
S301, pass through word vector table and term vector table, the commodity title of vectorization training sample obtains the training of vectorization
The commodity title of sample.
Word vector table be it is pre-set, term vector table is also possible to pre-set.As an example, word vector table
It is the vector table obtained using Skip-Gram model or CBOW model training word vector.Term vector table is to utilize Skip-Gram mould
The vector table that type or CBOW model training term vector obtain.
By word vector table and term vector table, the training of vectorization can be obtained with the commodity title of vectorization training sample
The commodity title of sample.Training sample can be what the commodity title that basis obtains at random obtained.
S302, according to vectorization training sample commodity title and training sample item property, training item property
Item property model after model training.
Classifier is the general designation for the method classified in data mining to sample.In embodiments of the present invention, according to
The commodity title of the training sample of quantization and the item property of training sample can be obtained based on classifier training item property model
Item property model after to training.
Classifier may include decision tree, logistic regression, Bayes, neural network, random forest or support vector machines.
S303, the commodity title of vectorization is inputted in the item property model after training, the commodity mark of output vector
Inscribe corresponding item property.
The commodity title of vectorization can be inputted in the item property model after training, thus direct output vector
The corresponding item property of commodity title.
Referring to fig. 4, Fig. 4 is the flow diagram that the method for item property is determined in another of the invention embodiment, in Fig. 4
Technical solution be applied to user terminal, specifically include:
The non-vocabulary character of S401, the vocabulary character for receiving user's input and user's input.
User inputs commodity title by client.Client can be located at PC, be also located at mobile device, show as one
Example, mobile device can be mobile phone or tablet computer.
Commodity title is made of multiple characters, and multiple characters may be constructed vocabulary, and the character for constituting vocabulary is known as word
Remittance character;Correspondingly, no character for constituting vocabulary is known as non-vocabulary character.Wherein, non-vocabulary character further includes punctuate symbol
Number etc..
As an example, the vocabulary character that user calls input method to input is received, and calls the non-of input method input
Vocabulary character.
The non-vocabulary that S402, the vocabulary character according to user's input, the sequence of the vocabulary character of user's input, user input
The sequence of character and the non-vocabulary character of user's input, constructs commodity title.
The number of commodity title is more than or equal to 1.Wherein, the sequence of the vocabulary character of user's input includes: what user inputted
The random sequence of the input sequence of vocabulary character and/or the vocabulary character of user's input.User input non-vocabulary character it is suitable
Sequence includes: the random sequence of the input sequence of the non-vocabulary character of user's input and/or the non-vocabulary character of user's input.
S403, commodity title is sent to determine item property.
User end to server sends commodity title, so that it is determined that the corresponding item property of commodity title.
Corresponding with above-mentioned embodiment of the method, the embodiment of the present invention also provides a kind of device of determining item property, such as
Shown in Fig. 5.Fig. 5 shows the apparatus structure schematic diagram that item property is determined in the embodiment of the present invention.It may include: acquisition module
501, processing module 502, vector module 503 and output module 504.
Module 501 is obtained, for obtaining commodity title.
Processing module 502 is segmented for carrying out a point word to commodity title.
Vector module 503, the commodity title after dividing word to divide word for vectorization obtain the commodity title of vectorization.
Output module 504, in the commodity title input item property module by vectorization, output commodity title to be corresponding
Item property.
In embodiments of the present invention, word is segmented and divided to commodity title, reduces data processing amount and amount of storage.It utilizes
The accuracy rate of the item property of determining commodity title can be improved in item property model.
Corresponding with above-mentioned embodiment of the method, the embodiment of the present invention also provides a kind of device of determining item property, such as
Shown in Fig. 6.Fig. 6 shows the apparatus structure schematic diagram that item property is determined in another embodiment of the present invention.It may include: to connect
Receive module 601, building module 602 and sending module 603.
Receiving module 601, for receiving the vocabulary character of user's input and the non-vocabulary character of user's input.
Module 602 is constructed, the sequence of the vocabulary character for vocabulary character, user's input according to user's input, user
The sequence of the non-vocabulary character of input and the non-vocabulary character of user's input, constructs commodity title.
Sending module 603, for sending commodity title to determine item property.
In embodiments of the present invention, according to the vocabulary character of user's input, the sequence of the vocabulary character of user's input, user
The sequence of the non-vocabulary character of input and the non-vocabulary character of user's input, constructs commodity title, sends commodity title with determination
Item property, to improve the accuracy rate of the item property of determining commodity title.
Corresponding with above-mentioned embodiment of the method, the embodiment of the present invention also provides a kind of device of determining item property, such as
Shown in Fig. 7.Fig. 7 shows the apparatus structure schematic diagram that item property is determined in another embodiment of the invention.It may include: mark
Inscribe module 701, training module 702 and attribute module 703.
Title module 701, for by word vector table and term vector table, the commodity title of vectorization training sample to be obtained
The commodity title of the training sample of vectorization.
Training module 702, for the commodity title of the training sample according to vectorization and the item property of training sample, instruction
Item property model after practicing the training of item property model;
Attribute module 703, for the commodity title of vectorization to be inputted in the item property model after training, output vector
The corresponding item property of commodity title of change.
Fig. 8 is the calculating equipment for showing the method and apparatus that can be realized determining item property according to an embodiment of the present invention
Exemplary hardware architecture structure chart.
As shown in figure 8, calculating equipment 800 includes input equipment 801, input interface 802, central processing unit 803, memory
804, output interface 805 and output equipment 806.Wherein, input interface 802, central processing unit 803, memory 804 and
Output interface 805 is connected with each other by bus 810, and input equipment 801 and output equipment 806 pass through 802 He of input interface respectively
Output interface 805 is connect with bus 810, and then is connect with the other assemblies for calculating equipment 800.
Specifically, input equipment 801 is received from external input information, and will input information by input interface 802
It is transmitted to central processing unit 803;Central processing unit 803 is based on the computer executable instructions stored in memory 804 to input
Information is handled to generate output information, and output information is temporarily or permanently stored in memory 804, is then passed through
Output information is transmitted to output equipment 806 by output interface 805;Output information is output to and calculates equipment 800 by output equipment 806
Outside for users to use.
That is, calculating equipment shown in Fig. 8 also may be implemented as including: to be stored with computer executable instructions
Memory;And processor, the processor may be implemented to combine Fig. 2 and Fig. 5 description when executing computer executable instructions
The method and apparatus for determining item property.
Fig. 9 is the calculating equipment for showing the method and apparatus that can be realized determining item property according to an embodiment of the present invention
Exemplary hardware architecture structure chart.
As shown in figure 9, calculating equipment 900 includes input equipment 901, input interface 902, central processing unit 903, memory
904, output interface 905 and output equipment 906.Wherein, input interface 902, central processing unit 903, memory 904 and
Output interface 905 is connected with each other by bus 910, and input equipment 901 and output equipment 906 pass through 902 He of input interface respectively
Output interface 905 is connect with bus 910, and then is connect with the other assemblies for calculating equipment 900.
Specifically, input equipment 901 is received from external input information, and will input information by input interface 902
It is transmitted to central processing unit 903;Central processing unit 903 is based on the computer executable instructions stored in memory 904 to input
Information is handled to generate output information, and output information is temporarily or permanently stored in memory 904, is then passed through
Output information is transmitted to output equipment 906 by output interface 905;Output information is output to and calculates equipment 900 by output equipment 906
Outside for users to use.
That is, calculating equipment shown in Fig. 9 also may be implemented as including: to be stored with computer executable instructions
Memory;And processor, the processor may be implemented to combine Fig. 3 and Fig. 6 description when executing computer executable instructions
The method and apparatus for determining item property.
Figure 10 is to show the calculating of the method and apparatus that can be realized determining item property according to an embodiment of the present invention to set
The structure chart of standby exemplary hardware architecture.
As shown in Figure 10, calculate equipment 1000 include input equipment 1001, input interface 1002, central processing unit 1003,
Memory 1004, output interface 1005 and output equipment 1006.Wherein, input interface 1002, central processing unit 1003, deposit
Reservoir 1004 and output interface 1005 are connected with each other by bus 1010, and input equipment 1001 and output equipment 1006 are distinguished
It is connect by input interface 1002 and output interface 1005 with bus 1010, and then connected with the other assemblies for calculating equipment 1000
It connects.
Specifically, input equipment 1001 is received from external input information, and is believed input by input interface 1002
Breath is transmitted to central processing unit 1003;Central processing unit 1003 is based on the computer executable instructions pair stored in memory 1004
Input information is handled to generate output information, output information is temporarily or permanently stored in memory 1004, so
Output information is transmitted to by output equipment 1006 by output interface 1005 afterwards;Output information is output to meter by output equipment 1006
Calculate the outside of equipment 1000 for users to use.
That is, calculating equipment shown in Fig. 10 also may be implemented as including: to be stored with computer executable instructions
Memory;And processor, the processor may be implemented that Fig. 4 and Fig. 7 is combined to describe when executing computer executable instructions
Determination item property method and apparatus.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent
Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to
So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into
Row equivalent replacement;And these are modified or replaceed, and the essence of corresponding technical solution is not made to be detached from various embodiments of the present invention technology
The range of scheme.
Claims (28)
1. a kind of method of determining item property characterized by comprising
Obtain commodity title;
A point word is carried out to the commodity title to segment;
Commodity title after dividing word to divide word described in vectorization obtains the commodity title of vectorization;
By in the commodity title input item property model of the vectorization, the corresponding item property of the commodity title is exported.
2. determining the method for item property according to claim 1, which is characterized in that described to divide the commodity title
Word participle, comprising:
A point word is carried out to the commodity title in conjunction with high frequency vocabulary to segment.
3. determining the method for item property according to claim 2, which is characterized in that the combination high frequency vocabulary is to the quotient
Product title carries out a point word and segments, comprising:
The commodity title is segmented in conjunction with high frequency vocabulary, then by the commodity title in addition to high frequency vocabulary character
Carry out a point word.
4. determining the method for item property according to claim 2, which is characterized in that the combination high frequency vocabulary is to the quotient
Product title carries out before point word participle, further includes:
Commodity title in training sample is segmented to obtain word segmentation result;
Statistics participle is as a result, obtain the high frequency vocabulary.
5. determining the method for item property according to claim 4, which is characterized in that it is described obtain the high frequency vocabulary it
Afterwards, further includes:
Specialized vocabulary is added in the high frequency vocabulary, to update the high frequency vocabulary.
6. determining the method for item property according to claim 1, which is characterized in that after dividing word to divide word described in the vectorization
Commodity title obtain the commodity title of vectorization, comprising:
Using word vector table and term vector table, the commodity title after dividing word to divide word described in vectorization obtains the commodity mark of vectorization
Topic.
7. determining the method for item property according to claim 6, which is characterized in that the word vector table is to utilize high frequency words
The vector table obtained after commodity title training word vector in training sample after remittance segmentation;
The term vector table is obtained using after the commodity title training term vector in the training sample after the segmentation of high frequency vocabulary
Vector table.
8. determining the method for item property according to claim 6, which is characterized in that the word vector table is to utilize Skip-
The vector table that Gram model or CBOW model training word vector obtain;
The term vector table is the vector table obtained using Skip-Gram model or CBOW model training term vector.
9. determining the method for item property according to claim 6, which is characterized in that the commodity mark by the vectorization
Topic inputs before preset item property model, further includes:
By the word vector table and the term vector table, the commodity title of vectorization training sample obtains the described of vectorization
The commodity title of training sample;
The commodity title of the training sample according to the vectorization and the item property of the training sample, training commodity category
Property model obtains the item property model.
10. determining the method for item property according to claim 9, which is characterized in that the institute according to the vectorization
The commodity title of training sample and the item property of the training sample are stated, training item property model obtains the item property
Model, comprising:
The commodity title of the training sample according to the vectorization and the item property of the training sample are based on classifier
Training item property model obtains the item property model.
11. according to claim 10 determine item property method, which is characterized in that the classifier include decision tree,
Logistic regression, Bayes, neural network, random forest or support vector machines.
12. a kind of method of determining item property characterized by comprising
Receive the vocabulary character of user's input and the non-vocabulary character of user's input;
It is inputted according to the vocabulary character of user input, the sequence of the vocabulary character of user input, the user non-
The sequence of vocabulary character and the non-vocabulary character of user input, constructs commodity title;
The commodity title is sent to determine item property.
13. determining the method for item property according to claim 12, which is characterized in that the vocabulary for receiving user's input
The non-vocabulary character of character and user's input, comprising:
The vocabulary character that user calls input method to input is received, and calls the non-vocabulary character of input method input.
14. determining the method for item property according to claim 12, which is characterized in that the vocabulary character of user's input
Sequence include: user input vocabulary character input sequence and/or user input vocabulary character it is random
Sequentially;
The sequence of the non-vocabulary character of user input include: the non-vocabulary character of user's input input sequence and/
Or the random sequence of the non-vocabulary character of user's input.
15. determining the method for item property according to claim 12, which is characterized in that the number of the commodity title is greater than
Equal to 1.
16. a kind of method of determining item property characterized by comprising
By word vector table and term vector table, the commodity title of vectorization training sample obtains the training sample of vectorization
Commodity title;
The commodity title of the training sample according to the vectorization and the item property of the training sample, training commodity category
Item property model after property model training;
The commodity title of vectorization is inputted in the item property model after the training, the commodity title of the vectorization is exported
Corresponding item property.
17. the method for the 6 determining item properties according to claim 1, which is characterized in that the word vector table is to utilize
The vector table that Skip-Gram model or CBOW model training word vector obtain;
The term vector table is the vector table obtained using Skip-Gram model or CBOW model training term vector.
18. the method for the 6 determining item properties according to claim 1, which is characterized in that the institute according to the vectorization
The commodity title of training sample and the item property of the training sample are stated, training item property model obtains the item property
Model, comprising:
The commodity title of the training sample according to the vectorization and the item property of the training sample are based on classifier
Training item property model obtains the item property model.
19. the method for the 8 determining item properties according to claim 1, which is characterized in that the classifier include decision tree,
Logistic regression, Bayes, neural network, random forest or support vector machines.
20. a kind of device of determining item property, which is characterized in that described device includes:
Module is obtained, for obtaining commodity title;
Processing module is segmented for carrying out a point word to the commodity title;
Vector module obtains the commodity title of vectorization for the commodity title after dividing word to divide word described in vectorization;
Output module, for exporting the commodity title pair in the commodity title input item property model by the vectorization
The item property answered.
21. a kind of equipment of determining item property, which is characterized in that
Memory, for storing program;
Processor, for running the described program stored in the memory, to execute as any right of claim 1-11 is wanted
The method for seeking the determining item property.
22. a kind of computer storage medium, which is characterized in that be stored with computer program in the computer storage medium and refer to
It enables;The side that item property is determined as described in claim 1-11 is any is realized when the computer program instructions are executed by processor
Method.
23. a kind of device of determining item property, which is characterized in that described device includes:
Receiving module, for receiving the vocabulary character of user's input and the non-vocabulary character of user's input;
Construct module, it is the sequence for the vocabulary character that vocabulary character for input according to the user, the user input, described
The sequence of the non-vocabulary character of the non-vocabulary character and user input of user's input, constructs commodity title;
Sending module, for sending the commodity title to determine item property.
24. a kind of equipment of determining item property, which is characterized in that
Memory, for storing program;
Processor, for running the described program stored in the memory, to execute as any right of claim 12-15 is wanted
The method for seeking the determining item property.
25. a kind of computer storage medium, which is characterized in that be stored with computer program in the computer storage medium and refer to
It enables;It is realized when the computer program instructions are executed by processor determining item property as described in claim 12-15 is any
Method.
26. a kind of device of determining item property, which is characterized in that described device includes:
Title module, for by word vector table and term vector table, the commodity title of vectorization training sample to obtain vectorization
The commodity title of the training sample;
Training module, for the commodity title of the training sample according to the vectorization and the commodity category of the training sample
Property, the item property model after training item property model training;
Attribute module, for the commodity title of vectorization to be inputted in the item property model after the training, export it is described to
The corresponding item property of commodity title of quantization.
27. a kind of equipment of determining item property, which is characterized in that
Memory, for storing program;
Processor, for running the described program stored in the memory, to execute as any right of claim 16-19 is wanted
The method for seeking the determining item property.
28. a kind of computer storage medium, which is characterized in that be stored with computer program in the computer storage medium and refer to
It enables;It is realized when the computer program instructions are executed by processor determining item property as described in claim 16-19 is any
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810175616.0A CN110223095A (en) | 2018-03-02 | 2018-03-02 | Determine the method, apparatus, equipment and storage medium of item property |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810175616.0A CN110223095A (en) | 2018-03-02 | 2018-03-02 | Determine the method, apparatus, equipment and storage medium of item property |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110223095A true CN110223095A (en) | 2019-09-10 |
Family
ID=67821962
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810175616.0A Pending CN110223095A (en) | 2018-03-02 | 2018-03-02 | Determine the method, apparatus, equipment and storage medium of item property |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110223095A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113256379A (en) * | 2021-05-24 | 2021-08-13 | 北京小米移动软件有限公司 | Method for correlating shopping demands for commodities |
CN113570427A (en) * | 2021-07-22 | 2021-10-29 | 上海普洛斯普新数字科技有限公司 | System for extracting and identifying on-line or system commodity characteristic information |
CN113592512A (en) * | 2021-07-22 | 2021-11-02 | 上海普洛斯普新数字科技有限公司 | Online commodity identity uniqueness identification and confirmation system |
CN114153979A (en) * | 2022-02-09 | 2022-03-08 | 北京泰迪熊移动科技有限公司 | Commodity keyword identification method and device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102929937A (en) * | 2012-09-28 | 2013-02-13 | 福州博远无线网络科技有限公司 | Text-subject-model-based data processing method for commodity classification |
CN105243129A (en) * | 2015-09-30 | 2016-01-13 | 清华大学深圳研究生院 | Commodity property characteristic word clustering method |
CN107203548A (en) * | 2016-03-17 | 2017-09-26 | 阿里巴巴集团控股有限公司 | Attribute acquisition methods and device |
US20170357896A1 (en) * | 2016-06-09 | 2017-12-14 | Sentient Technologies (Barbados) Limited | Content embedding using deep metric learning algorithms |
-
2018
- 2018-03-02 CN CN201810175616.0A patent/CN110223095A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102929937A (en) * | 2012-09-28 | 2013-02-13 | 福州博远无线网络科技有限公司 | Text-subject-model-based data processing method for commodity classification |
CN105243129A (en) * | 2015-09-30 | 2016-01-13 | 清华大学深圳研究生院 | Commodity property characteristic word clustering method |
CN107203548A (en) * | 2016-03-17 | 2017-09-26 | 阿里巴巴集团控股有限公司 | Attribute acquisition methods and device |
US20170357896A1 (en) * | 2016-06-09 | 2017-12-14 | Sentient Technologies (Barbados) Limited | Content embedding using deep metric learning algorithms |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113256379A (en) * | 2021-05-24 | 2021-08-13 | 北京小米移动软件有限公司 | Method for correlating shopping demands for commodities |
CN113570427A (en) * | 2021-07-22 | 2021-10-29 | 上海普洛斯普新数字科技有限公司 | System for extracting and identifying on-line or system commodity characteristic information |
CN113592512A (en) * | 2021-07-22 | 2021-11-02 | 上海普洛斯普新数字科技有限公司 | Online commodity identity uniqueness identification and confirmation system |
CN114153979A (en) * | 2022-02-09 | 2022-03-08 | 北京泰迪熊移动科技有限公司 | Commodity keyword identification method and device, electronic equipment and storage medium |
CN114153979B (en) * | 2022-02-09 | 2022-05-13 | 北京泰迪熊移动科技有限公司 | Commodity keyword identification method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111784455B (en) | Article recommendation method and recommendation equipment | |
CN103778214B (en) | A kind of item property clustering method based on user comment | |
CN107330752B (en) | Method and device for identifying brand words | |
CN110223095A (en) | Determine the method, apparatus, equipment and storage medium of item property | |
US20180053234A1 (en) | Description information generation and presentation systems, methods, and devices | |
CN106959966A (en) | A kind of information recommendation method and system | |
CN112330455A (en) | Method, device, equipment and storage medium for pushing information | |
CN113449187A (en) | Product recommendation method, device and equipment based on double portraits and storage medium | |
CN112733042A (en) | Recommendation information generation method, related device and computer program product | |
CN114240552A (en) | Product recommendation method, device, equipment and medium based on deep clustering algorithm | |
CN110503459A (en) | User credit degree appraisal procedure, device and storage medium based on big data | |
CN111797622B (en) | Method and device for generating attribute information | |
CN114387061A (en) | Product pushing method and device, electronic equipment and readable storage medium | |
CN110135769A (en) | Kinds of goods attribute fill method and device, storage medium and electric terminal | |
CN110276065A (en) | A kind of method and apparatus handling goods review | |
CN110633398A (en) | Method for confirming central word, searching method, device and storage medium | |
CN110110035A (en) | Data processing method and device and computer readable storage medium | |
CN113761114A (en) | Phrase generation method and device and computer-readable storage medium | |
CN109359198A (en) | A kind of file classification method and device | |
CN115510212A (en) | Text event extraction method, device, equipment and storage medium | |
CN110363206A (en) | Cluster, data processing and the data identification method of data object | |
JP2023554210A (en) | Sort model training method and apparatus for intelligent recommendation, intelligent recommendation method and apparatus, electronic equipment, storage medium, and computer program | |
CN104102662A (en) | Method and device for determining interest and preference similarity of users | |
CN113204643B (en) | Entity alignment method, device, equipment and medium | |
CN112184250B (en) | Method and device for generating retrieval page, storage medium and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190910 |