Nothing Special   »   [go: up one dir, main page]

CN115828862A - Data processing method, text display method, data processing system and equipment - Google Patents

Data processing method, text display method, data processing system and equipment Download PDF

Info

Publication number
CN115828862A
CN115828862A CN202211460790.2A CN202211460790A CN115828862A CN 115828862 A CN115828862 A CN 115828862A CN 202211460790 A CN202211460790 A CN 202211460790A CN 115828862 A CN115828862 A CN 115828862A
Authority
CN
China
Prior art keywords
text
commodity
information
categories
target commodity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211460790.2A
Other languages
Chinese (zh)
Inventor
鲁志红
赵帅帅
刘敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Alibaba Overseas Internet Industry Co ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202211460790.2A priority Critical patent/CN115828862A/en
Publication of CN115828862A publication Critical patent/CN115828862A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application provides a data processing method, a text display method, a data processing system and data processing equipment. The data processing method comprises the following steps: acquiring multimedia data related to a target commodity from a plurality of data sources; processing the multimedia data to obtain a plurality of text messages; determining corresponding categories for the text information respectively based on the categories corresponding to the commodity file; and editing the texts of the plurality of text messages according to categories to obtain a structured description text of the target commodity. According to the scheme provided by each embodiment of the application, the text information is sourced from a plurality of data sources, so that the content richness of the commodity description text is high; in addition, the description text can be crawled and included by a search engine, which is helpful for improving the search recall rate of commodities; in addition, the structured description text is convenient and easy to read, language translation can be performed by using the self-contained translation function of the application or the browser, and user experience is good.

Description

Data processing method, text display method, data processing system and equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, a text display method, a data processing system, and a device.
Background
The detailed contents of the commodities displayed on the commodity page of the e-commerce platform mainly take pictures as main contents, and the text information is less, so that the crawling and the receiving of a search engine are influenced, and the recall rate of the commodity information is further influenced. On the other hand, the characters in the picture cannot be translated by utilizing the translation function of the application or the browser. For example, most of characters in commodity detail pictures on some cross-border e-commerce platforms are in English, and because translation cannot be performed, the user experience of non-English countries is not good, and information of commodities cannot be directly and accurately known.
Disclosure of Invention
Embodiments of the present application provide a data processing method, a text display method, a data processing system, and a device, which can improve the above problems.
In one embodiment of the application, a data processing method is provided and is suitable for a server. The data processing method comprises the following steps:
acquiring multimedia data related to a target commodity from a plurality of data sources;
processing the multimedia data to obtain a plurality of text messages;
respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file;
and performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity.
In another embodiment of the present application, a text display method is provided, which is suitable for a first client. The text display method comprises the following steps:
responding to the operation of a first user for a target commodity, and sending a file application request to a server;
receiving a structured first description text of the target commodity fed back by the server;
displaying the first description text;
the first description text is obtained by text editing of a plurality of text information classification items, and the text information in the text information is classified into the categories in advance based on the categories corresponding to the commodity file; the plurality of textual information is determined from multimedia data associated with the target item at a plurality of data sources.
In another embodiment of the present application, a text display method is further provided, which is suitable for a first client. The text display method comprises the following steps:
responding to an instruction of a user for a target commodity, and acquiring multimedia data associated with the target commodity from a plurality of data sources;
processing the multimedia data to obtain a plurality of text messages;
respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file;
performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity;
and displaying the description text on a commodity page of the target commodity.
In yet another embodiment of the present application, a data processing system is provided. The system comprises:
the server is used for acquiring multimedia data related to the target commodity from a plurality of data sources; processing the multimedia data to obtain a plurality of text messages; respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file; performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity; adding the structural description text into the commodity page information of the target commodity;
the second client is used for responding to the operation of a second user on the target commodity and acquiring the commodity page information from the server; and displaying the commodity page of the target commodity based on the commodity page information.
In yet another embodiment of the present application, a data processing system is provided. The system comprises:
the first client is used for responding to the operation of a first user for the target commodity and sending a file application request to the server;
the server is used for responding to the file application request and acquiring multimedia data related to the target commodity from a plurality of data sources; processing the multimedia data to obtain a plurality of text messages; respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file; performing text editing on the plurality of text messages according to categories to generate a structured first description text of the target commodity;
the first client is further used for receiving a first description text of the target commodity fed back by the server; and displaying the first descriptive text.
The present application further provides an embodiment of a computing device comprising a memory storing one or more computer instructions and a processor; the processor, coupled to the memory, is configured to execute the one or more computer instructions to implement the steps of the data processing method or the steps of the text display method.
Embodiments of the present application also provide a computer-readable storage medium storing computer instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of the above data processing method or the above text display method.
Yet another embodiment of the present application provides a computer program product comprising a computer program or instructions which, when executed by a processor, cause the processor to perform the steps of the above-mentioned data processing method or the steps of the above-mentioned text display method.
According to the technical scheme provided by each embodiment of the application, multimedia data related to a target commodity are obtained from a plurality of data sources; then processing the multimedia data to obtain a plurality of text messages; and classifying the categories of the plurality of documents respectively based on the categories corresponding to the commodity documents so as to edit the text information according to the categories to obtain the structured description text of the target commodity. Therefore, according to the scheme provided by the embodiment of the application, the content richness of the commodity description text is high because the text information is sourced from a plurality of data sources; in addition, the description text can be crawled and included by a search engine, which is helpful for improving the search recall rate of commodities; in addition, the structured description text is convenient and easy to read, language translation can be performed by using an application or a translation function carried by a browser, user experience is good, and multi-country users can be helped to acquire commodity information more directly and accurately.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required to be utilized in the description of the embodiments or the prior art are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained according to the drawings without creative efforts for those skilled in the art.
FIG. 1 is a schematic diagram of a data processing system according to an embodiment of the present application;
FIG. 2 illustrates another schematic diagram of a data processing system provided by an embodiment of the present application;
FIG. 3 is a flow chart illustrating a data processing method provided herein;
fig. 4 shows a schematic view of a display mode of a commodity page in the technical solution provided by the present application;
fig. 5 is a schematic illustration showing a description text containing text contents corresponding to various categories in the technical solution provided by an embodiment of the present application;
FIG. 6 is a flow chart illustrating a text display method according to an embodiment of the present application;
FIG. 7 is a flow chart illustrating a text display method according to another embodiment of the present application;
FIG. 8 is a schematic diagram illustrating a data processing method according to another embodiment of the present application;
fig. 9 is a block diagram illustrating a data processing apparatus according to an embodiment of the present application;
fig. 10 is a block diagram illustrating a structure of a text display device according to an embodiment of the present application;
fig. 11 is a block diagram showing a structure of a text display device according to another embodiment of the present application;
fig. 12 shows a schematic structural diagram of a computer device provided in an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
In some of the flows described in the specification, claims, and above-described figures of the present application, a number of operations are included that occur in a particular order, and these operations may be performed out of order or in parallel as they occur herein. The sequence numbers of the operations, e.g., 101, 102, etc., are used merely to distinguish between the various operations, and do not represent any order of execution per se. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different. In the present application, the term "or/and" is only one kind of association relationship describing the associated object, and means that three relationships may exist, for example: a or/and B, which means that A can exist independently, A and B can exist simultaneously, and B can exist independently; the "/" character in this application generally indicates that the objects associated with each other are in an "or" relationship. It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. In addition, the embodiments described below are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Before introducing the solutions provided in the embodiments of the present application, a brief description is given to an application scenario of the technical solutions provided in the present application.
One of application scenarios to which the technical scheme of the present application is applicable: the e-commerce platform utilizes the calculation power of the service end to generate a structural description text corresponding to the commodity for the commodity on the e-commerce platform, and adds the structural description text into the commodity page information of the commodity, so that the corresponding structural description text is displayed in the commodity page when a buyer user browses the commodity page of the commodity through the application of the e-commerce platform client or a browser.
The second application scenario that can be used in the technical scheme of the application is as follows: when a merchant user needs to put a new commodity on shelf or publish advertisement recommendation information for each specific commodity, a commodity description text of the new commodity or the advertisement recommendation commodity needs to be entered. If the merchant expects that the description text can fully reflect the selling points of the commodities, the file advertising effect is good, but the file advertising effect is not written, at the moment, the description text of the commodities can be automatically generated by using the technical scheme provided by the application.
The application technical scheme can use the third application scene: the e-commerce platform provides a function corresponding to the technical scheme for each merchant user, the merchant user can apply for opening the function by himself, and corresponding description texts are generated for part or all of the commodities in the shop so as to be displayed in the commodity page; and so on.
Only a few more typical application scenarios are illustrated here, and the technical solution of the present application can also be applied to other scenarios, which are not illustrated here.
The technical solutions provided in the present application are described below from the perspective of a hardware system and a method, respectively.
Fig. 1 is a schematic structural diagram of a data processing system according to an embodiment of the present application. As shown in fig. 1, the data processing system includes:
the server 11 is used for acquiring multimedia data associated with the target commodity from a plurality of data sources; processing the multimedia data to obtain a plurality of text messages; respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file; performing text editing on the plurality of text messages according to categories to obtain a structural description text of the target commodity; adding the structural description text into the commodity page information of the target commodity;
the second client 13 is configured to respond to an operation of a second user on the target commodity, and acquire the commodity page information from the server; and displaying the commodity page of the target commodity based on the commodity page information.
The first application scenario can be implemented based on the data processing system provided by the embodiment. Namely, the E-commerce platform can generate corresponding description texts for the existing commodities by utilizing the calculation power of the service terminal. Of course, it is also understood that: the E-commerce platform completes the existing commodity information upgrading by using the technical scheme provided by the embodiment of the application. The second user corresponding to the second client 13 may be a user browsing a website or application page corresponding to the e-commerce platform.
Further, as shown in fig. 2, the system provided in this embodiment may further include a first client 12. The first client 12 is configured to: and responding to the operation of the first user on the target commodity, and sending a file application request to the server 11. After receiving the filing application request, the server 11 executes "obtaining multimedia data associated with the target commodity from a plurality of data sources; processing the multimedia data to obtain a plurality of text messages; respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file; performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity; and adding the structural description text into the commodity page information of the target commodity.
That is, the third application scenario can be implemented based on the data processing system provided by the embodiment. That is, the e-commerce platform provides a function corresponding to the technical solution of the present application for a first user (e.g., a merchant user) of the first client 12, and the merchant user can apply for opening the function by himself to generate a structural description text of a target commodity specified by the merchant user. For example, a merchant may designate one or some commodities to update commodity information, that is, add a description text generated by the technical solution of the present application to a commodity page, so as to improve the search rate of the commodities.
Still further, the first client 12 is further configured to: receiving and displaying the structural description text of the target commodity fed back by the server 11; responding to the modification operation of the first user on the description text, and adjusting the description text based on the modification operation to obtain an adjusted description text; and responding to a confirmation instruction of the first user for the adjusted description text, sending the adjusted description text to the server 11, so that the server replaces the description text with the adjusted description text.
Namely, the technical scheme of the application also provides an interface for modifying and perfecting the description text automatically generated by the server for the first user (such as a merchant user) of the first client.
Corresponding to the second application scenario, the embodiment of the application provides a data processing system. The data processing system comprises a first client 12 and a server 11. The server 11 in this embodiment may provide the description text generating service only for the first client 12. After the first user of the first client 12 automatically generates the description text by using the server, the description text may be printed as a billboard, or distributed to the merchant's own electric shop application, etc., which is not limited in this embodiment. In particular, the method comprises the following steps of,
the first client 12 is used for responding to the operation of the first user on the target commodity and sending a file application request to the server;
the server 11 is used for responding to the file application request and acquiring multimedia data related to the target commodity from a plurality of data sources; processing the multimedia data to obtain a plurality of text messages; respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file; performing text editing on the plurality of text messages according to categories to generate a structured first description text of the target commodity;
the first client 12 is further configured to receive a first description text of the target product fed back by the server; and displaying the first descriptive text.
Likewise, the first user corresponding to the first client 12 may modify the first description text, etc. That is, the first client 12 is further configured to, in response to a modification operation of the first user on the first description text, adjust the description text based on the modification operation, so as to obtain a second description text; in response to a confirmation instruction of the first user for the second description text, sending the second description text to the server 11, so that the server 11 replaces the first description text with the second description text.
Corresponding to the second application scenario, an embodiment of the present application further provides a data processing system. The data processing system may be understood as a software system that may be deployed on a corresponding hardware device of the first client. That is, the hardware device of the first client 12 is provided with a data processing system capable of implementing the technical solution of the present application, and the first user may call the locally deployed data processing system when needing to generate the description text for a certain target commodity.
Here, it should be noted that: the server 11 may be a server, a service cluster formed by a plurality of servers, a virtual server deployed on the server or the service cluster, or a cloud, and the like, which is not limited in this embodiment. The hardware devices corresponding to the first client 12 and the second client 13 may be a mobile phone, a computer, a tablet computer, a smart wearable device, and the like, and also are not limited in particular.
Specific contents of the corresponding functions of the server and the first client are explained in the following method embodiments.
Fig. 3 shows a schematic flowchart of a data processing method according to an embodiment of the present application. The execution main body of the method provided by this embodiment may be the server in the system embodiment described above. Wherein the method comprises the following steps:
101. and acquiring multimedia data related to the target commodity from a plurality of data sources.
102. And processing the multimedia data to obtain a plurality of text messages.
103. And respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file.
104. And performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity.
In 101, the plurality of data sources may include, but are not limited to: a corresponding database at the server side, data crawled by an extranet, and the like. For example, the database may include a merchandise information library, an audio-visual information library, and the like. That is, the aforementioned 101 "acquiring multimedia data associated with a target product from multiple data sources" may include:
1011. acquiring a main picture, commodity detail information and/or attribute information of the target commodity from a commodity information base;
1012. acquiring first video information and/or first audio information related to the target commodity from an audio and video information base;
1013. and acquiring the multimedia data of the target commodity and the multimedia data of the commodity with the same money as the target commodity from the Internet side.
The main picture of the target commodity can be a picture designated by a first user (such as a user of a merchant of an e-commerce platform) when the commodity is put on the shelf. The attribute information of the target product may include, but is not limited to: title, commodity category and commodity attribute data (e.g., CPV data), and so on. As shown in fig. 4, a title 2 and a main picture 1 may be displayed at the head of the goods page. For example, title 2 may be shown on one side of main picture 1. The item detail information may include, but is not limited to: at least one of a detail picture and a detail text. The article detail information may be displayed below the title 2 and the main picture 1. Part of the data in the merchandise attribute data may be shown in the header, for example, on one side of the main picture. The structured description text of the product in this embodiment may be displayed below the title 2 and the main picture 1. The commodity detail information can be displayed below the structural description text, and the content which is not displayed can be browsed by operating a mouse or a touch screen to scroll pages.
In 102, the multimedia data may include, but is not limited to: text, pictures, video, audio, and the like. Therefore, when step 102 is executed, the non-text information in the multimedia data can be subjected to text processing; and a selling point pattern (namely text information) matched with the target commodity can be obtained based on the commodity selling point library to serve as a material for subsequently generating a description text, and the like.
For the contents of the text processing and the acquisition of the adaptive text information based on the commodity selling point library, reference may be made to the following description.
In 103, the plurality of categories corresponding to the product literature may be pre-configured or may be determined based on the attribute information of the target product. Namely, the method provided by the embodiment of the application may further include:
105. determining the plurality of categories based on the attribute information of the target commodity; or the plurality of categories are acquired according to the information preconfigured in the commodity file.
The attribute information of the target good may include, but is not limited to: title, item type, and item attribute data. One achievable solution is: analyzing the commodity characteristics and the user group category of the target commodity according to the attribute information of the target commodity; and determining the commodity file category concerned by the group corresponding to the user group category based on the commodity characteristics and the user group category. Another implementation is: and searching the categories related to the commodity category of the target commodity based on the commodity category and commodity file category relation table.
Or, a plurality of categories are fixed and are pre-configured information; and will not change from commodity to commodity.
In an implementation, the step "determining respective categories for the text messages based on the categories corresponding to the commercial paperwork", may include:
obtaining a classification model; and inputting the categories and the text information into the classification model, and executing the classification model to obtain the categories corresponding to the text information.
The classification model may be a model implemented based on a machine learning technique, for example, implemented by using a deep neural network model or a convolutional neural network model, which is not limited in this embodiment. The classification model can be obtained by training by using a training sample set. The training sample set may include text samples and labels (i.e. categories) corresponding to the text samples; of course, the training sample set may include negative samples in addition to positive samples such as text samples and labels (i.e., categories) corresponding to the text samples.
In the above 104, "text editing is performed on the text information according to categories," may specifically be:
and merging the same type of target text information to obtain a merged text under the same type of target.
Wherein the "merge" process may include, but is not limited to: combining the same kind of purpose text information, filtering out repeated contents, expressing logic smooth text information according to languages and the like. In specific implementation, the text generation model can be used for merging the text information of the same type of purpose. The text generation model may be a machine learning model and trained based on a large amount of training data. When the method is implemented, the same kind of objective text information is input into the text generation model, and the text generation model is executed to obtain the merged text under the corresponding category. The merged text is smooth and conforms to the language expression logic of the language to which the text belongs.
The text generation model may be a model implemented based on a neural network algorithm, or a model implemented by using google _ T5 technology, and the like, and the embodiment does not limit the specific implementation of the model.
For example, in the example shown in fig. 5, the categories corresponding to the article literature include: overview, selling points, service policies, quality comments, specification introduction, etc. There may be one or more text messages under each category.
If only one text information in a category exists, only whether the text information accords with language expression logic (namely, whether the text information is smooth) needs to be concerned.
If the number of the text messages in a category is two or more, the merging process needs to be performed.
Of course, the text generation model can be used to process text information regardless of whether the text information is one or more of the categories. That is, the text information under each category in the plurality of categories is input to the text generation model, and the text generation model is executed to output the description text content corresponding to each category.
The structural description text of the target commodity comprises description text contents corresponding to various categories in the multiple categories.
According to the technical scheme provided by the embodiment, multimedia data related to the target commodity are obtained from a plurality of data sources; then processing the multimedia data to obtain a plurality of text messages; and classifying the categories of the multiple documents respectively based on the multiple categories corresponding to the commodity documents so as to edit the text information according to the categories and obtain the structural description text of the target commodity. Therefore, according to the scheme provided by the embodiment of the application, the content richness of the commodity description text is high because the text information is sourced from a plurality of data sources; in addition, the description text can be crawled and included by a search engine, which is helpful for improving the search recall rate of the commodities; in addition, the structured description text is convenient and easy to read, language translation can be performed by using an application or a translation function of the browser, user experience is good, and multi-country users can be helped to acquire commodity information more directly and accurately.
In an implementation solution, the above 102 "processing the multimedia data to obtain a plurality of text messages" may include the following:
1021. and performing text processing on the multimedia data to obtain at least one text message.
1022. And inquiring at least one text message matched with the target commodity in a commodity selling point filing case library based on the multimedia data.
The multimedia data may include attribute information of the target product, where the attribute information includes a product title, a product category, and product attribute data. Accordingly, the 1022 "searching the article selling point document library for at least one text message adapted to the target article based on the multimedia data" may include at least one of the following:
recalling at least one first selling point file matched with the attribute information from the commodity selling point library by utilizing a deep semantic matching model;
recalling at least one second selling point scheme corresponding to the commodity category of the target commodity from the commodity selling point library;
and determining at least one text message related to the target commodity selling point according to the at least one first selling point file and the at least one second selling point file.
The Deep Semantic matching Model may be a DSSM Model (Deep Structure Semantic Model) or other models.
Wherein, commodity selling point storehouse includes: the commodity category, the commodity characteristics and the selling point pattern. Wherein the characteristics of the goods can be determined according to the title of the goods and the attribute data of the goods.
The selling point documents in the commodity selling point library can be written by workers, can also be obtained by crawling from a network side, and the like, which is not limited in this embodiment.
Further, the multimedia data may further include at least one of: the main picture of the target commodity, the first commodity detail information of the target commodity, the first audio information related to the target commodity, the first video information related to the target commodity, the second commodity detail information of the commodity with the same money as the target commodity, the second audio information of the commodity with the same money as the target commodity and the second video information of the commodity with the same money as the target commodity. The commodity detail information comprises detail pictures and/or detail texts;
correspondingly, the step 1021 "performing a text processing on the multimedia data to obtain at least one text message" includes at least one of the following:
identifying the main picture by using a picture description technology to generate text information describing the main picture;
performing voice recognition on the first audio information and/or the second audio information, and generating text information related to the target commodity based on a voice recognition result;
performing character recognition on the detail pictures in the first commodity detail information and/or the second commodity detail information, and generating text information according to a character recognition result;
obtaining at least one text message based on the detail text in the first commodity detail information and/or the second commodity detail information;
and extracting key frames from the first video information and/or the second video information, carrying out image recognition on the key frames, and generating text information related to the target commodity according to an image recognition result.
The picture (or Image) description technology, such as Image capture, takes a picture (or Image) as an input, and enables a computer to output natural language description characters corresponding to the Image through a mathematical model and calculation, so that the computer has the capability of 'looking at the pictures and speaking'. Specifically, the method comprises the following steps. The picture (or image) description techniques may include, but are not limited to: encoder-decoder based methods, attention-based methods, methods based on generating a countermeasure network, reinforcement learning based methods, dense description based methods, and so forth.
The above-mentioned Character Recognition for the detail picture can be realized by using an OCR (Optical Character Recognition) technology.
The image recognition of the key frame can be recognized by using an image recognition technology and also by using the above-mentioned picture (or image) description technology to generate corresponding text information.
Fig. 6 is a flowchart illustrating a text display method according to an embodiment of the present application. The method provided by the embodiment may be applicable to the first client in the system embodiment. As shown, the text display method may include:
201. responding to the operation of a first user for a target commodity, and sending a file application request to a server;
202. receiving a structured first description text of the target commodity fed back by the server;
203. displaying the first descriptive text;
the first description text is obtained by text editing of a plurality of text information classification items, and the text information in the text information is classified into the categories in advance based on the categories corresponding to the commodity file; the plurality of textual information is determined from multimedia data associated with the target item at a plurality of data sources.
For the generation process of the first description text, reference may be made to the content in the above method embodiment, which is not described herein again.
Further, the technical solution corresponding to the method provided by the embodiment of the present application further provides a text input interface for the first user, and the first user can input text information about the target product that the first user wants to add through an interactive interface (e.g., an application interface) provided by the first client. That is, the method provided by this embodiment further includes the following steps:
204. displaying an application interface, wherein a text input box and an application control are arranged on the application interface;
205. responding to the operation of the first user on the application control, and sending a file application request to a server; or responding to the operation of the application control after the first user inputs characters in the character input box, acquiring character information input by the first user, and carrying the character information in the file application request, so that the server can use the character information as a data source when determining the first description text of the target commodity.
Still further, the technical solution corresponding to the method provided in the embodiment of the present application further provides a description document editing function for the first user, and when the first user is unsatisfied with the first description text fed back by the server, the first description text can be modified by using the function. Namely, the method provided by the embodiment of the present application may further include:
206. responding to the modification operation of the first user on the first description text, and adjusting the description text based on the modification operation to obtain a second description text;
207. and responding to a confirmation instruction of the first user for the second description text, sending the second description text to the server, so that the server replaces the first description text with the second description text.
Fig. 7 is a flowchart illustrating a text display method according to another embodiment of the present application. The execution subject of the method provided by this embodiment may be the first client in the system embodiment described above. Namely, the method provided by the present embodiment may include:
301. and responding to an instruction of a user for a target commodity, and acquiring multimedia data associated with the target commodity from a plurality of data sources.
302. And processing the multimedia data to obtain a plurality of text messages.
303. And respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file.
304. And performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity.
305. And displaying the description text on a commodity page of the target commodity.
For the contents of 301 to 304, reference may be made to the contents of the above data processing method embodiment, which is not described herein again. In addition, the method provided by this embodiment may further include the content corresponding to steps 204 to 207 described above.
The technical scheme provided by each embodiment of the application solves the problem that the text content on the commodity page is less, the rich text content has certain help on crawling, indexing and retrieving of a search engine, and the conversion rate of the commodity is improved. In addition, the structured commodity description text is clear at a glance and has certain help for user experience; for the cross-border e-commerce platform, rich texts are convenient to translate and read by multi-country users.
The innovation points of the technical scheme provided by the embodiments of the application can be simply summarized as follows: the method comprises the steps of acquiring multimedia information of commodities based on a multi-channel data source; processing the multimedia information into text information; then classifying the text information by category; and editing the texts of the plurality of text messages according to categories to construct a structured commodity description case. Furthermore, according to the scheme provided by each embodiment of the application, multiple categories corresponding to the commodity file can be adjusted based on different commodities, so that the characteristics of the commodities are highlighted as much as possible, and the conversion rate of the commodities is improved.
The research and development background of the technical scheme provided by each embodiment of the application is made aiming at less text data of the detailed information of the commodity, and the technical scheme provided by each embodiment of the application can be applied to other similar application scenes.
The solutions provided in the embodiments of the present application will be further described with reference to a specific example.
As shown in fig. 8, there are 4 data sources in the present embodiment, and a product main graph, product attribute information, product detail information filled by a merchant, and product detail information crawled by an extranet are respectively obtained from the 4 data sources. Wherein, the goods detail information may comprise a detail text and a detail picture.
As shown in the figure, the specific processing flow includes the following parts:
part one, commodity main graph
And processing the main commodity graph by using an Image Caption technology to obtain text information A.
Part two, commodity attribute information
The merchandise attribute information may include, but is not limited to: item title, item category, CVP, etc. And searching a selling point pattern matched with the commodity attribute information in a commodity selling point library by adopting a DSSM model. And performing description text generation based on the matched selling point pattern, for example, based on the matched selling point pattern ranked in the top 10 to obtain at least one text message B.
The above description text generation can be implemented by using google _ T5 technology.
Third, the detailed information of the first commodity filled by the merchant
Recognizing text information in the detail picture in the first commodity detail information by using an OCR technology; the textual information obtained therefrom may be relatively cluttered. Therefore, the detailed text in the first commodity detailed information and the text information recognized by the OCR are analyzed, and based on the analysis result, a preprocessing operation, such as filtering out duplicate content, deleting wrong content, and the like, is performed to obtain at least one text information C. Subsequently, at least one text message C is classified into categories.
Part four, commodity detail information that extranet crawled
And searching second commodity detail information of the same type of commodity matched with the target commodity from the commodity detail information crawled from the extranet. Then, the same step as the third step is adopted, namely, the OCR technology is utilized to identify the text information in the detail picture in the second commodity detail information; and analyzing the detailed text in the second commodity detailed information and the text information identified by the OCR, and executing preprocessing operation based on the analysis result to obtain at least one text information D. Subsequently, at least one text message D is classified into categories.
Here, it should be noted that: the above-mentioned part of the text information a generated based on the product main map may belong to: summary of the invention. At least one text message B generated by the second part may belong to: the category of point of sale of the goods. In the at least one text message C and the at least one text message D obtained in the third and fourth parts, some of the messages may belong to: some of the categories of points of sale of goods may belong to: user good comment categories, some may belong to: the service policy categories, some of which may belong to: commodity specification introduction categories; and so on.
After the text information A, the at least one text information B, the at least one text information C and the at least one text information D are obtained, the text information can be merged according to categories, and structured output is performed after merging of the categories is completed. And displaying according to a preset sequence of various categories, such as a summary category, a selling point category, a prompt category, a user favorable comment category, a service policy category, an installation and use guidance category, a commodity specification introduction category and the like.
Fig. 9 shows a block diagram of a data processing apparatus according to an embodiment of the present application. As shown, the data processing apparatus may include a first obtaining module 21, a first processing module 22, a first classifying module 23, and a first text generating module 24. The first obtaining module 21 is configured to obtain multimedia data associated with a target product from a plurality of data sources. The first processing module 22 is configured to process the multimedia data to obtain a plurality of text messages. The first classification module 23 is configured to determine, based on a plurality of categories corresponding to the commodity copy, corresponding categories for the plurality of text messages, respectively. The first text generating module 24 is configured to perform text editing on the plurality of text messages according to categories to obtain a structured description text of the target product.
Further, when the first processing module 22 processes the multimedia data to obtain a plurality of text messages, it is specifically configured to:
performing textual processing on the multimedia data to obtain at least one text message; and inquiring at least one text message matched with the target commodity in a commodity selling point filing case library based on the multimedia data.
Further, the multimedia data includes attribute information of the target product, and the attribute information includes a product title, a product category, and product attribute data. Correspondingly, the first processing module 22 queries at least one text message adapted to the target commodity in a commodity selling point filing case library based on the multimedia data, and has at least one of the following functions:
recalling at least one first selling point file matched with the attribute information from the commodity selling point library by utilizing a deep semantic matching model;
recalling at least one second selling point file corresponding to the commodity category of the target commodity from the commodity selling point library;
and determining at least one text message related to the target commodity selling point according to the at least one first selling point file and the at least one second selling point file.
Further, the multimedia data includes at least one of: the main picture of the target commodity, the first commodity detail information of the target commodity, the first audio information related to the target commodity, the first video information related to the target commodity, the second commodity detail information of the commodity with the same money as the target commodity, the second audio information of the commodity with the same money as the target commodity and the second video information of the commodity with the same money as the target commodity; the commodity detail information comprises detail pictures and/or detail texts. Correspondingly, when the first processing module 22 performs the text processing on the multimedia data to obtain at least one text message, the first processing module has at least one of the following functions:
identifying the main picture by using a picture description technology to generate text information describing the main picture;
performing voice recognition on the first audio information and/or the second audio information, and generating text information related to the target commodity based on a voice recognition result;
performing character recognition on the detail pictures in the first commodity detail information and/or the second commodity detail information, and generating text information according to a character recognition result;
obtaining at least one text message based on the detail text in the first commodity detail information and/or the second commodity detail information;
and extracting key frames from the first video information and/or the second video information, carrying out image recognition on the key frames, and generating text information related to the target commodity according to an image recognition result.
Further, when the first obtaining module 21 obtains the multimedia data associated with the target product from a plurality of data sources, it is specifically configured to:
acquiring a main picture, commodity detail information and/or attribute information of the target commodity from a commodity information base;
acquiring first video information and/or first audio information related to the target commodity from an audio and video information base;
and acquiring the multimedia data of the target commodity and the multimedia data of the commodity with the same money as the target commodity from the Internet side.
Further, the first obtaining module 21 may be further configured to: determining the plurality of categories based on the attribute information of the target commodity; or the plurality of categories are acquired according to the information preconfigured in the commodity file.
Further, when determining the corresponding category for the text information based on the multiple categories corresponding to the commodity copy, the first classification module 23 is specifically configured to:
obtaining a classification model;
and inputting the categories and the text information into the classification model, and executing the classification model to obtain the categories corresponding to the text information.
Here, it should be noted that: the data processing apparatus provided in this embodiment may implement the technical solutions described in the foregoing data processing method embodiments, and the specific implementation principles of the modules or units may refer to the corresponding contents in the foregoing method embodiments, which are not described herein again.
Fig. 10 shows a block diagram of a text display device according to an embodiment of the present application. As shown in fig. 9, the text display device includes: a sending module 31, a receiving module 32 and a first display module 33. The sending module 31 is configured to send a document application request to the server in response to an operation of the first user on the target product. The receiving module 32 is configured to receive the structured first description text of the target product fed back by the server. The first display module 33 is configured to display the first description text.
The first description text is obtained by text editing of a plurality of text information classification items, and the text information in the text information is classified into the categories in advance based on the categories corresponding to the commodity file; the plurality of textual information is determined from multimedia data associated with the target item at a plurality of data sources.
Further, the first display module 33 is further configured to display an application interface, and the application interface is provided with a text input box and an application control. The sending module is further configured to:
responding to the operation of the first user on the application control, and sending a file application request to a server; or
And responding to the operation of the first user on the application control after the first user inputs characters in the character input box, acquiring character information input by the first user, and carrying the character information in the file application request, so that the server takes the character information as a data source when determining the first description text of the target commodity.
Further, the text display device provided by the embodiment of the application can further include an adjusting module. The adjusting module is used for responding to the modification operation of the first user on the first description text, and adjusting the description text based on the modification operation to obtain a second description text. Correspondingly, the sending module is further configured to: and responding to a confirmation instruction of the first user for the second description text, and sending the second description text to the server so that the server replaces the first description text with the second description text.
Here, it should be noted that: the data processing apparatus provided in this embodiment may implement the technical solutions described in the foregoing text display method embodiments, and the specific implementation principles of the modules or units may refer to the corresponding contents in the foregoing method embodiments, which are not described herein again.
Fig. 11 shows a block diagram of a text display device according to an embodiment of the present application. As shown in the drawing, the text display device includes: a second obtaining module 41, a second processing module 42, a second classifying module 43, a second text generating module 44, and a second displaying module 45. The second obtaining module 41 is configured to, in response to an instruction of a user for a target commodity, obtain multimedia data associated with the target commodity from multiple data sources. The second processing module 42 is configured to process the multimedia data to obtain a plurality of text messages. The second classification module 43 is configured to determine, based on a plurality of categories corresponding to the article copy, corresponding categories for the plurality of text messages, respectively. The second text generating module 44 is configured to perform text editing on the multiple text messages according to categories to obtain a structured description text of the target product. The second display module 45 is configured to display the description text on a product page of the target product.
Here, it should be noted that: the data processing apparatus provided in this embodiment may implement the technical solutions described in the foregoing text display method embodiments, and the specific implementation principles of the modules or units may refer to the corresponding contents in the foregoing method embodiments, which are not described herein again.
The schematic structural diagram of the computing device provided by one embodiment of the application is provided. The schematic diagram of the principle structure is shown in fig. 12. In particular, the computing device includes a memory 51 and a processor 52. Wherein the memory 51 is configured to store one or more computer instructions; the processor 52 is coupled to the memory 51, and is used for implementing the at least one or more computer instructions (such as a computer instruction for implementing data storage logic) to implement the steps in the data processing method provided in the embodiment of the present application, or the steps in the text display method.
The memory 51 may be implemented by any type or combination of volatile and non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The illustration of FIG. 12 is merely illustrative of some of the components that a computing device may contain and does not imply that a computing device includes only the components shown in FIG. 12, such as audio component 56, display 54, power component 55, communication component 53, and the like.
Yet another embodiment of the present application provides a computer program product (not shown in any figure of the drawings). The computer program product comprises computer programs or instructions which, when executed by a processor, cause the processor to carry out the steps in the above-described method embodiments.
Accordingly, the present application further provides a computer-readable storage medium storing a computer program, where the computer program can implement the method steps or functions provided by the foregoing embodiments when executed by a computer.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (14)

1. A data processing method, adapted to a server, the method comprising:
acquiring multimedia data related to a target commodity from a plurality of data sources;
processing the multimedia data to obtain a plurality of text messages;
respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file;
and performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity.
2. The method of claim 1, wherein processing the multimedia data to obtain a plurality of text messages comprises:
performing textual processing on the multimedia data to obtain at least one text message;
and inquiring at least one text message matched with the target commodity in a commodity selling point filing case library based on the multimedia data.
3. The method of claim 2, wherein the multimedia data comprises attribute information of the target product, the attribute information comprising a product title, a product category, and product attribute data;
and
based on the multimedia data, at least one text message matched with the target commodity is inquired in a commodity selling point filing case library and comprises at least one of the following items:
recalling at least one first selling point file matched with the attribute information from the commodity selling point library by utilizing a deep semantic matching model;
recalling at least one second selling point scheme corresponding to the commodity category of the target commodity from the commodity selling point library;
and determining at least one text message related to the target commodity selling point according to the at least one first selling point literature and the at least one second selling point literature.
4. The method of claim 2, wherein the multimedia data comprises at least one of: the main picture of the target commodity, the first commodity detail information of the target commodity, the first audio information related to the target commodity, the first video information related to the target commodity, the second commodity detail information of the commodity with the same money as the target commodity, the second audio information of the commodity with the same money as the target commodity and the second video information of the commodity with the same money as the target commodity;
the commodity detail information comprises detail pictures and/or detail texts;
and
performing textual processing on the multimedia data to obtain at least one text message, which includes at least one of the following:
identifying the main picture by using a picture description technology to generate text information describing the main picture;
performing voice recognition on the first audio information and/or the second audio information, and generating text information related to the target commodity based on a voice recognition result;
performing character recognition on the detail pictures in the first commodity detail information and/or the second commodity detail information, and generating text information according to a character recognition result;
obtaining at least one text message based on the detail text in the first article detail information and/or the second article detail information;
and extracting key frames from the first video information and/or the second video information, carrying out image recognition on the key frames, and generating text information related to the target commodity according to an image recognition result.
5. The method of any one of claims 1 to 4, wherein obtaining multimedia data associated with a target commodity from a plurality of data sources comprises:
acquiring a main picture, commodity detail information and/or attribute information of the target commodity from a commodity information base;
acquiring first video information and/or first audio information related to the target commodity from an audio and video information base;
and acquiring the multimedia data of the target commodity and the multimedia data of the commodity with the same money as the target commodity from the Internet side.
6. The method of any of claims 1 to 4, further comprising:
determining the plurality of categories based on the attribute information of the target commodity; or
And acquiring the plurality of categories according to the commodity file pre-configuration information.
7. The method of claim 6, wherein determining respective categories for the plurality of text messages based on a plurality of categories corresponding to the article copy comprises:
obtaining a classification model;
and inputting the categories and the text information into the classification model, and executing the classification model to obtain the categories corresponding to the text information.
8. A text display method, adapted for a first client, the method comprising:
responding to the operation of a first user for a target commodity, and sending a file application request to a server;
receiving a structured first description text of the target commodity fed back by the server;
displaying the first descriptive text;
the first description text is obtained by text editing of a plurality of text information classification items, and the text information in the text information is classified into the categories in advance based on the categories corresponding to the commodity file; the plurality of text messages are determined by multimedia data associated with the target item at a plurality of data sources.
9. The method of claim 8, further comprising:
displaying an application interface, wherein a text input box and an application control are arranged on the application interface;
responding to the operation of the first user on the application control, and sending a file application request to a server; or
And responding to the operation of the first user on the application control after the first user inputs characters in the character input box, acquiring character information input by the first user, and carrying the character information in the file application request, so that the server takes the character information as a data source when determining the first description text of the target commodity.
10. The method of claim 8, further comprising:
responding to the modification operation of the first user on the first description text, and adjusting the description text based on the modification operation to obtain a second description text;
and responding to a confirmation instruction of the first user for the second description text, and sending the second description text to the server so that the server replaces the first description text with the second description text.
11. A text display method, adapted for a first client, the method comprising:
responding to an instruction of a user for a target commodity, and acquiring multimedia data associated with the target commodity from a plurality of data sources;
processing the multimedia data to obtain a plurality of text messages;
respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file;
performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity;
and displaying the description text on a commodity page of the target commodity.
12. A data processing system, comprising:
the server is used for acquiring multimedia data related to the target commodity from a plurality of data sources; processing the multimedia data to obtain a plurality of text messages; respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file; performing text editing on the plurality of text messages according to categories to obtain a structured description text of the target commodity; adding the structural description text into the commodity page information of the target commodity;
the second client is used for responding to the operation of a second user on the target commodity and acquiring the commodity page information from the server; and displaying the commodity page of the target commodity based on the commodity page information.
13. A data processing system, comprising:
the first client is used for responding to the operation of a first user for the target commodity and sending a file application request to the server;
the server is used for responding to the file application request and acquiring multimedia data related to the target commodity from a plurality of data sources; processing the multimedia data to obtain a plurality of text messages; respectively determining corresponding categories for the text information based on a plurality of categories corresponding to the commodity file; performing text editing on the plurality of text messages according to categories to generate a structured first description text of the target commodity;
the first client is further used for receiving a first description text of the target commodity fed back by the server; and displaying the first descriptive text.
14. A computing device, comprising: a memory and a processor, wherein,
the memory storing one or more computer instructions;
the processor, coupled with the memory, configured to execute the one or more computer instructions for implementing the steps in the method of any of claims 1 to 7, or the steps in the method of any of claims 8 to 10, or the steps in the method of claim 11.
CN202211460790.2A 2022-11-21 2022-11-21 Data processing method, text display method, data processing system and equipment Pending CN115828862A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211460790.2A CN115828862A (en) 2022-11-21 2022-11-21 Data processing method, text display method, data processing system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211460790.2A CN115828862A (en) 2022-11-21 2022-11-21 Data processing method, text display method, data processing system and equipment

Publications (1)

Publication Number Publication Date
CN115828862A true CN115828862A (en) 2023-03-21

Family

ID=85529908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211460790.2A Pending CN115828862A (en) 2022-11-21 2022-11-21 Data processing method, text display method, data processing system and equipment

Country Status (1)

Country Link
CN (1) CN115828862A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117593069A (en) * 2023-07-19 2024-02-23 行吟信息科技(上海)有限公司 Information generation method, device, computer equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117593069A (en) * 2023-07-19 2024-02-23 行吟信息科技(上海)有限公司 Information generation method, device, computer equipment and medium

Similar Documents

Publication Publication Date Title
CN106874467B (en) Method and apparatus for providing search results
US9613268B2 (en) Processing of images during assessment of suitability of books for conversion to audio format
CN107491477B (en) Emotion symbol searching method and device
US11651015B2 (en) Method and apparatus for presenting information
US20160179966A1 (en) Method and system for generating augmented product specifications
CN115982376B (en) Method and device for training model based on text, multimode data and knowledge
CN111240669B (en) Interface generation method and device, electronic equipment and computer storage medium
CN109033282A (en) A kind of Web page text extracting method and device based on extraction template
CN109508448A (en) Short information method, medium, device are generated based on long article and calculate equipment
CN110633398A (en) Method for confirming central word, searching method, device and storage medium
CN110909768B (en) Method and device for acquiring marked data
CN107632974B (en) Chinese analysis platform suitable for multiple fields
CN115828862A (en) Data processing method, text display method, data processing system and equipment
CN114218907A (en) Presentation generation method and device, electronic equipment and storage medium
CN117520343A (en) Information extraction method, server and storage medium
CN117420998A (en) Client UI interaction component generation method, device, terminal and medium
CN111368034A (en) Bidirectional semantic feature matching method and supply content recommendation device
CN113127597A (en) Processing method and device for search information and electronic equipment
CN112989020B (en) Information processing method, apparatus, and computer-readable storage medium
CN114818639A (en) Presentation generation method, device, equipment and storage medium
CN111723177B (en) Modeling method and device of information extraction model and electronic equipment
CN114595191A (en) Webpage processing method and device, electronic equipment and storage medium
US20210073335A1 (en) Methods and systems for semantic analysis of table content
CN112241463A (en) Search method based on fusion of text semantics and picture information
CN110826313A (en) Information extraction method, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240226

Address after: Room 303, 3rd Floor, Building 5, No. 699 Wangshang Road, Changhe Street, Binjiang District, Hangzhou City, Zhejiang Province, 310052

Applicant after: Hangzhou Alibaba Overseas Internet Industry Co.,Ltd.

Country or region after: China

Address before: Room 554, 5 / F, building 3, 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant before: Alibaba (China) Co.,Ltd.

Country or region before: China