CN115334333A

CN115334333A - Live video processing method and device, live server and storage medium

Info

Publication number: CN115334333A
Application number: CN202210836239.7A
Authority: CN
Inventors: 兰庆元
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2022-07-15
Filing date: 2022-07-15
Publication date: 2022-11-11

Abstract

The application relates to a live video processing method and device, computer equipment and a storage medium. The method comprises the following steps: acquiring a live video sent by a main broadcast client, processing the live video, and obtaining live audio of each section and a live frame; based on a first feature database, performing feature recognition processing on each segment of live broadcast audio, and determining a first time; the first characteristic database is used for storing risk audio characteristics; the first times are the times of occurrence of risk audio features in each section of live audio; based on a second feature database, performing feature recognition processing on each frame of live broadcast picture, and determining a second time; the second characteristic database is used for storing the risk picture characteristics; the second frequency is the frequency of occurrence of risk picture characteristics in each frame of live broadcast picture; and if the first time or the second time is greater than a first preset time threshold, deleting the live video and sending warning information to the anchor client. By adopting the method, the streaming of the non-civilized video can be avoided.

Description

Live video processing method and device, live server and storage medium

Technical Field

The present application relates to the field of data communication technologies, and in particular, to a live video processing method and apparatus, a live server, a live system, and a storage medium.

Background

Servers, also known as servers, are devices that provide computing services. Since the server needs to respond to and process the service request, the server generally has the capability of assuming and securing the service. The server is composed of a processor, a hard disk, a memory, a system bus and the like, and is similar to a general computer architecture, but has high requirements on processing capacity, stability, reliability, safety, expandability, manageability and the like due to the need of providing highly reliable services. Under a network environment, the server is divided into a live broadcast server, a file server, a database server, an application server, a WEB server and the like according to different service types provided by the server.

However, the existing live broadcast server only has a function of decoding and transmitting live broadcast information of a main broadcast, and cannot process an unverified picture and unverified audio involved in live broadcast, so that the phenomenon that the whole environment of a live broadcast platform is influenced by the outflow of unverified video occurs.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a live video processing method and apparatus, a live server, a live system, and a storage medium, which can avoid a phenomenon in which an entire environment of a live platform is affected by streaming of an inaudible video.

In a first aspect, a method for processing a live video is provided, where the method includes:

acquiring a live broadcast video sent by a main broadcast client, processing the live broadcast video, and obtaining live broadcast audio of each segment and live broadcast pictures of each frame;

based on a first feature database, performing feature recognition processing on each segment of live broadcast audio, and determining a first time; the first characteristic database is used for storing risk audio characteristics; the first times are the times of occurrence of risk audio features in each section of live audio;

based on a second feature database, performing feature recognition processing on each frame of live broadcast picture, and determining a second number; the second characteristic database is used for storing the risk picture characteristics; the second frequency is the frequency of occurrence of risk picture characteristics in each frame of live broadcast picture;

and if the first time or the second time is greater than a first preset time threshold, deleting the live video and sending warning information to the anchor client.

In one embodiment, the method further includes: if the first times and the second times are both smaller than or equal to a first preset time threshold value, replacing target live broadcast audio in each section of live broadcast audio with preset characteristic audio, replacing target live broadcast pictures in each frame of live broadcast pictures with preset characteristic pictures, and generating a purification processing video; the target live audio is live audio with risk audio characteristics; the target live broadcast picture is a live broadcast picture with risk picture characteristics; and sending the purified video to the client of the audience.

In one embodiment, the step of generating the sanitized video includes: generating first original video information; and coding the first original video information to obtain a purified video.

In one embodiment, the step of processing the live video and obtaining the segments of live audio and the frames of live video comprises: decoding the live video to obtain second original video information; and extracting the second original video information to obtain each section of live broadcast audio and each frame of live broadcast picture.

In one embodiment, the method further includes: storing the first number and the second number; summing the first times and the second times within preset time to obtain an accumulated superposition time value; and if the accumulated superposition times are less than or equal to a second preset times threshold, determining the anchor account corresponding to the anchor client as an automatic stop account. And if the accumulated superposition number is greater than a second preset number threshold, determining the anchor account corresponding to the anchor client as a permanent closed account.

In one embodiment, the method further comprises: acquiring character information sent by a viewer client; based on the third characteristic database, carrying out characteristic identification processing on the character information, and judging whether the character information is target character information; the third characteristic database is used for storing sensitive vocabulary characteristics; the target character information is character information with sensitive vocabulary characteristics; if yes, shielding the text information; if not, the text information is sent to the audience client and the anchor client.

In a second aspect, a device for processing live video is provided, where the device includes a live video acquisition module, an audio feature recognition module, a video feature recognition module, and a warning information sending module.

The live video acquisition module is used for acquiring a live video sent by a main broadcast client, processing the live video and acquiring live audio of each segment and live frames; the audio characteristic identification module is used for carrying out characteristic identification processing on each section of live audio based on the first characteristic database and determining a first time; the first characteristic database is used for storing risk audio characteristics; the first times are the times of occurrence of risk audio features in each section of live audio; the video characteristic identification module is used for carrying out characteristic identification processing on each frame of live broadcast picture based on the second characteristic database and determining a second number; the second characteristic database is used for storing the risk picture characteristics; the second frequency is the frequency of occurrence of risk picture characteristics in each frame of live broadcast picture; and the warning information sending module is used for deleting the live video and sending warning information to the anchor client if the first time or the second time is greater than a first preset time threshold.

In a third aspect, a live server is provided, where the computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of any one of the above method embodiments when executing the computer program.

In a fourth aspect, a live system is provided, comprising a host client and a live server as in any of the above live server embodiments; the anchor client is in communication connection with the live broadcast server and used for acquiring live broadcast videos and sending the live broadcast videos to the live broadcast server.

In a fifth aspect, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, carries out the steps of any of the above-described method embodiments.

According to the processing method and device of the live video, the live server, the live system and the storage medium, the live video sent by the anchor client is obtained, the live video is processed, and each section of live audio and each frame of live frame are obtained; then, based on a first feature database, performing feature recognition processing on each segment of live broadcast audio, and determining a first time; meanwhile, based on a second feature database, feature recognition processing is carried out on each frame of live broadcast picture, and a second number is determined; then, under the condition that the first time or the second time is larger than a first preset time threshold, deleting the live video and sending warning information to the anchor client; the phenomenon that the whole environment of the live broadcast platform is influenced by the outflow of the non-civilized video is prevented, the whole environment of the live broadcast platform is purified, the physical and mental health of audience users is guaranteed, and the management efficiency of the live broadcast platform to the anchor is improved.

Drawings

Fig. 1 is a first flowchart of a processing method of a live video in an embodiment;

FIG. 2 is a flow diagram illustrating the steps of processing live video and obtaining segments of live audio and frames of live video in one embodiment;

fig. 3 is a schematic diagram of a second flow of a processing method of a live video in another embodiment;

FIG. 4 is a schematic flow chart diagram illustrating the steps for generating sanitized video in one embodiment;

fig. 5 is a third flow chart of a processing method of a live video in another embodiment;

fig. 6 is a fourth flowchart illustrating a processing method of live video in another embodiment;

FIG. 7 is a block diagram showing a configuration of a processing device for live video according to one embodiment;

FIG. 8 is an internal block diagram of a live server in one embodiment;

fig. 9 is a schematic structural diagram of a live system in one embodiment.

Detailed Description

To facilitate an understanding of the present application, the present application will now be described more fully with reference to the accompanying drawings. Embodiments of the present application are given in the accompanying drawings. This application may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.

It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another. For example, a first resistance may be referred to as a second resistance, and similarly, a second resistance may be referred to as a first resistance, without departing from the scope of the present application. The first resistance and the second resistance are both resistances, but they are not the same resistance.

It is to be understood that "connection" in the following embodiments is to be understood as "electrical connection", "communication connection", and the like if the connected circuits, modules, units, and the like have communication of electrical signals or data with each other.

As used herein, the singular forms "a", "an" and "the" may include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises/comprising," "includes" or "including," etc., specify the presence of stated features, integers, steps, operations, components, parts, or combinations thereof, but do not preclude the presence or addition of one or more other features, integers, steps, operations, components, parts, or combinations thereof.

In a first aspect, as shown in fig. 1, the present application provides a method for processing a live video, and this embodiment is illustrated by applying the method to a live server, it can be understood that the method can also be applied to a terminal, and can also be applied to a system including the terminal and the live server, and is implemented by interaction between the terminal and the server. In this embodiment, the method includes the following steps 101 to 104.

Step 101, acquiring a live video sent by a main broadcast client, processing the live video, and obtaining live audio of each segment and live frames.

The anchor client is in communication connection with the live broadcast server and used for acquiring live broadcast videos and sending the live broadcast videos to the live broadcast server. The anchor client may be, but is not limited to, a terminal, which may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The live broadcast server acquires live broadcast video sent by the anchor client, and the live broadcast video can be processed, so that live broadcast audio of each segment and live broadcast pictures of each frame are obtained.

In one embodiment, as shown in fig. 2, the step of processing the live video and obtaining the segments of live audio and the frames of live pictures includes step 201 and step 202.

Step 201, decoding the live video to obtain second original video information.

Step 202, extracting the second original video information to obtain each segment of live broadcast audio and each frame of live broadcast picture.

The live broadcast server can decode the live broadcast video, so that the decoded information is second original video information; and then, extracting the second original video information to obtain corresponding live broadcast audio and live broadcast pictures.

It is understood that the step of the anchor client capturing the live video includes: collecting original audio data and original video data; integrating the original audio data and the original video data to obtain live original video information; and coding the live original video information to obtain a live video. And the anchor client can send the live video obtained by the steps to the live broadcast server.

In this embodiment, the second original video information is obtained by decoding the live video; and then, extracting and processing the second original video information to obtain each section of live broadcast audio and each frame of live broadcast picture, thereby improving the processing convenience of the live broadcast video.

And 102, performing feature recognition processing on each section of live audio based on the first feature database, and determining a first time.

Wherein the first feature database is used for storing risk audio features. It is to be understood that the first feature database stores at least one risk audio feature therein, and the first feature database may be, but is not limited to being, pre-provisioned within the live server. Risky audio features refer to audio features where there is an out of compliance or illegal phenomenon; for example, audio features relating to violence, audio features relating to fraud, or audio features relating to the spreading of pornography. In addition, the first time is the frequency of occurrence of risk audio features in each live audio segment.

It can be understood that the live broadcast server may perform feature recognition processing on each segment of live broadcast audio based on the first feature database, so as to obtain the risk audio features stored in the first feature database appearing in each segment of live broadcast audio, that is, the first times, which are the times of occurrence of the risk audio features in each segment of live broadcast audio may be statistically determined.

In a specific example, a specific manner of performing the feature recognition processing on each segment of live audio is not limited herein, and the feature recognition processing may be performed on each segment of live audio through an audio feature recognition algorithm, or may be performed on each segment of live audio through a pre-trained audio feature recognition neural network model. The above is only a specific example, and the setting is flexible according to the user requirement in practical application, and is not limited herein.

And 103, performing feature recognition processing on each frame of live broadcast picture based on a second feature database, and determining a second time.

Wherein the second feature database is used for storing the risk picture features. It is understood that the second feature database stores at least one risk picture feature, and the second feature database may be, but is not limited to being, pre-located in the live server. The risk picture characteristics refer to picture characteristics with non-compliance or illegal phenomena; for example, picture features related to violence, picture features related to fraud, or picture features related to spreading pornography. In addition, the second frequency is the frequency of occurrence of risk picture features in each frame of live pictures.

It can be understood that the live broadcast server may perform feature recognition processing on each live broadcast picture based on the second feature database, so as to obtain risk picture features stored in the second feature database appearing in each frame of live broadcast picture, that is, the number of times that the risk picture features appear in each frame of live broadcast picture, that is, the second number of times, can be statistically determined.

In a specific example, the specific manner of performing the feature recognition processing on each frame of live broadcast picture is not limited herein, and the feature recognition processing may be performed on each frame of live broadcast picture through a picture feature recognition algorithm, or may be performed on each frame of live broadcast picture through a pre-trained picture feature recognition neural network model. The above is only a specific example, and the setting is flexible according to the user requirement in practical application, and is not limited herein.

And 104, if the first times or the second times are larger than a first preset time threshold, deleting the live video and sending warning information to the anchor client.

After determining the first time and the second time, the live broadcast server indicates that the overall risk of the live broadcast video is high under the condition that the first time is greater than a first preset time threshold or the second time is greater than the first preset time threshold, and the live broadcast video can be deleted; and sending warning information to the anchor client, thereby timely prompting that an anchor user corresponding to the anchor client needs to standardize the self live broadcast behavior.

In a specific example, the first preset number threshold may be, but is not limited to, any positive integer preset, which is only a specific example, and is flexibly set according to a user requirement in an actual application, and is not limited herein.

Based on the method, the live video is processed by acquiring the live video sent by the anchor client, and each section of live audio and each frame of live video are obtained; then, based on a first feature database, carrying out feature recognition processing on each segment of live broadcast audio, and determining a first time; meanwhile, based on a second feature database, feature recognition processing is carried out on each frame of live broadcast picture, and a second number is determined; then, under the condition that the first time or the second time is larger than a first preset time threshold, deleting the live video and sending warning information to the anchor client; the phenomenon that the whole environment of the live broadcast platform is influenced by the outflow of the non-civilized video is prevented, the whole environment of the live broadcast platform is purified, the physical and mental health of audience users is guaranteed, and the management efficiency of the live broadcast platform to the anchor is improved.

In one embodiment, as shown in fig. 3, the method further includes steps 105 and 106.

And 105, if the first frequency and the second frequency are both smaller than or equal to a first preset frequency threshold value, replacing target live broadcast audio in each section of live broadcast audio with preset characteristic audio, replacing target live broadcast pictures in each frame of live broadcast pictures with preset characteristic pictures, and generating the purified video.

The target live audio is live audio with risk audio characteristics; the target live broadcast picture is a live broadcast picture with risk picture characteristics; and sending the purified video to the client of the audience. The live broadcast server can perform feature recognition processing on each live broadcast audio based on the first feature database, so that risk audio features stored in the first feature database appearing in each live broadcast audio are obtained, and target live broadcast audio in each live broadcast audio can be determined. Meanwhile, the live broadcast server can perform feature recognition processing on each live broadcast picture based on the second feature database, so that risk picture features stored in the second feature database appearing in each frame of live broadcast picture are obtained, and a target live broadcast picture in each frame of live broadcast picture can be determined.

It can be understood that the live broadcast server can directly replace the target live broadcast audio in each section of live broadcast audio with the preset characteristic audio and replace the target live broadcast picture in each frame of live broadcast picture with the preset characteristic picture under the condition that the first frequency and the second frequency are both smaller than or equal to the first preset frequency threshold value, namely the whole risk of the live broadcast video is lower under the above condition, so that the purified video is generated.

In one particular example, the preset feature audio may be, but is not limited to, audio containing "drips" or blank audio; the preset feature picture may be, but is not limited to, a blank picture or a mosaic picture. The above is only a specific example, and the setting is flexible according to the user requirement in practical application, and is not limited herein.

In one embodiment, as shown in FIG. 4, the step of generating the sanitized video includes

steps

401 and 402.

Step 401, generating first original video information.

Step 402, encoding the first original video information to obtain a purified video.

When the first frequency and the second frequency are both smaller than or equal to a first preset frequency threshold value, the live broadcast server replaces target live broadcast audio in each section of live broadcast audio with preset characteristic audio, and replaces target live broadcast pictures in each frame of live broadcast pictures with preset characteristic pictures to generate first original video information; and then, coding the first original video information to obtain a purified video.

In a specific example, the live server may store the generated sanitized video for 36 hours in an accumulated manner, so as to be easy to review and check at any time, which is only a specific example and is flexibly set according to the user's requirement in practical application, and is not limited herein.

In the present embodiment, by generating first original video information; then, coding the first original video information to obtain a purified video; the processing convenience of the live video is improved.

Step 106, sending the sanitized video to the viewer client.

The audience client is in communication connection with the live broadcast server and used for watching live broadcast videos and receiving purification processing videos sent by the live broadcast server. The viewer client may be, but is not limited to, a terminal, which may be, but is not limited to, various personal computers, notebook computers, smart phones, tablets, and portable wearable devices. After the live broadcast server generates the purification processing video, the purification processing video can be sent to the audience client, so that the audience client can timely receive the purification processing video sent by the live broadcast server.

In this embodiment, when both the first frequency and the second frequency are less than or equal to a first preset frequency threshold, replacing a target live broadcast audio in each segment of live broadcast audio with a preset feature audio, replacing a target live broadcast picture in each frame of live broadcast picture with a preset feature picture, and generating a purified video; then, sending a purification processing video to a spectator client; furthermore, the phenomenon that the whole environment of the live broadcast platform is influenced by the outflow of the non-civilized pictures or the non-civilized audio at the audience client is prevented, the whole environment of the live broadcast platform is purified, the physical and mental health of audience users is guaranteed, and the management efficiency of the live broadcast platform to the main broadcast is improved.

In one embodiment, as shown in fig. 5, the method further includes steps 107 to 110.

Step 107, the first number and the second number are stored.

And 108, summing the first times and the second times within the preset time to obtain an accumulated superposition time value.

And step 109, if the accumulated superposition number is less than or equal to a second preset number threshold, determining the anchor account corresponding to the anchor client as an automatic stop account.

And step 110, if the accumulated superposition number is greater than a second preset number threshold, determining the anchor account corresponding to the anchor client as a permanent closed account.

The automatic stop account indicates that the anchor account cannot be live broadcast within the preset stop time. The permanent closed account indicates that the anchor account is permanently unavailable for live broadcast. It is understood that the live server stores the first number and the second number while determining the first number and the second number; then, summing each first time and each second time within preset time to obtain an accumulated superposition time value; meanwhile, when the accumulated number of times of superposition is less than or equal to a second preset number of times threshold, it means that there is a high risk in the anchor account corresponding to the anchor client at this time, so that the anchor account corresponding to the anchor client is determined as an automatic off-air account to limit that the anchor account corresponding to the anchor client cannot be live broadcast within the preset off-air time under this condition; and when the accumulated superposition sub-value is greater than the second preset time threshold, it indicates that there is a significant risk in the anchor account corresponding to the anchor client at this time, so that the anchor account corresponding to the anchor client is determined as a permanent closed account to limit that the anchor account corresponding to the anchor client cannot be live broadcast permanently under this condition.

In a specific example, the preset time is 1 hour, the second preset time threshold is 10 times, the preset stop time is 24 hours, and the live broadcast server determines the first time and the second time and stores the first time and the second time; then, summing each first time and each second time within 1 hour to obtain a cumulative stacking number value; meanwhile, under the condition that the accumulated number of the superposition times is less than or equal to 10 times, the anchor account corresponding to the anchor client has higher risk, so that the anchor account corresponding to the anchor client is determined as an automatic stop account to limit that the anchor account corresponding to the anchor client cannot be subjected to live broadcast within 24 hours under the condition; moreover, when the cumulative number of superimposed times is greater than 10 times, it indicates that there is a significant risk in the anchor account corresponding to the anchor client at this time, so that the anchor account corresponding to the anchor client is determined to be a permanent closed account, so as to limit that the anchor account corresponding to the anchor client cannot be live broadcast permanently under this condition.

In the embodiment, the first times and the second times are stored; then, summing each first time and each second time within preset time to obtain an accumulated superposition time value; then, if the accumulated superposition sub-value is less than or equal to a second preset sub-threshold, determining the anchor account corresponding to the anchor client as an automatic stop account; meanwhile, if the accumulated superposition number is greater than a second preset number threshold, determining the anchor account corresponding to the anchor client as a permanent closed account; the efficiency and the convenience of punishing the anchor account number in the processing process of live video are improved, the overall environment of a live broadcast platform is purified, the physical and mental health of audience users is guaranteed, and the management efficiency of the live broadcast platform on the anchor is improved.

In one of the embodiments, as shown in figure 6,

and step 111, acquiring the text information sent by the client of the audience.

And 112, performing feature recognition processing on the character information based on the third feature database, and judging whether the character information is the target character information.

And step 113, if yes, shielding the text information.

And step 114, if not, sending text information to the audience client and the anchor client.

Wherein the third feature database is used for storing sensitive vocabulary features. The sensitive vocabulary characteristics refer to the vocabulary characteristics with unconventional or illegal phenomena; for example, features relating to violence, features relating to fraud or features relating to spreading pornography. The target text information is text information with sensitive vocabulary characteristics.

It can be understood that the live broadcast server obtains the text information sent by the audience client; then, feature recognition processing can be performed on the character information based on the third feature database, so that sensitive vocabulary features stored in the third feature database appearing in the character information are obtained, and whether the character information is the target character information is judged. Meanwhile, under the condition that the text information is the target text information, shielding the text information; and, in case the text information is not the target text information, transmitting the text information to the viewer client and the anchor client.

In a specific example, the specific manner of performing the feature recognition processing on the text information is not limited herein, and the feature recognition processing may be performed on the text information by using a vocabulary feature recognition algorithm, or may be performed on the text information by using a pre-trained vocabulary feature recognition neural network model. The above is only a specific example, and the setting is flexible according to the user requirement in practical application, and is not limited herein.

In the embodiment, the text information sent by the client of the audience is obtained; then, based on a third characteristic database, carrying out characteristic identification processing on the character information, and judging whether the character information is target character information; meanwhile, under the condition that the text information is the target text information, the text information is shielded; and, under the condition that the literal information is not the target literal information, the literal information is sent to the audience client and the anchor client; the phenomenon that the whole environment of the live broadcast platform is influenced due to the outflow of the character information containing sensitive vocabulary characteristics in the live broadcast process is prevented, the whole environment of the live broadcast platform is further purified, the physical and mental health of audience users is guaranteed, and the management efficiency of the live broadcast platform to the anchor broadcast is improved.

It should be understood that although the various steps in the flow diagrams of fig. 1-6 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least some of the steps in fig. 1-6 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.

In a second aspect, as shown in fig. 7, an apparatus for processing live video is provided, where the apparatus includes a live video acquiring module 710, an audio feature identifying module 720, a video feature identifying module 730, and an alert information sending module 740.

The live video acquiring module 710 is configured to acquire a live video sent by a main broadcast client, process the live video, and obtain live audio segments and live frames.

The audio feature identification module 720 is configured to perform feature identification processing on each segment of live audio based on the first feature database, and determine a first time; the first characteristic database is used for storing risk audio characteristics; the first times are the times of occurrence of risk audio features in each section of live audio.

The video feature recognition module 730 is configured to perform feature recognition processing on each frame of live broadcast picture based on the second feature database, and determine a second number; the second characteristic database is used for storing the risk picture characteristics; the second frequency is the frequency of occurrence of risk picture characteristics in each frame of live picture.

The warning information sending module 740 is configured to delete the live video and send warning information to the anchor client if the first number or the second number is greater than a first preset number threshold.

In one embodiment, the apparatus includes a preset feature replacement module.

The preset feature replacement module is used for replacing target live broadcast audio in each section of live broadcast audio with preset feature audio, replacing target live broadcast pictures in each frame of live broadcast pictures with preset feature pictures and generating a purified video if the first time and the second time are both smaller than or equal to a first preset time threshold; the target live audio is live audio with risk audio characteristics; the target live broadcast picture is a live broadcast picture with risk picture characteristics; and sending the purified video to the client of the audience.

In one embodiment, the preset feature replacement module comprises a video information generation unit and a video information encoding unit.

The video information generating unit is used for generating first original video information; the video information coding unit is used for coding the first original video information to obtain a purified video.

In one embodiment, the live video acquisition module 710 includes a video information decoding unit and a video information extraction unit.

The video information decoding unit is used for decoding the live video to obtain second original video information; and the video information extraction unit is used for extracting and processing the second original video information to obtain each section of live broadcast audio and each frame of live broadcast picture.

In one embodiment, the device further comprises a data storage module, a summation calculation module and an account number determination module.

The data storage module is used for storing a first time and a second time; the summation calculation module is used for carrying out summation calculation on each first time and each second time within preset time to obtain an accumulated superposition time value; the account number determining module is used for determining the anchor account number corresponding to the anchor client as an automatic stop account number if the accumulated number of times of superposition is smaller than or equal to a second preset time threshold. The account number determining module is used for determining the anchor account number corresponding to the anchor client as a permanent closed account number if the accumulated superposition number is greater than a second preset number threshold.

In one embodiment, the device further comprises a text information acquisition module, a vocabulary feature recognition module and a text information processing module.

The system comprises a text information acquisition module, a text information acquisition module and a text information display module, wherein the text information acquisition module is used for acquiring text information sent by a viewer client;

the vocabulary characteristic identification module is used for carrying out characteristic identification processing on the character information based on the third characteristic database and judging whether the character information is the target character information; the third characteristic database is used for storing sensitive vocabulary characteristics; the target character information is character information with sensitive vocabulary characteristics; the character information processing module is used for shielding character information if the character information is the target character information; and the character information processing module is used for sending the character information to the audience client and the anchor client if the character information is not the target character information.

For specific limitations of the processing apparatus for live video, reference may be made to the above limitations on the processing method for live video, and details are not repeated here. The modules in the processing apparatus for live video may be implemented in whole or in part by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent of a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a live server 920 is provided, the internal structure of which can be as shown in fig. 8. The live server 920 includes a processor, memory, network interface, and database connected by a system bus. Wherein the processor of the live server 920 is configured to provide computing and control capabilities. The memory of the live broadcast server 920 includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, a first feature database, a second feature database, and a third feature database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The first feature database of the live server 920 is used to store risk audio features. The second feature database of the live server 920 is used to store risk picture features. The third feature database of the live server 920 is used to store sensitive vocabulary features. The network interface of the live broadcast server 920 is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of processing live video.

Those skilled in the art will appreciate that the structure shown in fig. 8 is only a block diagram of a part of the structure related to the present application, and does not constitute a limitation to the live server 920 to which the present application is applied, and that a specific live server 920 may include more or less components than those shown in the figure, or combine some components, or have a different arrangement of components.

In a third aspect, a live server 920 is provided, the computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of any one of the above method embodiments when executing the computer program.

In a fourth aspect, as shown in fig. 9, a live system is provided, comprising a host client 910 and a live server 920 as in any of the live server embodiments described above.

The anchor client 910 is in communication connection with the live broadcast server 920, and is configured to collect a live broadcast video and send the live broadcast video to the live broadcast server 920. The anchor client 910 may be, but is not limited to, a terminal, which may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. It can be understood that the number of the main client 910 configured in the live system is at least one, and is flexibly set according to the actual requirement of the user in practical application, and is not limited herein.

In this embodiment, the live broadcast system processes a live broadcast video by acquiring a live broadcast video sent by the anchor client 910, and obtains live broadcast audio segments and live broadcast frames; then, based on a first feature database, performing feature recognition processing on each segment of live broadcast audio, and determining a first time; meanwhile, based on a second feature database, feature recognition processing is carried out on each frame of live broadcast picture, and a second number is determined; then, under the condition that the first time or the second time is greater than a first preset time threshold, deleting the live video and sending warning information to the anchor client 910; the phenomenon that the whole environment of the live broadcast platform is influenced by the outflow of the non-civilized video is prevented, the whole environment of the live broadcast platform is purified, the physical and mental health of audience users is guaranteed, and the management efficiency of the live broadcast platform to the anchor is improved.

In one embodiment, as shown in fig. 9, the live system further includes a viewer client 930.

The viewer client 930 is communicatively connected to the live broadcast server 920, and is configured to view the live broadcast video and receive the sanitized video sent by the live broadcast server 920. The viewer client 930 may be, but is not limited to, a terminal, which may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. It is understood that the number of the viewer clients 930 configured in the live system is at least one, and is flexibly set according to the actual requirement of the user in practical applications, which is not limited herein.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct Rambus Dynamic RAM (DRDRAM), and Rambus Dynamic RAM (RDRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of processing live video, the method comprising:

acquiring a live video sent by a main broadcast client, processing the live video, and obtaining live audio of each segment and live frames;

based on a first feature database, performing feature recognition processing on each section of live broadcast audio, and determining a first time; the first feature database is used for storing risk audio features; the first times are the times of the risk audio features appearing in each section of live audio;

based on a second feature database, performing feature recognition processing on each frame of the live broadcast picture, and determining a second time; the second characteristic database is used for storing risk picture characteristics; the second times are the times of occurrence of the risk picture characteristics in each frame of the live broadcast picture;

and if the first times or the second times are larger than a first preset time threshold, deleting the live video and sending warning information to the anchor client.

2. The method of claim 1, further comprising:

if the first times and the second times are both smaller than or equal to the first preset time threshold, replacing target live broadcast audio in each section of live broadcast audio with preset characteristic audio, replacing target live broadcast pictures in each frame of live broadcast pictures with preset characteristic pictures, and generating a purified video; wherein the target live audio is the live audio in which the risk audio feature occurs; the target live broadcast picture is the live broadcast picture with the risk picture characteristics;

and sending the purified video to the client of the audience.

3. The method of claim 2, wherein the step of generating a sanitized video comprises:

generating first original video information;

and coding the first original video information to obtain the purified video.

4. The method of claim 1, wherein the step of processing the live video to obtain segments of live audio and frames of live video comprises:

decoding the live video to obtain second original video information;

and extracting the second original video information to obtain each section of live broadcast audio and each frame of live broadcast picture.

5. The method of claim 1, further comprising:

storing the first number of times and the second number of times;

summing each first time and each second time within preset time to obtain an accumulated superposition time value;

if the accumulated superposition time value is less than or equal to a second preset time threshold, determining the anchor account corresponding to the anchor client as an automatic stop account;

and if the accumulated superposition time value is greater than a second preset time threshold, determining the anchor account corresponding to the anchor client as a permanent closed account.

6. The method according to any one of claims 1 to 5, further comprising:

acquiring character information sent by a viewer client;

based on a third characteristic database, carrying out characteristic identification processing on the character information and judging whether the character information is target character information; the third characteristic database is used for storing sensitive vocabulary characteristics; the target character information is the character information with the sensitive vocabulary characteristics;

if yes, shielding the text information;

if not, the text information is sent to the audience client and the anchor client.

7. An apparatus for processing live video, the apparatus comprising:

the live video acquisition module is used for acquiring a live video sent by a main broadcast client, processing the live video and obtaining live audio of each section and live frames;

the audio characteristic identification module is used for carrying out characteristic identification processing on each section of live audio based on a first characteristic database and determining a first time; the first feature database is used for storing risk audio features; the first times are times of occurrence of the risk audio features in each section of live audio;

the video characteristic identification module is used for carrying out characteristic identification processing on each frame of live broadcast picture based on a second characteristic database and determining a second time; the second characteristic database is used for storing risk picture characteristics; the second times are the times of the risk picture characteristics appearing in each frame of the live broadcast picture;

and the warning information sending module is used for deleting the live video and sending warning information to the anchor client if the first time or the second time is greater than a first preset time threshold.

8. A live server comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented by the processor when executing the computer program.

9. A live system comprising a anchor client and a live server as claimed in claim 8; the anchor client is in communication connection with the live broadcast server and used for collecting the live broadcast video and sending the live broadcast video to the live broadcast server.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.