CN117041652A

CN117041652A - Video recording method and device based on browser, electronic equipment and storage medium

Info

Publication number: CN117041652A
Application number: CN202310871863.5A
Authority: CN
Inventors: 陈龙辉; 梁选勤; 余毅鹏
Original assignee: SHENZHEN TIANSHITONG TECHNOLOGY CO LTD
Current assignee: SHENZHEN TIANSHITONG TECHNOLOGY CO LTD
Priority date: 2023-07-17
Filing date: 2023-07-17
Publication date: 2023-11-10

Abstract

The present application relates to the field of video recording technologies, and in particular, to a video recording method and apparatus based on a browser, an electronic device, and a storage medium. The video recording method of the application needs to acquire a compiling instruction library comprising a video recording program. An instruction compilation object is then created based on the compilation instruction library. And acquiring video recording parameters and recording start instructions through the browser, receiving media stream data of the browser, and analyzing the media stream data to obtain video stream data and video frame mark information. Compiling the object based on the recording start instruction and the video recording parameter calling instruction, and then calibrating the video stream data according to the video frame mark information to obtain the calibrated video stream data. And ending recording the video stream data after the calibration processing based on the recording ending instruction to obtain the target video file. The video stream data is calibrated based on the video frame mark information in the video process, so that the video recording quality can be improved.

Description

Video recording method and device based on browser, electronic equipment and storage medium

Technical Field

The present application relates to the field of video recording technologies, and in particular, to a video recording method and apparatus based on a browser, an electronic device, and a storage medium.

Background

With the development of internet technology, the capability of application programming interfaces supported by various browsers is more and more perfect, and the capability of the Web application for realizing online playing of audio and video is more and more powerful, such as Web/HTML5 application of short video, long video, live website, and the like.

In the related art, in order to meet the requirement that a user needs to record video at a client, a MediaRecorder construction method is often used to record video in a browser or record video in a FFmpeg tool screen. The method comprises the steps that a video is recorded by using a MediaRecorder construction method, so that frame cutting operation can occur, the original resolution and code rate cannot be guaranteed, and the recorded video resolution is inconsistent with the original video or video clamping is caused by network jitter and insufficient page rendering performance; and the FFmpeg tool is used for recording the screen, so that the conditions of picture screen display, asynchronous sound and picture, incapability of dragging a progress bar and the like are easy to occur.

Therefore, there are limitations to recording video either by MediaRecorder construction or by FFmpeg tool recording. At present, how to improve the quality of video recording in a browser has become a major problem to be solved in the industry.

Disclosure of Invention

The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application provides a video recording method based on a browser, which can improve the quality of video recording in the browser.

According to an embodiment of the first aspect of the present application, a video recording method based on a browser includes:

acquiring a compiling instruction library, wherein the compiling instruction library comprises a video recording program;

creating an instruction compiling object based on the compiling instruction library;

acquiring video recording parameters and recording start instructions through the browser, and receiving media stream data of the browser;

analyzing the media stream data to obtain video stream data and video frame mark information;

invoking the command compiling object based on the recording start command and the video recording parameters to execute the video recording program to start recording the video stream data;

performing calibration processing on the video stream data according to the video frame mark information to obtain the video stream data after the calibration processing;

and acquiring a recording ending instruction through the browser, ending recording the video stream data after the calibration processing based on the recording ending instruction, and obtaining a target video file.

According to some embodiments of the application, the video stream data comprises a sequence of picture frames and a sequence of audio frames;

the step of performing calibration processing on the video stream data according to the video frame mark information to obtain the video stream data after the calibration processing includes:

determining a picture key frame in the picture frame sequence and a timestamp corresponding to the picture key frame according to the video frame marking information;

determining an audio key frame matched with the picture key frame from the audio frame sequence according to the time stamp;

and generating a merging sequence by taking the matched picture key frame and the matched audio key frame as references, and obtaining the video stream data after calibration processing based on the merging sequence.

According to some embodiments of the application, the generating a merging sequence based on the matched picture key frame and the audio key frame, and obtaining the video stream data after calibration processing based on the merging sequence includes:

generating a merging sequence by taking the matched picture key frame and the matched audio key frame as references;

determining a first frame rate based on the sequence of picture frames and a second frame rate based on the sequence of audio frames;

Comparing the first frame rate with the second frame rate to obtain a first comparison result;

and carrying out alignment processing on the combined sequence based on the first comparison result to obtain the video stream data after calibration processing.

According to some embodiments of the application, the creating an instruction compiled object based on the compiled instruction library includes:

enabling a first thread and a second thread through a thread management unit of the browser; the first thread is used for bearing the running load of the browser page and the service logic association, and the second thread is used for bearing the running load of the browser page;

and reading the compiling instruction library in the second thread, and creating the instruction compiling object.

According to some embodiments of the present application, the obtaining, by the browser, the video recording parameter and the recording start instruction, and receiving media stream data of the browser includes:

acquiring a recording setting instruction from the first thread through the browser, and determining the video recording parameters based on the recording setting instruction;

after the video recording parameters are determined, a recording start instruction is acquired from the first thread through the browser;

And receiving media stream data of the browser based on the video recording parameters and the recording start instruction.

According to some embodiments of the application, the receiving media stream data of the browser based on the video recording parameter and the recording start instruction includes:

acquiring the media stream data in the first thread through the browser;

the media stream data is sent from the first thread to the second thread.

According to some embodiments of the application, the parsing the media stream data to obtain video stream data and video frame mark information includes:

acquiring a preset data offset;

and decapsulating the media stream data based on the data offset to obtain the video stream data and the video frame marking information.

According to a second aspect of the present application, a video recording apparatus based on a browser includes:

the first acquisition module is used for acquiring a compiling instruction library, wherein the compiling instruction library comprises a video recording program;

the object creating module is used for creating an instruction compiling object based on the compiling instruction library;

the second acquisition module is used for acquiring video recording parameters and recording start instructions through the browser and receiving media stream data of the browser;

The analysis module is used for analyzing the media stream data to obtain video stream data and video frame marking information;

the recording starting module is used for calling the command compiling object based on the recording starting command and the video recording parameter so as to execute the video recording program to start recording on the video stream data;

the calibration module is used for carrying out calibration processing on the video stream data according to the video frame mark information to obtain the video stream data after the calibration processing;

and the recording ending module is used for acquiring a recording ending instruction through the browser, ending recording the video stream data after the calibration processing based on the recording ending instruction, and obtaining a target video file.

In a third aspect, an embodiment of the present application provides an electronic device, including: the video recording device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the video recording method based on the browser according to any one of the embodiments of the first aspect of the application when executing the computer program.

In a fourth aspect, an embodiment of the present application provides a computer readable storage medium storing a program, where the program is executed by a processor to implement a browser-based video recording method according to any one of the embodiments of the first aspect of the present application.

According to the video recording method and device based on the browser, the electronic equipment and the storage medium, which are provided by the embodiment of the application, the method and device have the following beneficial effects:

according to the browser-based video recording method, a compiling instruction library is required to be acquired first, and the compiling instruction library comprises a video recording program. An instruction compilation object is then created based on the compilation instruction library. Further, the video recording parameters and the recording start instruction are obtained through the browser, the media stream data of the browser are received, and then the video stream data and the video frame marking information are obtained through analysis from the media stream data. And compiling the object based on the recording start instruction and the video recording parameter calling instruction to execute the video recording program to start recording the video stream data, and then calibrating the video stream data according to the video frame mark information to obtain calibrated video stream data. And further, acquiring a recording end instruction through a browser, and ending recording the video stream data after the calibration processing based on the recording end instruction to obtain the target video file. Because the video stream data is calibrated based on the video frame mark information in the video process, the quality of video recording in the browser can be improved.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

fig. 1 is a flowchart of a video recording method according to an embodiment of the present application;

fig. 2 is another flow chart of a video recording method according to an embodiment of the present application;

fig. 3 is another flow chart of a video recording method according to an embodiment of the present application;

fig. 4 is another flow chart of a video recording method according to an embodiment of the present application;

fig. 5 is another flow chart of a video recording method according to an embodiment of the present application;

fig. 6 is another flow chart of a video recording method according to an embodiment of the present application;

fig. 7 is another flow chart of a video recording method according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a video recording apparatus according to an embodiment of the present application;

fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the application.

In the description of the present application, a number means one or more, a number means two or more, and greater than, less than, exceeding, etc. are understood to not include the present number, and above, below, within, etc. are understood to include the present number. The description of the first and second is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present application, it should be understood that the direction or positional relationship indicated with respect to the description of the orientation, such as up, down, left, right, front, rear, etc., is based on the direction or positional relationship shown in the drawings, is merely for convenience of describing the present application and simplifying the description, and does not indicate or imply that the apparatus or element to be referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present application.

In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

In the description of the present application, unless explicitly defined otherwise, terms such as arrangement, installation, connection, etc. should be construed broadly and the specific meaning of the terms in the present application can be determined reasonably by a person skilled in the art in combination with the specific content of the technical solution. In addition, the following description of specific steps does not represent limitations on the order of steps or logic performed, and the order of steps and logic performed between steps should be understood and appreciated with reference to what is described in the embodiments.

The following explains the technical terms related to the present application:

wasm is a short for WebAssemble, which is a portable, small, fast-loading encoding format that can run in a Web browser. WebAssembly is not a handwritten line code, which is a compilation target. With the development of programming technology, more and more languages can be compiled into WebAssembly, webAssembly, and C, C ++, java and other native languages can be directly compiled into machine code, so that the converter is omitted. The WebAssembly running speed is high, and the binary representation method of the WebAssembly running speed greatly reduces the size of the code package, so that the loading speed of the browser is improved.

Application programming interface (API, application Programming Interface): is a predefined function that aims to provide applications and developers the ability to access a set of routines based on certain software or hardware without having to access source code or understand the details of the internal operating mechanisms.

The IndexDB is an abbreviation of IndexDataBaseAPI, which is a scheme for storing structured data in a browser, and replaces the role of WebSQLDataBaseAPI. The design of the APIs in the IndexDB is essentially asynchronous, most operations execute asynchronously in the form of requests, with either successful results or errors having corresponding onerror and onsuccess event handlers to determine the output. The IndexeDB database is a database similar to MySQL or WebSQLDatabase, which uses objects to store data instead of tables, and belongs to the style of NoSQL.

IDBFS is an IndexedDB file system. The system is based on the IndexdDB object of the browser, can be stored permanently, but can only be used in the browser environment. The hooking of IDBFS is accomplished through es.mount () method. In fact, during operation, the IDBFS still uses the memory to store the virtual file system, but only the IDBFS can perform bidirectional synchronization of the memory data and the indeeddb through the fs.syncfs () method, so as to achieve the purpose of data persistence storage.

MediaRecorder is a class provided by Android sdk and specially used for audio and video recording, and can generally collect audio by using a mobile phone microphone and collect picture information by using a camera.

FFmpeg is an audio and video coding and decoding tool, and is a group of audio and video coding and development suite, and as a coding and development suite, the FFmpeg provides rich audio and video processing calling interfaces for developers. FFmpeg provides encapsulation and decapsulation of multiple media formats, including multiple audio and video encodings, streaming media of multiple protocols, multiple multi-color format conversion, multiple sample rate conversion, multiple code rate conversion, and the like; the FFmpeg framework provides a variety of rich plug-in modules including encapsulated and decapsulated plug-ins, encoded and decoded plug-ins, and the like.

LLVM is a framework system of a framework compiler, written in C++, for optimizing the compile time, link time, run time, and idle time of programs written in arbitrary programming languages, remains open to developers, and is compatible with existing scripts.

Emscripten is a tool chain that is compiled through LLVM to generate asm.js, webAsssembly bytecodes in order to be able to run C and C++ near the fastest speed in a web page and without any plug-ins. Emscriptin is a LLVM-based compiler that is theoretically capable of compiling any code that is capable of generating LLVM bit codes into a strict subset asm.js of javascript, and is essentially used to compile C/C++ code into asm.js.

The following describes the technical problems faced by the present application:

In the related art, in order to meet the requirement that a user needs to record video at a client, a MediaRecorder construction method is often used to record video in a browser or record video in a FFmpeg tool screen.

The MediaRecorder construction method depends on a webpage document object model (DOM, document Object Model), such as a Video tag and a Canvas tag, on the Video being played by the webpage player, and has to be completed in a main thread, so that a large number of frame cutting operations may affect the performance of the current page, and meanwhile, the frame cutting operations cannot guarantee the original resolution and code rate, so that the recorded Video resolution is inconsistent with the original Video or Video is blocked due to network jitter and insufficient page rendering performance.

The FFmpeg tool is used for recording a screen, a video file or a video stream is read to an ArrayBuffer and copied to a memory; the ArrayBuffer is a basic unit for a custom array and view reference. Then, the Wasm reads the memory to call, and the memory is occupied when the transcoding MP4 file is recorded, so that the memory overruns if the video file is large or the video stream code rate is high. The time sequence of a key frame (I frame) and a predicted frame (P frame) of video streaming is difficult to ensure well, and if the I frame or the P frame is lost in video recording, the conditions of picture screen display, asynchronous sound and picture, incapability of dragging a progress bar and the like are likely to occur.

Therefore, there are limitations to recording either video by MediaRecorder construction or by FFmpeg tool recording, either with original code rate or directly on the browser.

Further description will be made based on the drawings.

Referring to fig. 1, a browser-based video recording method according to an embodiment of the first aspect of the present application may include, but is not limited to, steps S101 to S107 described below.

Step S101, a compiling instruction library is obtained, wherein the compiling instruction library comprises a video recording program;

step S102, creating an instruction compiling object based on a compiling instruction library;

step S103, obtaining video recording parameters and recording start instructions through a browser, and receiving media stream data of the browser;

step S104, analyzing the media stream data to obtain video stream data and video frame mark information;

step S105, compiling an object based on the recording start instruction and the video recording parameter calling instruction to execute the video recording program to start recording the video stream data;

step S106, performing calibration processing on the video stream data according to the video frame mark information to obtain calibrated video stream data;

step S107, obtaining a recording end instruction through a browser, and ending recording of the calibrated video stream data based on the recording end instruction to obtain a target video file.

In the embodiment of the present application shown in steps S101 to S107, a compiling instruction library including a video recording program needs to be acquired first. An instruction compilation object is then created based on the compilation instruction library. Further, the video recording parameters and the recording start instruction are obtained through the browser, the media stream data of the browser are received, and then the video stream data and the video frame marking information are obtained through analysis from the media stream data. And compiling the object based on the recording start instruction and the video recording parameter calling instruction to execute the video recording program to start recording the video stream data, and then calibrating the video stream data according to the video frame mark information to obtain calibrated video stream data. And further, acquiring a recording end instruction through a browser, and ending recording the video stream data after the calibration processing based on the recording end instruction to obtain the target video file. Because the video stream data is calibrated based on the video frame mark information in the video process, the quality of video recording in the browser can be improved.

In step S101 of some embodiments, a compiling instruction library is obtained, the compiling instruction library including a video recording program. It should be noted that the compiling instruction library refers to an instruction library storing video recording programs, and in some embodiments, the compiling instruction library may further include various associated programs such as an audio codec program and a video codec program.

In some more specific embodiments, the Emscripten SDK may be used to compile audio codecs, video codecs, audio video recorders from C/C++ into a libtps. In this way, a library of iibtps, wasm, i.e., a compiled instruction library, can be generated.

In step S102 of some embodiments, an instruction compilation object is created based on a compilation instruction library. It should be noted that, the instruction compiled object is created based on the compiled instruction library, and the instruction compiled object may be specifically used to load the video recording program. In some embodiments, the compiling instruction library may further include various associated programs such as an audio codec program and a video codec program, so that the instruction compiling object may also be used to load various associated programs such as an audio codec program and a video codec program.

Referring to fig. 2, step S102 may include, but is not limited to, steps S201 to S202 described below, according to some embodiments of the present application.

Step S201, enabling a first thread and a second thread through a thread management unit of a browser; the first thread is used for bearing the running load of the browser page and the service logic association, and the second thread is used for bearing the running load of the browser page;

Step S202, a compiling instruction library is read in a second thread, and an instruction compiling object is created.

In steps S201 to S202 of some embodiments, the first thread and the second thread are started through the thread management unit of the browser, and then the compiling instruction library is read in the second thread, and the instruction compiling object is created. It should be noted that the thread management unit of the browser is used for managing the multithreading operation of the browser. The first thread is used for bearing the running load of the browser page and the service logic association, and the second thread is used for bearing the running load of the browser page. Because the first thread and the second thread are independent threads, the first thread and the second thread can run simultaneously and are not blocked. Therefore, when the video recording generates the operation task, the operation task can be handed to the second thread for processing, and when the second thread is completed in calculation, the result is returned to the first thread. In this way, the first thread only uses to concentrate on processing business logic, and does not consume too much time to process a large amount of complex computation, thereby reducing blocking time, improving running efficiency, and improving page smoothness and user experience naturally.

In some more specific embodiments, the library of the Libtps.wasm can be directly read and a wasm object can be created, so that the loading of modules such as the encoding and decoding of the Libtps.wasm audio and video, the recording of the audio and video, the merging of the audio and video, the operation of the IDBFS file and the like can be realized. The loading mode is simpler and more direct, and the loading efficiency is higher.

In other embodiments, the Web workbench thread may be started, the libtps.js is asynchronously loaded, the libtps.js reads the library of libtps.wasm and creates the wasm object, and the modules such as the libtps.wasm audio/video encoding/decoding, the audio/video recording, the audio/video merging, the IDBFS file operation and the like are loaded. It should be noted that the Web workbench is part of the HTML5 standard, and this specification defines a set of APIs that allow a new Web workbench thread to be opened up beyond the js main thread and a js script to be run therein, which gives the developer the ability to operate multiple threads with js. Because the thread is an independent thread, the Web workbench thread and the js main thread can run simultaneously without blocking each other. Therefore, when a large number of operation tasks exist, the operation tasks can be handed to the Web workbench thread for processing, and when the calculation of the Web workbench thread is completed, the result is returned to the js main thread. Thus, js main thread only uses to concentrate on processing business logic, and does not consume too much time to process a large amount of complex calculation, thereby reducing blocking time, improving running efficiency, and improving page smoothness and user experience naturally.

In step S103 of some embodiments, a video recording parameter and a recording start instruction are acquired by a browser, and media stream data of the browser is received. It should be noted that, the video recording parameters are used to make clear about the relevant attributes of the video recording, such as coding format type, sampling rate, code rate, coded data storage address, data length, whether it is a key frame, frame acquisition time stamp, etc. The recording start instruction is used for instructing the browser to execute the action of starting video recording. It should be noted that, the browser needs to receive the media stream data first to perform the video recording, so the recording start command also has the function of starting the browser to receive the media stream data.

Referring to fig. 3, step S103 may include, but is not limited to, steps S301 to S303 described below, according to some embodiments of the present application.

Step S301, obtaining a recording setting instruction in a first thread through a browser, and determining video recording parameters based on the recording setting instruction;

step S302, after determining video recording parameters, obtaining a recording start instruction in a first thread through a browser;

step S303, based on the video recording parameter and the recording start instruction, receiving the media stream data of the browser.

In step S301 to step S303, a recording setting instruction is first acquired in a first thread through a browser, a video recording parameter is determined based on the recording setting instruction, then a recording start instruction is acquired in the first thread through the browser after the video recording parameter is determined, and further media stream data of the browser is received based on the video recording parameter and the recording start instruction. It is emphasized that the video recording parameters are used to clarify the relevant properties of the video recording, such as coding format type, sampling rate, code rate, coded data storage address, data length, whether it is a key frame, frame acquisition time stamp, etc. The video recording parameters can be obtained through setting by a user in an interface of a browser, and can also be obtained through data transmitted from a network. In the embodiment of the application, a recording setting instruction is acquired in a first thread through a browser, and then video recording parameters are determined based on the recording setting instruction. After the video recording parameters are determined, recording can be performed on the basis, so that a recording start instruction is acquired in the first thread through the browser, and then the media stream data of the browser is received on the basis of the video recording parameters and the recording start instruction. Because the video recording parameters are determined according to the recording setting instructions, the setting of the parameters has greater flexibility, and the video recording requirements of each time can be better adapted.

In some specific embodiments, the user may click a button control for starting recording in the browser, and after the browser monitors the recording start instruction, the browser starts initializing the recording file, designates a file generation path and a file name, and mounts the file to the IDBFS through the iibtps. And starting to call the libtps.wasm to start the recording program, inputting necessary video recording parameters (including coding format type, sampling rate, code rate and the like), and immediately further decoding the media stream data and continuously writing the media stream data into a memory after the libtps.wasm receives the media stream data.

Referring to fig. 4, step S303 may include, but is not limited to, steps S401 to S402 described below, according to some embodiments of the present application.

Step S401, obtaining media stream data in a first thread through a browser;

step S402, the media stream data is sent from the first thread to the second thread.

Through steps S401 to S402, in the illustrated embodiment of the present application, the browser obtains media stream data in the first thread, and sends the media stream data from the first thread to the second thread. It should be noted that, the first thread is used to carry the running load associated with the business logic and therefore, the first thread based on the browser is often required to obtain the media stream data. And because the second thread is used for bearing the running load of the recording browser page, the media stream data needs to be sent from the first thread to the second thread, so that the second thread can bear the running load of the recording browser page.

In some specific embodiments, a user clicks to Start recording in a first thread of a browser, the first thread of the browser sends an instruction (PostMessage) to prompt a Web Worker thread serving as a second thread to Start recording (Start), and after the Web Worker thread monitors the instruction through an onMessage method, the Web Worker thread starts initializing a recording file, designates a file generation path and a file name, and mounts the file to an IDBFS through a Libtps. And starting to call the libtps.wasm to start recording module, inputting necessary parameters (including audio coding format type, sampling rate, code rate and the like), and immediately further decoding the media stream data and continuously writing the media stream data into the memory after the libtps.wasm receives the media stream data.

In step S104 of some embodiments, the video stream data and the video frame tag information are parsed from the media stream data. The media stream data includes video stream data. The video stream data is configured with video frame marking information for marking custom information such as key frames, time stamps and the like. It should be noted that the video frame mark information needs to participate in the calibration processing of the video stream data in the subsequent step, and it should be understood that the quality of video recording in the browser can be improved by recording the video stream data after the calibration processing.

In some specific embodiments, the device-side transmission video stream adds custom information such as a key frame and a time stamp at the tail part of a video frame, and the embodiment of the application can compare whether the time stamps of the picture frame and the audio frame information are consistent or not during decoding, thereby effectively solving the problem of frame loss. In some embodiments, after the time matching of the picture frame and the audio frame is achieved, the user-defined information of the frame data can be deleted and then audio and video combination can be performed.

Referring to fig. 5, step S104 may include, but is not limited to, steps S501 to S502 described below, according to some embodiments of the present application.

Step S501, obtaining a preset data offset;

step S502, decapsulating the media stream data based on the data offset to obtain video stream data and video frame mark information.

Through the embodiment of the present application shown in steps S501 to S502, a preset data offset is required to be acquired first, and the media stream data is unpacked based on the data offset to obtain the video stream data and the video frame mark information. It should be noted that, the data offset refers to a difference between a storage location of data in a memory and a location of the data in a program in a computer. This difference is due to the fact that the data storage in memory is done in byte order, while the data storage in the program is done in data type. Therefore, when accessing data in a program, an offset of the data needs to be considered to ensure that the data is accessed correctly. Data offset is important in computer programming. When accessing complex data types such as arrays, structures, etc., the offset of the data needs to be considered. For example, when accessing a member variable in a structure, the offset of the member variable in the structure needs to be considered. If the offset is not considered, erroneous data may be accessed, resulting in program errors.

Data encapsulation is an important element of data processing, for example, packaging information in a certain format during communication, making it a valid data information that a computer can effectively recognize the processing. The unpacking is reverse operation, which is used for organizing and extracting data information, so that the data can be effectively expressed, transmitted and stored. It should be noted that the decapsulation is generally performed in several steps, where data is first extracted from its encapsulation format, transmitted into a system or device to be processed, and then the received data is subjected to packet decompression, which is a unique data decompression method, and the main function is to extract and parse, according to its definition, a plurality of parts of different types of data, such as data protocol headers, message bodies, signatures, etc., in the encapsulation. After analysis, whether the analysis result of the data is correct or not is verified, and finally the data is packaged again to become useful information so as to be convenient for the next processing. According to the embodiment of the application, the purpose of the decapsulation is to enable various types of data to be correctly identified and effectively transmitted, and the system must be able to extract correct instructions according to each data in the encapsulation format so as to ensure the integrity and reliability of the data. Decapsulation is also understood as the reprocessing of data that is capable of being transported as a whole, but that is capable of being disassembled into several different parts, becoming separately processable parts. This advantageously reduces the complexity of the system and reduces the time it takes for the computer to use the hardware resources, thereby enabling the gathering, scheduling, management and transportation of the required valid data in a high speed transmission environment.

It should be clear that, in the embodiment of the present application, the media stream data needs to be unpacked based on the data offset to obtain the video stream data and the video frame mark information, which aims to ensure that the media stream data is unpacked normally.

In some specific embodiments, after receiving the media stream frame data through onMessage, the Web workbench thread decapsulates the video and audio frame data according to a preset data offset, and then parses custom information serving as non-frame data to obtain information such as a current frame type (picture/audio), a current frame timestamp, a key frame timestamp, a frame sequence number, a frame index and the like. After the analysis is completed, the data of the media stream frame is modified, and then the custom information of the non-video frame playing is removed according to the agreed data offset.

In step S105 of some embodiments, an object is compiled based on the recording start instruction and the video recording parameter call instruction to execute the video recording program to start recording the video stream data. It should be noted that after the video stream data and the video frame mark information are parsed from the media stream data, the object can be compiled based on the recording start instruction and the video recording parameter calling instruction, and the purpose of the method is to execute the video recording program to start recording on the video stream data.

In step S106 of some embodiments, the video stream data is calibrated according to the video frame mark information, so as to obtain calibrated video stream data. It should be emphasized that the video frame marking information is used for marking custom information such as key frames and time stamps, and participates in the calibration processing of the video stream data in the process of executing the video recording program to start recording the video stream data, and it should be understood that the quality of video recording in the browser can be improved by recording the video stream data after the calibration processing.

Referring to fig. 6, according to some embodiments of the present application, it is noted that video and audio are both information used to transmit and play sound and pictures, but they differ in some respects. Video typically contains sound and pictures that allow viewers to see and hear the content being played. In contrast, audio contains only sound and no picture. Thus, in embodiments of the present application, the video stream data may include a sequence of picture frames and a sequence of audio frames. Step S106 may include, but is not limited to, steps S601 to S603 described below.

Step S601, determining a picture key frame and a timestamp corresponding to the picture key frame in a picture frame sequence according to the video frame marking information;

Step S602, determining an audio key frame matched with the picture key frame from the audio frame sequence according to the time stamp;

in step S603, a merging sequence is generated based on the matched picture key frame and the audio key frame, and video stream data after calibration processing is obtained based on the merging sequence.

Through steps S601 to S603, firstly, determining a picture key frame in a picture frame sequence and a timestamp corresponding to the picture key frame according to the video frame marking information; then, according to the time stamp, determining an audio key frame matched with the picture key frame from the audio frame sequence; further, generating a merging sequence by taking the matched picture key frame and the matched audio key frame as references, and obtaining video stream data after calibration processing based on the merging sequence. It should be noted that, because the video stream data is configured with video frame marking information for marking custom information such as key frames and time stamps, the embodiment of the application can compare whether the time stamps of the picture frames and the audio frame information are consistent during decoding, thereby effectively solving the problem of frame loss. In some embodiments, after the time matching of the picture frame and the audio frame is achieved, the user-defined information of the frame data can be deleted and then audio and video combination can be performed. Thus, the calibration processing of the video stream data can be realized.

Referring to fig. 7, step S603 may include, but is not limited to, steps S701 to S704 described below, according to some embodiments of the present application.

Step S701, generating a merging sequence by taking the matched picture key frame and the matched audio key frame as a reference;

step S702, determining a first frame rate based on a picture frame sequence, and determining a second frame rate based on an audio frame sequence;

step S703, comparing the first frame rate with the second frame rate to obtain a first comparison result;

step S704, performing alignment processing on the combined sequence based on the first comparison result, to obtain video stream data after calibration processing.

Through the embodiment of the application shown in step S701 to step S704, a merging sequence is generated based on the matched picture key frame and the audio key frame, a first frame rate is determined based on the picture frame sequence, a second frame rate is determined based on the audio frame sequence, the first frame rate and the second frame rate are further compared to obtain a first comparison result, and finally the merging sequence is aligned based on the first comparison result to obtain the video stream data after calibration processing. In this way, the alignment processing can be performed on the picture frame sequence and the audio frame sequence based on the first frame rate of the picture frame sequence and the second frame rate of the audio frame sequence, so as to obtain the video stream data after the calibration processing. It should be understood that the quality of video recording in the browser can be improved by recording the video stream data after the calibration processing.

In some more specific embodiments, after decoding the media stream data, it may be determined, frame by frame, whether the current frame is a key frame, so that in order to enable the calibration of the media stream data to be performed smoothly, the first frame of media stream data (picture and audio) that is transmitted during the click recording needs to be a key frame, and if the first frame of media stream data is a non-key frame, the video recording program may perform frame loss processing and send an instruction to request the key frame. Further, after the picture key frame is taken, the video recording program records the time stamp of the current picture key frame, the time stamp is transmitted into the video merging queue, the audio key frame with the time stamp completely matched with the time stamp is searched in the memory and added into the merging queue, and in order to synchronize picture and video merging, the video recording program calculates and converts the frame rates of the picture and the audio so as to determine that the audio does not appear too fast or too slow. And further, after the picture and the video are combined, the combined media stream data is written into the memory.

In step S107 of some embodiments, a recording end instruction is acquired through a browser, and recording is ended on the calibrated video stream data based on the recording end instruction, so as to obtain a target video file. The recording end instruction is used for indicating the browser to execute the action of ending video recording.

In some more specific embodiments, when the user clicks on the browser to stop recording, the main thread sends a stop instruction to the Web Worker thread to end the call to the video recording program, and the libtps.wasm reads the combined audio and video memory data and writes the combined audio and video memory data into the specified file mounted in the IDBFS; the file writing is asynchronous operation, and after the writing is completed, the Web workbench thread sends a completion (done) instruction to inform the main thread that the writing is completed.

Referring to fig. 8, a browser-based video recording apparatus 800 according to an embodiment of the second aspect of the present application includes:

the analysis module is used for analyzing the media stream data to obtain video stream data and video frame mark information;

the recording starting module is used for compiling an object based on the recording starting instruction and the video recording parameter calling instruction so as to execute a video recording program to start recording on video stream data;

The calibration module is used for carrying out calibration processing on the video stream data according to the video frame mark information to obtain video stream data after the calibration processing;

and the recording ending module is used for acquiring a recording ending instruction through the browser, ending recording the calibrated video stream data based on the recording ending instruction, and obtaining the target video file.

Fig. 9 shows an electronic device 900 provided by an embodiment of the application. The electronic device 900 includes: a processor 901, a memory 902, and a computer program stored on the memory 902 and executable on the processor 901, the computer program when executed for performing the browser-based video recording method described above.

The processor 901 and the memory 902 may be connected by a bus or other means.

The memory 902 is used as a non-transitory computer readable storage medium for storing non-transitory software programs and non-transitory computer executable programs, such as the browser-based video recording method described in the embodiments of the present application. The processor 901 implements the above-described browser-based video recording method by running a non-transitory software program and instructions stored in the memory 902.

The memory 902 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area. The storage data area may store video recording methods that perform the browser-based methods described above. In addition, the memory 902 may include high-speed random access memory 902 and may also include non-transitory memory 902, such as at least one storage device memory device, flash memory device, or other non-transitory solid state memory device. In some implementations, the memory 902 optionally includes memory 902 located remotely from the processor 901, the remote memory 902 being connectable to the electronic device 900 through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The non-transitory software programs and instructions required to implement the above-described browser-based video recording method are stored in the memory 902, and when executed by the one or more processors 901, the above-described browser-based video recording method is performed, for example, method steps S101 to S107 in fig. 1, method steps S201 to S202 in fig. 2, method steps S301 to S303 in fig. 3, method steps S401 to S402 in fig. 4, method steps S501 to S502 in fig. 5, method steps S601 to S603 in fig. 6, and method steps S701 to S704 in fig. 7.

The embodiment of the application also provides a computer readable storage medium which stores computer executable instructions for executing the video recording method based on the browser.

In an embodiment, the computer-readable storage medium stores computer-executable instructions that are executed by one or more control processors, for example, to perform method steps S101 through S107 in fig. 1, method steps S201 through S202 in fig. 2, method steps S301 through S303 in fig. 3, method steps S401 through S402 in fig. 4, method steps S501 through S502 in fig. 5, method steps S601 through S603 in fig. 6, and method steps S701 through S704 in fig. 7.

The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, storage device storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically include computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media. It should also be appreciated that the various embodiments provided by the embodiments of the present application may be arbitrarily combined to achieve different technical effects.

While the preferred embodiment of the present application has been described in detail, the present application is not limited to the above embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit and scope of the present application, and these equivalent modifications or substitutions are included in the scope of the present application as defined in the appended claims.

Claims

1. A browser-based video recording method, comprising:

2. The method of claim 1, wherein the video stream data comprises a sequence of picture frames and a sequence of audio frames;

3. The method of claim 2, wherein generating a merged sequence based on the matched picture key frames and the audio key frames, and deriving the calibrated video stream data based on the merged sequence, comprises:

4. The method of claim 1, wherein creating an instruction compilation object based on the compilation instruction library comprises:

5. The method of claim 4, wherein the obtaining, by the browser, the video recording parameters and the recording start instruction, and receiving the media stream data of the browser, comprises:

6. The method of claim 5, wherein receiving media stream data of the browser based on the video recording parameters and the recording start instruction comprises:

acquiring the media stream data in the first thread through the browser;

the media stream data is sent from the first thread to the second thread.

7. The method of claim 5, wherein parsing the video stream data and video frame marker information from the media stream data comprises:

acquiring a preset data offset;

8. A browser-based video recording apparatus, comprising:

9. An electronic device, comprising: a memory, a processor storing a computer program, the processor implementing the browser-based video recording method according to any one of claims 1 to 7 when the computer program is executed.

10. A computer-readable storage medium storing a program that is executed by a processor to implement the browser-based video recording method of any one of claims 1 to 7.