CN111739543B - Debugging method of audio coding method and related device thereof - Google Patents
Debugging method of audio coding method and related device thereof Download PDFInfo
- Publication number
- CN111739543B CN111739543B CN202010448481.8A CN202010448481A CN111739543B CN 111739543 B CN111739543 B CN 111739543B CN 202010448481 A CN202010448481 A CN 202010448481A CN 111739543 B CN111739543 B CN 111739543B
- Authority
- CN
- China
- Prior art keywords
- encoding
- coding
- coding method
- audio
- optimal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 360
- 230000005540 biological transmission Effects 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 14
- 238000010586 diagram Methods 0.000 description 4
- 230000003993 interaction Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Telephonic Communication Services (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The application discloses a debugging method of an audio coding method and a related device thereof. The debugging method of the audio coding method comprises the following steps: acquiring a file obtained by encoding the same audio file by each encoding method, and determining the encoding time length corresponding to each encoding method; determining the time length and accuracy of the voice recognition of the coded files of each coding method by the same voice recognition algorithm; and determining an optimal coding method based on the coding time length, the identification time length and the identification accuracy corresponding to each coding method. The method and the device can automatically debug the audio coding method without manual intervention of a developer, and save cost.
Description
Technical Field
The present disclosure relates to the field of audio coding technologies, and in particular, to a method for debugging an audio coding method and a related device thereof.
Background
Currently, the audio coding method is manually adjusted by a developer to obtain an optimal coding method. This manual tuning will take up significantly more development time and increase costs.
Disclosure of Invention
The main purpose of the application is to provide a debugging method of an audio coding method and a related device thereof, which can automatically carry out the debugging work of the audio coding method without manual intervention of a developer, thereby saving cost.
In order to achieve the above purpose, a technical scheme adopted in the application is as follows: there is provided a debugging method of an audio encoding method, the method comprising:
acquiring a file obtained by encoding the same audio file by each encoding method, and determining the encoding time length corresponding to each encoding method;
determining the time length and accuracy of the voice recognition of the coded files of each coding method by the same voice recognition algorithm;
and determining an optimal coding method based on the coding time length, the identification time length and the identification accuracy corresponding to each coding method.
The method for determining the optimal coding method based on the coding time length, the identification time length and the identification accuracy corresponding to each coding method comprises the following steps: calculating the corresponding time delay of each coding method according to the coding time length and the identification time length corresponding to each coding method;
and selecting the coding method with the minimum time delay from the coding methods with the recognition accuracy exceeding the accuracy threshold, and taking the coding method with the minimum time delay as the optimal coding method.
The method for obtaining the file obtained by encoding the same audio file by each encoding method and determining the encoding time length corresponding to each encoding method comprises the following steps: acquiring a file obtained by encoding the same audio file by each encoding method transmitted by equipment, encoding time length and a transmitting time point corresponding to each encoding method;
calculating the corresponding time delay of each coding method according to the coding time length and the identification time length corresponding to each coding method, including: calculating network transmission time corresponding to each coding method based on the transmission time point corresponding to each coding method;
and taking the sum of the coding time length, the network transmission time and the identification time length corresponding to each coding method as the time delay corresponding to each coding method.
Wherein the coding method comprises a coding format and coding parameters,
obtaining a file obtained by encoding the same audio file by using each encoding method, and determining the encoding duration of each encoding method, wherein the method comprises the following steps: acquiring a file obtained by encoding the same audio file by using each group of encoding parameter values of each encoding format, and determining the encoding time length corresponding to each group of encoding parameter values of each encoding format;
selecting the coding method with the minimum time delay from the coding methods with the recognition accuracy exceeding the threshold value, taking the coding method with the minimum time delay as the optimal coding method, and comprising the following steps: and selecting the coding parameter value with the minimum time delay from the coding parameter values with the identification accuracy exceeding the threshold value of each coding format, and taking the coding parameter value with the minimum time delay as the optimal coding parameter value of each coding format.
Wherein the code parameter value with the least delay is taken as the optimal code parameter value of each code format, and then the method comprises the following steps: and selecting the optimal coding parameter value with the minimum time delay from the optimal coding parameter values with different coding formats, and taking the optimal coding parameter value with the minimum time delay of the coding format as the optimal coding parameter value with the optimal coding format.
The method for obtaining the file obtained by encoding the same audio file by using each encoding method comprises the following steps: and storing each coding parameter of each coding format and the value range thereof.
The method for obtaining the file obtained by encoding the same audio file by using each encoding method comprises the following steps: obtaining a file obtained by encoding the same audio file for a plurality of times by using each encoding method;
determining the time length and the accuracy of the voice recognition of the coded files of each coding method by the same voice recognition algorithm, wherein the method comprises the following steps: the ratio of the number of the correctly recognized voice recognition algorithm in the plurality of the encoding files of each encoding method to the total number of the plurality of the encoding files of each encoding method is used as the recognition accuracy corresponding to each encoding method;
and taking the average value of the identification time lengths of the correctly identified coded files in the plurality of coded files of each coding method as the corresponding identification time length of each coding method.
Wherein the audio files comprise audio files with different parameters;
the method further comprises the steps of: sequentially taking the audio files with various parameters as the same audio file;
determining an optimal coding method based on the coding time length, the identification time length and the identification accuracy corresponding to each coding method, wherein the optimal coding method comprises the following steps: and determining the optimal coding method corresponding to the audio file with various parameters based on the coding time length, the identification time length and the identification accuracy corresponding to each coding method.
In order to achieve the above object, the present application provides a debugging device of an audio encoding method, the debugging device of the audio encoding method including a memory and a processor; the memory stores a computer program, and the processor is configured to execute the computer program to implement the steps in the above method.
To achieve the above object, the present application provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above method.
Compared with the prior art, the beneficial effects of this application are: the optimal coding method can be determined according to the coding time length, the identification time length and the identification accuracy corresponding to each coding method, the debugging work of the audio coding method can be automatically carried out, manual intervention of a developer is not needed, and cost is saved.
Drawings
FIG. 1 is a flow chart of a first embodiment of a debugging method of an audio encoding method of the present application;
FIG. 2 is a flow chart of a second embodiment of a debugging method of the audio encoding method of the present application;
FIG. 3 is a schematic structural diagram of an embodiment of a debugging device of the audio encoding method of the present application;
FIG. 4 is a schematic diagram of an embodiment of a storage medium readable by the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
It should be noted that, in the embodiment of the present application, directional indications (such as up, down, left, right, front, and rear … …) are referred to, and the directional indications are merely used to explain the relative positional relationship, movement conditions, and the like between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indications are correspondingly changed.
In addition, if there is a description of "first", "second", etc. in the embodiments of the present application, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be regarded as not exist and not within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart of a first embodiment of a debugging method of an audio encoding method according to the present application, and as shown in fig. 1, the debugging method of the audio encoding method according to the present embodiment includes the following steps.
S101: and acquiring a file obtained by encoding the same audio file by each encoding method, and determining the encoding time length corresponding to each encoding method.
In an implementation manner, the same audio file may be encoded by each encoding method to obtain an encoded file of each encoding method, and the encoding duration corresponding to each encoding method may also be recorded after encoding. The encoding time length corresponding to each encoding method refers to the time spent for encoding the same audio file by each encoding method.
It can be understood that the same audio file may be encoded by each encoding method one by one, so as to obtain encoded files of each encoding method, and record the encoding duration corresponding to each encoding method. For example, when testing the optimal coding method among A, B and C coding methods: firstly, encoding the same audio file by an A encoding method to obtain a file encoded by the A encoding method, and recording the time consumed by encoding the same audio file by the A encoding method; then, encoding the same audio file by using a B encoding method to obtain a file encoded by using the B encoding method, and recording the time consumed by encoding the same audio file by using the B encoding method; and then, encoding the same audio file by using a C encoding method to obtain a file encoded by using the C encoding method, and recording the time consumed by encoding the same audio file by using the C encoding method.
In addition, the same audio file may be encoded by various encoding methods in parallel to obtain encoded files of the respective encoding methods, and the encoding durations corresponding to the respective encoding methods may be recorded.
In another implementation, the encoded files of each encoding method may be obtained from other devices, and the encoding durations corresponding to each encoding method may be obtained from other devices. The other devices may be terminals, servers, etc.
It can be understood that the encoding file of the encoding method and the encoding time length corresponding to the encoding method can be obtained from a plurality of devices, and the debugging time can be saved. For example, the encoded files of the a and B encoding methods are acquired from the a device, and the encoded durations corresponding to the a and B encoding methods are acquired, and the encoded files of the C and D encoding methods are acquired from the B device, and the encoded durations corresponding to the C and D encoding methods are acquired … ….
Of course, the code file of each code method and the code time length corresponding to each code method may be acquired from one apparatus. For example, the encoded files of all the encoding methods and the encoding durations corresponding to all the encoding methods are acquired from the a device.
Accordingly, the encoded files and the corresponding encoding durations of the respective encoding methods may be acquired one by one. Of course, the encoded files and the corresponding encoding durations of all encoding methods may be acquired at once.
S102: and determining the time length and the accuracy of the voice recognition of the coded files of each coding method by the same voice recognition algorithm.
In an implementation manner, after the encoded files of the plurality of encoding methods are obtained, the encoded files of the encoding methods may be respectively input to a speech recognition algorithm, so that the speech recognition algorithm performs speech recognition on the encoded files of the plurality of encoding methods, thereby determining recognition accuracy corresponding to the plurality of encoding methods based on a result of the speech recognition, taking time consumed by the speech recognition algorithm for performing the speech recognition on the encoded files of the encoding methods as recognition duration corresponding to the encoding methods, and recording the recognition duration corresponding to the encoding methods.
It can be understood that the voice recognition algorithm used for voice recognition of the encoded files of all the encoding methods is the same voice recognition algorithm, so as to reduce interference factors and improve the accuracy of the debugging result.
In another implementation manner, the encoded files of each encoding method may be sent to the recognition server, so that the recognition server uses the voice recognition algorithm loaded thereon to perform voice recognition on the encoded files of each encoding method, and obtains the recognition duration and the recognition accuracy corresponding to each encoding method from the recognition server.
S103: and determining an optimal coding method based on the coding time length, the identification time length and the identification accuracy corresponding to each coding method.
The optimal coding method can be determined according to the coding time length corresponding to each coding method obtained in the step S101, and the identification time length and the identification accuracy corresponding to each coding method determined in the step S102.
The optimal coding method may refer to a coding method with high timeliness and high recognition accuracy.
Optionally, the time delay corresponding to each coding method may be calculated according to the coding time length and the identification time length corresponding to each coding method, and then the optimal coding method may be determined based on the time delay and the identification accuracy corresponding to each coding method.
In an implementation manner, determining an optimal coding method based on the time delay and the recognition accuracy corresponding to each coding method may include: and selecting the coding method with the minimum time delay from the coding methods with the recognition accuracy exceeding the accuracy threshold, and taking the coding method with the minimum time delay as the optimal coding method. The accuracy threshold may be set according to practical situations, for example, 98% and 90% equivalent values may be used.
In another implementation, determining an optimal coding method based on the delay and the recognition accuracy corresponding to each coding method may include: and selecting the coding method with the highest recognition accuracy from the coding methods with the time delay lower than the time delay threshold, and taking the coding method with the highest recognition accuracy as the optimal coding method. The delay threshold may be set according to practical situations, for example, may be 2ms, 10ms, etc.
In the embodiment, the optimal coding method can be determined according to the coding time length, the identification time length and the identification accuracy corresponding to each coding method, the debugging work of the audio coding method can be automatically performed, manual intervention of a developer is not needed, and human resources are saved.
Further, the method for debugging the audio encoding method according to the first embodiment of the present invention may include the following four execution methods, which are not limited thereto.
One of them may be: the terminal or the server independently completes the coding of the audio file and the identification of the coded file, and the statistics and comparison of the coding time length, the identification time length and the identification accuracy are carried out to determine the optimal coding method.
The second one can be: the transit server acquires the coding files of the coding methods and the coding time lengths corresponding to the coding methods from the terminal, and sends the coding files of the coding methods to the identification server so that the identification server can identify the coding files of the coding methods and determine the identification time lengths and the identification accuracy corresponding to the coding methods. The transit server obtains the identification time length and the identification accuracy corresponding to each coding method from the identification server, and compares the coding time length, the identification time length and the identification accuracy corresponding to each coding method to determine the optimal coding method.
The third can be: the terminal encodes the audio file, sends the encoded file of each encoding method to the server, so that the server uses the same voice recognition algorithm to recognize the encoded file of each encoding method, then obtains the recognition duration and the recognition accuracy corresponding to each encoding method from the server, and then compares the encoding duration, the recognition duration and the recognition accuracy corresponding to each encoding method to determine the optimal encoding method. Compared with the former two executing methods, the third executing method considers the process that the coded file is transmitted to the server through the network in the actual voice interaction process of the terminal and the server, which is matched with the actual situation, and can obtain accurate time information, thereby obtaining accurate debugging results.
Fourth, can be: the terminal encodes the audio file, and sends the encoded file of each encoding method to the server, so that the server uses the same voice recognition algorithm to recognize the encoded file of each encoding method. The server compares the coding time length, the identification time length and the identification accuracy corresponding to each coding method to determine the optimal coding method. The fourth execution mode is more consistent with the actual voice interaction process of the terminal and the server, so that the most accurate test result can be obtained; compared with the third execution mode, the terminal does not need to have the functions of counting, comparing and determining the optimal coding method, the data transmission process is reduced, the data loss is avoided, and the debugging time is saved. Further, a fourth implementation manner is described in detail in the second implementation manner of the debugging method of the audio encoding method.
Referring to fig. 2, fig. 2 is a flowchart illustrating a second embodiment of a debugging method of the audio encoding method of the present application. The coding method comprises a coding format and coding parameters. Wherein the encoding format comprises mp3, speex, opus and the like encoding formats. The coding parameters include parameters such as sampling rate, bit width, compression rate, etc. Each coding parameter of each coding format has a range of values, and it is understood that a combination of values of all coding parameters of each coding format forms a set of coding parameter values for each coding format. Each encoding format may have one or more sets of encoding parameter values. In order to facilitate the terminal to determine a plurality of sets of encoding parameter values of each encoding format, that is, determine a plurality of encoding methods corresponding to each encoding format, the terminal may store a range of values of a plurality of encoding parameters of each encoding format. Further, the terminal may directly store several sets of encoding parameter values for each encoding format. Of course, the range of values of the various encoding parameters of each encoding format or sets of encoding parameter values of each encoding format may be stored in the terminal in a configuration file. As shown in fig. 2, the debugging method of the audio encoding method of the present embodiment includes the following steps.
S201: and acquiring a file transmitted by the terminal and obtained after the same audio file is encoded by each encoding method, and encoding time length and a sending time point corresponding to each encoding method.
In this embodiment, a file obtained by encoding the same audio file by each encoding method, encoding time lengths corresponding to each encoding method, and a transmission time point are acquired from a terminal.
It is understood that where the encoding method includes an encoding format and encoding parameters, an encoding method may refer to a set of encoding parameter values for an encoding format. Thus step S201 may include: and acquiring a file obtained by encoding the same audio file by using the encoding parameter values of each encoding format, and determining the encoding time length and the transmitting time point corresponding to the encoding parameter values of each encoding format.
The sending time point corresponding to each coding method refers to the starting time of the terminal for sending the coding file of each coding method. For example, when the a terminal starts to send the encoded file of the encoding method to the server, the a terminal obtains the current time, and uses the current time as the sending time point corresponding to the encoding method.
In addition, the terminal can also send the network state parameters to the server when sending the coding files of the coding methods, so that the server can confirm the network transmission time based on the network state parameters.
S202: and determining the time length and the accuracy of the voice recognition of the coded files of each coding method by the same voice recognition algorithm.
In one implementation, in step S201, a file obtained by encoding the same audio file multiple times by respective encoding methods may be obtained. Determining the accuracy of the speech recognition of the encoded files of each encoding method by the same speech recognition algorithm may include: and taking the ratio of the number of the correctly recognized voice recognition algorithm in the plurality of the encoding files of each encoding method to the total number of the plurality of the encoding files of each encoding method as the recognition accuracy corresponding to each encoding method. For example, the audio file with the voice content of "open light" is encoded 10 times by the a encoding method, so as to obtain 10 encoded files of the a encoding method, when the 10 encoded files of the a encoding method are recognized by the same voice recognition algorithm, 5 encoded files are correctly recognized as "open light" by the voice recognition algorithm, and the content of the other 5 encoded files recognized by the voice recognition algorithm is not "open light", so that the recognition accuracy of the a encoding method is 5/10=50%.
Determining the duration of speech recognition of the encoded files of each encoding method by the same speech recognition algorithm may include: and taking the average value of the identification time lengths of the correctly identified coded files in the plurality of coded files of each coding method as the corresponding identification time length of each coding method. For example, when the 10 encoded files of the a encoding method are identified by the same speech recognition algorithm, 5 encoded files are correctly identified by the speech recognition algorithm as "light on", and an average value of identification durations of the 5 correctly identified encoded files can be used as the identification duration corresponding to the a encoding method. Of course, in other implementations, the average value of the identification durations of all the encoded files of each encoding method may also be directly used as the corresponding identification duration of each encoding method.
S203: and calculating the network transmission time corresponding to each coding method based on the transmission time point corresponding to each coding method.
In this embodiment, the time when the encoded file of each encoding method is acquired may be taken as an acquisition time point corresponding to each encoding method, and the network transmission time corresponding to each encoding method may be obtained by subtracting the transmission time point corresponding to each encoding method from the acquisition time point corresponding to each encoding method.
In another implementation, the network transmission time corresponding to each encoding method may also be determined based on the network state parameters. The network state parameters may be obtained from the terminal or may be obtained from the server itself.
It is to be understood that the execution order of step S203 is not limited thereto, as long as step S203 is executed after step S201, for example, step S203 may be executed before step S202. S204: and taking the sum of the coding time length, the network transmission time and the identification time length corresponding to each coding method as the time delay corresponding to each coding method.
S205: and determining the optimal coding method based on the corresponding time delay and the identification accuracy of each coding method.
In one implementation, a coding method with the least time delay is selected from the coding methods with the recognition accuracy exceeding the accuracy threshold, and the coding method with the least time delay is used as an optimal coding method.
In another implementation manner, a coding method with the highest recognition accuracy is selected from the coding methods with time delay smaller than the time delay threshold, and the coding method with the highest recognition accuracy is used as an optimal coding method.
In an application scenario, determining an optimal coding method based on the time delay and the recognition accuracy corresponding to each coding method may include: and determining the optimal coding parameter value of each coding format based on the corresponding time delay and the identification accuracy of each group of coding parameter groups of each coding format.
In another application scenario, determining an optimal coding method based on the time delay and the recognition accuracy corresponding to each coding method may include: and determining the optimal coding parameter value of each coding format based on the time delay and the identification accuracy corresponding to each group of coding parameter groups of each coding format, and then determining the optimal coding parameter value of the optimal coding format based on the time delay and the identification accuracy corresponding to the optimal coding parameter value of each coding format.
Specifically, determining an optimal coding method based on the time delay and the recognition accuracy corresponding to each coding method may include: selecting the code parameter value with the least time delay from the code parameter values with the identification accuracy exceeding the threshold value of each code format, and taking the code parameter value with the least time delay as the optimal code parameter value of each code format; and selecting the optimal coding parameter value with the minimum time delay from the optimal coding parameter values with different coding formats, and taking the optimal coding parameter value with the minimum time delay of the coding format as the optimal coding parameter value with the optimal coding format.
In another embodiment, determining an optimal coding method based on the delay and the recognition accuracy corresponding to each coding method may include: selecting the coding parameter value with the highest recognition accuracy from the coding parameter values with the time delay lower than the time delay threshold value of each coding format, and taking the coding parameter value with the highest recognition accuracy as the optimal coding parameter value of each coding format; and selecting the optimal coding parameter value with the highest identification accuracy from the optimal coding parameter values with different coding formats, and taking the optimal coding parameter value with the highest identification accuracy as the optimal coding parameter value of the optimal coding format.
In addition, in the process of determining the optimal coding format based on the time delay and the identification accuracy corresponding to the optimal coding parameter value of each coding format, the coding formats can be ordered in a mode from the optimal coding format to the worst coding format based on the time delay and the identification accuracy corresponding to the optimal coding parameter value of each coding format.
For example, the identification accuracy corresponding to the optimal coding parameter value of the A coding format is 98%, and the corresponding time delay is 1ms; the identification accuracy corresponding to the optimal coding parameter value of the B coding format is 97%, and the corresponding time delay is 1.5ms; the identification accuracy corresponding to the optimal coding parameter value of the C coding format is 99%, and the corresponding time delay is 1.25ms. If the code format with the highest recognition accuracy is selected from the code formats with the time delay exceeding the time delay threshold as the optimal code format, the code formats can be ordered according to the high-low sequence of the recognition accuracy, wherein the sequence of the optimal code format, the worst code format, is C, A and B. In addition, if the code format with the least time delay is selected as the optimal code format from the code formats with the recognition accuracy exceeding the accuracy threshold, the code formats can be ordered according to the time delay, and the order of the optimal code format, the worst code format and the optimal code format is A, C and B.
In yet another application scenario, where each encoding format has a set of encoding parameter values, determining an optimal encoding method based on a delay and recognition accuracy corresponding to each encoding method may include: and determining the optimal coding format based on the corresponding time delay and the identification accuracy of the coding parameter set of each coding format.
The method and the device can also debug the optimal coding method for the audio files with different parameters based on the schemes of the first embodiment and the second embodiment. To determine an optimal encoding method for the audio file for each parameter. Specifically, before step S201, the audio files of the various parameters may be sequentially regarded as the same audio file, so as to sequentially confirm the optimal encoding methods of the audio files of the various parameters. The parameters of the audio may include audio duration and audio quality, but are not limited thereto. The method for debugging the audio coding method in the above embodiment may determine the optimal coding method corresponding to the audio with different durations, or may determine the optimal coding method corresponding to the audio with different tone qualities according to the method for debugging the audio coding method in the above embodiment, or may determine the optimal coding method corresponding to the audio with different tone qualities with different durations according to the method for debugging the audio coding method in the above embodiment.
After the audio is debugged according to the method, a corresponding table of the audio parameters and the optimal coding method can be obtained, the corresponding table of the audio parameters and the optimal coding method can be stored, so that after the terminal collects the audio, the optimal coding method of the collected audio can be searched and determined in the corresponding table based on the collected audio parameters, and then the collected audio is coded according to the optimal coding method.
In addition, when the encoding method includes encoding formats and encoding parameters, a table of audio parameters corresponding to the optimal encoding parameter values and optimal encoding formats of each encoding format can be obtained. In addition, the correspondence table can store the good and bad sequence of the coding formats, so that the terminal can select the optimal coding method according to the coding formats supported by the terminal.
The above-mentioned debugging method of the audio coding method is generally implemented by a debugging device of the audio coding method, so the present application also proposes a debugging device of the audio coding method. Referring to fig. 3, fig. 3 is a schematic structural diagram of an embodiment of a debugging device for the audio encoding method of the present application. The debugging device 10 of the audio encoding method comprises a processor 12 and a memory 11; the memory 11 is used for storing a debugging method implementing an audio encoding method as described above, and the processor 12 is used for executing program instructions stored by the memory 11.
The logic process of the debugging method of the audio encoding method is presented as a computer program, and in terms of the computer program, if sold or used as a separate software product, it can be stored in a computer storage medium, so the application proposes a readable storage medium. Referring to fig. 4, fig. 4 is a schematic structural diagram of an embodiment of a readable storage medium of the present application, in which a computer program 21 is stored in the readable storage medium 20 of the present embodiment, and the steps in the above-mentioned debugging method of the audio encoding method are implemented when the computer program is executed by a processor.
The readable storage medium 20 may be a medium that may store a computer program, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or may be a server that stores the computer program, and the server may send the stored computer program to another device for running, or may also run the stored computer program itself. The readable storage medium 20 may be a combination of entities from a physical entity, such as a plurality of servers, a server plus a memory, or a memory plus a removable hard disk.
The foregoing is only examples of the present application, and is not intended to limit the scope of the patent application, and all equivalent structures or equivalent processes using the descriptions and the contents of the present application or other related technical fields are included in the scope of the patent application.
Claims (9)
1. A method of debugging an audio encoding method, the method comprising:
acquiring a file obtained by encoding the same audio file by each encoding method, and determining the encoding time length corresponding to each encoding method;
determining the time length and accuracy of the voice recognition of the coded files of each coding method by the same voice recognition algorithm;
calculating the corresponding time delay of each coding method according to the coding time length and the identification time length corresponding to each coding method;
and selecting the coding method with the minimum time delay from the coding methods with the recognition accuracy exceeding the threshold value, and taking the coding method with the minimum time delay as the optimal coding method.
2. The method for debugging an audio encoding method according to claim 1, wherein said obtaining a file obtained by encoding the same audio file by each encoding method, determining the encoding duration corresponding to each encoding method, comprises: acquiring a file obtained by encoding the same audio file by each encoding method transmitted by equipment, encoding time length and a transmitting time point corresponding to each encoding method;
the calculating the time delay corresponding to each coding method by the coding time length and the identification time length corresponding to each coding method comprises the following steps: calculating network transmission time corresponding to each coding method based on the transmission time point corresponding to each coding method;
and taking the sum of the coding time length, the network transmission time and the identification time length corresponding to each coding method as the time delay corresponding to each coding method.
3. The method for debugging an audio encoding method according to claim 1, wherein the encoding method comprises an encoding format and encoding parameters,
the obtaining the file obtained by encoding the same audio file by each encoding method, and determining the encoding time length of each encoding method comprises the following steps: acquiring a file obtained by encoding the same audio file by using each group of encoding parameter values of each encoding format, and determining the encoding time length corresponding to each group of encoding parameter values of each encoding format;
the method for selecting the coding method with the minimum time delay from the coding methods with the recognition accuracy exceeding the threshold value, and taking the coding method with the minimum time delay as the optimal coding method comprises the following steps: and selecting the coding parameter value with the minimum time delay from the coding parameter values with the identification accuracy exceeding the accuracy threshold value of each coding format, and taking the coding parameter value with the minimum time delay as the optimal coding parameter value of each coding format.
4. A method of debugging an audio encoding method as claimed in claim 3, wherein said taking the least-delayed encoding parameter value as the optimal encoding parameter value for each encoding format, then comprises: and selecting the optimal coding parameter value with the minimum time delay from the optimal coding parameter values with different coding formats, and taking the optimal coding parameter value with the minimum time delay of the coding format as the optimal coding parameter value with the optimal coding format.
5. A debugging method of an audio encoding method according to claim 3, wherein,
the obtaining a file obtained by encoding the same audio file by using each encoding method comprises the following steps: and storing each coding parameter of each coding format and the value range thereof.
6. The method for debugging an audio encoding method according to claim 1, wherein,
the obtaining the file obtained by encoding the same audio file by each encoding method comprises the following steps: acquiring a file obtained by encoding the same audio file for a plurality of times by using each encoding method;
determining the time length and the accuracy of the voice recognition of the coded files of each coding method by the same voice recognition algorithm, wherein the method comprises the following steps: the ratio of the number of the plurality of coded files of each coding method, which are correctly identified by the voice identification algorithm, to the total number of the plurality of coded files of each coding method is used as the identification accuracy corresponding to each coding method;
and taking the average value of the identification time lengths of the correctly identified coded files in the plurality of coded files of each coding method as the corresponding identification time length of each coding method.
7. The debugging method of an audio encoding method as claimed in claim 1, wherein the audio files comprise audio files having different parameters;
the method further comprises the steps of: sequentially taking the audio files with various parameters as the same audio file;
the method for selecting the coding method with the minimum time delay from the coding methods with the recognition accuracy exceeding the threshold value, and taking the coding method with the minimum time delay as the optimal coding method comprises the following steps: and determining the optimal coding method corresponding to the audio file with various parameters based on the time delay and the recognition accuracy corresponding to each coding method.
8. A debugging device of an audio coding method, which is characterized by comprising a memory and a processor; the memory has stored therein a computer program, the processor being adapted to execute the computer program to carry out the steps of the method according to any of claims 1-7.
9. A readable storage medium having stored thereon a computer program, wherein the program when executed by a processor realizes the steps of the method according to any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010448481.8A CN111739543B (en) | 2020-05-25 | 2020-05-25 | Debugging method of audio coding method and related device thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010448481.8A CN111739543B (en) | 2020-05-25 | 2020-05-25 | Debugging method of audio coding method and related device thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111739543A CN111739543A (en) | 2020-10-02 |
CN111739543B true CN111739543B (en) | 2023-05-23 |
Family
ID=72647673
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010448481.8A Active CN111739543B (en) | 2020-05-25 | 2020-05-25 | Debugging method of audio coding method and related device thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111739543B (en) |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11331305A (en) * | 1998-05-08 | 1999-11-30 | Sony Corp | Transmitter and transmitting method, receiver and receiving method and providing medium |
GB0408856D0 (en) * | 2004-04-21 | 2004-05-26 | Nokia Corp | Signal encoding |
US9866610B2 (en) * | 2011-12-16 | 2018-01-09 | Genband Us Llc | Methods, systems, and computer readable media for selecting a codec pair based on network conditions |
CN103035238B (en) * | 2012-11-27 | 2014-09-17 | 中国科学院自动化研究所 | Encoding method and decoding method of voice frequency data |
US9437205B2 (en) * | 2013-05-10 | 2016-09-06 | Tencent Technology (Shenzhen) Company Limited | Method, application, and device for audio signal transmission |
WO2014180100A1 (en) * | 2013-05-10 | 2014-11-13 | Tencent Technology (Shenzhen) Company Limited | Method, application, and device for audio signal transmission |
CN107424622B (en) * | 2014-06-24 | 2020-12-25 | 华为技术有限公司 | Audio encoding method and apparatus |
CN106782551B (en) * | 2016-12-06 | 2020-07-24 | 北京华夏电通科技有限公司 | Voice recognition system and method |
CN111164947A (en) * | 2017-08-14 | 2020-05-15 | 英国电讯有限公司 | Method and device for encoding audio and/or video data |
-
2020
- 2020-05-25 CN CN202010448481.8A patent/CN111739543B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111739543A (en) | 2020-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107516510B (en) | Automatic voice testing method and device for intelligent equipment | |
CN111340242B (en) | Model joint training method and device for protecting privacy | |
CN101571827A (en) | Method for saving logs and log system | |
CN110647456B (en) | Fault prediction method, system and related device of storage equipment | |
CN112506751B (en) | Method, device, equipment and medium for contrast test of overall performance of server | |
CN105469783A (en) | Audis identification method and device | |
CN111739543B (en) | Debugging method of audio coding method and related device thereof | |
CN112965912B (en) | Interface test case generation method and device and electronic equipment | |
CN104981028A (en) | Wireless network access method and related equipment | |
CN113380229A (en) | Voice response speed determination method, related device and computer program product | |
CN117521783A (en) | Federal machine learning method, apparatus, storage medium and processor | |
CN115934568A (en) | Method and related device for automatically testing performance of exported file | |
CN114185938B (en) | Project traceability analysis method and system based on digital finance and big data traceability | |
CN115577363A (en) | Detection method and device for deserialization utilization chain of malicious code | |
CN115129548A (en) | Alarm analysis method, device, equipment and medium | |
CN109117091A (en) | A kind of SSD equipment mount point acquisition methods and relevant apparatus | |
CN111679791B (en) | Storage position selection method and device, terminal equipment and storage medium | |
CN103106103B (en) | Solicited message sorting technique and device | |
CN106775854B (en) | Method and device for generating configuration file | |
CN110597700A (en) | Server testing method and device | |
CN110597828A (en) | Database changing method, device, equipment and storage medium | |
CN117251384B (en) | Interface automation test case generation method and system | |
CN114756467B (en) | Buried data detection method, buried data detection device, storage medium and storage device | |
CN110795297B (en) | Electronic equipment testing method and device | |
CN109299349B (en) | Application recommendation method and device, equipment and computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |