CN112669796A - Method and device for converting music into music book based on artificial intelligence - Google Patents
Method and device for converting music into music book based on artificial intelligence Download PDFInfo
- Publication number
- CN112669796A CN112669796A CN202011603739.3A CN202011603739A CN112669796A CN 112669796 A CN112669796 A CN 112669796A CN 202011603739 A CN202011603739 A CN 202011603739A CN 112669796 A CN112669796 A CN 112669796A
- Authority
- CN
- China
- Prior art keywords
- music
- file
- format
- score
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 42
- 238000006243 chemical reaction Methods 0.000 claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 23
- 239000013598 vector Substances 0.000 claims abstract description 18
- 239000011295 pitch Substances 0.000 claims description 7
- 238000001228 spectrum Methods 0.000 claims description 6
- 230000001755 vocal effect Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 3
- 239000012634 fragment Substances 0.000 claims description 3
- 238000013519 translation Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 15
- 238000012545 processing Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 6
- 230000015654 memory Effects 0.000 description 6
- 230000002093 peripheral effect Effects 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 210000005266 circulating tumour cell Anatomy 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Landscapes
- Auxiliary Devices For Music (AREA)
Abstract
The application relates to a method and a device for converting music into music score based on artificial intelligence, belonging to the technical field of computers, wherein the method comprises the following steps: inputting the music file into a pre-trained music recognition model to obtain an intermediate file; the music identification model is obtained by training the artificial intelligent model by using a plurality of groups of sample data, wherein each group of sample data comprises a sample music file and a digital music score corresponding to the sample music file; the intermediate file comprises a characteristic vector used for indicating music information corresponding to the music file; acquiring an expected music score format; a preset file conversion tool is called to convert the file format of the intermediate file into an expected music score format, and a music score file corresponding to the music file is obtained; the problem that midi files obtained by the existing music-to-music mode are often lack of sound part division and can be transcribed into music scores by a dividing method can be solved; the information such as the tone marks, the playing methods, the pedals and the like is lost, and the information can be transcribed into the music score by an identification party; can be understood without machine translation.
Description
Technical Field
The application relates to a method and a device for converting music into music score based on artificial intelligence, and belongs to the technical field of computers.
Background
The music-to-score technology refers to a technology of converting music into a readable and playable score. Currently, music to music score technology can be implemented by computer devices.
In a typical music to score method, music may be converted to a digital musical instrument interface (midi) file. However, midi files often lack part of voice division and need to be transcribed into a music score through a dividing party; and the information such as the tone marks, the playing methods, the pedals and the like is lost, and the information can be transcribed into the music score through an identification party.
Disclosure of Invention
The application provides a method and a device for converting music into music score based on artificial intelligence, which can solve the problem that midi files obtained by the conventional music-to-music mode are often lack of sound part division and can be converted into music score by a dividing method; and the information such as the tone marks, the playing methods, the pedals and the like is lacked, and the information can be transcribed into the music score by a recognizer. The application provides the following technical scheme:
in a first aspect, a method for converting music into music score based on artificial intelligence is provided, the method comprising:
inputting the music file into a pre-trained music recognition model to obtain an intermediate file; the music identification model is obtained by training an artificial intelligent model by using a plurality of groups of sample data, wherein each group of sample data comprises a sample music file and a digital music score corresponding to the sample music file; the intermediate file comprises a feature vector used for indicating music information corresponding to the music file;
acquiring an expected music score format;
and calling a preset file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain the music score file corresponding to the music file.
Optionally, after the sample data is obtained, the music recognition model converts the digital music score in each group of sample data into a corresponding sample intermediate file by using the file conversion tool; and training the artificial intelligence model based on the sample music files in each group of sample data and the sample intermediate files corresponding to each sample music file to obtain the artificial intelligence model.
Optionally, after the sample data is obtained, the music recognition model converts the digital music score in each group of sample data into a corresponding sample intermediate file by using the file conversion tool; converting the sample music files in each group of sample data into frequency spectrum files, and dividing the frequency spectrum files into a plurality of music fragments; and training the artificial intelligence model based on the plurality of music pieces corresponding to each sample music file and the sample intermediate file.
Optionally, the music information includes: musical instruments, key marks, beats, tempo, notes, rests, pitches, duration, tempo changes, key mark changes, bar divisions, vocal part divisions, clef allocation, musical notation, inflexion marks, and decorative tones of music.
Optionally, the desired score format comprises a first format type and/or a second format type;
the first format type is a format type of a file for storing human-readable music symbols corresponding to music files;
the second format type is a format type of a file storing music information readable by a computer program corresponding to the music file.
Optionally, the first format type includes at least one of the following: picture format, portable file format;
the second format type includes at least one of: MIDI format and MXL format.
Optionally, the step of calling a preset file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain the music score file corresponding to the music file includes:
determining a file conversion tool corresponding to the expected music score format;
and calling the determined file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain the music score file.
In a second aspect, an apparatus for converting music into music score based on artificial intelligence is provided, the apparatus comprising:
the music recognition module is used for inputting the music file into a pre-trained music recognition model to obtain an intermediate file; the music identification model is obtained by training an artificial intelligent model by using a plurality of groups of sample data, wherein each group of sample data comprises a sample music file and a digital music score corresponding to the sample music file; the intermediate file comprises a feature vector used for indicating music information corresponding to the music file;
the format acquisition module is used for acquiring an expected music score format;
and the format conversion module is used for calling a preset file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain the music score file corresponding to the music file.
The beneficial effect of this application lies in: inputting the music file into a pre-trained music recognition model to obtain an intermediate file; the music identification model is obtained by training the artificial intelligent model by using a plurality of groups of sample data, wherein each group of sample data comprises a sample music file and a digital music score corresponding to the sample music file; the intermediate file comprises a characteristic vector used for indicating music information corresponding to the music file; acquiring an expected music score format; a preset file conversion tool is called to convert the file format of the intermediate file into an expected music score format, and a music score file corresponding to the music file is obtained; the problem that midi files obtained by the existing music-to-music mode are often lack of sound part division and can be transcribed into music scores by a dividing method can be solved; the information such as the tone marks, the playing methods, the pedals and the like is lost, and the information can be transcribed into the music score by an identification party; the intermediate file generated by the music recognition model is input into a file conversion tool, and the file conversion tool converts the file format of the intermediate file into an expected music score format according to the requirements of a user; flexible conversion of file formats can be achieved. Meanwhile, when the file format is the picture format, the file format can be directly understood by a user without machine translation.
In addition, since the music information includes information such as instruments, key marks, beats, velocities, notes, rests, pitches, lengths of pitches, tempo conversion, velocity conversion, key mark conversion, bar division, vocal part division, clef allocation, musical notation allocation, musical performance, inflexion marks, decorative tones, pedals, and the like required for making a music score corresponding to the music, and influences caused by slight variations such as slight variations in the lengths of notes of the same kind and in the pause times of rests of the same kind between a music file and the music score due to understanding and expression of the music score during the performance are effectively ignored, a good music score which is accurate and easy to read and convenient to perform can be generated.
The foregoing description is only an overview of the technical solutions of the present application, and in order to make the technical solutions of the present application more clear and clear, and to implement the technical solutions according to the content of the description, the following detailed description is made with reference to the preferred embodiments of the present application and the accompanying drawings.
Drawings
FIG. 1 is a flowchart of a method for converting music into music score based on artificial intelligence according to an embodiment of the present application;
FIG. 2 is a schematic diagram of feature extraction of a sample music file provided by one embodiment of the present application;
FIG. 3 is a schematic diagram of a process for generating a feature vector according to an embodiment of the present application;
FIG. 4 is a schematic illustration of an intermediate file provided by an embodiment of the present application;
FIG. 5 is a block diagram of an apparatus for converting music into music score based on artificial intelligence provided by an embodiment of the present application;
fig. 6 is a block diagram of an apparatus for converting music into music score based on artificial intelligence according to another embodiment of the present application.
Detailed Description
The following detailed description of embodiments of the present application will be described in conjunction with the accompanying drawings and examples. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.
First, several terms referred to in the present application will be described.
Long Short-Term Memory network (LSTM): is a special recurrent neural network. LSTM is a time-recursive neural network suitable for processing and predicting significant events of relatively long intervals and delays in a time series.
Connection Timing Classification (CTC) penalty function: the method is used for processing the alignment problem of the input label and the output label in the sequence labeling problem. Conventional sequence labeling algorithms require that the input and output symbols be perfectly aligned at each time instant. And CTCs extend the set of tags, adding null elements. After the sequences are labeled by using the extended label set, all the predicted sequences which can be converted into real sequences through the mapping function are correct prediction results. The CTC loss function can obtain a prediction sequence without data alignment processing.
Optionally, the present application is described by taking an execution subject of each embodiment as an example of an electronic device with computing capability, where the electronic device may be a desktop computer, a notebook computer, a server, a mobile phone, a tablet computer, a wearable device, and the like, and the embodiment does not limit the type of the electronic device.
Fig. 1 is a flowchart of a method for converting music into music score based on artificial intelligence according to an embodiment of the present application. The method at least comprises the following steps:
Music files refer to audio files that can be played directly by an audio player. Optionally, the file format of the music file may be MP3 format and/or WAV format, and the file format of the music file is not limited in this embodiment.
The music recognition model is pre-stored in the electronic device.
Optionally, the method for obtaining the music recognition model by training the artificial intelligence model using multiple sets of sample data includes: after the sample data are obtained, converting the digital music score in each group of sample data into a corresponding sample intermediate file by using a file conversion tool; and training the artificial intelligent model based on the sample music files in each group of sample data and the sample intermediate files corresponding to each sample music file to obtain the artificial intelligent model. Or after the sample data is obtained, converting the digital music score in each group of sample data into a corresponding sample intermediate file by using a file conversion tool; converting the sample music files in each group of sample data into frequency spectrum files, and dividing the frequency spectrum files into a plurality of music fragments; and training the artificial intelligent model based on the plurality of music pieces corresponding to each sample music file and the sample intermediate file.
Optionally, the sample data is from public domain (public domain) free music and its corresponding digital music score. Because the music and the digital music score which belong to the public field are huge in quantity and complete in variety, abundant data can be provided for training of the artificial intelligent model.
Wherein, as data preprocessing, the digital music score in the sample data is converted into a sample intermediate file. These sample intermediate files will serve as labels for music files during model training. Alternatively, the sample data may be divided into three parts, a training set, a development set and a test set. The training set is used to train a model architecture of a plurality of music recognition models; the models obtained by training are evaluated by a development set, and proper models are screened out from the models for tuning; and finally, evaluating the final performance of the model by using the test set to obtain a final music recognition model.
Optionally, the artificial intelligence network comprises an LSTM, and accordingly, the music recognition model is built based on the LSTM; and/or, a Gated current Units (GRU), and accordingly, the music recognition model is established based on the GRU, and the embodiment does not limit the network type of the artificial intelligence network. It should be added that, in other implementation manners, the artificial intelligence network may also be another type of network model, and this embodiment is not limited to this.
Optionally, the music recognition model is trained using a CTC loss function. In other embodiments, the loss function used in the training process may also be other loss functions, such as: conditional Random Fields (CRF) loss functions, etc., and the number of loss functions used in the training process may be one or more, and the present embodiment does not limit the type and number of the loss functions.
Such as: reference to the drawings2, the sample music file is converted into a corresponding spectrogram, and the spectrogram records the energy of each frequency audio at different moments. Subsequently, the spectrogram is divided into a plurality of music pieces according to a preset sampling rate and a preset sampling length. For example, if the audio length is 2 seconds, the sampling rate is 100Hz, and the sampling length is 0.05 seconds, the audio segment is divided into 200 segments with a length of 0.05 seconds. If the spectrogram is 400 pixels in length, the spectrogram is divided into 200 segments of 10 pixels wide. Then, each music piece is input into a feature extraction network in an artificial intelligence model to extract a feature vector of the music piece, and feature data of each music piece is obtained. The feature extraction network may be a neural network including a plurality of convolutional layers. Then, referring to fig. 3, taking an example that the artificial intelligence network further includes an LSTM and the loss function is a CTC, the feature data of the music pieces at different times acquired in fig. 2 are input to a bidirectional LSTM network, and the network outputs a vector y conforming to the CTC loss function specification1-ynEach column of the vector represents the sound information required for a score (e.g., c3, a, binary rest, etc.). Each column of the vector may be designed for an open source score format (e.g., music xml), thereby producing an intermediate file that may be converted to a score file by a simple transformation; and arranging the output into a matrix and processing to obtain an intermediate file of the whole sample music file.
Schematically, one possible intermediate document is illustrated with reference to fig. 4; suppose the output of the artificial intelligence model at each time instant is the vector y of the acoustic information in FIG. 31-ynThen y will be1-ynArranging the intermediate file in a matrix, folding the matrix after folding the matrix to obtain the intermediate file.
The folding is an algorithm characteristic of the CTC loss function in a speech recognition scene, and can fold repeated information in the audio information into the same information. Such as: for the pronunciation of the word music, the letter corresponding every 0.1 second may be mmuussisiiiic, and this output is folded to yield music.
Optionally, in this embodiment, the music information includes, but is not limited to: musical instruments, key marks, beats, tempo, notes, rests, pitches, duration, tempo changes, key mark changes, bar divisions, vocal part divisions, clef allocation, musical notation, inflexion marks, decorative tones, pedals, and the like of music pieces. It should be added that the music information may also include other music information included in the real music score, and the content of the music information is not limited in this embodiment.
The desired score format is a score format that a user desires to acquire. Optionally, the desired score format includes a first format type and/or a second format type. The first format type refers to a format type of a file storing human-readable music symbols corresponding to music files, such as: the first format type includes at least one of: picture format, portable file format. Of course, the first format type may be other types, and this embodiment is not listed here. The second format type is the format type of the file storing the music information which can be read by the computer program and corresponds to the music file; the music information is translated by a machine to obtain corresponding music symbols, such as: the second format type includes at least one of: MIDI format and MXL format. Of course, the second format type may be other types, and the embodiment is not listed here.
Optionally, obtaining the desired score format comprises: displaying a format selection interface, wherein the format selection interface displays a plurality of music score formats; when a selection operation of at least one score format in a plurality of score formats is received, the score format indicated by the selection operation is determined as a desired score format. Or receiving an expected music score format sent by other equipment; or, a default expected score format is read, and the embodiment does not limit the manner of obtaining the expected score format.
Step 103, a preset file conversion tool is called to convert the file format of the intermediate file into an expected music score format, so as to obtain a music score file corresponding to the music file.
The file conversion tool supports the conversion of the file format of the intermediate file into the expected music score format; meanwhile, the file conversion tool also supports the conversion of a desired score format (such as a data score format) into a file format of an intermediate file.
In one example, the file conversion tool converts the file format of the intermediate file into a desired music score format, resulting in a music score file corresponding to the music file, comprising: creating a score file in a desired score format; identifying each feature vector in the intermediate file to obtain corresponding music information; and writing the music information into the pre-created music score file in a desired music score format to obtain the music score file with the desired music score format.
The file conversion tool has the function of identifying feature vectors, such as: each line of the characteristic vector corresponds to a part of the music information, and all the information segments are combined to obtain the music information.
Optionally, different expected music score formats correspond to different file conversion tools, and at this time, the electronic device further needs to determine a file conversion tool corresponding to the expected music score format; and calling the determined file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain the music score file.
Such as: the file format of the intermediate file is designed with reference to the MusicXML file, and a tool for converting the intermediate file into the MusicXML file may be written and converted into the MusicXML file using the tool. It can then be converted to pdf, midi, etc. using a third party, open source tool. For example: the reference MusicXML file may be converted to pdf, midi, etc. using the open source software musescore, the LilyPond file may be converted to a picture using LilyPond, the midi file may be converted to mp3 using a compositor, etc.
In summary, in the method for converting music into music score based on artificial intelligence provided by this embodiment, the music file is input into the pre-trained music recognition model to obtain an intermediate file; the music identification model is obtained by training the artificial intelligent model by using a plurality of groups of sample data, wherein each group of sample data comprises a sample music file and a digital music score corresponding to the sample music file; the intermediate file comprises a characteristic vector used for indicating music information corresponding to the music file; acquiring an expected music score format; a preset file conversion tool is called to convert the file format of the intermediate file into an expected music score format, and a music score file corresponding to the music file is obtained; the problem that midi files obtained by the existing music-to-music mode are often lack of sound part division and can be transcribed into music scores by a dividing method can be solved; the information such as the tone marks, the playing methods, the pedals and the like is lost, and the information can be transcribed into the music score by an identification party; the intermediate file generated by the music recognition model is input into a file conversion tool, and the file conversion tool converts the file format of the intermediate file into an expected music score format according to the requirements of a user; flexible conversion of file formats can be achieved. Meanwhile, when the file format is the picture format, the file format can be directly understood by a user without machine translation.
In addition, since the music information includes information such as instruments, key marks, beats, velocities, notes, rests, pitches, lengths of pitches, tempo conversion, velocity conversion, key mark conversion, bar division, vocal part division, clef allocation, musical notation, inflexion marks, and decorative tones required for making a music score corresponding to the music, and influences caused by slight variations such as slight variations in the note lengths of the same kind and variations in the pause times of the same kind of rests between a music file and the music score due to understanding and expression of the music score during playing are effectively ignored, a good music score which is accurate and easy to read and convenient to play can be generated.
Fig. 5 is a block diagram of an apparatus for converting music into music score based on artificial intelligence according to an embodiment of the present application. The device at least comprises the following modules: a music recognition module 510, a format acquisition module 520, and a format conversion module 530.
A music recognition module 510, configured to input a music file into a pre-trained music recognition model to obtain an intermediate file; the music identification model is obtained by training an artificial intelligent model by using a plurality of groups of sample data, wherein each group of sample data comprises a sample music file and a digital music score corresponding to the sample music file; the intermediate file comprises a feature vector used for indicating music information corresponding to the music file;
a format obtaining module 520, configured to obtain a desired score format;
a format conversion module 530, configured to invoke a preset file conversion tool to convert the file format of the intermediate file into the expected music score format, so as to obtain a music score file corresponding to the music file.
For relevant details reference is made to the above-described method embodiments.
It should be noted that: in the above embodiment, when the device for converting music into music score based on artificial intelligence is used for converting music into music score based on artificial intelligence, only the division of the functional modules is used for illustration, and in practical applications, the distribution of the functions may be completed by different functional modules as needed, that is, the internal structure of the device for converting music into music score based on artificial intelligence may be divided into different functional modules to complete all or part of the functions described above. In addition, the device for converting music into music score based on artificial intelligence and the method for converting music into music score based on artificial intelligence provided by the above embodiments belong to the same concept, and the specific implementation process thereof is described in detail in the method embodiments and will not be described herein again.
Fig. 6 is a block diagram of an apparatus for converting music into music score based on artificial intelligence according to an embodiment of the present application, such as: a smartphone, a tablet, a laptop, a desktop, or a server. The apparatus for converting music into music score based on artificial intelligence may also be referred to as a user equipment, a portable terminal, a laptop terminal, a desktop terminal, a control terminal, etc., which is not limited in this embodiment. The apparatus comprises at least a processor 601 and a memory 602.
Processor 601 may include one or more processing cores such as: 4 core processors, 8 core processors, etc. The processor 601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 601 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 601 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, processor 601 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.
The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for execution by processor 601 to implement the artificial intelligence based music to music score method provided by the method embodiments herein.
In some embodiments, the apparatus for converting music into music score based on artificial intelligence may further include: a peripheral interface and at least one peripheral. The processor 601, memory 602 and peripheral interface may be connected by a bus or signal lines. Each peripheral may be connected to the peripheral interface via a bus, signal line, or circuit board. Illustratively, peripheral devices include, but are not limited to: radio frequency circuit, touch display screen, audio circuit, power supply, etc.
Of course, the device for converting music into music score based on artificial intelligence may also include fewer or more components, which is not limited by the embodiment.
Optionally, the present application further provides a computer-readable storage medium, in which a program is stored, where the program is loaded and executed by a processor to implement the method for transforming music into music score based on artificial intelligence of the above method embodiments.
Optionally, the present application further provides a computer product, which includes a computer-readable storage medium, in which a program is stored, where the program is loaded and executed by a processor to implement the method for converting music into music score based on artificial intelligence of the above-mentioned method embodiments.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (8)
1. An artificial intelligence based music-to-music score method, the method comprising:
inputting the music file into a pre-trained music recognition model to obtain an intermediate file; the music identification model is obtained by training an artificial intelligent model by using a plurality of groups of sample data, wherein each group of sample data comprises a sample music file and a digital music score corresponding to the sample music file; the intermediate file comprises a feature vector used for indicating music information corresponding to the music file;
acquiring an expected music score format;
and calling a preset file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain the music score file corresponding to the music file.
2. The method of claim 1, wherein the music recognition model is implemented by converting the digital music score in each set of sample data into a corresponding sample intermediate file using the file conversion tool after the sample data is obtained; and training the artificial intelligence model based on the sample music files in each group of sample data and the sample intermediate files corresponding to each sample music file to obtain the artificial intelligence model.
3. The method of claim 1, wherein the music recognition model is implemented by converting the digital music score in each set of sample data into a corresponding sample intermediate file using the file conversion tool after the sample data is obtained; converting the sample music files in each group of sample data into frequency spectrum files, and dividing the frequency spectrum files into a plurality of music fragments; and training the artificial intelligence model based on the plurality of music pieces corresponding to each sample music file and the sample intermediate file.
4. The method of claim 1, wherein the music information comprises: musical instruments, key marks, tempos, speeds, notes, rests, pitches, durations, tempo changes, speed changes, key mark changes, bar divisions, vocal part divisions, clef allocation, musical notation, inflexion marks, decorative tones, and pedals of music pieces.
5. The method of claim 1, wherein the desired score format comprises a first format type and/or a second format type;
the first format type is a format type of a file for storing human-readable music symbols corresponding to music files;
the second format type is a format type of a file storing music information readable by a computer program corresponding to the music file.
6. The method of claim 5,
the first format type includes at least one of: picture format, portable file format;
the second format type includes at least one of: MIDI format and MXL format.
7. The method of claim 1, wherein the invoking a preset file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain a music score file corresponding to the music file comprises:
determining a file conversion tool corresponding to the expected music score format;
and calling the determined file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain the music score file.
8. An apparatus for converting music into music score based on artificial intelligence, the apparatus comprising:
the music recognition module is used for inputting the music file into a pre-trained music recognition model to obtain an intermediate file; the music identification model is obtained by training an artificial intelligent model by using a plurality of groups of sample data, wherein each group of sample data comprises a sample music file and a digital music score corresponding to the sample music file; the intermediate file comprises a feature vector used for indicating music information corresponding to the music file;
the format acquisition module is used for acquiring an expected music score format;
and the format conversion module is used for calling a preset file conversion tool to convert the file format of the intermediate file into the expected music score format to obtain the music score file corresponding to the music file.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011603739.3A CN112669796A (en) | 2020-12-29 | 2020-12-29 | Method and device for converting music into music book based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011603739.3A CN112669796A (en) | 2020-12-29 | 2020-12-29 | Method and device for converting music into music book based on artificial intelligence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112669796A true CN112669796A (en) | 2021-04-16 |
Family
ID=75410652
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011603739.3A Pending CN112669796A (en) | 2020-12-29 | 2020-12-29 | Method and device for converting music into music book based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112669796A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113851146A (en) * | 2021-09-26 | 2021-12-28 | 平安科技(深圳)有限公司 | Performance evaluation method and device based on feature decomposition |
WO2023124472A1 (en) * | 2021-12-31 | 2023-07-06 | 腾讯音乐娱乐科技(深圳)有限公司 | Midi music file generation method, storage medium and terminal |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1379898A (en) * | 1999-09-16 | 2002-11-13 | 汉索尔索弗特有限公司 | Method and apparatus for playing musical instruments based on digital music file |
US20090064846A1 (en) * | 2007-09-10 | 2009-03-12 | Xerox Corporation | Method and apparatus for generating and reading bar coded sheet music for use with musical instrument digital interface (midi) devices |
CN102610222A (en) * | 2007-02-01 | 2012-07-25 | 缪斯亚米有限公司 | Music transcription method, system and device |
DE202015006043U1 (en) * | 2014-09-05 | 2015-10-07 | Carus-Verlag Gmbh & Co. Kg | Signal sequence and data carrier with a computer program for playing a piece of music |
CN105205047A (en) * | 2015-09-30 | 2015-12-30 | 北京金山安全软件有限公司 | Playing method, converting method and device of musical instrument music score file and electronic equipment |
WO2018194456A1 (en) * | 2017-04-20 | 2018-10-25 | Universiteit Van Amsterdam | Optical music recognition omr : converting sheet music to a digital format |
CN108806657A (en) * | 2018-06-05 | 2018-11-13 | 平安科技(深圳)有限公司 | Music model training, musical composition method, apparatus, terminal and storage medium |
WO2019205383A1 (en) * | 2018-04-28 | 2019-10-31 | 平安科技(深圳)有限公司 | Electronic device, deep learning-based music performance style identification method, and storage medium |
JP2020003536A (en) * | 2018-06-25 | 2020-01-09 | カシオ計算機株式会社 | Learning device, automatic music transcription device, learning method, automatic music transcription method and program |
CN110942758A (en) * | 2019-09-23 | 2020-03-31 | 广东互动电子网络媒体有限公司 | Machine vision-based music score recognition method and device |
CN111862944A (en) * | 2019-04-30 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Speech recognition apparatus, method, electronic device, and computer-readable storage medium |
CN111899727A (en) * | 2020-07-15 | 2020-11-06 | 苏州思必驰信息科技有限公司 | Training method and system for voice recognition model of multiple speakers |
CN111898753A (en) * | 2020-08-05 | 2020-11-06 | 字节跳动有限公司 | Music transcription model training method, music transcription method and corresponding device |
-
2020
- 2020-12-29 CN CN202011603739.3A patent/CN112669796A/en active Pending
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1379898A (en) * | 1999-09-16 | 2002-11-13 | 汉索尔索弗特有限公司 | Method and apparatus for playing musical instruments based on digital music file |
CN102610222A (en) * | 2007-02-01 | 2012-07-25 | 缪斯亚米有限公司 | Music transcription method, system and device |
US20090064846A1 (en) * | 2007-09-10 | 2009-03-12 | Xerox Corporation | Method and apparatus for generating and reading bar coded sheet music for use with musical instrument digital interface (midi) devices |
DE202015006043U1 (en) * | 2014-09-05 | 2015-10-07 | Carus-Verlag Gmbh & Co. Kg | Signal sequence and data carrier with a computer program for playing a piece of music |
CN105205047A (en) * | 2015-09-30 | 2015-12-30 | 北京金山安全软件有限公司 | Playing method, converting method and device of musical instrument music score file and electronic equipment |
WO2018194456A1 (en) * | 2017-04-20 | 2018-10-25 | Universiteit Van Amsterdam | Optical music recognition omr : converting sheet music to a digital format |
WO2019205383A1 (en) * | 2018-04-28 | 2019-10-31 | 平安科技(深圳)有限公司 | Electronic device, deep learning-based music performance style identification method, and storage medium |
CN108806657A (en) * | 2018-06-05 | 2018-11-13 | 平安科技(深圳)有限公司 | Music model training, musical composition method, apparatus, terminal and storage medium |
JP2020003536A (en) * | 2018-06-25 | 2020-01-09 | カシオ計算機株式会社 | Learning device, automatic music transcription device, learning method, automatic music transcription method and program |
CN111862944A (en) * | 2019-04-30 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Speech recognition apparatus, method, electronic device, and computer-readable storage medium |
CN110942758A (en) * | 2019-09-23 | 2020-03-31 | 广东互动电子网络媒体有限公司 | Machine vision-based music score recognition method and device |
CN111899727A (en) * | 2020-07-15 | 2020-11-06 | 苏州思必驰信息科技有限公司 | Training method and system for voice recognition model of multiple speakers |
CN111898753A (en) * | 2020-08-05 | 2020-11-06 | 字节跳动有限公司 | Music transcription model training method, music transcription method and corresponding device |
Non-Patent Citations (1)
Title |
---|
张一彬;周杰;边肇祺;郭军;: "基于内容的音频与音乐分析综述", 计算机学报, no. 05 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113851146A (en) * | 2021-09-26 | 2021-12-28 | 平安科技(深圳)有限公司 | Performance evaluation method and device based on feature decomposition |
WO2023124472A1 (en) * | 2021-12-31 | 2023-07-06 | 腾讯音乐娱乐科技(深圳)有限公司 | Midi music file generation method, storage medium and terminal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110148394B (en) | Singing voice synthesizing method, singing voice synthesizing device, computer equipment and storage medium | |
EP3616190B1 (en) | Automatic song generation | |
CN106920547B (en) | Voice conversion method and device | |
CN108831437B (en) | Singing voice generation method, singing voice generation device, terminal and storage medium | |
CN110136689B (en) | Singing voice synthesis method and device based on transfer learning and storage medium | |
CN111782576B (en) | Background music generation method and device, readable medium and electronic equipment | |
EP3489946A1 (en) | Real-time jamming assistance for groups of musicians | |
CN111653265A (en) | Speech synthesis method, speech synthesis device, storage medium and electronic equipment | |
CN110164460A (en) | Sing synthetic method and device | |
CN103098124B (en) | Method and system for text to speech conversion | |
CN112669796A (en) | Method and device for converting music into music book based on artificial intelligence | |
CN108305611A (en) | Method, apparatus, storage medium and the computer equipment of text-to-speech | |
CN115101042B (en) | Text processing method, device and equipment | |
CN112735371A (en) | Method and device for generating speaker video based on text information | |
CN112289300A (en) | Audio processing method and device, electronic equipment and computer readable storage medium | |
Dongmei | Design of English text-to-speech conversion algorithm based on machine learning | |
CN112071299B (en) | Neural network model training method, audio generation method and device and electronic equipment | |
CN112786020B (en) | Lyric timestamp generation method and storage medium | |
CN116229935A (en) | Speech synthesis method, device, electronic equipment and computer readable medium | |
CN113626635B (en) | Song phrase dividing method, system, electronic equipment and medium | |
CN114242032A (en) | Speech synthesis method, apparatus, device, storage medium and program product | |
Khan et al. | Development of a music score editor based on musicxml | |
CN114093340A (en) | Speech synthesis method, speech synthesis device, storage medium and electronic equipment | |
CN113421544B (en) | Singing voice synthesizing method, singing voice synthesizing device, computer equipment and storage medium | |
CN116645957B (en) | Music generation method, device, terminal, storage medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210416 |