CN110580458A - music score image recognition method combining multi-scale residual error type CNN and SRU - Google Patents
music score image recognition method combining multi-scale residual error type CNN and SRU Download PDFInfo
- Publication number
- CN110580458A CN110580458A CN201910787184.3A CN201910787184A CN110580458A CN 110580458 A CN110580458 A CN 110580458A CN 201910787184 A CN201910787184 A CN 201910787184A CN 110580458 A CN110580458 A CN 110580458A
- Authority
- CN
- China
- Prior art keywords
- sru
- music score
- model
- data set
- cnn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000004927 fusion Effects 0.000 claims description 7
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 claims description 2
- 230000003595 spectral effect Effects 0.000 claims 1
- 238000013527 convolutional neural network Methods 0.000 description 17
- 238000004422 calculation algorithm Methods 0.000 description 11
- 238000013528 artificial neural network Methods 0.000 description 8
- 238000000605 extraction Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 102100032202 Cornulin Human genes 0.000 description 2
- 101000920981 Homo sapiens Cornulin Proteins 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000005489 elastic deformation Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000013433 optimization analysis Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a music score image recognition method combining multi-scale residual error CNN and SRU, which comprises the following steps: firstly, establishing a data set of a music score image; secondly, constructing a model: combining the multi-scale residual difference CNN and the SRU; thirdly, training a model: and performing model training by using the data set after data enhancement, inputting the model into a music score image in the data set, gradually adjusting each parameter of the network through a chain time sequence classification loss function to achieve the optimal value by using a truth label as a semantic label corresponding to the image, and finally outputting a predicted value of the note semantic information.
Description
Technical Field
the invention is an important branch of the field of serialized image recognition, applies a neural network to the recognition of images, optimizes a note recognition network aiming at difficult notes and realizes more accurate and rapid conversion of music score images.
Background
the music score describes related information such as notes, tones, duration and the like in detail, and becomes the most direct way for musicians to learn, share and transmit music, but not a few classical music scores are damaged or even lost due to environmental changes and epoch changes, so that all music scores cannot be completely and nondestructively preserved due to artificial storage. With the rapid development of advanced technologies such as computer application and image scanning, paper-based Music files can be converted into electronic files which can be read and understood by a computer through an Optical Music Recognition (OMR) technology, so that the paper-based Music files can be widely applied to the fields of Music information retrieval, Music auxiliary teaching and the like. However, because the general music score recognition algorithm has a complex structure and high implementation difficulty, and the existing commercial recognition software has low precision, an OMR algorithm which is easy to implement and high in precision is urgently needed to be researched.
Bainbridge et al[1]An early OMR algorithm universal framework is provided, which mainly comprises the parts of image preprocessing, note identification, music information reconstruction, final expression construction and the like, and staff detection and deletion, note segmentation, identification and note information recombination are technical difficulties, but each step is difficult to realize, and the overall identification precision is insufficient. In recent years, along with the driving of big data, machine learning and deep neural network are widely applied, Sober-Mira and the like[2]The Convolutional Neural Network (CNN) is applied to a note recognition part, so that the precision of a general framework algorithm is improved; shi, etc[3]Firstly, a convolution cyclic Neural network (CRNN) is proposed and applied to scene text recognition with obvious effect; Calvo-Zaragoza et al[4]Shi and the like are adopted in music score recognition[3]The method comprises the steps of carrying out model optimization and quantitative analysis, preprocessing an input picture, inputting three monaural music score images with a ratio of 1:4 into a network in a unified mode, adopting a Bi-directional Short-Term Memory (BilTM) network to form a C-BilSTM network for a feature recognition part in the CRNN network, and finally obtaining about 22.37% of sequence error rate and 2.16% of symbol error rate in an input image with the size of 60 x 240, wherein recognition accuracy of difficult notes such as partials, minor pitch lines and the like is insufficient due to insufficient feature extraction capability.
The OMR algorithm studies so far have the following problems: 1) the algorithm based on a general framework is complicated in steps and has difficulty in each step: the detection part of the staff needs to balance the noise resistance and the deformation resistance of the algorithm; the staff deleting part increases the difficulty of identifying the punctuation notes; the note identification and classification part selects different identification methods according to different characteristics of notes, a general algorithm is difficult to select, and the classification effect is obvious in difference among different notes. These problems will make the overall recognition accuracy of the OMR task insufficient; 2) the complexity of a general framework is simplified by using an end-to-end trained deep neural network algorithm, key steps in an OMR task are not analyzed and researched respectively, the possibility of introducing errors in a multi-step framework is reduced, but the OMR task is sensitive to detail information, and particularly for recognition of difficult notes, the improvement of recognition precision is severely limited due to insufficient feature extraction capability of a model; 3) the note sequence in the data set is only the combination of simple notes, and the generalization capability of the model is poor due to insufficient richness and diversity, so that the problem of overfitting is easily caused; 4) the network model adopting the BilSTM feature recognition usually has slow convergence in the training process and consumes longer time.
the note sequence has sequence and specificity, namely, the note at the current moment has strong correlation with the note at the previous moment and the note at the next moment, and a Recurrent Neural Network (RNN) can effectively identify the serialized data, so that the method can be used for identifying the note. In the training process, the problem of gradient disappearance is easy to occur to long sequence data, and most of the RNN structures control information flow through a gate mechanism model such as LSTM or GRU so as to alleviate the problem of gradient disappearance/explosion. However, the forgotten gate, input gate and unit state of the model LSTM \ GRU still need to hide the output of the unit at the previous time except for the input at the current time, so that the parallel operation speed is limited to a great extent, and the problem is effectively solved by the Simple loop Units (SRUs). The SRU accelerates the convergence of the model by enabling the calculation of the gate state to depend on the information input at the current moment only by utilizing the weak cyclicity and the high parallelism, relieving the dependency of the current moment on the state at the previous moment and enabling most of the calculation to be carried out synchronously.
Reference documents:
[1]Bainbridge D and Bell T,The challenge ofoptical music recognition[J],Computers and the Humanities,2001,35(2):95–121.
[2]Sober-Mira J,Calvo-Zaragoza J,Rizo D,et al.Pen-Based Music Document Transcription with Convolutional Neural Networks.In:Fornés A.,Lamiroy B.(eds)Graphics Recognition.Current Trends and Evolutions[C].GREC.2017,71-80.
[3]Shi B,Bai X,Yao C.An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene TextRecognition[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2015,39(11):2298-2304.
[4]Jorge Calvo-Zaragoza,Jose J.Valero-Mas,Antonio Pertusa.End-to-end Optical Music Recognition using Neural Networks[C],18th International Society for Music Information Retrieval Conference,Suzhou,China,2017,472-477.
Disclosure of Invention
the invention provides a note identification method based on a deep neural network, aiming at the problem of note identification in an optical music score image. The method can identify difficult notes in the music score images under different qualities, can ensure higher accuracy, and simultaneously accelerates the model training speed, and the technical scheme is as follows:
A music score image recognition method combining multi-scale residual CNN and SRU comprises the following steps:
Firstly, establishing a data set of a music score image: selecting a spectrum case and using an image enhancement technology to enable the data set to contain a music score image under an undesirable condition so as to expand the data set;
Secondly, constructing a model: combining the multi-scale residual difference CNN and the SRU;
(1) Scale residual CNN network: the multi-scale residual type CNN is composed of five convolution residual blocks and carries out multi-scale feature fusion, input image data sequentially passes through the five residual block convolution layers to obtain feature maps C1, C2, C3, C4 and C5, the sizes of convolution kernels of the feature maps C1, C2, C3, C4 and C5 are all 33, and the number of the convolution kernels is increased in a mode of 32, 64, 128, 256 and 256 layer by layer. And fusing the result of the last layer of feature map C5 after 2 times of upsampling with the result of feature map C4 after 11 convolution operations to obtain a feature F5, and performing the same processing on F5 and C3 by C5 and C4 to obtain a feature F4.
(2) An SRU part: the method comprises two layers of bidirectional SRUs (hidden layer units), wherein the cycle length of each layer is kept unchanged due to the fact that the height of a music score image and the number of selected convolution kernels are determined, and forward learning and backward propagation of weights in each SRU are achieved through 512 hidden layer units;
Thirdly, training a model: and performing model training by using the data set after data enhancement, inputting the model into a music score image in the data set, gradually adjusting each parameter of the network through a chain time sequence classification loss function to achieve the optimal value by using a truth label as a semantic label corresponding to the image, and finally outputting a predicted value of the note semantic information.
The invention provides a network combining residual error CNN and SRU, wherein difficult notes in an optical music score image are taken as research objects in the network, a residual error CNN structure is used in a feature extraction part, and multi-scale feature fusion is added, so that multi-level features are concentrated in a unified feature map to improve the accuracy of subsequent recognition; and the SRU is adopted in the feature recognition part, so that the model training speed is higher.
Drawings
FIG. 1 Algorithm Structure
Fig. 2 is a multi-scale fusion effect graph, (a) is an original graph, (b) is a feature graph extracted by shallow convolution, (c) is a feature graph extracted by deeper convolution, (d) is a feature graph extracted by deep convolution, and (e) is a feature graph extracted after multi-scale fusion.
TABLE 1 network specific parameters
Table 2 different network accuracy comparisons
Detailed Description
in order to make the technical scheme of the invention clearer, the invention is further explained below by combining the attached drawings. The invention is realized by the following steps:
The experimental environment of the invention is as follows: ubuntu16.04 operating system, Intel Core i7-8700 CPU,16G running memory, Nvidia GTX1080Ti GPU, deep learning framework Tensorflow. Adam optimization is adopted in the network, the learning rate is set to be 1e-3, the batch _ size is set to be 16, BN layers are added to accelerate convergence, loss is printed once after every 1000 times of iterative training, the accuracy of the loss is verified, and 64000 iterations are performed in total.
First, a music score image data set is established.
The data used in The present invention are derived from The PrIMus Dataset (Printed Images of Music tables) in The open, wherein 87687 real spectra were collected, each with only one staff and containing about 4-7 bars. And (3) adding image enhancement methods such as Berlin noise, white Gaussian noise, elastic deformation and the like to the randomly selected part of data set to simulate the music score images under various undesirable conditions.
And secondly, constructing a model.
the whole algorithm is formed by combining a multi-scale residual error type CNN network and an SRU.
(1) Multi-scale residual error type CNN: the multi-scale residual CNN is composed of five convolutional residual blocks and performs multi-scale feature fusion, as shown in fig. 1. The input image data are sequentially subjected to five residual block convolutional layers to obtain feature maps C1, C2, C3, C4 and C5, the sizes of convolution kernels of the feature maps C1, C2, C3, C4 and C5 are all 3 multiplied by 3, and the number of the convolution kernels is increased by 32, 64, 128, 256 and 256 layer by layer. And fusing the result of 2 times upsampling of the final layer of feature map C5 with the result of 1 × 1 convolution operation of the feature map C4 to obtain a feature F5, and performing the same processing on F5 and C3 to obtain a feature F4 through C5 and C4.
(2) SRU network: and combining the network and the SRU to form the multi-scale residual type CNN and SRU network based on the multi-scale residual type CNN in the last step. Wherein the SRU portion consists of two layers of bi-directional SRUs. The cycle length of each layer is kept unchanged due to the determination of the height of the music score image and the number of selected convolution kernels, forward learning and backward propagation of the weight in each SRU are realized through 512 hidden layer units, and network specific parameters are shown in Table 1.
And thirdly, training the model. And training the constructed model by using the data set to obtain an optimal model and storing the optimal model. The deep learning network model inputs a data set music score image, the truth labels are semantic information corresponding to notes in the music score image, parameters of the network are gradually adjusted through a chain time sequence classification loss function to achieve the optimal value, and the predicted value of the note semantic information is finally output.
TABLE 1 network specific parameters
Table 2 different network accuracy comparisons
The multi-scale residual CNN and SRU method identifies the music score image. Firstly, expanding data of a music score image data set by methods of deformation, noise adding and the like to improve the generalization capability of a model; secondly, a residual difference type CNN structure is adopted in the feature extraction part, and the extracted features of each layer of residual difference type structure are subjected to multi-scale fusion, so that multi-level features are concentrated in a unified feature map to enhance the feature representation capability of the model, and further improve the subsequent RNN identification precision; and finally, an SRU model is adopted in the characteristic identification RNN part, so that the model convergence speed is increased, and the training time is shortened.
Claims (1)
1. A music score image recognition method combining multi-scale residual CNN and SRU comprises the following steps:
Firstly, establishing a data set of a music score image: spectral examples are selected and image enhancement techniques are used so that the data set contains images of the score in the undesirable case to augment the data set.
Secondly, constructing a model: combining the multi-scale residual difference CNN and the SRU;
(1) Scale residual CNN network: the multi-scale residual type CNN is composed of five convolution residual blocks and carries out multi-scale feature fusion, input image data sequentially passes through the five residual block convolution layers to obtain feature maps C1, C2, C3, C4 and C5, the sizes of convolution kernels of the feature maps C1, C2, C3, C4 and C5 are all 3 multiplied by 3, and the number of the convolution kernels is increased by 32, 64, 128, 256 and 256 layer by layer. And fusing the result of 2 times upsampling of the final layer of feature map C5 with the result of 1 × 1 convolution operation of the feature map C4 to obtain a feature F5, and performing the same processing on F5 and C3 to obtain a feature F4 through C5 and C4.
(2) An SRU part: the method comprises two layers of bidirectional SRUs (hidden layer units), wherein the cycle length of each layer is kept unchanged due to the fact that the height of a music score image and the number of selected convolution kernels are determined, and forward learning and backward propagation of weights in each SRU are achieved through 512 hidden layer units;
thirdly, training a model: and performing model training by using the data set after data enhancement, inputting the model into a music score image in the data set, gradually adjusting each parameter of the network through a chain time sequence classification loss function to achieve the optimal value by using a truth label as a semantic label corresponding to the image, and finally outputting a predicted value of the note semantic information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910787184.3A CN110580458A (en) | 2019-08-25 | 2019-08-25 | music score image recognition method combining multi-scale residual error type CNN and SRU |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910787184.3A CN110580458A (en) | 2019-08-25 | 2019-08-25 | music score image recognition method combining multi-scale residual error type CNN and SRU |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110580458A true CN110580458A (en) | 2019-12-17 |
Family
ID=68811891
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910787184.3A Pending CN110580458A (en) | 2019-08-25 | 2019-08-25 | music score image recognition method combining multi-scale residual error type CNN and SRU |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110580458A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112633175A (en) * | 2020-12-24 | 2021-04-09 | 哈尔滨理工大学 | Single note real-time recognition algorithm based on multi-scale convolution neural network under complex environment |
CN112686104A (en) * | 2020-12-19 | 2021-04-20 | 北京工业大学 | Deep learning-based multi-vocal music score identification method |
CN112836056A (en) * | 2021-03-12 | 2021-05-25 | 南宁师范大学 | Text classification method based on network feature fusion |
CN113239151A (en) * | 2021-05-18 | 2021-08-10 | 中国科学院自动化研究所 | Method, system and equipment for enhancing spoken language understanding data based on BART model |
CN113239809A (en) * | 2021-05-14 | 2021-08-10 | 西北工业大学 | Underwater sound target identification method based on multi-scale sparse SRU classification model |
CN114092946A (en) * | 2021-11-22 | 2022-02-25 | 重庆理工大学 | Music score recognition method |
CN114202763A (en) * | 2021-12-02 | 2022-03-18 | 厦门大学 | Music numbered musical notation semantic translation method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018194456A1 (en) * | 2017-04-20 | 2018-10-25 | Universiteit Van Amsterdam | Optical music recognition omr : converting sheet music to a digital format |
CN109376720A (en) * | 2018-12-19 | 2019-02-22 | 杭州电子科技大学 | Classification of motion method based on artis space-time simple cycle network and attention mechanism |
US20190122101A1 (en) * | 2017-10-20 | 2019-04-25 | Asapp, Inc. | Fast neural network implementations by increasing parallelism of cell computations |
CN109711409A (en) * | 2018-11-15 | 2019-05-03 | 天津大学 | A kind of hand-written music score spectral line delet method of combination U-net and ResNet |
-
2019
- 2019-08-25 CN CN201910787184.3A patent/CN110580458A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018194456A1 (en) * | 2017-04-20 | 2018-10-25 | Universiteit Van Amsterdam | Optical music recognition omr : converting sheet music to a digital format |
US20190122101A1 (en) * | 2017-10-20 | 2019-04-25 | Asapp, Inc. | Fast neural network implementations by increasing parallelism of cell computations |
CN109711409A (en) * | 2018-11-15 | 2019-05-03 | 天津大学 | A kind of hand-written music score spectral line delet method of combination U-net and ResNet |
CN109376720A (en) * | 2018-12-19 | 2019-02-22 | 杭州电子科技大学 | Classification of motion method based on artis space-time simple cycle network and attention mechanism |
Non-Patent Citations (2)
Title |
---|
吴天龙;李锵;关欣;: "基于多维局部二值模式和XGBoost的轻量谱线删除法" * |
张文达;许悦雷;倪嘉成;马时平;史鹤欢: "基于多尺度分块卷积神经网络的图像目标识别算法" * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112686104A (en) * | 2020-12-19 | 2021-04-20 | 北京工业大学 | Deep learning-based multi-vocal music score identification method |
CN112686104B (en) * | 2020-12-19 | 2024-05-28 | 北京工业大学 | Multi-sound part music score recognition method based on deep learning |
CN112633175A (en) * | 2020-12-24 | 2021-04-09 | 哈尔滨理工大学 | Single note real-time recognition algorithm based on multi-scale convolution neural network under complex environment |
CN112836056A (en) * | 2021-03-12 | 2021-05-25 | 南宁师范大学 | Text classification method based on network feature fusion |
CN112836056B (en) * | 2021-03-12 | 2023-04-18 | 南宁师范大学 | Text classification method based on network feature fusion |
CN113239809A (en) * | 2021-05-14 | 2021-08-10 | 西北工业大学 | Underwater sound target identification method based on multi-scale sparse SRU classification model |
CN113239809B (en) * | 2021-05-14 | 2023-09-15 | 西北工业大学 | Underwater sound target identification method based on multi-scale sparse SRU classification model |
CN113239151A (en) * | 2021-05-18 | 2021-08-10 | 中国科学院自动化研究所 | Method, system and equipment for enhancing spoken language understanding data based on BART model |
CN114092946A (en) * | 2021-11-22 | 2022-02-25 | 重庆理工大学 | Music score recognition method |
CN114092946B (en) * | 2021-11-22 | 2024-08-20 | 重庆理工大学 | Music score identification method |
CN114202763A (en) * | 2021-12-02 | 2022-03-18 | 厦门大学 | Music numbered musical notation semantic translation method and system |
CN114202763B (en) * | 2021-12-02 | 2024-09-13 | 厦门大学 | Music numbered musical notation semantic translation method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110580458A (en) | music score image recognition method combining multi-scale residual error type CNN and SRU | |
CN107437096B (en) | Image classification method based on parameter efficient depth residual error network model | |
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN107680580B (en) | Text conversion model training method and device, and text conversion method and device | |
CN111753081B (en) | System and method for text classification based on deep SKIP-GRAM network | |
CN111738169B (en) | Handwriting formula recognition method based on end-to-end network model | |
Calvo-Zaragoza et al. | End-to-end optical music recognition using neural networks | |
CN110443127A (en) | In conjunction with the musical score image recognition methods of residual error convolutional coding structure and Recognition with Recurrent Neural Network | |
CN111858878B (en) | Method, system and storage medium for automatically extracting answer from natural language text | |
CN113159023A (en) | Scene text recognition method based on explicit supervision mechanism | |
CN115587594B (en) | Unstructured text data extraction model training method and system for network security | |
CN111967267B (en) | XLNET-based news text region extraction method and system | |
CN111666376B (en) | Answer generation method and device based on paragraph boundary scan prediction and word shift distance cluster matching | |
CN110852375A (en) | End-to-end music score note identification method based on deep learning | |
CN113988079A (en) | Low-data-oriented dynamic enhanced multi-hop text reading recognition processing method | |
Ríos-Vila et al. | On the use of transformers for end-to-end optical music recognition | |
Helmy et al. | Applying deep learning for Arabic keyphrase extraction | |
Szűcs et al. | Seq2seq deep learning method for summary generation by lstm with two-way encoder and beam search decoder | |
Azawi | Handwritten digits recognition using transfer learning | |
CN111858879B (en) | Question and answer method and system based on machine reading understanding, storage medium and computer equipment | |
CN117437499A (en) | Transfer learning method for extracting constant domain features and optimizing text of CLIP | |
Găman et al. | Self-paced learning to improve text row detection in historical documents with missing labels | |
Soujanya et al. | A CNN based approach for handwritten character identification of Telugu guninthalu using various optimizers | |
Tong et al. | Out-of-Distribution with Text-to-Image Diffusion Models | |
CN113326833A (en) | Character recognition improved training method based on center loss |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20231208 |
|
AD01 | Patent right deemed abandoned |