Nothing Special   »   [go: up one dir, main page]

Home Classical music recommendation algorithm on art market audience expansion under deep learning
Article Open Access

Classical music recommendation algorithm on art market audience expansion under deep learning

  • Chunhai Li and Xiaohui Zuo EMAIL logo
Published/Copyright: May 30, 2024
Become an author with De Gruyter Brill

Abstract

The purpose of the study is to help users know about their favorite music and expand art market audiences. First, the personalized recommendation data of classical music are obtained based on the deep learning recommendation algorithm technology, artificial intelligence, and music playback software of users. Second, a systematic experiment is conducted on the improved recommendation algorithm, and a classical music dataset is established and used for model training and user testing. Then, the network model of the classical music recommendation algorithm is constructed through the typical convolutional neural network model, and the optimal parameters suitable for the model are found. The experimental results show that the optimal value of the dimension in the hidden layer is 192, and 24,000 training rounds can converge to the global optimum when the learning rate is 0.001. The personalized recommendation is provided for target users by calculating the similarity between user preference and potential features of classical music, relieving the auditory fatigue of art market audiences, improving user experience, and expanding the art market audience through the classical music recommendation system.

1 Introduction

Nowadays, the rapid development of the Internet brings a lot of information for people to choose from. However, it also causes the problem of information overload, resulting in great competitive pressure on information producers [1]. For the collector of information resources, it is difficult to pick out valuable and suitable information from a large deal of information. In this case, the time used for selecting information is greatly increased, reducing the working efficiency. Meantime, this leaves people drown in excessive and invalid information. For information producers, it is increasingly hard to enhance the competitiveness of their production of information, to stand out in the vast amounts of information, and to win the attention of the public [2]. The information overload becomes more and more severe. In view of this, the two complementary tools of search engines and recommendation systems come into being. Since the application of a search engine is passive for the user, it is required to know the personalized needs and characteristics of each user. The recommendation system automatically recommends the keywords and submits them to the search engine, showing users relevant results in the system after they are obtained [3].

With the continuous development and improvement of Internet technology, more and more classical music-related industries have begun to turn to online music platforms to publicize their music [4]. The appearance of platforms, such as Netease Cloud, Tencent, and KuGou, provides channels for increasing the size of the audience of classical music. The online listening, downloading, purchasing, sharing, and other functions of the music platforms greatly improve the speed and enlarge the scope of classical music, which is helpful in giving more users access to the resources of classical music. However, there are more and more pieces of music in the music library, which leads to more time and energy that the users have to spend finding their favorite music [5]. In terms of the search engine, the traditional search method usually retrieves the names of classical songs, singers, albums, and so on, which ignores the user’s personalized differences and causes the “long tail” phenomenon of classical music. To solve this problem, the recommendation systems of classical music should be applied, because they can collect users’ personalized information and preferences and analyze the data of classical music, recommending the classical music that conforms to users’ preferences [6].

Based on convolutional neural network (CNN) and deep learning (DL), users’ personalized preferences and the characteristics of classical music are collected, and the personalized recommendation system of classical music for each user after the data are analyzed. The basic principle of the recommendation system of classical music is as follows: (1) the audio resources of classical music in the system are processed, and then the spectrum and note characteristics of classical music are extracted; (2) the music is divided into several segments, and the probability distribution of each segment on the preset classification is used as a new classification basis; and (3) the traditional classification of classical music and user preference, combined with the data of the user’s listening, collection, praise, sharing, purchase records, a recommendation system is designed to automatically identify the user preferences and characteristics. Data analysis on the application of this recommendation system shows that it plays a great role in increasing the size of the audience in the art market. Since audiences are one of the main driving forces for the development and progress of the art industry, the results have great significance and open up a bright prospect for the dissemination of classical music.

This article is divided into four sections. First, Section 1 shows the research background and the main content of the study; Section 2 introduces the research methods and data needed in this study; Section 3 conducts the comparative analysis of classical music recommendation algorithm and data; and Section 4 is the conclusion, which summarizes the main research work of this study and shows the direction of future research work.

2 Method

2.1 Recommended system of classical music

In the literature research on the music recommendation algorithm, Shi proposed a music recommendation method that combines the long-term, medium-term, and user’s real-time behavior, considers the dynamic adjustment and the influence weight of the three behaviors, and uses advanced long-term and short-term memory technology to improve the effectiveness of the music recommendation [7]. Dharsini et al. built an efficient music recommendation system that uses facial recognition technology to confirm users’ emotions [8]. Jin and Han proposed a music recommendation algorithm combining the clustering and potential factor models. First, users’ music playback records are processed to form the user music matrix. Then, the probability model of potential factors is used to analyze the data on the result matrix, and the user preference matrix U and the music feature matrix V are obtained. On this basis, two clustering algorithms are used for user clustering and music clustering of the two matrices. Finally, the user-based collaborative filtering algorithm is used to predict users’ preference matrix and commodity feature matrix that complete the clustering [9].

The recommended system of classical music mainly includes three parts: people, model, and results. Here, based on the actual needs, the recommendation model of classical music is divided into two categories, namely the user preference model and the music resource model. In terms of the user preference model, the recommended algorithm is based on context, which requires the use of music labels, user data, and result feedbacks to establish its model. In terms of the music resource model, the recommended algorithm is based on content, which needs to establish a model based on the characteristics of classical music, such as category, emotion, and melody. After the two models are established, collaborative filtering is applied to realize the personalized recommendation [10].

Deep learning (DL), an increasingly popular research method, is applied to the recommendation system of classical music for the improvement and optimization of algorithms, which was published in the top academic journal Science in 2006 and recognized by the academic community. This pioneering result based on DL establishes Hinton’s authority in the field of DL. The study draws two conclusions: first, the multi-layer neural network can better identify data compared with the single-layer, especially in the learning and performance of data features. Second, the progressive nature of DL can greatly reduce the difficulty of neural network training, so that the gradient decline of the neural network becomes more stable [11]. Furthermore, in terms of the application of DL, Google published its research on the application of a recommendation system (Google Play) based on DL and the algorithm model for YouTube videos in 2016 [12].

2.2 Increasing the size of the audience

Since the concept of “increasing the size of audience” appeared in the 1980s, it has been widely used in music, drama, dance, film, television, and visual art. Although it has no unified definition, the connotation of increasing the size of the audience is understood from the definition and description by many scholars based on different art categories and their own experiences. Christian Watt, an English scholar, argued that increasing the size of the audience refers to a powerful process to improve the service for existing audiences and attract new audiences. It is not a simple behavior process but a planned and targeted management process involving all aspects of the operation of museums to achieve their overall purpose to a high standard [8]. According to Ebrahimi et al., increasing the size of the audience is defined as enriching the audience’s experience, helping them learn more, and deepening their enjoyment of art museum services [13]. Yan thought that the goal of broadening audiences is to establish a relationship between man and art [14]. Du summed up the increasing size of the audience as “an art to win new audiences and retain old audiences” [15]. Liao deeply analyzed the news selection, decoding, and coding abilities that news audiences should have based on news consumption features and the corresponding problems [16].

Here, the object of the research on broadening the audience is the art market, which is wider, compared with art museums, theatres, and stages. The concept of audience is deeper and broader. Besides, the “audience” in this study refers to art viewers and appreciators in the narrow sense, as well as art followers, learners, lovers, consumers, and the target audience of the whole art industry.

2.3 Recommendation algorithm based on deep learning

2.3.1 Deep neural network model

DL is proposed under the upsurge of machine learning. It opens up a more convenient and broad perspective for research and becomes one of the most popular research methods. It differs from other machine learning because it has multiple learning methods, which can be divided into several ranks and successively arranged [17]. Each learning method is composed of some simple and nonlinear modules, and these modules can transform the previous module representation into a higher level of abstraction. In the complex combinational transformation, the learning system can perform more complex tasks, like simulating the perception process of the human brain neural network on external stimuli. In terms of classification, a higher level of representation can magnify the irrelevant changes instead of controlling them when identifying important features [18].

The research on the deep neural network model is as follows: Mun et al. developed a deep neural network model, which can estimate and quantify gait spatio-temporal parameters from foot features [19]. Ren et al. proposed to use GA to automatically iteratively generate the most suitable network model based on existing datasets, remove redundant nodes and connections of the original network model, and make the optimized model more streamlined according to the fact that there are too many human experience factors involved in the training and model compression of DL networks [20]. Lee and Lee developed a process-centered assessment method using the concept of the deep neural network and a series of facial images [21].

The principle of the deep neural network is to establish a suitable learning network so that many hidden layers in the network can learn various features actively. Hidden layers have a progressive relationship with each other, which is an abstract description of the last layer. The analysis of data samples by the deep neural network is deep and more accurate, and the classification is more reasonable [22]. The deep neural network mainly includes three levels: the first is the input layer, which inputs the original data samples into the neural network training. The second is the hidden layer. After acquiring and analyzing the characteristics of the original data sample, the neural network compresses and abstracts the characteristics fully. The third is the output layer, which classifies the results of the analysis of the characteristics. The probability of each classification in the expectation is calculated [23]. The results are compared to the expected in the training, and the differences between the results are transmitted back.

2.3.2 Deep neural network model

In the application of common DL systems, the weights corresponding to each neuron represent their respective characteristics for learning in the network [24]. Linear models are usually used to memorize units. To fit external stimuli better, a nonlinear activation function can be added to the original linear model. In equation (1), the Sigmoid function is an activation function added to compress the continuous input to [0, 1] in the application

(1) σ ( z ) = 1 1 + e z ,

where z represents the linear model of neural units:

(2) z = j w j x j + b .

CNN is a typical deep neural network. It combines the advantages of image processing and DL, improves the accuracy of feature recognition, and greatly reduces the computation of neural networks. It can be applied to image recognition and recommendation systems.

2.3.3 Feature extraction method of the notes of classical music

The classical music signal is composed of pitch and pantone. The fundamental pitch determines its pitch. Therefore, the detection of the pitch period is the key to the recognition of the notes of classical music [25]. The pitch detection method based on the autocorrelation function is a simple and classical time-domain detection algorithm. The definition of the short-term autocorrelation function is shown in equation (3)

(3) R i ( k ) = m 1 n m y i ( m ) y i ( m + k ) ,

where R i is the covariance and y i is the autocorrelation function.

The frameshift method is used to select a reasonable pitch period, reduce the interference of frequency doubling waves, improve the accuracy of note recognition, and add the note characteristics to the spectrum sample to obtain the note spectrum [26].

2.4 Comparison of three recommendation algorithms

Knees compare content-based and context-based recommendation algorithms, which are mentioned in the literature review. Here, the two algorithms are compared with a DL-based recommendation algorithm, as shown in Table 1.

Table 1

Comparison of the three recommendation algorithms

Prerequisite Metadata Cold start problems Preference Characteristics
Content-based Music files Need N. N. Objective, direct, and numerical
Context-based Users Do not need Y. Y. Subjective, noisy, semantic
The algorithm based on DL Music files Need N. N. Objective, fuzzy, and extensible

DL-based data processing is the same as content-based data processing, extracting features from audio metadata. Although it avoids the cold start problem, the method increases the burden on data processing [27]. This is because the capacity of a piece of music is far greater than the amount of information generated by users. In this case, context-based recommendation algorithms like collaborative filtering are popular in the industry, and they are not only easy to handle and deploy, and independent of metadata, and can explore the preferences of users. However, the cold start is always a problem in the context-based recommendation algorithm, which is not applicable if there is little information available [28].

In addition, the recommendations are more likely to be similar to their preferences rather than related. The context-based recommendation can only provide the user with the preferences of other users, as some business platforms do. For example, people who have listened to the song will also like the following ones. This recommendation is not an accurate assessment of the similarity between the two songs [29].

Extracting music features from audio signals can essentially reflect the types of music, which is more in line with the intuitive feelings of human beings on music. Content-based and DL-based recommendation algorithms have this advantage. Also, DL-based recommendation algorithms can apply to a broader platform because of the strong scalability of the DL model [30].

Based on the above analysis, it is concluded that each algorithm has inevitable defects. Thus, it is suggested to combine the recommendation algorithms. However, the combined algorithm is very complex. After the three recommendation algorithms are re-examined, it is found that content-based can collect user behavior data for the recommendation.

2.5 Artificial intelligence (AI)

As a technology, AI mainly studies the characteristics and laws contained in human intelligence activities. Based on these characteristics and laws, it imitates and constructs an artificial system with a certain degree of intelligence and attempts to make computers complete the tasks that require human intelligence. In short, AI analyzes how to use computer hardware and software to simulate the basic theory, method, and technology of human intelligent behavior through intelligent algorithms, platforms, or machines to simulate and extend human intelligence.

Machine learning can mine the effective data and association in large amounts of data through the network model so that the program has the ability of self-learning and self-prediction. As an algorithm for machine learning, DL can continuously optimize the structure of its network model by simulating the neural network of the human brain and extracting more high-quality data and connections. Machine learning and DL greatly promote the realization and development of AI in many fields, such as data mining, natural language, and computer vision.

2.5.1 Machine learning

Machine learning is the core of the basic technical level of AI. It mainly studies some behaviors of computer simulation human learning to absorb new knowledge and skills. Machine learning is derived from the early research field of AI. Learning methods can be divided into supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. Machine learning needs to continuously train the machine through a large amount of data. With the help of cloud computing technology, the machine manipulates the data to judge the results and help make decisions. Through data description, the machine can describe, identify, classify, and explain things or phenomena, helping human beings complete tasks.

2.5.2 DL

DL is an algorithm that can effectively realize machine learning. This algorithm is mainly used to simulate the neural network of human brains for learning and establish a network imitating the human brain mechanism to analyze data. Machines’ analytical learning ability similar to human beings is finally achieved by learning the inherent laws and representation levels of sample data. The DL algorithm imitates the multi-layer neural network of the human brain, which can make the machine learn more complex feature rules from the low level to the high level, solving some complex and difficult problems more pertinently.

3 Research model and recommendation algorithm

3.1 Self-attention model

The self-attention model is a special attention model and is widely used because of its powerful feature extraction ability for sequence information. In the attention model, the three elements of Query, Key, and Value are different. However, in the self-attention model, Query, Key, and Value are the same, which are all equal to the input signal X. It can be understood that the essence of this operation is that each signal itself queries its importance in the group of signals or its association with other signals, and then the weight of the signal that is only related to itself is accumulated to obtain its self-attention value. Attention in machine translation can be intuitively reflected. Whenever a word is translated, the weight of Attention will be more on the input words associated with the word. If the self-attention model is transformed, that is, the source language and the target language are the same, what self-attention learns in DL can be observed intuitively.

The surface of the figure is the input sentence of self-attention, and the above is the output sentence obtained by self-attention calculation. It can be found that the input and output are the same, that is, it translates itself. The line segment in the figure indicates the weight of the calculated in the input sentence when the word is translated. The deeper the color is, the stronger the relevance is.

As mentioned above, the self-attention model can effectively model the pre-correlation and post-correlation of internal signals in a sequence and even learn the semantic relationship within a sentence. Since the acquisition of these association features by self-attention is not affected by the distance, indiscriminate association modeling can be carried out no matter how far the two signals are located.

3.2 Content-based recommendation algorithm

3.2.1 Content representation

A feature representation is calculated for each item, which is the core of the whole recommendation algorithm and shows the difference applied to different recommendation fields. It usually uses some attributes of the item itself to represent the item. For example, attributes such as authors and publishing time are easy to obtain, and the real representativeness is the specific content of the article itself. At this time, the specific content of the article needs to be transformed into structured data. The article is composed of words, and the weight information of each word can be used to form a vector. Article d can be expressed as follows:

(4) d = { w 1 , w 2 , , w n } ,

where w i represents the weight of the ith word in this article, n is the size of the entire vocabulary, and the specific size of the weight can be obtained by term frequency-inverse document frequency. w i can be defined as follows:

(5) w i = n i j = 1 n n j log m m i + 1 ,

where n i represents the number of times the ith word appears in article d, m represents the total number of articles, m i represents the number of documents that appear in the ith word, and 1 is added to avoid zeroing errors. This shows that the algorithm believes that the more times a word appears in article d, and the fewer articles the word appears, the greater the weight of the word is, and the more representative the article d is, thus obtaining the vector representation of the article d.

3.2.2 Learning the features

The user’s preference for new items is determined according to the user’s favorite items. In the first step, the feature vector of each item has been obtained, and the user knows which items he likes and which items he does not like. At this time, a supervised machine learning binary classification task can be constructed to train a user model and then determine whether the user has a preference for new items. Here, the commonly used classification algorithms are decision trees, naive Bayesian, and neural networks.

3.2.3 Generation of the recommendation list

The optimal K-recommended items are obtained through the characteristics of the items obtained in the first two steps and the output of the user model. If the user classification model is used in the second step, only the items most likely to be liked by K users are obtained, or the user vector is obtained by averaging the feature vectors of all users’ favorite items, and then the K items most similar to the user vector are obtained.

Since the content-based recommendation algorithm mainly analyzes the content attributes of items themselves, it can well solve the cold start problem of items. When a new item is added, it can still be recommended well, even if no user likes it.

4 Results and discussion

4.1 Experimental environment

The experiment is conducted on the servers that have Windows 10 operating systems installed. The server configuration is shown in Table 2.

Table 2

Hardware environment and parameters

Hardware environment Processor Mainboard Memory Graphics card Basic frequency
Parameters Intel i5-10400F Asus B460 TUF Gaming Weigang DDR4 3,000 Hz 16G Seven Rainbow RTX3060 12G 2.20 GHz

Table 2 shows the hardware used in the study of the classical music recommendation algorithm. Python 3.6 is the programming language used to write the experimental code. The Tensorflow 1.12.0 DL framework is used to achieve the rapid construction of the model to improve the efficiency of code writing.

4.2 Dataset collection and data pre-processing

4.2.1 Data collection

In this study, Last.fm Dataset-1k user music dataset (http://www.last.fm) is used to train and test the proposed model. This dataset records the user’s music listening behavior completely and is widely used in the research of the music recommendation algorithm, which is convenient to compare with other algorithms. The dataset collects 579,195 records from 300 users in the tsv format, and each record is presented in the form of six ancestors <user, timestamp, artid, artname, traid, traname>. The specific scale of the dataset is shown in Figure 1.

Figure 1 
                     Last.fm Dataset-1k users datasets.
Figure 1

Last.fm Dataset-1k users datasets.

The following shows a record in the dataset with a specific example. User_000001 is shown in Table 3.

Table 3

User_000001 the listening record

Attribute User ID Listening period Singer ID Artist Song ID Names of songs
Example User_000001 2020-10-31T15:41:13Z 87c5dedd-371d-4a53-9f7f-80522fb73cb Jay Chou 268b6266-29ce-4822-9f58-70034a8edb4a Balloons of Love

Table 3 shows that user_000001 listens to the song “Balloons of Love” sung by Jay Chow on 31 October 2020 at 3.41 p.m. 13 s, which also includes singer ID, avoiding the conflict between singers and songs.

4.2.2 Data pre-processing

Since the user’s listening behavior has the characteristics of sessions, the user’s listening records are divided into session records according to certain rules. Generally speaking, if the user does not listen for 40 min, his short-term preference may change. If the difference in the playing time of the two songs is 40 min, the former and latter records are divided into two session records based on this boundary. On this basis, the data need to be processed as follows, as shown in Table 4, to obtain a better recommendation effect.

Table 4

Data pre-processing process

Operational procedure Content
Remove the songs replayed Users often play a song inadvertently and open a single loop play, and this is not the user’s real intention. Even if it is the user’s real intention, recommending the same song is meaningless, which will lose the diversity of recommendations and occupy valuable recommendation space. Therefore, the repeated songs are usually combined into one.
Remove short and long session records Too short music session records have little significance for the training of the model and will cause too long _ PAD complement symbols, affecting the efficiency of model training. Because users often forget to turn off the player and play a lot of songs at random, users generally do not listen to a song for a long time, and too-long session records are more likely to be noisy and interfere with user preferences modeling. Therefore, session records between 5 and 40 are retained. Moreover, the model trained in such data can be good for effective recommendation even if encountering very short or very long sessions.

After data are pre-processed according to the above rules, 208,627 session records are obtained. The average number of session records owned by users in the listening history is about 220.2, and the average number of each music session record contains about 8.4 songs, which meets the training data requirements of the model.

4.3 Hyperparameter setting

The architecture design of the classical music recommendation engine based on the CNN model and the DL model is shown in Figure 2.

Figure 2 
                  Architecture of the recommended engine.
Figure 2

Architecture of the recommended engine.

The architecture of the recommended engine in Figure 2 contains four attributes:

  1. Use user and databases to obtain user behavior features and song attributes.

  2. User and music features are fed into the basic music recommendation algorithm to train the initial recommendation sequence after they are sorted.

  3. Remove the initial recommendation sequence, obtain the features of the corresponding user and song, and splice the user listening song sequence features into the CNN model and DL model to train the TOP N songs to form the recommended list.

  4. After the recommendation list is gained, it needs to observe the user’s request for the recommendation list. When a personalized recommendation list is requested, the personalized recommendation list with new songs will be recommended to users. When users request popular songs, the list of popular songs will be recommended to users.

4.4 Experimental results and performance evaluation

4.4.1 Comparative analysis of different dimensions in the hidden layer

The experiment aims to explore the influence of different dimensions in the hidden layer on the model and find the optimal value of the dimensions in the hidden layer. The number of dimensions in the hidden layer is often related to the feature extraction ability of the model. If the number of neurons in the hidden layer is too small, it will not extract enough information. If the number of neurons in the hidden layer is too large, it will introduce unnecessary noise to cause over-fitting and make the model more bloated. In this experiment, the dimensions in the hidden layer are set to 48, 96, 192, and 384, respectively. The experimental results are shown in Figure 3.

Figure 3 
                     The changing state of index value under different dimensions in the hidden layer.
Figure 3

The changing state of index value under different dimensions in the hidden layer.

Figure 3 shows the influence of dimension in the hidden layer on the algorithm. When the number of the dimensions in the hidden layer is small, the model cannot carry or mine enough songs and user’s preference information and cannot accurately predict the next song to be listened to by the user. When the number of dimensions in the hidden layer increases to 384, the overall effect of the model begins to decline, indicating that too many nodes in the hidden layer lead to the sparseness of users’ preferences, producing much noise information and hurting the final result. Therefore, the optimal number of dimensions in the hidden layer is 192.

4.4.2 Comparative experimental analysis of different learning rates

The experiment is to explore the influence of different learning rates on the model. The learning rate determines the learning range of each training. A good learning rate can make the model converge to the optimal solution quickly and stably, while a bad learning rate may take a long time to converge to the optimal solution and even cause the model to be unable to converge. If the learning rate is too large, the model will oscillate at the lowest point and cannot reach the global optimum. If the learning rate is too small, it will lead to slow convergence or fall into the local optimum. When the optimal learning rate is searched for, the learning rate is set to 0.1, 0.01, 0.001, and 0.0001, respectively. The optimal learning rate can be found more quickly by scaling the multiple of 10, which is a common way to adjust the learning rate. The experimental results are shown in Figure 4.

Figure 4 
                     The changing state of indexes under different learning rates.
Figure 4

The changing state of indexes under different learning rates.

Figure 4 shows that the final effect shows a trend of first getting better and then getting worse as the learning rate decreases. When the learning rate is set to 0.1, the model oscillates at the lowest point, failing to converge to the global optimum. When the learning rate is 0.0001, the model does not converge to the global optimum. When it is close to the optimal point, it falls into the local optimum. At this time, 32,000 rounds of model training are needed to converge. When the learning rate is 0.001, 24,000 rounds of training can converge to the global optimum.

5 Conclusion

Based on DL, the recommendation algorithm is used to construct the classical music recommendation system. With the combination of DL and recommendation algorithms, various types of classical music are automatically extracted with the help of the relevant advantages of AI. In the audio of classical music, a higher level of feature representation can be obtained. Furthermore, the implicit characteristics of classical music are extracted to obtain user preferences by combining with the idea of the deep neural network of the classical music recommendation algorithm, as well as AI. Finally, the corresponding recommendation to each user’s personalized preferences is obtained, achieving the purpose of increasing the size of the audience in the art market. According to the data of the recommended and shared classical music on music platform A, it can be concluded that the recommendation system of classical music can broaden the number of the audience in the art market, save the user’s search time, and bring convenience to the user.

The shortcomings of the study are as follows: First, the recommendation system of classical music is only based on CNN, which may have some limitations. Second, the data collected have a few features, which are not comprehensive. Therefore, it is hoped that the design and establishment of the model can be further carried out by integrating other DL network models. Besides, the visualization method will be used to observe the learning state of each layer, improve the structure, and adjust the parameters, which can greatly improve the accuracy of the recommendation system. In terms of user characteristics, the issue time of classical music and user age will be added to the later research, making their characteristics more comprehensive.

Acknowledgments

This research received no external funding.

  1. Funding information: This research received no external funding.

  2. Author contributions: Chunhai Li Conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft preparation. Xiaohui Zuo: writing—review and editing, visualization, supervision, project administration, funding acquisition.

  3. Conflict of interest: The authors declare that there is no conflict of interest regarding the publication of this article.

  4. Data availability statement: The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

References

[1] Elbir AM. DeepMUSIC: Multiple signal classification via deep learning. IEEE Sens Lett. 2020;4(4):1–4.10.1109/LSENS.2020.2980384Search in Google Scholar

[2] Martin-Gutierrez D, Hernandez Penaloza G, Belmonte-Hernandez A, Alvarez Garcia F. A multimodal end-to-end deep learning architecture for music popularity prediction. IEEE Access. 2020;34(99):1.10.1109/ACCESS.2020.2976033Search in Google Scholar

[3] Wen X. Using deep learning approach and IoT architecture to build the intelligent music recommendation system. Soft Comput. 2020;23(1):1–10.Search in Google Scholar

[4] Pandeya YR, Lee J. Deep learning-based late fusion of multimodal information for emotion classification of music video. Multimed Tools Appl. 2021;80(38):1–19.10.1007/s11042-020-08836-3Search in Google Scholar

[5] Prisco RD, Zaccagnino G, Zaccagnino R. EvoComposer: An evolutionary algorithm for 4-voice music compositions. Evolut Comput. 2019;28(2):1–42.10.1162/evco_a_00265Search in Google Scholar PubMed

[6] Deshmukh P, Kale G. Music and movie recommendation system. Int J Eng Trends Technol. 2018;61(3):178–81.10.14445/22315381/IJETT-V61P229Search in Google Scholar

[7] Shi J. Music recommendation algorithm based on multidimensional time-series model analysis. Complexity. 2021;2021(1):1–11.10.1155/2021/5579086Search in Google Scholar

[8] Dharsini SV, Balaji B, Hari K. Music recommendation system based on facial emotion recognition. J Comput Theor Nanosci. 2020;17(4):1662–5.10.1166/jctn.2020.8420Search in Google Scholar

[9] Jin Y, Han C. A music recommendation algorithm based on clustering and latent factor model. MATEC Web Conf. 2020;309(9):3009.10.1051/matecconf/202030903009Search in Google Scholar

[10] Schedl M. Deep learning in music recommendation systems. Front Appl Math Stat. 2019;5:44.10.3389/fams.2019.00044Search in Google Scholar

[11] Edwards JR, Borgstedt S, Barth B. New music recommendation algorithm facilitates audio branding. Mark Rev St Gallen. 2019;4:888–94.Search in Google Scholar

[12] Pacha A, Haji J, Calvo-Zaragoza J. A baseline for general music object detection with deep learning. Appl Sci. 2018;8(9):1488.10.3390/app8091488Search in Google Scholar

[13] Ebrahimi AA, Abutalebi HR, Karimi M. A generalised two stage cumulants-based MUSIC algorithm for passive mixed sources localisation. IET Signal Process. 2019;13(4):409–14.10.1049/iet-spr.2018.5357Search in Google Scholar

[14] Yan F. Music recognition algorithm based on T-S cognitive neural network. Transl Neurosci. 2019;10:123–34.10.1515/tnsci-2019-0023Search in Google Scholar PubMed PubMed Central

[15] Du X. Application of deep learning and artificial intelligence algorithm in multimedia music teaching. J Intell Fuzzy Syst. 2020;38(2):1–11.Search in Google Scholar

[16] Liao BY. Composition and improvement strategies of news audience’s media literacy in the omnimedia era. Contemp Soc Sci. 2020;24(4):128–37.Search in Google Scholar

[17] Dorochowicz A, Kurowski A. Employing subjective tests and deep learning for discovering the relationship between personality types and preferred music genres. Electronics. 2020;9(12):2016.10.3390/electronics9122016Search in Google Scholar

[18] Oramas S, Barbieri F, Nieto O, Serra X. Multimodal deep learning for music genre classification. Trans Int Soc Music Inf Retr. 2018;1(1):4–21.10.5334/tismir.10Search in Google Scholar

[19] Mun KR, Song G, Chun S, Kim J. Gait estimation from anatomical foot parameters measured by a foot feature measurement system using a deep neural network model. Sci Rep. 2018;8(1):9879.10.1038/s41598-018-28222-2Search in Google Scholar PubMed PubMed Central

[20] Ren HS, Bo XC, Ying XM. A deep neural network model compression method of diffuse large B cell lymphoma recognition based on genetic algorithm. Mil Med. 2018;42(10):757–61.Search in Google Scholar

[21] Lee HJ, Lee D. Study of process-focused assessment using an algorithm for facial expression recognition based on a deep neural network model. Electronics. 2020;10(1):54.10.3390/electronics10010054Search in Google Scholar

[22] Huang Z, Jia X, Guo Y. State-of-the-art model for music object recognition with deep learning. Appl Sci. 2019;9(13):2645.10.3390/app9132645Search in Google Scholar

[23] Chowdhuri S. PhonoNet: Multi-stage deep learning for raga preservation in hindustani classical music. J Acoust Soc Am. 2019;146(4):2947.10.1121/1.5137236Search in Google Scholar

[24] Briot JP, Pachet F. Music generation by deep learning - Challenges and directions. Neural Comput Appl. 2020;32(2):194–212.Search in Google Scholar

[25] Gui R, Chen T, Nie H. The impact of emotional music on active ROI in patients with depression based on deep learning: A task-state fMRI study. Comput Intell Neurosci. 2019;2019(6):1–14.10.1155/2019/5850830Search in Google Scholar

[26] Purwins H, Li B, Virtanen T, Schluter J, Chang SY, Sainath T. Deep learning for audio signal processing. IEEE J Sel Top Signal Process. 2019;21:1.10.1109/JSTSP.2019.2908700Search in Google Scholar

[27] Sotiropoulos DN, Tsihrintzis GA. Artificial immune system-based music recommendation. Intell Decis Technol. 2018;14:1–17.Search in Google Scholar

[28] Li T. Selection of audio materials in college music education courses based on hybrid recommendation algorithm and big data. J Phys Conf Ser. 2021;1774(1):012019.10.1088/1742-6596/1774/1/012019Search in Google Scholar

[29] Mandloi K, Mittal A. Hybrid music recommendation system using content-based filtering and k-mean clustering algorithm. Int J Comput Sci Eng. 2018;6(7):1498–501.10.26438/ijcse/v6i7.14981501Search in Google Scholar

[30] Gong W, Yu Q. A deep music recommendation method based on human motion analysis. IEEE Access. 2021;36(99):1.10.1109/ACCESS.2021.3057486Search in Google Scholar

Received: 2023-12-30
Accepted: 2024-02-04
Published Online: 2024-05-30

© 2024 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

  1. Research Articles
  2. A study on intelligent translation of English sentences by a semantic feature extractor
  3. Detecting surface defects of heritage buildings based on deep learning
  4. Combining bag of visual words-based features with CNN in image classification
  5. Online addiction analysis and identification of students by applying gd-LSTM algorithm to educational behaviour data
  6. Improving multilayer perceptron neural network using two enhanced moth-flame optimizers to forecast iron ore prices
  7. Sentiment analysis model for cryptocurrency tweets using different deep learning techniques
  8. Periodic analysis of scenic spot passenger flow based on combination neural network prediction model
  9. Analysis of short-term wind speed variation, trends and prediction: A case study of Tamil Nadu, India
  10. Cloud computing-based framework for heart disease classification using quantum machine learning approach
  11. Research on teaching quality evaluation of higher vocational architecture majors based on enterprise platform with spherical fuzzy MAGDM
  12. Detection of sickle cell disease using deep neural networks and explainable artificial intelligence
  13. Interval-valued T-spherical fuzzy extended power aggregation operators and their application in multi-criteria decision-making
  14. Characterization of neighborhood operators based on neighborhood relationships
  15. Real-time pose estimation and motion tracking for motion performance using deep learning models
  16. QoS prediction using EMD-BiLSTM for II-IoT-secure communication systems
  17. A novel framework for single-valued neutrosophic MADM and applications to English-blended teaching quality evaluation
  18. An intelligent error correction model for English grammar with hybrid attention mechanism and RNN algorithm
  19. Prediction mechanism of depression tendency among college students under computer intelligent systems
  20. Research on grammatical error correction algorithm in English translation via deep learning
  21. Microblog sentiment analysis method using BTCBMA model in Spark big data environment
  22. Application and research of English composition tangent model based on unsupervised semantic space
  23. 1D-CNN: Classification of normal delivery and cesarean section types using cardiotocography time-series signals
  24. Real-time segmentation of short videos under VR technology in dynamic scenes
  25. Application of emotion recognition technology in psychological counseling for college students
  26. Classical music recommendation algorithm on art market audience expansion under deep learning
  27. A robust segmentation method combined with classification algorithms for field-based diagnosis of maize plant phytosanitary state
  28. Integration effect of artificial intelligence and traditional animation creation technology
  29. Artificial intelligence-driven education evaluation and scoring: Comparative exploration of machine learning algorithms
  30. Intelligent multiple-attributes decision support for classroom teaching quality evaluation in dance aesthetic education based on the GRA and information entropy
  31. A study on the application of multidimensional feature fusion attention mechanism based on sight detection and emotion recognition in online teaching
  32. Blockchain-enabled intelligent toll management system
  33. A multi-weapon detection using ensembled learning
  34. Deep and hand-crafted features based on Weierstrass elliptic function for MRI brain tumor classification
  35. Design of geometric flower pattern for clothing based on deep learning and interactive genetic algorithm
  36. Mathematical media art protection and paper-cut animation design under blockchain technology
  37. Deep reinforcement learning enhances artistic creativity: The case study of program art students integrating computer deep learning
  38. Transition from machine intelligence to knowledge intelligence: A multi-agent simulation approach to technology transfer
  39. Research on the TF–IDF algorithm combined with semantics for automatic extraction of keywords from network news texts
  40. Enhanced Jaya optimization for improving multilayer perceptron neural network in urban air quality prediction
  41. Design of visual symbol-aided system based on wireless network sensor and embedded system
  42. Construction of a mental health risk model for college students with long and short-term memory networks and early warning indicators
  43. Personalized resource recommendation method of student online learning platform based on LSTM and collaborative filtering
  44. Employment management system for universities based on improved decision tree
  45. English grammar intelligent error correction technology based on the n-gram language model
  46. Speech recognition and intelligent translation under multimodal human–computer interaction system
  47. Enhancing data security using Laplacian of Gaussian and Chacha20 encryption algorithm
  48. Construction of GCNN-based intelligent recommendation model for answering teachers in online learning system
  49. Neural network big data fusion in remote sensing image processing technology
  50. Research on the construction and reform path of online and offline mixed English teaching model in the internet era
  51. Real-time semantic segmentation based on BiSeNetV2 for wild road
  52. Online English writing teaching method that enhances teacher–student interaction
  53. Construction of a painting image classification model based on AI stroke feature extraction
  54. Big data analysis technology in regional economic market planning and enterprise market value prediction
  55. Location strategy for logistics distribution centers utilizing improved whale optimization algorithm
  56. Research on agricultural environmental monitoring Internet of Things based on edge computing and deep learning
  57. The application of curriculum recommendation algorithm in the driving mechanism of industry–teaching integration in colleges and universities under the background of education reform
  58. Application of online teaching-based classroom behavior capture and analysis system in student management
  59. Evaluation of online teaching quality in colleges and universities based on digital monitoring technology
  60. Face detection method based on improved YOLO-v4 network and attention mechanism
  61. Study on the current situation and influencing factors of corn import trade in China – based on the trade gravity model
  62. Research on business English grammar detection system based on LSTM model
  63. Multi-source auxiliary information tourist attraction and route recommendation algorithm based on graph attention network
  64. Multi-attribute perceptual fuzzy information decision-making technology in investment risk assessment of green finance Projects
  65. Research on image compression technology based on improved SPIHT compression algorithm for power grid data
  66. Optimal design of linear and nonlinear PID controllers for speed control of an electric vehicle
  67. Traditional landscape painting and art image restoration methods based on structural information guidance
  68. Traceability and analysis method for measurement laboratory testing data based on intelligent Internet of Things and deep belief network
  69. A speech-based convolutional neural network for human body posture classification
  70. The role of the O2O blended teaching model in improving the teaching effectiveness of physical education classes
  71. Genetic algorithm-assisted fuzzy clustering framework to solve resource-constrained project problems
  72. Behavior recognition algorithm based on a dual-stream residual convolutional neural network
  73. Ensemble learning and deep learning-based defect detection in power generation plants
  74. Optimal design of neural network-based fuzzy predictive control model for recommending educational resources in the context of information technology
  75. An artificial intelligence-enabled consumables tracking system for medical laboratories
  76. Utilization of deep learning in ideological and political education
  77. Detection of abnormal tourist behavior in scenic spots based on optimized Gaussian model for background modeling
  78. RGB-to-hyperspectral conversion for accessible melanoma detection: A CNN-based approach
  79. Optimization of the road bump and pothole detection technology using convolutional neural network
  80. Comparative analysis of impact of classification algorithms on security and performance bug reports
  81. Cross-dataset micro-expression identification based on facial ROIs contribution quantification
  82. Demystifying multiple sclerosis diagnosis using interpretable and understandable artificial intelligence
  83. Unifying optimization forces: Harnessing the fine-structure constant in an electromagnetic-gravity optimization framework
  84. E-commerce big data processing based on an improved RBF model
  85. Analysis of youth sports physical health data based on cloud computing and gait awareness
  86. CCLCap-AE-AVSS: Cycle consistency loss based capsule autoencoders for audio–visual speech synthesis
  87. An efficient node selection algorithm in the context of IoT-based vehicular ad hoc network for emergency service
  88. Computer aided diagnoses for detecting the severity of Keratoconus
  89. Improved rapidly exploring random tree using salp swarm algorithm
  90. Network security framework for Internet of medical things applications: A survey
  91. Predicting DoS and DDoS attacks in network security scenarios using a hybrid deep learning model
  92. Enhancing 5G communication in business networks with an innovative secured narrowband IoT framework
  93. Quokka swarm optimization: A new nature-inspired metaheuristic optimization algorithm
  94. Digital forensics architecture for real-time automated evidence collection and centralization: Leveraging security lake and modern data architecture
  95. Image modeling algorithm for environment design based on augmented and virtual reality technologies
  96. Enhancing IoT device security: CNN-SVM hybrid approach for real-time detection of DoS and DDoS attacks
  97. High-resolution image processing and entity recognition algorithm based on artificial intelligence
  98. Review Articles
  99. Transformative insights: Image-based breast cancer detection and severity assessment through advanced AI techniques
  100. Network and cybersecurity applications of defense in adversarial attacks: A state-of-the-art using machine learning and deep learning methods
  101. Applications of integrating artificial intelligence and big data: A comprehensive analysis
  102. A systematic review of symbiotic organisms search algorithm for data clustering and predictive analysis
  103. Modelling Bitcoin networks in terms of anonymity and privacy in the metaverse application within Industry 5.0: Comprehensive taxonomy, unsolved issues and suggested solution
  104. Systematic literature review on intrusion detection systems: Research trends, algorithms, methods, datasets, and limitations
Downloaded on 17.7.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jisys-2023-0351/html
Scroll to top button