1. Introduction
The financial investment business has been growing in recent years with new people interested in the subject, mainly by individuals. According to the Official Brazilian Stock Exchange [
1], the number of individuals making financial investments grew from 557,109 in 2015, to 3,173,411 on 30 November 2020. The growth in the number of new investors is considerable, but the low financial education in Brazil makes it difficult to make the first investments, precisely because it is a market that demands dedication and in-depth studies to obtain good results. Therefore, the lack of adequate knowledge ends up keeping new investors away.
Within this context, it can be noticed that new investors need some help to start their activities in the investment business. Therefore, this paper seeks to contribute in the financial investment field using Machine Learning algorithms to assist in the decision-making process of new investors, where it was sought to answer the following research questions: how to qualitatively classify shares of companies listed on Bovespa to assist in the decision-making process of new investors? Which Machine Learning algorithms have greater accuracy for this purpose?
Although the literature presents similar works to this one, it is noteworthy that the relevance of this research lies in the fact that it considers peculiarities of the Brazilian scenario, such as economic crises. In addition, the Selic rate is considered as one of the parameters, which is changed according to the needs of the Brazilian economy. In this sense, peculiar situations can happen, which are not portrayed in the bases used in the other works.
Machine Learning algorithms that serve as financial investment advisors for new investors may be a solution to these questions. Based on these needs, this paper has the following objectives: identification of market data and investor profile that are needed to generate reliable investment suggestions; perform pre-processing and database assembly; define algorithms that will be used; implement Machine Learning algorithms to solve the problem at hand; Conduct validation and performance tests with the used model. The use of machine learning models to support decision making has been used in various fields, such as medicine [
2,
3], security [
4,
5], electricity [
6,
7,
8], and in financial investments [
9].
2. Theoretical Background
This section will present concepts and definitions relevant to the development of the solution proposed by this paper.
2.1. Financial Market Investment Types
There are different types of financial investments that can be made in different types of applications, such as: stocks, savings, commodities, and foreign currencies, among others. However, a financial investment is basically characterized as an application of capital in some financial application aiming at future income [
10].
Financial investments can also be divided into two basic groups: fixed-income investments and variable income investments. In fixed-income investments, the investor can know the amount of profitability that will be earned when the term of his investment is completed. Some of the main examples of fixed-income investments are savings accounts and direct treasury, among others. As for variable-income investments, the profitability value is not known in advance by the investor due to the variations that it undergoes over time; due to this characteristic, they are investments with a higher risk but with greater profitability. Some of the main examples of variable income investments are stocks and stock funds, among others [
11].
The implementation of the present work is focused on shares. A share is the smallest portion of the capital stock of publicly traded companies, which gives the owners of the share’s rights and duties equal to those of any other partner in the company, limited by the number of shares the owner owns [
12]. The owner of the shares is called a shareholder, who, from the moment he or she acquires a certain share, becomes a partner or co-owner of the company. Stocks are considered a variable-income investment because they have a return that can vary over time. They are considered a relatively high-risk investment because they belong to a highly volatile market [
13].
2.2. Fundamental Analysis
Fundamentalist Analysis determines the appropriate stock prices using the earnings and dividends of a given company, expectations of future interest rates, and risk assessment of the company [
14].
The basic objective of the valuation of a company is to obtain a fair value, which reflects the expected return on future performance projections consistent with the reality of the company evaluated. Since it is based on projections, the valuation is subject to errors and uncertainties, mainly since the analysis of external variables is not controlled by the company in question. Within this context, the result obtained through an evaluation is not an exact estimate of the value of the evaluated company [
15].
To be able to analyze companies, it is important to use variables and indicators, known as Fundamentalist Indicators, that impact a company. In the present work, the most used indicators were used according to Bered and Rosa [
16], having as a basis the theoretical foundation presented to obtain each indicator. The Fundamentalist Indicators that were used are: Net Margin (ML), Earnings per Share (EPS), Price/Earnings (P/E), Book Value per Share (NAV), Price/Equity Value (P/EV), Return on Equity (ROE) and Ebitda Margin (EM).
2.3. Invvestor Profile
Each investor active in the financial investment market has different characteristics, i.e., each investor has a unique way of making his investments, and the result of the investments is directly related to his way of acting and thinking about his decisions. Thus, it is necessary that each investor makes investments that match his profile, considering mainly three fundamental points: the types of risks he is willing to face; how much he is willing to lose; and what is the desired financial return. To learn more about the types of investors, the following presents the characteristics and differences of the three profiles: conservative, moderate, and riskier [
17].
Conservative: Their main characteristic is a low tolerance for risk and, consequently, a low return when compared to the other investor profiles. In general, they have little knowledge about investments and do not like to take risks; that is, they prefer to exchange the possibility of reaching a high profitability for a higher level of security. They commonly aim at preserving their assets by making fixed-income investments or investments that make it possible to withdraw their resources in a short period of time [
18].
Moderate: Moderate profile investors are people who show interest in higher returns, take more risks, and seek to increase their wealth in the medium term through investments. In general terms, they have some knowledge about investments and tend to be people who are not willing to take high levels of risk but are also not extremely conservative to the point of fitting into the conservative profile. For wanting a higher return and still wanting to protect the security in their investments, it is commonly indicated to make investments with different degrees of risk, diversifying their investments among various products, so that they can reach a higher return than the conservative profile and still maintain a moderate level of security in their investments [
19].
Riskier: Investors with a riskier profile are characterized by their goal of obtaining higher returns on their investments and are therefore willing to take more risks. They have a high level of knowledge about investments, are usually advised by qualified professionals who guide their investments and use variable income to achieve their financial goals, even accepting the loss of capital, due to the high risk involved in the operations [
20].
2.4. Machine Learning Techniques Used in Research
To develop this paper, different Artificial Intelligence methods were used, which will be presented further on.
2.4.1. Artificial Neural Networks
Artificial Neural Networks (ANNs) are inspired by the biological neural networks existing in the human brain, which can perform complex tasks automatically, quickly, and simultaneously [
21]. These structures served as the basis for the development of the models, which seek to simulate the learning capacity of the brain [
22]. In other words, ANNs are non-linear mathematical systems that have neurons connected by connections that are associated with weights [
23]. ANNs can recognize patterns, detecting relationships, performing operations with imprecise data, and predicting time series, among other functions [
24].
In addition to time series prediction, ANN shows promise for pattern classification [
25,
26,
27], control [
28,
29], and optimization. Optimization gains space in this context because of the need to improve the components necessary to maintain the system’s operation [
30,
31], especially considering the expansive growth of communication systems [
32,
33], Internet of Things [
34,
35,
36], the need for sustainability [
37,
38], technology development [
39], and data privacy [
40,
41].
Figure 1 shows a representation of an artificial neuron from a Perceptron Neural Network.
The output of the neuron is the result of the computation involving the inputs
Xk, bias
Bk and synaptic weights called
Wk. Here, the inputs
Xk are multiplied by a synaptic weight
Wk added to a bias
Bk, which activates or not the neuron through the activation function. The activation function is responsible for controlling the activation of a neuron, and therefore, they are fundamental to the correct operation of the ANN [
42]. Commonly, the Sigmoid and ReLu functions are used, which was the one used in the present work. Vanishing Gradient is used with the backpropagation algorithm.
The interconnections between neurons in a neural network is what makes it possible to perform complex tasks [
43]. The Multilayer Perceptron architecture organizes perceptrons into multiple layers, which are divided into three parts: the Input Layer, Hidden Layers, and Output Layer [
44]. Networks with intermediate layers can implement continuous functions and can perform function approximation by using two intermediate layers [
45], which served as the basis for the present work.
2.4.2. Logistic Regression
The Logistic Regression method basically performs a binary classification, which returns the probability that the input data belong to a certain class or not: that is, the estimated probability of a given output (
y) for an input (
x). The dependent variable is usually binary (nominal or ordinal), and the independent variables can be categorical or continuous. The Logistic Regression model is based on the Sigmoid function, where its output varies between 0 and 1. For this reason, it is a widely used model to describe the probability of something happening or not happening based on the input variables [
46].
2.4.3. Decision Tree
A Decision Tree is a classification/regression model, where its structure is in tree form and consists basically of nodes and arcs (also known as branches) [
47]. Decision Trees are widely used for a few reasons, such as: it has support for diversified features (categorical and numerical), it has a representation of acquired knowledge that is easily understood, and it is relatively fast to perform the entire training and learning process compared to other algorithms such as ANNs [
48].
Each internal node of the tree represents a test on a feature of an instance. The arcs represent the result of each test performed. The outer nodes, also called end nodes or leaf nodes, represent the classification classes. To classify an instance, the tree is run from top to bottom, traversing the nodes and arcs by performing the tests on each node until it arrives at a leaf node, which contains the new classification of the instance. An example of a Decision Tree is shown in
Figure 2.
The example in
Figure 2 succinctly demonstrates how instance I contains two characteristics: yes and yes. In each node of the tree, a test is executed where a comparison is performed with the features of the instance. The class to which the instance belongs is the third characteristic, which in the case of the example is class 1. By performing the tests, the instance is labeled according to the class of the leaf-node.
2.5. Related Works
Romani’s work [
49] aims to increase investors’ returns by presenting investment products that fit their investment profile through the training of an ANNs. The neural network was implemented in Python and trained using data from investors with the highest returns in their investment portfolios within each investor profile, analyzing investments made in the month of September 2016. In this way, investors who have similar profiles can benefit from the knowledge of the investors with the highest profitability through the trained neural network. According to the author, through the simulations, the total profitability of the investment portfolio increased in most cases. Approximately 61% of the tests had a profitability gain, 4% maintained profitability, and 35% lost profitability.
The research of Lins [
12] aimed to analyze Machine Learning techniques to estimate future variations of financial investments in three fund characteristics: conservative, moderate, and aggressive. The data used were collected through the daily share value of three investment funds: Western Asset Investment Fund Equity BDR Level I, JGP Strategy Fund of Investment in Quotas of Multimarket Investment Funds and Daycoval Classic
Fundo de Investimento Renda Fixa Crédito Privado. With the three funds, data were obtained from May 2014 to September 2020. A comparison was made between the results obtained with neural networks and Linear Regression implemented using the Weka tool. According to the author, the application proved to be effective for forecasting financial investment funds. The ANN proved to be more effective for forecasting, considering various metrics and different databases. To evaluate the results, the following metrics were used: MAE, MSE, RMSE, and MAPE.
The work of Vilela, Penedo and Pereira [
50] intended to develop a model with ANNs to forecast the prices of shares traded on the BM&FBovespa, using traditional indicators of profitability, liquidity, and debt. The database used was
Economática, with quarterly series referring to 371 companies, from 31 March 2012 to 31 March 2017. The implementation was performed in Matlab software. The type of neural network used was the Multilayer Perceptron. Regarding the results obtained, in cases where no sharp variations occurred, the result obtained was extremely close to the values observed. However, the neural network was not able to generate satisfactory predictions in relation to sudden variations in the trend, which are generally linked to factors external to the indicators.
The work of Aydin and Cavdar [
51] aimed to develop an early warning system to predict financial crises in Turkey. The type of neural network used was Multilayer Perceptron. The tests were performed in JAVA language and 298 months of data of 7 key macroeconomic and financial indicators of the Turkish economy between January 1990 and September 2014 were used, obtained from the websites of the Electronic Data Distribution System of the Turkish National Bank (EDDS) and the World Bank—The World Bank. In the conclusion, the author points out that the data obtained with ANNs are impressive; however, it is still a great challenge to perform forecasting in a complex economic environment influenced by external factors, such as crises in other countries and political disturbances, which may influence the reliability of the forecast.
The work by Dingli and Fournier [
52] had the objective of developing a system to forecast financial time series using Convolutional Neural Networks to predict the direction of the next period relative to the current price of a stock. Data from Yahoo Finance, which provides historical daily prices, were used to fetch the data from 2003 to 2016. The creation of the neural network was performed using the open-source library TensorFlow. In the results, 65% accuracy was achieved in predicting next month’s price direction and 60% accuracy in predicting next week’s price direction. In addition to the models that are based on deep layers [
53,
54] and hybrid models [
55], other models are gaining space, such as ensemble learning methods [
56,
57,
58], neuro-fuzzy systems [
59,
60], and group method of data handling [
61]. In
Table 1, a comparison is made between some of these presented related works.
3. Project
In this project, Machine Learning algorithms capable of evaluating a certain stock and based on this evaluation indicate whether or not to invest in the stock in question were implemented. For the development of the work, different techniques were used, namely: ANNs, Logistic Regression and Decision Tree. The database used is the one provided by Oceans14 [
62], containing the history of Fundamental Indicators for each stock. In addition, the historical quotes of each stock obtained through the Yahoo Finance API [
63] were used.
Each algorithm performed the stock evaluation, classifying them as indicated or not indicated as a good financial investment. Finally, the results obtained with each technique were compared to identify the technique with the highest accuracy and classification metrics, namely: accuracy, recall and F1-score. Each of the steps will be presented below in more detail, with sections divided into: Database, Algorithms and Validation.
3.1. Database
The database used was the one provided by the Oceans14 website [
62], containing the history of Fundamentalist Indicators and the history of Quotes. The most commonly used indicators were used based on the theoretical foundation presented to obtain each indicator. The Fundamentalist Indicators that were used are: ML, EPS, P/E, NAV, P/EV, ROE, and EM. Companies belonging to the financial sector were excluded from the database, according to the Oceans14 [
62] classification of sectors. The exclusion was made because the companies in this sector present very distinct characteristics if compared to the companies in other sectors, impairing comparability and consequently the training of algorithms. The tests were performed with data samples from two distinct periods, namely: 1998 to 2019, and 2014 to 2019. The data sample composed of the period from 2014 to 2019 has 1021 records and 13 variables, as follows: share, company, sector, subsector, segment, ML, EPS, P/E, Equity Value per Share, P/EV, ROE, EM, and year.
In the 2014–2019 database, the data prior to the year 2014 were not used due to the fact that the economic recession process started from 2014 [
64]. The data from 2020 onwards were also not used, because the COVID-19 pandemic affected the market in an unexpected way, and as the objective of the work is to work with fundamentalist analysis, adding the data from the pandemic period to the database used in the project would only hinder the training, generating inconsistencies that totally interfere with the reliability of the results obtained with the work.
As the Ocenas14 website [
62] contains data records since 1998, tests were also performed with the data from 1998 to 2019, containing 2627 records with the purpose of identifying which database (1998–2019 or 2014–2019) obtained the best results for the present work. To use the data, it was necessary to perform the classification of each stock as indicated or not indicated as a good financial investment, where “1” represents a stock classified as a good investment and “0” represents a stock that is not considered a good investment. This classification was performed by comparing the variation of each share’s quotation with the variation of the Selic Rate in the same period.
The variation in the price of each share was obtained by comparing the price of each share on the date recorded in the database, with the price of the same share, but referring to 5 years after the recorded date, based on the return time and the profitability references of fixed income investments of the main Benchmarks demonstrated by Araujo [
64], such as: savings, IPCA, and CDI, among others. The variation of the Selic Rate was obtained through the History of Basic Interest Rates presented by the site of the Central Bank of Brazil.
3.2. Algorithms
In this paper, three different algorithms were implemented: Multilayer Perceptron, Logistic Regression, and Decision Tree, with the purpose of performing the tests and comparing which one generates better investment indications based on Fundamentalist Analysis. The type of Neural Network used was the Multilayer Perceptron. The Multilayer Perceptron is formed by a set of source nodes that form the input layer (Input Layer), hidden layers (Hidden Layers) consisting of neurons, which can be one or more, and an output layer (Output Layer) [
21].
The learning method used in the ANN was Feed Forward Backpropagation, in which the error is calculated by performing the reverse path, that is, from the last layers to the first layers of the network [
44]. The number of neurons for the input layer is equal to the number of variables present in the database. The number of neurons in the hidden layer was defined empirically by performing tests with different numbers of neurons in order to select the number of neurons that presents the best accuracy.
3.3. Validation
To perform the performance comparison of the different models, evaluation metrics were used that are calculated through the Confusion Matrix. The metrics Accuracy, Precision, Recall and F1-score were used to more accurately evaluate the results obtained with the classifications of each algorithm, to compare them and obtain the one that achieved the best results for the tests performed in this paper, which is given by:
where
tp is true positive,
fn is false negative, and
fp is false positive.
4. Development
This section will present the steps taken to develop this work, which are: database preparation, Neural Network, Decision Tree, and Logistic Regression. The first step of the development was to prepare the database for the algorithms that would be implemented later. The data preparation was performed in two steps, namely: exclusion of unnecessary data and transforming the data format to suit the algorithms. At first, the columns that were unnecessary for the implementation of the algorithms and that are not part of the fundamentalist indicators, which are the main basis of Fundamentalist Analysis, on which this work is structured, were removed. Thus, the columns removed were: “VAR PRICE”, “SUBSECTOR”, “SEGMENT”, “STOCK”, “COMPANY” and “YEAR”.
In addition to removing unnecessary columns from the database, it was necessary to transform the columns that had their data in String form, to Int. That is, the “SECTOR” column was transformed into Int using Sklearn’s “LabelEncoder” function, which uses an integer to represent each class of the column in question, so that the data are represented only in numeric format. After all the data preparation, the splitting of the training and test sets was performed using Sklearn’s “train_test_split” function, where the data were split into 30% for training and 70% for testing.
4.1. ANN Implementation
The ANN was implemented using Sklearn’s MLPClassifier, which is a neural network that is intended to work with classification problems. The Activation Function used in the Hidden Layers is ReLu, which has no negative part, presenting a set of outputs between zero and infinity. Here, zero represents the neuron as deactivated, and values greater than zero represent activation of the neuron [
65]. To define the configuration of the Artificial Neural Net parameters, tests were performed using Sklearn’s GridSearchCV function, which is basically used to automate the process of adjusting the parameters of a given algorithm, performing several combinations of parameters, and evaluating each configuration to obtain the one that presents the best results.
The number of neurons in the two hidden layers was defined through the parameter “hidden_layers_sizes”, the Learning Rate was defined through the parameter “learning_rate_init”, and the Number of epochs was defined through the parameter “max_iter”. After running GridSearchCV, the best parameters were obtained through the method “grid.best_params”, being them: 4 neurons in the first hidden layer, 3 neurons in the second hidden layer, learning rate of 0.01 and number of epochs of 1000.
After defining the settings of the ANNs, the tests were performed using the 2014–2019 database. The Accuracy obtained with the model was 74%. Data classified as “1” obtained an accuracy of 0.74, recall of 0.96 and F1-score of 0.84. However, data classified as “0” obtained a precision of 0.72, recall of 0.22, and F1-score of 0.34. In other words, among the data that should be classified as “0”, only 22% was actually classified as “0”. Considering that the label “0” represents a Stock that is not indicated as a good investment option, having a recall of only 0.22 indicates that the algorithm is classifying 78% of the stocks that should not be indicated as a good investment option as being a good investment option. This negatively influences the decision making of an investor who might use the algorithm to help decide, causing him to invest in a stock that should not be classified as a good investment choice.
At first, a database containing 1021 stock records from 2014 to 2019 was used. However, the Oceans14 website [
62], from where the data were collected, contains records since 1998. Therefore, to perform further tests and to test the algorithms, tests were performed using the database containing the Fundamentalist Stock Indicators since 1998. With the insertion of data since 1998, the database has 2627 records, of which 1231 have the label “0” and 1396 have the label “1”. After training and testing the ANNs using the database with records from 1998 to 2019, a report was generated with the main classification metrics.
Figure 3 shows the metrics of the ANNs using the database from 1998 to 2019.
In the metrics presented in
Figure 3, the Accuracy obtained with the model was 69%. Data classified as “1” obtained an accuracy of 0.72, recall of 0.69 and F1-score of 0.71. Data classified as “0” obtained a precision of 0.66, recall of 0.68 and F1-score of 0.67.
Figure 4 shows the Confusion Matrix of the ANNs.
In the Confusion Matrix presented in
Figure 4, it is shown that data labeled as “1” are 69% correctly classified. The data labeled as “0” obtained 68% correct classification.
4.2. Decision Tree
The Decision Tree was implemented using Sklearn’s DecisionTreeClassifier, which is a Decision Tree that is intended to work with classification problems. To set the parameters of the algorithm, the GridSearchCV was used, which was also used in the ANNs. Three parameters were chosen to be tested with GridSearchCV: class_weight, criterion and max_depth. The class_weight is a parameter used to work with databases that are unbalanced, which is the case of this work, where the database contains more data labeled as “1” (stocks considered as being a good investment) than labeled as “0” (stocks considered as not being a good investment option). The criterion is a parameter that basically defines which rule/criterion the Decision Tree will use to generate the decision and may vary depending on the application. The max_depth is basically the maximum height of the Decision Tree, which if too small can generate the problem of underfitting and if too large can generate overfitting.
After running GridSearchCV, the best parameters for the Decision Tree were generated, which are: class_weight set to “balanced”, criterion set to “gini” and max_depth set to 13. After defining the Decision Tree settings, the tests were performed using the 2014–2019 database. The Accuracy obtained with the model was 76%. The data classified as “1” obtained an accuracy of 0.83, recall of 0.81 and F1-score of 0.82. However, data classified as “0” had an accuracy of 0.62, recall of 0.66, and F1-score of 0.64. After performing the training and testing using the 2014–2019 database, training and testing was performed using the 1998–2019 database, which obtained the classification metrics shown in
Figure 5.
In the metrics presented in
Figure 5, it can be seen that the Accuracy obtained with the model was 77%. Data classified as “1” had an accuracy of 0.75, recall of 0.80, and F1-score of 0.78. Data classified as “0” obtained an accuracy of 0.79, recall of 0.73, and F1-score of 0.76. Similarly to the ANNs, when performing the comparison of the metrics obtained with the 1998–2019 database, compared to those obtained with the 2014–2019 database, the improvement of the metrics of the data classified as “0” is noted.
Figure 6 shows the Confusion Matrix obtained with the Decision Tree.
In the Confusion Matrix presented in
Figure 6, it is shown that the data labeled as “1” are 80% correctly classified. The data labeled “0” have a 73% correct classification.
4.3. Logistic Regression
The regression method basically performs the prediction of
Yi from the knowledge of
xi. There are several regression techniques that estimate the relationship between variables, for example: logistic, linear, non-linear, and ridge, among others [
46]. Linear regression refers to the relationship between variables, where the conditional expectation of
Y, given
X =
x, is a linear function of
x [
66]. In a model based on simple or multiple linear regression, the dependent variable
Y is a variable of a continuous nature. However, it can also be qualitative, where it is represented by two or more categories. In this case, the least squares optimization method does not provide a reasonable estimator. The ridge model solves a regression model where the loss function is the linear least square’s function and regularization is given by the l2-norm and has built-in support for multi-variate regression [
67]. Therefore, logistic regression offers a more appropriate approximation, allowing the use of regression models to calculate or predict the probability of a given event [
68].
The Logistic Regression algorithm was implemented using Sklearn’s LogisticRegression. To define the parameters of the algorithm, GridSearchCV was used, which was also used for the same purpose in the ANNs and Decision Tree. Three parameters were chosen to be tested with GridSearchCV, namely: class_weight, max_iter and solver. The class_weight is a parameter used to work with databases that are unbalanced, which is the case in this work. The max_iter is basically the number of “epochs”, like the parameter defined in the ANNs. The solver is basically the algorithm to be used for optimization.
After running GridSearchCV, the best parameters for the Logistic Regression were generated, being class_weight set to “None”, max_iter set to 100 and solver set to “newtoncg”. After setting the Logistic Regression settings, the tests were performed using the 2014–2019 database. The Accuracy obtained with the model was 64%. Data classified as “1” obtained an accuracy of 0.67, recall of 0.89 and F1-score of 0.76. However, data classified as “0” obtained an accuracy of 0.41, recall of 0.14, and F1-score of 0.21. After performing the training and testing using the 2014–2019 database, training and testing was performed using the 1998–2019 database, which obtained the classification metrics shown in
Figure 7.
In the metrics presented in
Figure 7, one can see that the Accuracy obtained with the model was 66%. Data classified as “1” obtained an accuracy of 0.69, recall of 0.67 and F1-score of 0.68. Data classified as “0” obtained an accuracy of 0.62, recall of 0.64 and F1-score of 0.63. Similarly to the ANNs and Decision Tree, when performing the comparison of the metrics obtained with the 1998–2019 database, compared to those obtained with the 2014–2019 database, the improvement in the metrics of the data classified as “0” is noted.
Figure 8 shows the Confusion Matrix obtained with the Decision Tree.
In the Confusion Matrix presented in
Figure 8, it is shown that the data labeled as “1” are 67% correctly classified. The data labeled as “0”, on the other hand, have a 64% correct classification.
5. Results
This section will present the results obtained with the algorithms implemented in this work. First, a comparison will be presented regarding the accuracy obtained with each algorithm using the 1998–2019 database, as shown in
Figure 9.
In the graph shown in
Figure 9, the Decision Tree has the highest accuracy, with 77%. This is followed by 69% accuracy for the ANNs and 66% for the Logistic Regression. However, to analyze the results in a deeper and more adequate way, it is necessary to analyze the classification metrics, as presented in
Figure 10.
In the graph shown in
Figure 10, the metrics for the three algorithms are presented. The ANN has an accuracy of 0.72, recall of 0.69 and F1-score of 0.71. The Decision Tree has an accuracy of 0.75, recall of 0.80 and F1-score of 0.78. Finally, the Logistic Regression has an accuracy of 0.69, recall of 0.67 and F1-score of 0.68. Decision Tree obtained higher values for its classification metrics compared to the other algorithms. It is worth noting that the Recall of the Decision Tree was higher than the Precision and F1-score. This indicates that the algorithm obtained a low number of false negatives for Class “1”.
Figure 11 shows the metrics obtained with the three algorithms for Class “0”.
In the graph shown in
Figure 11, the metrics of the three algorithms for Classes “0” are presented. The ANN has an accuracy of 0.66, recall of 0.68, and F1-score of 0.67. The Decision Tree has an accuracy of 0.79, recall of 0.73 and F1-score of 0.76. Finally, the Logistic Regression has an accuracy of 0.62, recall of 0.64 and F1-score of 0.63.
Through the numbers presented in each metric of the graphs presented, it is noted that the Decision Tree presented higher values for all metrics both for Class “1” and for Class “0”. It is worth noting that in the application of this work, bad stocks (Class “0”) classified as good (Class “1”) generate worse consequences for the investor than good stocks (Class “1”) classified as bad (Class “0”).
Making an investment in a bad stock generates losses for the investor. However, not making an investment in a good stock does not generate profit, but it also does not generate loss. Therefore, to select the best algorithm for the application of this work, it is necessary to analyze the metrics of the Class “0” classifications (stocks considered bad for investment). Thus, Decision Tree was really the algorithm that presented the best values for precision, recall and F1-score for the application of this work.
6. Conclusions
This paper was dedicated to implementing Machine Learning algorithms with the ability to act as financial investment advisors for new investors. The proposed objectives were achieved, namely: identify market and investor profile data that are needed to generate reliable investment suggestions; perform preprocessing and database assembly; define algorithms that will be used. implement Machine Learning algorithms; perform validation and performance tests.
The Fundamentalist Analysis was used as a basis for evaluating and generating investment indications, using the Fundamentalist Indicators of each stock as variables in the Database. The Fundamentalist Indicators used in the Database were obtained from the Oceans14 website [
62]. However, the Fundamental Indicators alone were not enough for the implementation. It was necessary to label each registered stock as good (Class “1”) or bad (Class “0”) to invest in. Thus, it was necessary to compare the variation of each share’s price with the variation of the Selic Rate in the same period. If the stock variation was greater than the Selic Rate variation, the stock in question was labeled as Class “1”; otherwise, it was labeled as Class “0”. To perform the calculations, it was to use the Yahoo Finance quotation history and the Brazilian Central Bank’s Basic Interest Rate history. In addition, preparation was carried out in the database, removing unnecessary variables for the implementation of the work and stocks from companies in the financial sector.
The Machine Learning techniques used were: Multilayer Perceptron, Decision Tree, and Logistic Regression. The implementation was completed in Python language, using Google Colaboratory and the Machine Learning library Sikit-learn. The settings for each algorithm were defined using the GridSearchCV function to find the best settings for the implementation of this work. With the parameter settings of each Artificial Intelligence technique defined, the algorithms were implemented. At first, only the database with records from 2014 to 2019 would be used, but the results obtained with records from only this period were not satisfactory, because the classification metrics for Classes “0” were very low. For this reason, new tests were subsequently performed with the Database containing records from 1998 to 2019, which in turn obtained better accuracy, precision, recall, and F1-score.
The results obtained with each algorithm were compared using the metrics: accuracy, precision, recall, and F1-score. The algorithm that obtained the best classification metrics was Decision Tree. The accuracy obtained with Decision Tree was 0.77. The metrics for Class “0” were accuracy of 0.79, recall of 0.73, and F1-score of 0.76. Metrics for Class “1” were accuracy of 0.75, recall of 0.80, and F1-score of 0.78.
Regarding the Decision Tree accuracy, the results lead us to consider that the value achieved was not better due to the market behavior itself, which has numerous variables that exceed what is “predicted” by the indicators of Fundamental Analysis, for example: economic crises, wars, natural disasters, etc. As the work was based on Fundamental Analysis, the forecast made by the models is limited to the information provided by the indicators (used as inputs to the models).
Future Works
For the future development of the research in the area of the present work, some points can be developed, such as using a database with more records, such as the one provided by Economática, which is paid, but very complete. This would help in the training of algorithms and consequently in the results achieved by them. It could also be beneficial to use more or different Fundamental Indicators. In this work we have used some of the main indicators, but there are many others that if well chosen, in conjunction with other indicators can assist in the analysis of stock valuation by the algorithms. Lastly, use the yield of other investments instead of the Selic Rate as a criterion for labeling the stocks in the database is an area that needs further research.