Improving Quality of Long-Term Bond Price Prediction Using Artificial Neural Networks

Purpose: The aim of this paper is to propose nonlinear autoregressive neural network which can improve quality of bond price forecasting. Methodology/Approach: Due to the complex nature of market information that influence bonds, artificial intelligence could be accurate, robust and fast choice of bond price prediction method. Findings: Our results have reached a coefficient of determination higher than 95% in the training, validation and testing sets. Moreover, we proposed the nonlinear autoregressive network with external inputs using 50 year interest-rate swaps denominated in EUR and volatility index VIX as two external variables. Research Limitation/Implication: Our sample of daily prices between 4th January 2016 and 13th January 2021 (totally 1,270 trading days) suggest that both Levenberg-Marquardt and Scaled conjugate gradient learning algorithms achieved excellent results. Originality/Value of paper: Despite the fact that both learning algorithms achieved satisfying outcomes, implementation of an independent variable into the autoregressive neural network environment had no significant impact on prediction ability of the model. Category: Research paper


INTRODUCTION
Bond yields and prices play crucial role in global economy. They determine the cost of corporate financing, influence balances of governments and shape market expectations. Correct forecast of bond yields and prices is therefore of great importance of almost all market participants. Extensive amount of academic research has been accordingly focused on the issue of bond return predictability. Many studies indicated that precise forecasts of bond yields and excess returns are able using forward rates (Fama and Bliss, 1997;Campbell and Shiller, 1991;Huang and Lin, 1996;Fama, 2006;Vieira, Fernandes and Chague, 2017). Although these works confirm the forecasting ability of forward rates, their out of sample predictions are usually outperformed by random walk forecasts (Diebold and Li, 2006). Carriero, Kapetanios and Marcellino (2012) introduced a Bayesian vector autoregression with time-varying optimal shrinkage to forecast term structure of government bond yields. Their approach performed better than benchmark methods on most maturities and forecast horizons, however, predictive gains of the model with respect to the random walk has declined over time. Regarding the importance of bond markets, several studies focused on the relationship between the state of the economy and bond return volatility (Bollerslev, Cai and Song, 2000;Andersen et al., 2001;Andersen et al., 2003). Chao (2016) examined whether different economic variables significantly influence the return volatility forecasts of US Treasuries and evaluated the out of sample performance of various prediction techniques.
The forecasting capability of their model was obvious at the short end of the yield curve and the turbulent historical periods, but the evidence of out of sample forecasting ability was weaker, since only a few forecasts significantly outperformed the benchmark. The link between the standard macroeconomic variables and the shape of the sovereign bond yield curve was examined by Aguiar-Conraria, Martins and Soares (2012) who used wavelet tools to analyse the yield curve components with three time-varying latent factors corresponding to its level, slope and curvature. The expectations element of the term spread could only moderately help to forecast business cycle, since the low liquidity and the risk associated to longer-term securities are represented by the term premium element which reflects the demand for higher yields. On the other hand, Hamilton and Kim (2002) suggested that both elements are statistically significant.
While, many authors claimed that macroeconomic fundamentals such as unemployment, or ratio of debt to GDP ratio are the major determinants of government bond yields (Bernoth, Von Hagenn and Schuknecht, 2004;Georgoutsos and Migiakis, 2013), von Hagen, Schuknecht and Wolswijk (2011) and Longstaff et al. (2011) suggested that bond yields are influenced by common parameters such as risk aversion of investors. On the other hand, De Grauwe and Ji (2013) argue that spreads are significantly influenced by monetary policy. On the sample of 10-year Greek government bond Chionis, Pragidis and Schizas (2014) found that, in general, macroeconomic indicators play a significant role as determinants of Greek bond yield, while isolating the debt crisis period, inflation and unemployment among others strengthen their affect to the Greek debt market. For the period during the crisis, the balance of current account was among the top variables determining yields Even though there has been literature exploring the impact of economic announcements on various asset classes, global bond markets have received smaller attention than stocks, derivatives, exchange rates, or other financial instruments (Ehrmann and Fratzscher, 2002;Faust et al., 2003;Chuliá, Martens and van Dijk, 2010). Share prices heavily depend on expected profits, the risk premium, and actual discount rate, which generally move in opposite manner. The relationship between macroeconomic releases and share prices is therefore ambiguous. Looking apart from the risk premium, favorable macroeconomic news increase both expected profits and discount rate, leaving the net effect on stock prices uncertain (Andersen et al., 2007). The effect of positive news on foreign exchange markets usually constitutes that assertive development of domestic economy strengthens the domestic currency, however, the empirical evidence is mixed. Moreover, many papers indicate that prices and fundamentals are independent for the foreign exchange market. If focus on the link between macro indicators and bond prices, Paiardini (2014) followed previous analyses of Balduzzi, Elton and Green (2001), Green (2004) and Andersson, Overby and Sebestyén (2009) and investigated the implications of regularly published macro news and monetary policy statements on the returns of Italian bonds. They found that out of 68 announcements, 25 news had a considerable impact on bond prices and that almost all releases were incorporated into market prices within 20 minutes, which is in compliance with Goldberg and Leonard (2003).
Lot of academic research has been devoted to the analysis of long term bond yields considering various perspectives. Kurita (2016) applied a Markovswitching variance technique to estimate structural changes in Japan's government bonds and examined internal factors in terms of Japan's inflation rate, short-term interest rate and stock returns. External factor was represented by yields on the US Treasuries. Results of this research emphasize the nonlinear characteristics of Japanese bond yields over the past three decades. The linkages between the short and long ends of the term spread was examined by Byrne, Fazio and Fiess (2012). Particularly, authors separated the role of global output, inflation, and savings as a possible interpretations of the low long term interest rates in western countries. Their results captured a globalization regime, where the longer term spreads were found to grow by approximately one third to variances in longer term yields of Treasuries. The magnitude of reactions was in conformity with results of Diebold, Li and Yue (2008) and Lange (2014). Moreover, obtained outcomes were consistent with the cross-correlations between 10-year bond yields of small economies and Treasuries described in Kulish and Rees (2011). While most of studies investigating the predictability of bond yields (Ilmanen and Byrne, 2003;Boyd and Mercer, 2010;Moskowitz, Ooi and Pedersen, 2012) relied on the monthly data, Bessembinder et al. (2009) examined the abnormal bond returns on the daily basis. They found that applying daily data significantly increased the power of the test, even if the available time series of daily returns was short. Their results were confirmed by Goyenko, Subrahmanyam andUkhov (2011) andHong, Lin and. For corporate debt, in short term predictability was also found to be positively connected to risk of default risk. Returns on bond portfolios with risky obligations were more predictable than returns on the high-quality portfolios with low risk.
It is crucial to understand that the major drivers of current interest rates are the actions of global central banks. Central banks decrease short-term interest rates through the policy rates (and other tools) and the expectations of the short-term rates are the crucial factor in determining the long-term rates and bond yields. Reflecting given low global interest rates, particularly private sector significantly increased bond issuance and rebalanced from bank loans towards corporate bond (Chang, Fernández and Gulan, 2016). The effects of zero-rate monetary policies on bond yields and spreads has been explored by various studies (Hamilton and Wu, 2012;Wright, 2012;Guidolin, Orlov and Pedio, 2014). They highlight a broad range of channels through which expansive monetary policy increase prices of financial assets and affects risk aversion of investors.
In addition to the above mentioned applications, artificial neural networks have been repeatedly utilized for stock and commodity markets forecasts. Kara, Boyacioglu and Baykan (2011) developed two neural network based models and compared their performances in predicting the direction of movement in the daily Istanbul Stock Exchange, Ticknor (2013) proposed Bayesian regularized artificial neural network to reduce the potential for overfitting and overtraining and performed experiments with blue chips stock. While Bildirici and Ersin (2009) upgraded ARCH/GARCH family models with artificial neural networks to evaluate the volatility of daily stock returns, Tseng et al. (2008) integrated a hybrid asymmetric volatility approach into a neural networks option-pricing model to enhance the forecasting ability of the price of derivatives. Other interesting papers devoted to the estimation of the evolution of financial asset prices were presented by Hafezi, Shahrabi and Hadavandi (2015), Rezaee, Jozmaleki and Valipour (2018) or Zhang, Li and Morimoto (2019).
Despite the growing prices of stocks and bonds in last five years, the development of accurate forecasting method plays an important role in the analysis of current debt market. As stated above, none of proposed models were able to outperform the random walk benchmark consistently and provide precise out of sample predictions. In order to focus on never seen out of sample data, this paper proposes a long-term bond price forecasting model based on a biologically inspired nonlinear technique -artificial neural networks. Long-term bonds are obligations with maturity in more than 30 or 40 years. Contrary to conventional bonds with shorter maturity (if they do not include a call option), their price does not converge to face value for several decades and therefore for the considerable period of time the price development of long-term bond can be estimated in the same way as it is in the case of shares and other similar financial instruments. Due to the inverse relationship between bonds yields and prices, there is only a small difference between forecasting of both variables. However, most of longterm bond investors do not hold purchased bonds to maturity. Instead of buy and hold approach, active investors sell purchased bond after its price rises and realize capital gain. These investors particularly focus on bond prices. And since long-term bonds are financial instruments significantly sensitive to underlying interest rates, aim of this study is to investigate the predictability of long-term bond prices not only using their past values, but also using interest rate swaps as proxy of market interest rates.

METHODOLOGY
Artificial neural networks are computational technique based on the functioning of biological nervous systems, which emulate the learning process in neural cells -neurons. They operate in parallel framework, hence they are not sensitive to degradation of some nodes and are able to solve nonlinear or badly defined tasks. Neural networks were successfully applied in many financial and business tasks, such as bankruptcy analysis, credit scoring, or time series prediction (see Li, 1994;Wong, Bodnovich and Selvi, 1997;Vellido, Lisboa and Vaughan, 1999;Tkáč and Verner, 2016).
An essential part of every neural network is an artificial neuron, which is an information-processing unit that receives input signal from external sources or other neurons and constructs an output signal transmitted further. Figure 1 presents the scheme of an artificial neuron. Every connection is characterized by its synaptic weight. Signal j x at the input of connection linked to neuron i is multiplied by weight ij w . Summing function of the neuron sums all the weighted inputs and activation function restricts the range of its output to limited interval, usually from one to zero or from minus one to one.

Figure 1 -An Artificial Neuron
The organization of individual neurons is called the network architecture. In case of feedforward is the signal proceeded exclusively acyclic and it is not passed back. In addition to input and output layers, multilayer feedforward networks contain at least one hidden layer with hidden neurons. Increasing number of hidden neurons give the network an ability to capture complicated patterns, which is particularly useful when solving problems with higher count of input variables. Fully connected multilayered feedforward network is depicted on Figure 2.

Figure 2 -Multilayer Feedforward Neural Network
The major advantage of artificial neural networks is their ability to extract the knowledge from surrounding environment by repeatedly modifying their connection weights. Every new iteration should increase its comprehension of presented data sample. The approach every network changes its free parameters based on external impulses is known as learning algorithm. Under the unsupervised learning, no target outputs are presented to the network and if the network captures valuable patterns in the data, it develops the expression of the input itself and creates its own structure. On the other hand, the most applied learning approach for multilayer feedforward networks is the learning with teacher, or supervised learning, where the network is provided with a desired outputs for given set of inputs. The synaptic weights are consequently modified according to the difference between the reached and hoped-for output of the network.
Error for the neuron i might be defined as ( ) ( ) ( ) The cost function is minimized by updating connection weights in following manner: where ( ) ij w n is the connection weight between neuron i and anterior neuron j at iteration n. Various algorithms have been introduced to optimize the cost function and modify connection weights and they differ from each other primarily in the reaction to the error signal and in the adjustment of the weights. The most frequently applied learning algorithms are based on gradient descent principles, however, they utilize only first-order information about the error surface. Moreover, the learning rate and momentum in gradient descent framework are additional parameters (besides the number of hidden layers, hidden neurons or choice of activation function) that have to be determined by researcher, usually by trial and error approach.
Comparing to conventional gradient descent techniques, more sophisticated methods use second-order Taylor series approximation of the error function. According to Newton method, optimal update of synaptic weights, regarding the error function ( ( ))  ( 1) n + A is computed recursively utilizing the former value of ( ) n A , ∆w and ∆G (Broyden, 1970;Fletcher, 1970;Goldfarb, 1970;Shanno, 1970, Powell, 1975. On the other hand, algorithm proposed by Levenberg (1944) and Marquardt (1963) updates network weights based on the Jacobian matrix ( ) If the parameter µ is zero, algorithm becomes a Newton method, while with increasing µ , it approaches to conventional gradient descent with a small learning rate. For more details see Hagan and Menhaj (1994) or Demuth et al. (2014). This algorithm has high computational speed and is suitable for midsized networks, therefore we apply it in our bond price prediction model as a benchmark method.
Despite the fact that in most academic researches the authors focus on the daily returns of financial assets, we focused on the estimation of daily prices. Compared to shares, bond prices have limited potential for change in the long term, as they converge to face value with maturity (in case of no credit event). Therefore, it is more important for bond investors and issuers to estimate the price of the bond or its yield to maturity rather than daily returns.
At first we identified the theoretical basis and proposed a conceptual background for the model. Subsequently, we focused on obtaining an available sample of data that would sufficiently approximate the long-term bond market. The sample consisted of long-term investment grade corporate bonds issued by global issuers and one government bond with a similar maturity and denominated in EUR. Bonds from various business sectors were included to remove potential specific impacts of the prediction tasks. After analysing the database, it was essential to create multiple neural network architectures and to optimize the number of hidden layers and neurons in each layer. It was also necessary to determine the percentage distribution of the sample for the training, test and validation sets.
The bond price prediction model was based on a multilayered feedforward network containing two hidden layers with sigmoid activation function and output layer with linear activation function. Every hidden layer had 15 neurons. All parameters were chosen by trial and error approach to ensure the most satisfying prediction results. The above-mentioned Levenberg-Marquardt algorithm was applied to the learning process. It suitably combines the properties of gradient and Newton methods. In order to compare its performance with other conventional learning technique, Scaled conjugate gradient was also included in the research. After evaluating the results, an external variable was added to the model to integrate information about current market volatility.

Results
The aim of this paper is to expand the scope of previous studies on the predictability of bond yields and prices. The predictability of prices was tested on a set of long-term bonds from 5 private issuers. We employed daily price time series observed between 4 th January 2016 and 13 th January 2021 (totally 1,270 trading days). This data interval was kindly provided for our research by the Deutsche Boerse and represents a relatively smooth trading interval with no significant shocks or unexpected economic events with a global impact.
It is obvious that given time interval represents a relatively short period to present a complex bond market prediction framework. However, with respect to the efficient market hypothesis, trying to create a single and universal model that is able to forecast the development of any financial asset may be rather controversial or unrealistic. From a practical point of view, it may be preferable to use a shorter period of time to set model parameters, during which no major economic changes or paradigm developments on the financial markets occur. This sector is so dynamic and influenced by many factors that long-term data predictions can lose efficiency and practical usability. For both traders and individual issuers, a period of two to three years can represent a time horizon that allows them to estimate bond market developments and plan their capital decisions. Table 1 summarizes the descriptive statistics of the sample.

Figure 4 -Development of Examined Bond Price Returns
The first forecasting model was nonlinear autoregressive network with two delays as input variable. In our experiments, the prices of the first 890 days were for training, 190 days were for validation and those of the last 190 trading days were for out-of sample testing. To assess the predictive capabilities of nonlinear autoregressive network, we applied Levenberg-Marquardt algorithm and Scaled conjugate gradient.
Network performance was compared using mean squared error and a coefficient of determination. As seen in Table 2, both learning algorithms have achieved excellent results on the observed sample of bond prices and have reached a coefficient of determination higher than 95% in the training, validation and testing sets.  As we can see, the learning process stopped after the 8 th epoch, when mean squared error began to grow again. If the network continued learning, it would result into overfitting and the model would lose its generalization ability.
In order to verify the predictive power of the proposed model, we changed the nonlinear autoregressive network framework to a nonlinear autoregressive network with external inputs. As two external variables, 50 year interest-rate swaps denominated in EUR and volatility index VIX were used. Interest rate swaps are a popular choice among researchers for illustrating the risk-free market interest rate in the financial time series and are often used as a benchmark rate on which bonds are extremely sensitive. Despite the fact that the maturity of all examined bonds was longer than 50 years, this approximation can be accepted since longer-term swaps are usually not traded. Moreover, the EUR-denominated interest rate swap curve was concave over the explored period, with additional yield increments significantly declining considering longer time to maturity. Figure 7 depicts development of volatility index VIX in given period. Results of nonlinear autoregressive network with external inputs are summarized in Table 3.  Experiments showed that both learning algorithms had again achieved satisfying outcomes on the examined sample of bond prices. The best result was obtained on Merck corporate bond where the Levenberg-Marquardt algorithm reached the 99.81% coefficient of determination on the validation data set. Based on obtained results, we can state that the implementation of independent variables into the autoregressive neural network environment had a positive impact on prediction ability of the model. If we look at individual bonds, in certain cases we can notice a marked improvement in terms of a reduction of the mean square error and the increase of the coefficient of determination.
Given values indicate that the integration of the external variable in the form of 50-year interest rate swaps denominated in EUR and volatility index VIX improved the predictive ability of standard autoregressive neural network.

CONCLUSION
Bond prices immediately reflect extensive market participants' interactions, the development of macroeconomic indicators as well as a set of global central bank policies. That makes prediction of future bond prices demanding and difficult. In this paper, we propose a long-term bond price forecasting system based on nonlinear autoregressive neural network. Our research demonstrated that neural networks are effective methods for prediction of time series with their ability to capture nonlinearity and hidden patterns in the data. Levenberg-Marquardt as well as Scaled conjugate gradient learning algorithm achieved excellent results on the observed sample of bond prices and reached a coefficient of determination higher than 95% in all sets of data. However, Levenberg-Marquardt algorithm overpowered Scaled conjugate gradient learning almost on all analysed bond price series. To evaluate the performance of examined model, we moreover proposed the nonlinear autoregressive network with external inputs with 50 year interest-rate swaps denominated in EUR and volatility index VIX as an external variables. Both learning algorithms had again achieved satisfying outcomes and we might argue that the implementation of independent variables into the autoregressive neural network environment had positive impact on the prediction ability of the model.
Regarding the observed outcomes of examined models in predicting bond price we might conclude that in very short time gave neural networks accurate results without overfitting the data. However, we would like to continue our research by integrating the neural network framework with various machine learning and artificial intelligence methods which may yield even better bond price predictions. Despite the selection of the long-term bonds from various sectors, the main disadvantage of the presented work is a relatively small sample of bonds. This can be mostly attributed to their low number, as the popularity of bonds with a maturity of more than 50 years has only slowly increased in recent years as a result of extremely low interest rate policies. However, as a result of the current pandemic, no significant turnaround in interest rates can be expected, and therefore the number of these securities will certainly increase. Although it has been said that bonds with such a long maturity behave to a large extent as shares and their price may fluctuate significantly, in further research we would also like to focus on determining the point after which price fluctuations decrease and the price gradually begins to converge to nominal value. It is questionable whether this turning point depends only on microeconomic variables such as the rating or coupon of the bond, it is affected by macroeconomic variables such as volatility, or it is a fixed horizon from maturity. Combined with the application of more advanced methods of artificial intelligence, knowledge of this point could help to improve the quality of bond behaviour prediction and thus the prosperity of the global economy.