Stock market prediction

Stock market prediction is the act of trying to determine the future value of a company stock or other financial instrument traded on an exchange. The successful prediction of a stock's future price could yield significant profit. The efficient market hypothesis suggests that stock prices reflect all currently available information and any price changes that are not based on newly revealed information thus are inherently unpredictable. Others disagree and those with this viewpoint possess myriad methods and technologies which purportedly allow them to gain future price information.

The efficient markets hypothesis and the random walk
The efficient market hypothesis posits that stock prices are a function of information and rational expectations, and that newly revealed information about a company's prospects is almost immediately reflected in the current stock price. This would imply that all publicly known information about a company, which obviously includes its price history, would already be reflected in the current price of the stock. Accordingly, changes in the stock price reflect release of new information, changes in the market generally, or random movements around the value that reflects the existing information set.

Burton Malkiel, in his influential 1973 work A Random Walk Down Wall Street, claimed that stock prices could therefore not be accurately predicted by looking at price history. As a result, Malkiel argued, stock prices are best described by a statistical process called a "random walk" meaning each day's deviations from the central value are random and unpredictable. This led Malkiel to conclude that paying financial services persons to predict the market actually hurt, rather than helped, net portfolio return. A number of empirical tests support the notion that the theory applies generally, as most portfolios managed by professional stock predictors do not outperform the market average return after accounting for the managers' fees.

Intrinsic value
Intrinsic value (true value) is the perceived or calculated value of a company, including tangible and intangible factors, using fundamental analysis. It's also frequently called fundamental value. It is used for comparison with the company's market value and finding out whether the company is undervalued on the stock market or not. When calculating it, the investor looks at both the qualitative and quantitative aspects of the business. It is ordinarily calculated by summing the discounted future income generated by the asset to obtain the present value.

Prediction methods
Prediction methodologies fall into three broad categories which can (and often do) overlap. They are fundamental analysis, technical analysis (charting) and machine learning.

Fundamental analysis
Fundamental analysts are concerned with the company that underlies the stock itself. They evaluate a company's past performance as well as the credibility of its accounts. Many performance ratios are created that aid the fundamental analyst with assessing the validity of a stock, such as the P/E ratio. Warren Buffett is perhaps the most famous of all fundamental analysts. He uses the overall market capitalization-to-GDP ratio to indicate the relative value of the stock market in general, hence this ratio has become known as the "Buffett indicator".

What fundamental analysis in the stock market is trying to achieve, is finding out the true value of a stock, which then can be compared with the value it is being traded with on stock markets and therefore finding out whether the stock on the market is undervalued or not. Finding out the true value can be done by various methods with basically the same principle. The principle is that a company is worth all of its future profits added together. These future profits also have to be discounted to their present value. This principle goes along well with the theory that a business is all about profits and nothing else.

Contrary to technical analysis, fundamental analysis is thought of more as a long-term strategy.

Fundamental analysis is built on the belief that human society needs capital to make progress and if a company operates well, it should be rewarded with additional capital and result in a surge in stock price. Fundamental analysis is widely used by fund managers as it is the most reasonable, objective and made from publicly available information like financial statement analysis.

Another meaning of fundamental analysis is beyond bottom-up company analysis, it refers to top-down analysis from first analyzing the global economy, followed by country analysis and then sector analysis, and finally the company level analysis.

Technical analysis
Technical analysis is an analysis methodology for analysing and forecasting the direction of prices through the study of past market data, primarily price and volume. The efficacy of technical analysis is disputed by the efficient-market hypothesis, which states that stock market prices are essentially unpredictable, and research on whether technical analysis offers any benefit has produced mixed results.

Technical analysts or chartists are usually less concerned with any of a company's fundamentals. They seek to determine possibilities of future stock price movement largely based on trends of the past price (a form of time series analysis). Numerous patterns are employed such as the head and shoulders or cup and saucer. Alongside the patterns, techniques are used such as the exponential moving average (EMA), oscillators, support and resistance levels or momentum and volume indicators. Candle stick patterns, believed to have been first developed by Japanese rice merchants, are nowadays widely used by technical analysts. Technical analysis is rather used for short-term strategies, than the long-term ones. And therefore, it is far more prevalent in commodities and forex markets where traders focus on short-term price movements. There are some basic assumptions used in this analysis, first being that everything significant about a company is already priced into the stock, other being that the price moves in trends and lastly that history (of prices) tends to repeat itself which is mainly because of the market psychology.

Machine learning
With the advent of the digital computer, stock market prediction has since moved into the technological realm. Several research papers have been published with implementations of machine learning techniques to predict stock markets including, but not limited to, artificial neural networks (ANNs), random forests and supervised statistical classification.

A common form of ANN in use for stock market prediction is the feed forward network utilizing the backward propagation of errors algorithm to update the network weights. These networks are commonly referred to as backpropagation networks. Another form of ANN that is more appropriate for stock prediction is the time recurrent neural network (RNN) or time delay neural network (TDNN). Examples of RNN and TDNN are the Elman, Jordan, and Elman-Jordan networks. (See the Elman And Jordan Networks.)

For stock prediction with ANNs, there are usually two approaches taken for forecasting different time horizons: independent and joint. The independent approach employs a single ANN for each time horizon, for example, 1-day, 2-day, or 5-day. The advantage of this approach is that network forecasting error for one horizon won't impact the error for another horizon—since each time horizon is typically a unique problem. The joint approach, however, incorporates multiple time horizons together so that they are determined simultaneously. In this approach, forecasting error for one time horizon may share its error with that of another horizon, which can decrease performance. There are also more parameters required for a joint model, which increases the risk of overfitting.

Of late, the majority of academic research groups studying ANNs for stock forecasting seem to be using an ensemble of independent ANNs methods more frequently, with greater success. An ensemble of ANNs would use low price and time lags to predict future lows, while another network would use lagged highs to predict future highs. The predicted low and high predictions are then used to form stop prices for buying or selling. Outputs from the individual "low" and "high" networks can also be input into a final network that would also incorporate volume, intermarket data or statistical summaries of prices, leading to a final ensemble output that would trigger buying, selling, or market directional change.

Deep learning methods have been used to some extent. The Gated Three-Tower Transformer (GT3) is a transformer-based model designed to integrate numerical market data with textual information from social sources to enhance the accuracy of stock market predictions.

Since NNs require training and can have a large parameter space; it is useful to optimize the network for optimal predictive ability. A major finding with ANNs and stock prediction is that a classification approach (vs. function approximation) using outputs in the form of buy (y=+1) and sell (y=-1) results in better predictive reliability than a quantitative output such as low or high price.

Implementations using random forests and supervised statistical classification follow the same approach of predicting stock movement as a binary classification problem. Under this formulation, the sign of a future return is the label of the data, with forecasted returns being split between negative and non-negative, and the observable features used to feed the classification model can be lagged returns, the lagged sign of returns or any other lagged explanatory economic data.

The loss function used to evaluate the quality of the classification model can be either the accuracy of the prediction (defined as the number of times that the classifier predicted the correct sign divided by the total number of predictions made) or the total return of a trading strategy that bought when the classifier predicted a positive sign and sold when the classifier predicted a negative return. As standard in all statistical classification problems, it is important to split the data available into training and test samples and only evaluate the model based on the test sample results as it is generally considered more trustworthy than evidence based on in-sample performance, which can be more sensitive to outliers and data mining. Out-of-sample forecasts also better reflect the information available to the forecaster in "real time".

Data sources for market prediction
Tobias Preis et al. introduced a method to identify online precursors for stock market moves, using trading strategies based on search volume data provided by Google Trends. Their analysis of Google search volume for 98 terms of varying financial relevance, published in Scientific Reports, suggests that increases in search volume for financially relevant search terms tend to precede large losses in financial markets. Out of these terms, three were significant at the 5% level (|z| &gt; 1.96). The best term in the negative direction was "debt", followed by "color".

In a study published in Scientific Reports in 2013, Helen Susannah Moat, Tobias Preis and colleagues demonstrated a link between changes in the number of views of English Wikipedia articles relating to financial topics and subsequent large stock market moves.

The use of Text Mining together with Machine Learning algorithms received more attention in the last years, with the use of textual content from Internet as input to predict price changes in Stocks and other financial markets.

The collective mood of Twitter messages has been linked to stock market performance. The study, however, has been criticized for its methodology.

The activity in stock message boards has been mined in order to predict asset returns. The enterprise headlines from Yahoo! Finance and Google Finance were used as news feeding in a Text mining process, to forecast the Stocks price movements from Dow Jones Industrial Average.