## AbstractThis thesis concerns probabilistic learning theory and stochastic opti- mization and investigates applications to a variety of problems arising in finance. In many sequential decision tasks, the consequences of an action emerge at a multitude of times after the action is taken. A key problem is to find good strategies for selecting actions based on both their short and long term consequences. We develop a simulation- based, two-timescale actor-critic algorithm for infinite horizon Markov decision processes with finite state and action spaces, with a dis- counted reward criterion. The algorithm is of the gradient descent type, searching the space of stationary randomized policies and using certain simultaneous deterministic perturbation stochastic approxi- mation (SDPSA) gradient estimates for enhanced performance. We apply our algorithm to a mortgage refinancing problem and find that it obtains the optimal refinancing strategies in a computationally ef- ficient manner. The problem of identifying pairs of similar time series is an impor- tant one with several applications in finance, especially to directional trading, where traders try to spot arbitrage opportunities. We use a variant of the ”Optimal Thermal Causal Path” method (obtained by adding a curvature term and by using an approximation technique to increase the efficiency) to determine the lead-lag structure between a given pair of time-series. We apply the method to various mar- ket sectors of NYSE data and extract highly correlated pairs of time series. Because Genetic Programming (GP) is known for its ability to detect patterns such as the conditional mean and conditional variance of a time series, it is potentially well-suited to volatility forecasting. We introduce a technique for forecasting 5-day annualized volatility in exchange rates. The technique employs a series of standard methods (such as MA, EWMA, GARCH and its variants) alongside Genetic Programming forecasting methods, dynamically opting for the most appropriate technique at a given time, determined through out-of- sample tests. A particular challenge with volatility forecasting using GP is that, during learning, the GP is presented with training data generated by a noisy Markovian process, not something that is mod- elled in the standard probabilistic learning frameworks. We analyse, in a probabilistic model of learning, how much such training data should be presented to the GP in the learning phase for the learning to be successful.
[Edit] |