Financial analysts hope their forecasts about the performance of companies and markets can help families build nest eggs and firms make fortunes. Nowadays, advances in data science have been critical in parsing huge volumes of data in extraordinarily short periods of time, supporting analysts’ ability to make financial predictions and develop complex strategies for clients. In a new book, Machine Learning and Data Sciences for Financial Markets: A Guide to Contemporary Practices, dozens of authorities in the field present their perspectives on how the financial sector can take advantage of cutting-edge tools, systems, and practices. 

Co-edited by Columbia’s Agostino Capponi — Associate Professor of Industrial Engineering and Operations Research; Director of the Center for Digital Finance and Technologies; and Data Science Institute member— the book is now available to order

Machine Learning and Data Sciences for Financial Markets spans many of the moment topics: from ways that robo-advisors can help clients make investment decisions based on the market, to the role of machine learning in “nowcasting,” the popular term used to capture a more accurate prediction of the current state of the economy using large sets of variables and their linkage to macroeconomic indicators. It also examines how machine learning can help design better hedging strategies to evaluate risks while solving difficult equations that can arise in financial decision making and pricing.

Capponi spoke with DSI about his book, what makes it stand out from prior texts, and how it can serve both students and seasoned practitioners.

What sets your new book apart from previous works?

The book is pretty unique in how it broadly covers the entire spectrum of topics in finance and integrates them with machine learning algorithms. It’s a collection of chapters contributed by more than 60 leading experts in the field to educate researchers who want to work in the area, and for students who want to take what they are learning in finance — such as asset pricing, risk management, and credit derivatives — and boost this knowledge with the latest machine learning techniques.

Computing has found use in finance for decades. How does your book build on these foundations?

There has been a lot of knowledge that has been developed with mathematical models in finance over the last 40 years, in terms of things like portfolio optimization or financial derivatives. We show how this knowledge can actually be integrated with modern machine learning techniques to do better.

We now have access to impressively fast technologies that we can apply to problems in finance that are extremely complex — for example, pricing derivatives, which depend on a large number of stocks or bonds. If we want to quote the price of an option contract written on 50 stocks, then it becomes a 50-dimensional problem that we have to solve. 

There was no hope of solving these problems efficiently up to 10 or so years ago. With recent advances in machine learning — such as deep neural networks, which means neural networks each with a large number of layers — we can solve these problems much faster. And these problems can easily go to dimensions higher than 50 — a thousand or more.

What other machine learning strategies will readers find useful?

This book is really about a broad range of machine learning techniques. Neural networks are especially useful as computational tools for partial differential equations with high dimensions. We also look at techniques such as random forests or boosted decision trees to, for example, see how robo-advising can be used in order to detect investor preferences and to help make recommendations to investors on matters such as portfolio stock allocations. 

Another machine learning approach explored in the book is collaborative filtering techniques, which are unsupervised learning models to, for instance, determine what set of corporate bonds are most suitable for a given client, given their risk profile.

Many challenges remain when it comes to applying data science and machine learning to finance, which means there are significant opportunities to become a pioneer in the field. What areas might you focus on as key to future research?

Many chapters of the book highlight a lot of important research directions. For example, one goal is understanding how to construct an integrated model of wealth management that takes into account life-cycle investing models and integrates them with machine learning methods. Such methods can learn client preferences, client goals, client expectations, and make better investment decisions for households to account for their full balance sheets and future cash flows.

Other challenges include trying to use natural language processing (NLP) tools in use cases like analyzing documents or balance sheet reports, or analyzing households or firms to make better assessments of their credit quality or credit scores. Another area where NLP would be useful would be incorporating news into analyses of the performances of stocks — we can, for instance, try and leverage the information about what is being said in the news in order to make a prediction about the performance of a firm over the next quarter. We can also imagine examining credit card statements to see what products are mostly getting bought, and then we might be able to understand which firms might expect the most growth.

How is this book visualizing what’s next for the financial sector?

There has been a lot of evolution in machine learning over the past two decades, and a lot of advances in finance in terms of shifting from the buy side to investment and risk management. Despite all this new growth, there’s never been a strong integration between machine learning and financial practices. This book really aims to bridge this gap, to serve as a common handbook on the latest knowledge and trends and the most important research directions over the next decade.