📄 Page
1
Yves Hilpisch Python for Algorithmic Trading From Idea to Cloud Deployment
📄 Page
2
(This page has no text content)
📄 Page
3
Yves Hilpisch Python for Algorithmic Trading From Idea to Cloud Deployment Boston Farnham Sebastopol TokyoBeijing
📄 Page
4
978-1-492-05335-4 [LSI] Python for Algorithmic Trading by Yves Hilpisch Copyright © 2021 Yves Hilpisch. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Michelle Smith Development Editor: Michele Cronin Production Editor: Daniel Elfanbaum Copyeditor: Piper Editorial LLC Proofreader: nSight, Inc. Indexer: WordCo Indexing Services, Inc. Interior Designer: David Futato Cover Designer: Jose Marzan Illustrator: Kate Dullea November 2020: First Edition Revision History for the First Edition 2020-11-11: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781492053354 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Python for Algorithmic Trading, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author, and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. This book is not intended as financial advice. Please consult a qualified professional if you require financial advice.
📄 Page
5
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1. Python and Algorithmic Trading. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Python for Finance 1 Python Versus Pseudo-Code 2 NumPy and Vectorization 3 pandas and the DataFrame Class 5 Algorithmic Trading 7 Python for Algorithmic Trading 11 Focus and Prerequisites 13 Trading Strategies 13 Simple Moving Averages 14 Momentum 14 Mean Reversion 14 Machine and Deep Learning 15 Conclusions 15 References and Further Resources 15 2. Python Infrastructure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Conda as a Package Manager 19 Installing Miniconda 19 Basic Operations with Conda 21 Conda as a Virtual Environment Manager 27 Using Docker Containers 30 Docker Images and Containers 31 Building a Ubuntu and Python Docker Image 31 Using Cloud Instances 36 RSA Public and Private Keys 38 iii
📄 Page
6
Jupyter Notebook Configuration File 38 Installation Script for Python and Jupyter Lab 40 Script to Orchestrate the Droplet Set Up 41 Conclusions 43 References and Further Resources 44 3. Working with Financial Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Reading Financial Data From Different Sources 46 The Data Set 46 Reading from a CSV File with Python 47 Reading from a CSV File with pandas 49 Exporting to Excel and JSON 50 Reading from Excel and JSON 51 Working with Open Data Sources 52 Eikon Data API 55 Retrieving Historical Structured Data 58 Retrieving Historical Unstructured Data 62 Storing Financial Data Efficiently 65 Storing DataFrame Objects 66 Using TsTables 70 Storing Data with SQLite3 75 Conclusions 77 References and Further Resources 78 Python Scripts 78 4. Mastering Vectorized Backtesting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Making Use of Vectorization 82 Vectorization with NumPy 83 Vectorization with pandas 85 Strategies Based on Simple Moving Averages 88 Getting into the Basics 89 Generalizing the Approach 97 Strategies Based on Momentum 98 Getting into the Basics 99 Generalizing the Approach 104 Strategies Based on Mean Reversion 107 Getting into the Basics 107 Generalizing the Approach 110 Data Snooping and Overfitting 111 Conclusions 113 References and Further Resources 113 Python Scripts 115 iv | Table of Contents
📄 Page
7
SMA Backtesting Class 115 Momentum Backtesting Class 118 Mean Reversion Backtesting Class 120 5. Predicting Market Movements with Machine Learning. . . . . . . . . . . . . . . . . . . . . . . . . . 123 Using Linear Regression for Market Movement Prediction 124 A Quick Review of Linear Regression 125 The Basic Idea for Price Prediction 127 Predicting Index Levels 129 Predicting Future Returns 132 Predicting Future Market Direction 134 Vectorized Backtesting of Regression-Based Strategy 135 Generalizing the Approach 137 Using Machine Learning for Market Movement Prediction 139 Linear Regression with scikit-learn 139 A Simple Classification Problem 141 Using Logistic Regression to Predict Market Direction 146 Generalizing the Approach 150 Using Deep Learning for Market Movement Prediction 153 The Simple Classification Problem Revisited 154 Using Deep Neural Networks to Predict Market Direction 156 Adding Different Types of Features 162 Conclusions 166 References and Further Resources 166 Python Scripts 167 Linear Regression Backtesting Class 167 Classification Algorithm Backtesting Class 170 6. Building Classes for Event-Based Backtesting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Backtesting Base Class 177 Long-Only Backtesting Class 182 Long-Short Backtesting Class 185 Conclusions 190 References and Further Resources 190 Python Scripts 191 Backtesting Base Class 191 Long-Only Backtesting Class 194 Long-Short Backtesting Class 197 7. Working with Real-Time Data and Sockets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Running a Simple Tick Data Server 203 Connecting a Simple Tick Data Client 206 Table of Contents | v
📄 Page
8
Signal Generation in Real Time 208 Visualizing Streaming Data with Plotly 211 The Basics 211 Three Real-Time Streams 212 Three Sub-Plots for Three Streams 214 Streaming Data as Bars 215 Conclusions 217 References and Further Resources 218 Python Scripts 218 Sample Tick Data Server 218 Tick Data Client 219 Momentum Online Algorithm 219 Sample Data Server for Bar Plot 220 8. CFD Trading with Oanda. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Setting Up an Account 227 The Oanda API 229 Retrieving Historical Data 230 Looking Up Instruments Available for Trading 230 Backtesting a Momentum Strategy on Minute Bars 231 Factoring In Leverage and Margin 234 Working with Streaming Data 236 Placing Market Orders 237 Implementing Trading Strategies in Real Time 239 Retrieving Account Information 244 Conclusions 246 References and Further Resources 247 Python Script 247 9. FX Trading with FXCM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Getting Started 251 Retrieving Data 251 Retrieving Tick Data 252 Retrieving Candles Data 254 Working with the API 256 Retrieving Historical Data 257 Retrieving Streaming Data 259 Placing Orders 260 Account Information 262 Conclusions 263 References and Further Resources 264 vi | Table of Contents
📄 Page
9
10. Automating Trading Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 Capital Management 266 Kelly Criterion in Binomial Setting 266 Kelly Criterion for Stocks and Indices 272 ML-Based Trading Strategy 277 Vectorized Backtesting 278 Optimal Leverage 285 Risk Analysis 287 Persisting the Model Object 290 Online Algorithm 291 Infrastructure and Deployment 296 Logging and Monitoring 297 Visual Step-by-Step Overview 299 Configuring Oanda Account 299 Setting Up the Hardware 300 Setting Up the Python Environment 301 Uploading the Code 302 Running the Code 302 Real-Time Monitoring 304 Conclusions 304 References and Further Resources 305 Python Script 305 Automated Trading Strategy 305 Strategy Monitoring 308 Appendix. Python, NumPy, matplotlib, pandas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Table of Contents | vii
📄 Page
10
(This page has no text content)
📄 Page
11
1 Harari, Yuval Noah. 2015. Homo Deus: A Brief History of Tomorrow. London: Harvill Secker. Preface Dataism says that the universe consists of data flows, and the value of any phenom‐ enon or entity is determined by its contribution to data processing….Dataism thereby collapses the barrier between animals [humans] and machines, and expects electronic algorithms to eventually decipher and outperform biochemical algorithms.1 —Yuval Noah Harari Finding the right algorithm to automatically and successfully trade in financial mar‐ kets is the holy grail in finance. Not too long ago, algorithmic trading was only avail‐ able and possible for institutional players with deep pockets and lots of assets under management. Recent developments in the areas of open source, open data, cloud compute, and cloud storage, as well as online trading platforms, have leveled the play‐ ing field for smaller institutions and individual traders, making it possible to get started in this fascinating discipline while equipped only with a typical notebook or desktop computer and a reliable internet connection. Nowadays, Python and its ecosystem of powerful packages is the technology platform of choice for algorithmic trading. Among other things, Python allows you to do efficient data analytics (with pandas, for example), to apply machine learning to stock market prediction (with scikit-learn, for example), or even to make use of Google’s deep learning technology with TensorFlow. This is a book about Python for algorithmic trading, primarily in the context of alpha generating strategies (see Chapter 1). Such a book at the intersection of two vast and exciting fields can hardly cover all topics of relevance. However, it can cover a range of important meta topics in depth. ix
📄 Page
12
These topics include: Financial data Financial data is at the core of every algorithmic trading project. Python and packages like NumPy and pandas do a great job of handling and working with structured financial data of any kind (end-of-day, intraday, high frequency). Backtesting There should be no automated algorithmic trading without a rigorous testing of the trading strategy to be deployed. The book covers, among other things, trad‐ ing strategies based on simple moving averages, momentum, mean-reversion, and machine/deep-learning based prediction. Real-time data Algorithmic trading requires dealing with real-time data, online algorithms based on it, and visualization in real time. The book provides an introduction to socket programming with ZeroMQ and streaming visualization. Online platforms No trading can take place without a trading platform. The book covers two pop‐ ular electronic trading platforms: Oanda and FXCM. Automation The beauty, as well as some major challenges, in algorithmic trading results from the automation of the trading operation. The book shows how to deploy Python in the cloud and how to set up an environment appropriate for automated algorithmic trading. The book offers a unique learning experience with the following features and benefits: Coverage of relevant topics This is the only book covering such a breadth and depth with regard to relevant topics in Python for algorithmic trading (see the following). Self-contained code base The book is accompanied by a Git repository with all codes in a self-contained, executable form. The repository is available on the Quant Platform. Real trading as the goal The coverage of two different online trading platforms puts the reader in the position to start both paper and live trading efficiently. To this end, the book equips the reader with relevant, practical, and valuable background knowledge. Do-it-yourself and self-paced approach Since the material and the code are self-contained and only rely on standard Python packages, the reader has full knowledge of and full control over what is x | Preface
📄 Page
13
going on, how to use the code examples, how to change them, and so on. There is no need to rely on third-party platforms, for instance, to do the backtesting or to connect to the trading platforms. With this book, the reader can do all this on their own at a convenient pace and has every single line of code to do so. User forum Although the reader should be able to follow along seamlessly, the author and The Python Quants are there to help. The reader can post questions and com‐ ments in the user forum on the Quant Platform at any time (accounts are free). Online/video training (paid subscription) The Python Quants offer comprehensive online training programs that make use of the contents presented in the book and that add additional content, covering important topics such as financial data science, artificial intelligence in finance, Python for Excel and databases, and additional Python tools and skills. Contents and Structure Here’s a quick overview of the topics and contents presented in each chapter. Chapter 1, Python and Algorithmic Trading The first chapter is an introduction to the topic of algorithmic trading—that is, the automated trading of financial instruments based on computer algorithms. It discusses fundamental notions in this context and also addresses, among other things, what the expected prerequisites for reading the book are. Chapter 2, Python Infrastructure This chapter lays the technical foundations for all subsequent chapters in that it shows how to set up a proper Python environment. This chapter mainly uses conda as a package and environment manager. It illustrates Python deployment via Docker containers and in the cloud. Chapter 3, Working with Financial Data Financial time series data is central to every algorithmic trading project. This chapter shows you how to retrieve financial data from different public data and proprietary data sources. It also demonstrates how to store financial time series data efficiently with Python. Chapter 4, Mastering Vectorized Backtesting Vectorization is a powerful approach in numerical computation in general and for financial analytics in particular. This chapter introduces vectorization with NumPy and pandas and applies that approach to the backtesting of SMA-based, momentum, and mean-reversion strategies. Preface | xi
📄 Page
14
Chapter 5, Predicting Market Movements with Machine Learning This chapter is dedicated to generating market predictions by the use of machine learning and deep learning approaches. By mainly relying on past return obser‐ vations as features, approaches are presented for predicting tomorrow’s market direction by using such Python packages as Keras in combination with Tensor Flow and scikit-learn. Chapter 6, Building Classes for Event-Based Backtesting While vectorized backtesting has advantages when it comes to conciseness of code and performance, it’s limited with regard to the representation of certain market features of trading strategies. On the other hand, event-based backtesting, technically implemented by the use of object oriented programming, allows for a rather granular and more realistic modeling of such features. This chapter presents and explains in detail a base class as well as two classes for the backtest‐ ing of long-only and long-short trading strategies. Chapter 7, Working with Real-Time Data and Sockets Needing to cope with real-time or streaming data is a reality even for the ambi‐ tious individual algorithmic trader. The tool of choice is socket programming, for which this chapter introduces ZeroMQ as a lightweight and scalable technology. The chapter also illustrates how to make use of Plotly to create nice looking, interactive streaming plots. Chapter 8, CFD Trading with Oanda Oanda is a foreign exchange (forex, FX) and Contracts for Difference (CFD) trading platform offering a broad set of tradable instruments, such as those based on foreign exchange pairs, stock indices, commodities, or rates instruments (benchmark bonds). This chapter provides guidance on how to implement auto‐ mated algorithmic trading strategies with Oanda, making use of the Python wrapper package tpqoa. Chapter 9, FX Trading with FXCM FXCM is another forex and CFD trading platform that has recently released a modern RESTful API for algorithmic trading. Available instruments span multi‐ ple asset classes, such as forex, stock indices, or commodities. A Python wrapper package that makes algorithmic trading based on Python code rather convenient and efficient is available (http://fxcmpy.tpq.io). Chapter 10, Automating Trading Operations This chapter deals with capital management, risk analysis and management, as well as with typical tasks in the technical automation of algorithmic trading oper‐ ations. It covers, for instance, the Kelly criterion for capital allocation and leverage in detail. xii | Preface
📄 Page
15
Appendix The appendix provides a concise introduction to the most important Python, NumPy, and pandas topics in the context of the material presented in the main chapters. It represents a starting point from which one can add to one’s own Python knowledge over time. Figure P-1 shows the layers related to algorithmic trading that the chapters cover from the bottom to the top. It necessarily starts with the Python infrastructure (Chap‐ ter 2), and adds financial data (Chapter 3), strategy, and vectorized backtesting code (Chapters 4 and 5). Until that point, data sets are used and manipulated as a whole. Event-based backtesting for the first time introduces the idea that data in the real world arrives incrementally (Chapter 6). It is the bridge that leads to the connecting code layer that covers socket communication and real-time data handling (Chap‐ ter 7). On top of that, trading platforms and their APIs are required to be able to place orders (Chapters 8 and 9). Finally, important aspects of automation and deploy‐ ment are covered (Chapter 10). In that sense, the main chapters of the book relate to the layers as seen in Figure P-1, which provide a natural sequence for the topics to be covered. Figure P-1. The layers of Python for algorithmic trading Preface | xiii
📄 Page
16
Who This Book Is For This book is for students, academics, and practitioners alike who want to apply Python in the fascinating field of algorithmic trading. The book assumes that the reader has, at least on a fundamental level, background knowledge in both Python programming and in financial trading. For reference and review, the Appendix intro‐ duces important Python, NumPy, matplotlib, and pandas topics. The following are good references to get a sound understanding of the Python topics important for this book. Most readers will benefit from having at least access to Hilpisch (2018) for ref‐ erence. With regard to the machine and deep learning approaches applied to algorith‐ mic trading, Hilpisch (2020) provides a wealth of background information and a larger number of specific examples. Background information about Python as applied to finance, financial data science, and artificial intelligence can be found in the following books: Hilpisch, Yves. 2018. Python for Finance: Mastering Data-Driven Finance. 2nd ed. Sebastopol: O’Reilly. ⸻. 2020. Artificial Intelligence in Finance: A Python-Based Guide. Sebastopol: O’Reilly. McKinney, Wes. 2017. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. 2nd ed. Sebastopol: O’Reilly. Ramalho, Luciano. 2021. Fluent Python: Clear, Concise, and Effective Programming. 2nd ed. Sebastopol: O’Reilly. VanderPlas, Jake. 2016. Python Data Science Handbook: Essential Tools for Working with Data. Sebastopol: O’Reilly. Background information about algorithmic trading can be found, for instance, in these books: Chan, Ernest. 2009. Quantitative Trading: How to Build Your Own Algorithmic Trad‐ ing Business. Hoboken et al: John Wiley & Sons. Chan, Ernest. 2013. Algorithmic Trading: Winning Strategies and Their Rationale. Hoboken et al: John Wiley & Sons. Kissel, Robert. 2013. The Science of Algorithmic Trading and Portfolio Management. Amsterdam et al: Elsevier/Academic Press. Narang, Rishi. 2013. Inside the Black Box: A Simple Guide to Quantitative and High Frequency Trading. Hoboken et al: John Wiley & Sons. Enjoy your journey through the algorithmic trading world with Python and get in touch by emailing py4at@tpq.io if you have questions or comments. xiv | Preface
📄 Page
17
Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs, to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. Using Code Examples You can access and execute the code that accompanies the book on the Quant Plat‐ form at https://py4at.pqp.io, for which only a free registration is required. If you have a technical question or a problem using the code examples, please email bookquestions@oreilly.com. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not Preface | xv
📄 Page
18
need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission. We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example, this book may be attrib‐ uted as: “Python for Algorithmic Trading by Yves Hilpisch (O’Reilly). Copyright 2021 Yves Hilpisch, 978-1-492-05335-4.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit http://oreilly.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/py4at. xvi | Preface
📄 Page
19
Email bookquestions@oreilly.com to comment or ask technical questions about this book. For news and information about our books and courses, visit http://oreilly.com. Find us on Facebook: http://facebook.com/oreilly Follow us on Twitter: http://twitter.com/oreillymedia Watch us on YouTube: http://youtube.com/oreillymedia Acknowledgments I want to thank the technical reviewers—Hugh Brown, McKlayne Marshall, Ramana‐ than Ramakrishnamoorthy, and Prem Jebaseelan—who provided helpful comments that led to many improvements of the book’s content. As usual, a special thank you goes to Michael Schwed, who supports me in all techni‐ cal matters, simple and highly complex, with his broad and in-depth technology know-how. Delegates of the Certificate Programs in Python for Computational Finance and Algorithmic Trading also helped improve this book. Their ongoing feedback has enabled me to weed out errors and mistakes and refine the code and notebooks used in our online training classes and now, finally, in this book. I would also like to thank the whole team at O’Reilly Media—especially Michelle Smith, Michele Cronin, Victoria DeRose, and Danny Elfanbaum—for making it all happen and helping me refine the book in so many ways. Of course, all remaining errors are mine alone. Furthermore, I would also like to thank the team at Refinitiv—in particular, Jason Ramchandani—for providing ongoing support and access to financial data. The major data files used throughout the book and made available to the readers were received in one way or another from Refinitiv’s data APIs. To my family with love. I dedicate this book to my father Adolf whose support for me and our family now spans almost five decades. Preface | xvii
📄 Page
20
(This page has no text content)