Mastering Python for Bioinformatics How to Write Flexible, Documented, Tested Python Code for Research Computing (Ken Youens-Clark) (Z-Library)

Mastering Python for Bioinformatics How to Write Flexible, Documented, Tested Python Code for Research Computing Ken Youens-Clark

(This page has no text content)

Ken Youens-Clark Mastering Python for Bioinformatics How to Write Flexible, Documented, Tested Python Code for Research Computing Boston Farnham Sebastopol TokyoBeijing

978-1-098-10088-9 [LSI] Mastering Python for Bioinformatics by Ken Youens-Clark Copyright © 2021 Charles Kenneth Youens-Clark. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Michelle Smith Development Editor: Corbin Collins Production Editor: Caitlin Ghegan Copyeditor: Sonia Saruba Proofreader: Rachel Head Indexer: Sue Klefstad Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea May 2021: First Edition Revision History for the First Edition 2021-05-04: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098100889 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Mastering Python for Bioinformatics, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author, and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Part I. The Rosalind.info Challenges 1. Tetranucleotide Frequency: Counting Things. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Getting Started 4 Creating the Program Using new.py 5 Using argparse 7 Tools for Finding Errors in the Code 10 Introducing Named Tuples 12 Adding Types to Named Tuples 15 Representing the Arguments with a NamedTuple 16 Reading Input from the Command Line or a File 18 Testing Your Program 20 Running the Program to Test the Output 23 Solution 1: Iterating and Counting the Characters in a String 25 Counting the Nucleotides 26 Writing and Verifying a Solution 28 Additional Solutions 30 Solution 2: Creating a count() Function and Adding a Unit Test 30 Solution 3: Using str.count() 34 Solution 4: Using a Dictionary to Count All the Characters 35 Solution 5: Counting Only the Desired Bases 38 Solution 6: Using collections.defaultdict() 39 Solution 7: Using collections.Counter() 41 Going Further 42 Review 42 iii

2. Transcribing DNA into mRNA: Mutating Strings, Reading and Writing Files. . . . . . . . 45 Getting Started 46 Defining the Program’s Parameters 47 Defining an Optional Parameter 47 Defining One or More Required Positional Parameters 48 Using nargs to Define the Number of Arguments 49 Using argparse.FileType() to Validate File Arguments 49 Defining the Args Class 50 Outlining the Program Using Pseudocode 51 Iterating the Input Files 52 Creating the Output Filenames 52 Opening the Output Files 54 Writing the Output Sequences 55 Printing the Status Report 57 Using the Test Suite 57 Solutions 60 Solution 1: Using str.replace() 60 Solution 2: Using re.sub() 62 Benchmarking 64 Going Further 65 Review 65 3. Reverse Complement of DNA: String Manipulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Getting Started 68 Iterating Over a Reversed String 70 Creating a Decision Tree 72 Refactoring 73 Solutions 74 Solution 1: Using a for Loop and Decision Tree 75 Solution 2: Using a Dictionary Lookup 75 Solution 3: Using a List Comprehension 78 Solution 4: Using str.translate() 78 Solution 5: Using Bio.Seq 81 Review 82 4. Creating the Fibonacci Sequence: Writing, Testing, and Benchmarking Algorithms. 83 Getting Started 84 An Imperative Approach 89 Solutions 91 Solution 1: An Imperative Solution Using a List as a Stack 91 Solution 2: Creating a Generator Function 93 Solution 3: Using Recursion and Memoization 96 iv | Table of Contents

Benchmarking the Solutions 100 Testing the Good, the Bad, and the Ugly 102 Running the Test Suite on All the Solutions 103 Going Further 109 Review 109 5. Computing GC Content: Parsing FASTA and Analyzing Sequences. . . . . . . . . . . . . . . . 111 Getting Started 112 Get Parsing FASTA Using Biopython 115 Iterating the Sequences Using a for Loop 118 Solutions 120 Solution 1: Using a List 120 Solution 2: Type Annotations and Unit Tests 123 Solution 3: Keeping a Running Max Variable 127 Solution 4: Using a List Comprehension with a Guard 129 Solution 5: Using the filter() Function 130 Solution 6: Using the map() Function and Summing Booleans 130 Solution 7: Using Regular Expressions to Find Patterns 131 Solution 8: A More Complex find_gc() Function 132 Benchmarking 134 Going Further 134 Review 135 6. Finding the Hamming Distance: Counting Point Mutations. . . . . . . . . . . . . . . . . . . . . 137 Getting Started 138 Iterating the Characters of Two Strings 141 Solutions 142 Solution 1: Iterating and Counting 142 Solution 2: Creating a Unit Test 143 Solution 3: Using the zip() Function 145 Solution 4: Using the zip_longest() Function 147 Solution 5: Using a List Comprehension 148 Solution 6: Using the filter() Function 149 Solution 7: Using the map() Function with zip_longest() 150 Solution 8: Using the starmap() and operator.ne() Functions 151 Going Further 153 Review 153 7. Translating mRNA into Protein: More Functional Programming. . . . . . . . . . . . . . . . . 155 Getting Started 155 K-mers and Codons 157 Translating Codons 160 Table of Contents | v

Solutions 161 Solution 1: Using a for Loop 161 Solution 2: Adding Unit Tests 162 Solution 3: Another Function and a List Comprehension 165 Solution 4: Functional Programming with the map(), partial(), and takewhile() Functions 167 Solution 5: Using Bio.Seq.translate() 169 Benchmarking 170 Going Further 170 Review 170 8. Find a Motif in DNA: Exploring Sequence Similarity. . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Getting Started 171 Finding Subsequences 173 Solutions 175 Solution 1: Using the str.find() Method 176 Solution 2: Using the str.index() Method 177 Solution 3: A Purely Functional Approach 179 Solution 4: Using K-mers 181 Solution 5: Finding Overlapping Patterns Using Regular Expressions 183 Benchmarking 184 Going Further 185 Review 185 9. Overlap Graphs: Sequence Assembly Using Shared K-mers. . . . . . . . . . . . . . . . . . . . . . 187 Getting Started 188 Managing Runtime Messages with STDOUT, STDERR, and Logging 192 Finding Overlaps 195 Grouping Sequences by the Overlap 196 Solutions 200 Solution 1: Using Set Intersections to Find Overlaps 200 Solution 2: Using a Graph to Find All Paths 203 Going Further 208 Review 208 10. Finding the Longest Shared Subsequence: Finding K-mers, Writing Functions, and Using Binary Search. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Getting Started 211 Finding the Shortest Sequence in a FASTA File 213 Extracting K-mers from a Sequence 215 Solutions 217 Solution 1: Counting Frequencies of K-mers 217 vi | Table of Contents

Solution 2: Speeding Things Up with a Binary Search 220 Going Further 226 Review 226 11. Finding a Protein Motif: Fetching Data and Using Regular Expressions. . . . . . . . . . . 227 Getting Started 227 Downloading Sequences Files on the Command Line 230 Downloading Sequences Files with Python 233 Writing a Regular Expression to Find the Motif 235 Solutions 237 Solution 1: Using a Regular Expression 237 Solution 2: Writing a Manual Solution 239 Going Further 244 Review 244 12. Inferring mRNA from Protein: Products and Reductions of Lists. . . . . . . . . . . . . . . . . 245 Getting Started 245 Creating the Product of Lists 247 Avoiding Overflow with Modular Multiplication 249 Solutions 251 Solution 1: Using a Dictionary for the RNA Codon Table 251 Solution 2: Turn the Beat Around 257 Solution 3: Encoding the Minimal Information 259 Going Further 260 Review 261 13. Location Restriction Sites: Using, Testing, and Sharing Code. . . . . . . . . . . . . . . . . . . . 263 Getting Started 264 Finding All Subsequences Using K-mers 266 Finding All Reverse Complements 267 Putting It All Together 267 Solutions 268 Solution 1: Using the zip() and enumerate() Functions 268 Solution 2: Using the operator.eq() Function 270 Solution 3: Writing a revp() Function 271 Testing the Program 272 Going Further 274 Review 274 14. Finding Open Reading Frames. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Getting Started 275 Translating Proteins Inside Each Frame 277 Table of Contents | vii

Finding the ORFs in a Protein Sequence 279 Solutions 280 Solution 1: Using the str.index() Function 280 Solution 2: Using the str.partition() Function 282 Solution 3: Using a Regular Expression 284 Going Further 286 Review 286 Part II. Other Programs 15. Seqmagique: Creating and Formatting Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Using Seqmagick to Analyze Sequence Files 290 Checking Files Using MD5 Hashes 291 Getting Started 293 Formatting Text Tables Using tabulate() 295 Solutions 296 Solution 1: Formatting with tabulate() 296 Solution 2: Formatting with rich 303 Going Further 305 Review 306 16. FASTX grep: Creating a Utility Program to Select Sequences. . . . . . . . . . . . . . . . . . . . 307 Finding Lines in a File Using grep 308 The Structure of a FASTQ Record 308 Getting Started 311 Guessing the File Format 315 Solution 317 Going Further 327 Review 327 17. DNA Synthesizer: Creating Synthetic Data with Markov Chains. . . . . . . . . . . . . . . . . . 329 Understanding Markov Chains 329 Getting Started 332 Understanding Random Seeds 335 Reading the Training Files 337 Generating the Sequences 340 Structuring the Program 343 Solution 343 Going Further 347 Review 347 viii | Table of Contents

18. FASTX Sampler: Randomly Subsampling Sequence Files. . . . . . . . . . . . . . . . . . . . . . . . 349 Getting Started 349 Reviewing the Program Parameters 350 Defining the Parameters 352 Nondeterministic Sampling 354 Structuring the Program 356 Solutions 356 Solution 1: Reading Regular Files 357 Solution 2: Reading a Large Number of Compressed Files 358 Going Further 360 Review 360 19. Blastomatic: Parsing Delimited Text Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Introduction to BLAST 361 Using csvkit and csvchk 364 Getting Started 368 Defining the Arguments 371 Parsing Delimited Text Files Using the csv Module 373 Parsing Delimited Text Files Using the pandas Module 377 Solutions 383 Solution 1: Manually Joining the Tables Using Dictionaries 383 Solution 2: Writing the Output File with csv.DictWriter() 384 Solution 3: Reading and Writing Files Using pandas 385 Solution 4: Joining Files Using pandas 387 Going Further 390 Review 390 A. Documenting Commands and Creating Workflows with make. . . . . . . . . . . . . . . . . . . . 391 B. Understanding $PATH and Installing Command-Line Programs. . . . . . . . . . . . . . . . . . 405 Epilogue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Table of Contents | ix

(This page has no text content)

Preface Programming is a force multiplier. We can write computer programs to free ourselves from tedious manual tasks and to accelerate research. Programming in any language will likely improve your productivity, but each language has different learning curves and tools that improve or impede the process of coding. There is an adage in business that says you have three choices: 1. Fast 2. Good 3. Cheap Pick any two. When it comes to programming languages, Python hits a sweet spot in that it’s fast because it’s fairly easy to learn and to write a working prototype of an idea—it’s pretty much always the first language I’ll use to write any program. I find Python to be cheap because my programs will usually run well enough on commodity hardware like my laptop or a tiny AWS instance. However, I would contend that it’s not neces‐ sarily easy to make good programs using Python because the language itself is fairly lax. For instance, it allows one to mix characters and numbers in operations that will crash the program. This book has been written for the aspiring bioinformatics programmer who wants to learn about Python’s best practices and tools such as the following: • Since Python 3.6, you can add type hints to indicate, for instance, that a variable should be a type like a number or a list, and you can use the mypy tool to ensure the types are used correctly. • Testing frameworks like pytest can exercise your code with both good and bad data to ensure that it reacts in some predictable way. xi

• Tools like pylint and flake8 can find potential errors and stylistic problems that would make your programs more difficult to understand. • The argparse module can document and validate the arguments to your programs. • The Python ecosystem allows you to leverage hundreds of existing modules like Biopython to shorten programs and make them more reliable. Using these tools practices individually will improve your programs, but combining them all will improve your code in compounding ways. This book is not a textbook on bioinformatics per se. The focus is on what Python offers that makes it suitable for writing scientific programs that are reproducible. That is, I’ll show you how to design and test programs that will always produce the same outputs given the same inputs. Bioinformatics is saturated with poorly written, undocumented programs, and my goal is to reverse this trend, one program at a time. The criteria for program reproducibility include: Parameters All program parameters can be set as runtime arguments. This means no hard- coded values which would require changing the source code to change the pro‐ gram’s behavior. Documentation A program should respond to a --help argument by printing the parameters and usage. Testing You should be able to run a test suite that proves the code meets some specifications You might expect that this would logically lead to programs that are perhaps correct, but alas, as Edsger Dijkstra famously said, “Program testing can be used to show the presence of bugs, but never to show their absence!” Most bioinformaticians are either scientists who’ve learned programming or pro‐ grammers who’ve learned biology (or people like me who had to learn both). No mat‐ ter how you’ve come to the field of bioinformatics, I want to show you practical programming techniques that will help you write correct programs quickly. I’ll start with how to write programs that document and validate their arguments. Then I’ll show how to write and run tests to ensure the programs do what they purport. For instance, the first chapter shows you how to report the tetranucleotide frequency from a string of DNA. Sounds pretty simple, right? It’s a trivial idea, but I’ll take about 40 pages to show how to structure, document, and test this program. I’ll spend a lot xii | Preface

of time on how to write and test several different versions of the program so that I can explore many aspects of Python data structures, syntax, modules, and tools. Who Should Read This? You should read this book if you care about the craft of programming, and if you want to learn how to write programs that produce documentation, validate their parameters, fail gracefully, and work reliably. Testing is a key skill both for under‐ standing your code and for verifying its correctness. I’ll show you how to use the tests I’ve written as well as how to write tests for your programs. To get the most out of this book, you should already have a solid understanding of Python. I will build on the skills I taught in Tiny Python Projects (Manning, 2020), where I show how to use Python data structures like strings, lists, tuples, dictionaries, sets, and named tuples. You need not be an expert in Python, but I definitely will push you to understand some advanced concepts I introduce in that book, such as types, regular expressions, and ideas about higher-order functions, along with testing and how to use tools like pylint, flake8, yapf, and pytest to check style, syntax, and correctness. One notable difference is that I will consistently use type annotations for all code in this book and will use the mypy tool to ensure the correct use of types. Programming Style: Why I Avoid OOP and Exceptions I tend to avoid object-oriented programming (OOP). If you don’t know what OOP means, that’s OK. Python itself is an OO language, and almost every element from a string to a set is technically an object with internal state and methods. You will encounter enough objects to get a feel for what OOP means, but the programs I present will mostly avoid using objects to represent ideas. That said, Chapter 1 shows how to use a class to represent a complex data structure. The class allows me to define a data structure with type annotations so that I can verify that I’m using the data types correctly. It does help to understand a bit about OOP. For instance, classes define the attributes of an object, and classes can inherit attributes from parent classes, but this essentially describes the limits of how and why I use OOP in Python. If you don’t entirely follow that right now, don’t worry. You’ll understand it once you see it. Instead of object-oriented code, I demonstrate programs composed almost entirely of functions. These functions are also pure in that they will only act on the values given to them. That is, pure functions never rely on some hidden, mutable state like global variables, and they will always return the same values given the same arguments. Additionally, every function will have an associated test that I can run to verify it behaves predictably. It’s my opinion that this leads to shorter programs that are more transparent and easier to test than solutions written using OOP. You may disagree Preface | xiii

1 Named for Rosalind Franklin, who should have received a Nobel Prize for her contributions to discovering the structure of DNA. and are of course welcome to write your solutions using whatever style of program‐ ming you prefer, so long as they pass the tests. The Python Functional Programming HOWTO documentation makes a good case for why Python is suited for functional programming (FP). Finally, the programs in this book also avoid the use of exceptions, which I think is appropriate for short programs you write for personal use. Managing exceptions so that they don’t interrupt the flow of a program adds another level of complexity that I feel detracts from one’s ability to understand a program. I’m generally unhappy with how to write functions in Python that return errors. Many people would raise an exception and let a try/catch block handle the mistakes. If I feel an exception is war‐ ranted, I will often choose to not catch it, instead letting the program crash. In this respect, I’m following an idea from Joe Armstrong, the creator of the Erlang lan‐ guage, who said, “The Erlang way is to write the happy path, and not write twisty lit‐ tle passages full of error correcting code.” If you choose to write programs and modules for public release, you will need to learn much more about exceptions and error handling, but that’s beyond the scope of this book. Structure The book is divided into two main parts. The first part tackles 14 of the programming challenges found at the Rosalind.info website.1 The second part shows more compli‐ cated programs that demonstrate other patterns or concepts I feel are important in bioinformatics. Every chapter of the book describes a coding challenge for you to write and provides a test suite for you to determine when you’ve written a working program. Although the “Zen of Python” says “There should be one—and preferably only one— obvious way to do it,” I believe you can learn quite a bit by attempting many different approaches to a problem. Perl was my gateway into bioinformatics, and the Perl com‐ munity’s spirit of “There’s More Than One Way To Do It” (TMTOWTDI) still reso‐ nates with me. I generally follow a theme-and-variations approach to each chapter, showing many solutions to explore different aspects of Python syntax and data struc‐ tures. xiv | Preface

Test-Driven Development More than the act of testing, the act of designing tests is one of the best bug preventers known. The thinking that must be done to create a useful test can discover and elimi‐ nate bugs before they are coded—indeed, test-design thinking can discover and elimi‐ nate bugs at every stage in the creation of software, from conception to specification, to design, coding, and the rest. —Boris Beizer, Software Testing Techniques (Thompson Computer Press) Underlying all my experimentation will be test suites that I’ll constantly run to ensure the programs continue to work correctly. Whenever I have the opportunity, I try to teach test-driven development (TDD), an idea explained in a book by that title written by Kent Beck (Addison-Wesley, 2002). TDD advocates writing tests for code before writing the code. The typical cycle involves the following: 1. Add a test. 2. Run all tests and see if the new test fails. 3. Write the code. 4. Run tests. 5. Refactor code. 6. Repeat. In the book’s GitHub repository, you’ll find the tests for each program you’ll write. I’ll explain how to run and write tests, and I hope by the end of the material you’ll believe in the common sense and basic decency of using TDD. I hope that thinking about tests first will start to change the way you understand and explore coding. Using the Command Line and Installing Python My experience in bioinformatics has always been centered around the Unix com‐ mand line. Much of my day-to-day work has been on some flavor of Linux server, stitching together existing command-line programs using shell scripts, Perl, and Python. While I might write and debug a program or a pipeline on my laptop, I will often deploy my tools to a high-performance compute (HPC) cluster where a schedu‐ ler will run my programs asynchronously, often in the middle of the night or over a weekend and without any supervision or intervention by me. Additionally, all my work building databases and websites and administering servers is done entirely from the command line, so I feel strongly that you need to master this environment to be successful in bioinformatics. I used a Macintosh to write and test all the material for this book, and macOS has the Terminal app you can use for a command line. I have also tested all the programs Preface | xv

using various Linux distributions, and the GitHub repository includes instructions on how to use a Linux virtual machine with Docker. Additionally, I tested all the pro‐ grams on Windows 10 using the Ubuntu distribution Windows Subsystem for Linux (WSL) version 1. I highly recommend WSL for Windows users to have a true Unix command line, but Windows shells like cmd.exe, PowerShell, and Git Bash can some‐ times work sufficiently well for some programs. I would encourage you to explore integrated development environments (IDEs) like VS Code, PyCharm, or Spyder to help you write, run, and test your programs. These tools integrate text editors, help documentation, and terminals. Although I wrote all the programs, tests, and even this book using the vim editor in a terminal, most peo‐ ple would probably prefer to use at least a more modern text editor like Sublime, TextMate, or Notepad++. I wrote and tested all the examples using Python versions 3.8.6 and 3.9.1. Some examples use Python syntax that was not present in version 3.6, so I would recom‐ mend you not use that version. Python version 2.x is no longer supported and should not be used. I tend to get the latest version of Python 3 from the Python download page, but I’ve also had success using the Anaconda Python distribution. You may have a package manager like apt on Ubuntu or brew on Mac that can install a recent version, or you may choose to build from source. Whatever your platform and instal‐ lation method, I would recommend you try to use the most recent version as the lan‐ guage continues to change, mostly for the better. Note that I’ve chosen to present the programs as command-line programs and not as Jupyter Notebooks for several reasons. I like Notebooks for data exploration, but the source code for Notebooks is stored in JavaScript Object Notation (JSON) and not as line-oriented text. This makes it very difficult to use tools like diff to find the differ‐ ences between two Notebooks. Also, Notebooks cannot be parameterized, meaning I cannot pass in arguments from outside the program to change the behavior but instead have to change the source code itself. This makes the programs inflexible and automated testing impossible. While I encourage you to explore Notebooks, espe‐ cially as an interactive way to run Python, I will focus on how to write command-line programs. Getting the Code and Tests All the code and tests are available from the book’s GitHub repository. You can use the program Git (which you may need to install) to copy the code to your computer with the following command. This will create a new directory called biofx_python on your computer with the contents of the repository: $ git clone https://github.com/kyclark/biofx_python xvi | Preface

If you enjoy using an IDE, it may be possible to clone the repository through that interface, as shown in Figure P-1. Many IDEs can help you manage projects and write code, but they all work differently. To keep things simple, I will show how to use the command line to accomplish most tasks. Figure P-1. The PyCharm tool can directly clone the GitHub repository for you Some tools, like PyCharm, may automatically try to create a virtual environment inside the project directory. This is a way to insulate the version of Python and modules from other projects on your computer. Whether or not you use virtual environments is a per‐ sonal preference. It is not a requirement to use them. You may prefer to make a copy of the code in your own account so that you can track your changes and share your solutions with others. This is called forking because you’re breaking off from my code and adding your programs to the repository. To fork my GitHub repository, do the following: 1. Create an account on GitHub.com. 2. Go to https://github.com/kyclark/biofx_python. 3. Click the Fork button in the upper-right corner (see Figure P-2) to make a copy of the repository in your account. Preface | xvii

Figure P-2. The Fork button on my GitHub repository will make a copy of the code in your account Now that you have a copy of all my code in your repository, you can use Git to copy that code to your computer. Be sure to replace YOUR_GITHUB_ID with your actual Git‐ Hub ID: $ git clone https://github.com/YOUR_GITHUB_ID/biofx_python I may update the repo after you make your copy. If you would like to be able to get those updates, you will need to configure Git to set my repository as an upstream source. To do so, after you have cloned your repository to your computer, go into your biofx_python directory: $ cd biofx_python Then execute this command: $ git remote add upstream https://github.com/kyclark/biofx_python.git Whenever you would like to update your repository from mine, you can execute this command: $ git pull upstream main Installing Modules You will need to install several Python modules and tools. I’ve included a require‐ ments.txt file in the top level of the repository. This file lists all the modules needed to run the programs in the book. Some IDEs may detect this file and offer to install these for you, or you can use the following command: $ python3 -m pip install -r requirements.txt xviii | Preface

Statistics

Uploader

Mastering Python for Bioinformatics How to Write Flexible, Documented, Tested Python Code for Research Computing (Ken Youens-Clark) (Z-Library)

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Comments 0

Recommended for You

Statistics

Uploader

Mastering Python for Bioinformatics How to Write Flexible, Documented, Tested Python Code for Research Computing (Ken Youens-Clark) (Z-Library)

Tags

Text Preview (First 20 pages)

Registered users can read the full content for free

Comments 0

Reply to Comment

Edit Comment

Recommended for You