Statistics
20
Views
0
Downloads
0
Donations
Uploader

高宏飞

Shared on 2025-12-22
Support
Share

AuthorSebastian Raschka

Machine learning and AI are moving at a rapid pace. Researchers and practitioners are constantly struggling to keep up with the breadth of concepts and techniques. This book provides bite-sized nuggets for your journey from machine learning beginner to expert, covering topics from various machine learning areas. Even experienced machine learning researchers and practitioners will encounter something new that they can add to their arsenal of techniques. The book is structured into five main chapters: Chapter 1, Deep Learning and Neural Networks covers questions about deep neural networks and deep learning that are not specific to a particular subdomain. For example, we discuss alternatives to supervised learning and techniques for reducing overfitting. Chapter 2, Computer Vision focuses on topics mainly related to deep learning but are specific to computer vision, many of which cover convolutional neural networks and vision transformers. Chapter 3, Natural Language Processing covers topics around working with text, many of which are related to transformer architectures and self-attention. Chapter 4, Production, Real-World, And Deployment Scenarios contains questions pertaining to practical scenarios, such as increasing inference speeds and various types of distribution shifts. Chapter 5, Predictive Performance and Model Evaluation dives a bit deeper into various aspects of squeezing out predictive performance, for example, changing the loss function, setting up k-fold cross-validation, and dealing with limited labeled data. It is for readers and machine learning practitioners who want to advance their understanding and learn about useful techniques that I consider significant and relevant but often overlooked in traditional and introductory textbooks and classes.

Tags
No tags
Publisher: leanpub.com
Publish Year: 2023
Language: 英文
Pages: 231
File Format: PDF
File Size: 12.9 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

(This page has no text content)
Machine Learning Q and AI Expand Your Machine Learning & AI Knowledge With 30 In-Depth Questions and Answers Sebastian Raschka, PhD This book is for sale at http://leanpub.com/machine-learning-q-and-ai This version was published on 2023-05-21 This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and many iterations to get reader feedback, pivot until you have the right book and build traction once you do. © 2016 - 2023 Sebastian Raschka, PhD
This book is dedicated to those who tirelessly contribute to advanc- ing the field of machine learning through research and development. Your passion for discovery and innovation and your commitment to sharing knowledge and resources through the open-source commu- nity is an inspiration to us all.
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Who Is This Book For? . . . . . . . . . . . . . . . . . . . . 2 What Will You Get Out of This Book? . . . . . . . . . . . 3 How To Read This Book . . . . . . . . . . . . . . . . . . . 4 Discussion Forum . . . . . . . . . . . . . . . . . . . . . . 6 Sharing Feedback and Supporting This Book . . . . . . . 7 Acknowledgements . . . . . . . . . . . . . . . . . . . . . 8 About the Author . . . . . . . . . . . . . . . . . . . . . . . 9 Copyright and Disclaimer . . . . . . . . . . . . . . . . . . 10 Credits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter 1. Neural Networks and Deep Learning . . . . . . 13 Q1. Embeddings, Representations, and Latent Space . . . 14 Q2. Self-Supervised Learning . . . . . . . . . . . . . . . . 18 Q3. Few-Shot Learning . . . . . . . . . . . . . . . . . . . 25 Q4. The Lottery Ticket Hypothesis . . . . . . . . . . . . . 29 Q5. Reducing Overfitting with Data . . . . . . . . . . . . 32 Q6. Reducing Overfitting with Model Modifications . . . 37 Q7. Multi-GPU Training Paradigms . . . . . . . . . . . . 45 Q8. The Keys to Success of Transformers . . . . . . . . . 52 Q9. Generative AI Models . . . . . . . . . . . . . . . . . . 57 Q10. Sources of Randomness . . . . . . . . . . . . . . . . 70 Chapter 2. Computer Vision . . . . . . . . . . . . . . . . . . 79
CONTENTS Q11. Calculating the Number of Parameters . . . . . . . 80 Q12. The Equivalence of Fully Connected and Convolu- tional Layers . . . . . . . . . . . . . . . . . . . . . 85 Q13. Large Training Sets for Vision Transformers . . . . 89 Chapter 3. Natural Language Processing . . . . . . . . . . 98 Q15. The Distributional Hypothesis . . . . . . . . . . . . 99 Q16. Data Augmentation for Text . . . . . . . . . . . . . 104 Q17. “Self”-Attention . . . . . . . . . . . . . . . . . . . . . 111 Q18. Encoder- And Decoder-Style Transformers . . . . . 116 Q19. Using and Finetuning Pretrained Transformers . . . 126 Q20. Evaluating Generative Language Models . . . . . . 141 Chapter 4. Production, Real-World, And Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Q21. Stateless And Stateful Training . . . . . . . . . . . . 153 Q22. Data-Centric AI . . . . . . . . . . . . . . . . . . . . . 156 Q23. Speeding Up Inference . . . . . . . . . . . . . . . . . 160 Chapter 5. Predictive Performance and Model Evaluation 167 Q25. Poisson and Ordinal Regression . . . . . . . . . . . 168 Q27. Proper Metrics . . . . . . . . . . . . . . . . . . . . . 170 Q28. The k in k-fold cross-validation . . . . . . . . . . . 175 Q29. Training and Test Set Discordance . . . . . . . . . . 179 Q30. Limited Labeled Data . . . . . . . . . . . . . . . . . 182 Afterword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Appendix A: Reader Quiz Solutions . . . . . . . . . . . . . 197 Appendix B: List of Questions . . . . . . . . . . . . . . . . . 223
Preface Over the years, I shared countless educational nuggets about ma- chine learning and deep learning with hundreds of thousands of people. The positive feedback has been overwhelming, and I continue to receive requests for more. So, in this book, I want to indulge both your desire to learn and my passion for writing about machine learning¹. ¹I will use machine learning as an umbrella term for machine learning, deep learning, and artificial intelligence.
Who Is This Book For? 2 Who Is This Book For? This book is for people with a beginner or intermediate background in machine learning who want to learn something new. This book will expose you to new concepts and ideas if you are already familiar with machine learning. However, it is not a math or coding book. You won’t need to solve any proofs or run any code while reading. In other words, this book is a perfect travel companion or something you can read on your favorite reading chair with your morning coffee.
What Will You Get Out of This Book? 3 What Will You Get Out of This Book? Machine learning and AI are moving at a rapid pace. Researchers and practitioners are constantly struggling to keep up with the breadth of concepts and techniques. This book provides bite-sized nuggets for your journey from machine learning beginner to ex- pert, covering topics from various machine learning areas. Even experienced machine learning researchers and practitioners will encounter something new that they can add to their arsenal of techniques.
How To Read This Book 4 How To Read This Book The questions in this book are mainly independent, and you can read them in any desired order. You can also skip individual questions for the most part. However, I organized the questions to bring more structure to the book. For instance, the first question deals with embeddings, which we refer to in later questions on self-supervised learning and few- shot learning. Therefore, I recommend reading the questions in sequence. The book is structured into five main chapters to provide additional structure. However, many questions could appear in different chapters without affecting the flow. Chapter 1, Deep Learning and Neural Networks covers questions about deep neural networks and deep learning that are not specific to a particular subdomain. For example, we discuss alternatives to supervised learning and techniques for reducing overfitting. Chapter 2, Computer Vision focuses on topics mainly related to deep learning but are specific to computer vision, many of which cover convolutional neural networks and vision transformers. Chapter 3, Natural Language Processing covers topics around working with text, many of which are related to transformer architectures and self-attention. Chapter 4, Production, Real-World, And Deployment Scenarios con- tains questions pertaining to practical scenarios, such as increasing inference speeds and various types of distribution shifts. Chapter 5, Predictive Performance and Model Evaluation dives a bit deeper into various aspects of squeezing out predictive perfor- mance, for example, changing the loss function, setting up k-fold cross-validation, and dealing with limited labeled data. If you are not reading this book for entertainment but for machine
How To Read This Book 5 learning interview preparation, you may prefer a spoiler-free look at the questions to quiz yourself before reading the answers. In this case, you can find a list of all questions, without answers, in the appendix.
Discussion Forum 6 Discussion Forum The best way to ask questions about the book is the discussion forum at https://community.leanpub.com/c/machine-learning-q-a². Please feel free to ask anything about the book, share your thoughts, or just introduce yourself! ²https://community.leanpub.com/c/machine-learning-q-a
Sharing Feedback and Supporting This Book 7 Sharing Feedback and Supporting This Book I enjoy writing, and it is my pleasure to share this knowledge with you. If you obtained a free copy and like this book, you can support me by buying a digital copy on Leanpub at https://leanpub.com/machine-learning-q-and-ai/³. For an author, there is nothing more valuable than your honest feedback. I would really appreciate hearing from you and appreci- ate any reviews! And, of course, I would be more than happy if you recommend this book to your friends and colleagues or share some nice words on your social channels. ³https://leanpub.com/machine-learning-q-and-ai/
Acknowledgements 8 Acknowledgements Writing a book is an enormous undertaking. This project would not have been possible without the help of the open source andmachine learning communities who collectively created the technologies that this book is about. Moreover, I want to thank everyone who encouraged me to share my flashcard decks, as this book is an improved and polished version of these. I also want to thank the following readers for helpful feedback on the manuscript: • Anton Reshetnikov for suggesting a cleaner layout for the supervised learning flowchart in Q30.
About the Author 9 About the Author Sebastian Raschka is a machine learning and AI researcher with a strong passion for education. As Lead AI Educator at Lightning AI, he is excited about making AI and deep learning more accessible and teaching people how to utilize these technologies at scale. Before dedicating his time fully to Light- ning AI, Sebastian held a position as Assistant Professor of Statistics at the University of Wisconsin- Madison, where he specialized in researching deep learning and machine learning. You can find out more about his research on his website⁴. Moreover, Sebastian loves open-source software and has been a passionate contributor for over a decade. Next to coding, he also loves writing and authored the bestselling Python Machine Learning book and Machine Learning with PyTorch and Scikit- Learn. If you like to find out more about Sebastian and what he is currently up to, please visit his personal website at https://sebastianraschka.com. You can also find Sebastian on Twitter (@rasbt)⁵ and LinkedIn (sebastianraschka)⁶. ⁴https://sebastianraschka.com/publications/ ⁵https://twitter.com/rasbt ⁶https://www.linkedin.com/in/sebastianraschka
Copyright and Disclaimer 10 Copyright and Disclaimer Machine Learning Q and AI by Sebastian Raschka Copyright © 2023 Sebastian Raschka. All rights reserved. No part of this book may be reproduced or transmitted in any form or by anymeans, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the author. The information contained within this book is strictly for educa- tional purposes. If you wish to apply ideas contained in this book, you are taking full responsibility for your actions. The author has made every effort to ensure the accuracy of the information within this book was correct at time of publication. The author does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause.
Credits 11 Credits Cover image by ECrafts / stock.adobe.com.
Introduction Thanks to rapid advancements in deep learning, we have seen a significant expansion of machine learning and AI in recent years. On the one hand, this rapid progress is exciting if we expect these advancements to create new industries, transform existing ones, and improve the quality of life for people around the world. On the other hand, the rapid emergence of new techniques can make it challenging to keep up, and keeping up can be a very time- consuming process. Nonetheless, staying current with the latest developments in AI and deep learning is essential for professionals and organizations that use these technologies. With this in mind, I began writing this book in the summer of 2022 as a resource for readers and machine learning practitioners who want to advance their understanding and learn about useful tech- niques that I consider significant and relevant but often overlooked in traditional and introductory textbooks and classes. I hope readers will find this book a valuable resource for obtaining new insights and discovering new techniques they can implement in their work. Happy learning, Sebastian
Chapter 1. Neural Networks and Deep Learning
Q1. Embeddings, Representations, and Latent Space 14 Q1. Embeddings, Representations, and Latent Space > Q: In deep learning, we often use the terms embedding vectors, rep- resentations, and latent space. What do these concepts have in common, and how do they differ? > A: While all three concepts, embedding vectors, vectors in latent space, and representations, are often used synonymously, we can make slight distinctions: • representations are encoded versions of the original input; • latent vectors are intermediate representations; • embedding vectors are representations where similar items are close to each other. Embeddings Embedding vectors, or embeddings for short, encode relatively high-dimensional data into relatively low-dimensional vectors. We can apply embedding methods to create a continuous dense (non-sparse) vector from a one-hot encoding. However, we can also use embedding methods for dense data such as images. For example, the last layers of a convolutional neural network may yield embedding vectors, as illustrated in the figure below⁷. ⁷Technically, all intermediate layer outputs of a neural network could yield embedding vectors. Depending on the training objective, the output layer may also produce useful embed- ding vectors. For simplicity, the convolutional neural network figure above only associates the second-last layer with embeddings.
Q1. Embeddings, Representations, and Latent Space 15 Figure 1.1. An input embedding (left) and an embedding from a neural network (right). Taking it to the extreme, embedding methods can be used to encode data into two-dimensional dense and continuous representations for visualization purposes and clustering analysis, as illustrated in the figure below. Figure 1.2. Mapping words (left) and images (right) to a two-dimensional feature space. A fundamental property of embeddings is that they encode distance or similarity. This means that embeddings capture the semantics of the data such that similar inputs are close in the embeddings space. Latent space
The above is a preview of the first 20 pages. Register to read the complete e-book.