Statistics
1
Views
0
Downloads
0
Donations
Support
Share
Uploader

高宏飞

Shared on 2026-03-18

AuthorMuppala, Sireesha, DeFauw, Randy, Sojoodi, Sina, & Randy DeFauw & Sina Sojoodi

No description

Tags
No tags
Publisher: BPB Publications
Publish Year: 2023
Language: 英文
File Format: PDF
File Size: 15.2 MB
Support Statistics
¥.00 · 0times
Text Preview (First 20 pages)
Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

(This page has no text content)
(This page has no text content)
Generative AI for Cloud Solutions Building end-to-end generative AI stacks with cloud services and model orchestration pipelines Sireesha Muppala Randy DeFauw Sina Sojoodi www.bpbonline.com
First Edition 2025 Copyright © BPB Publications, India ISBN: 978-93-65891-454 All Rights Reserved. No part of this publication may be reproduced, distributed or transmitted in any form or by any means or stored in a database or retrieval system, without the prior written permission of the publisher with the exception to the program listings which may be entered, stored and executed in a computer system, but they can not be reproduced by the means of publication, photocopy, recording, or by any electronic and mechanical means. LIMITS OF LIABILITY AND DISCLAIMER OF WARRANTY The information contained in this book is true to correct and the best of author’s and publisher’s knowledge. The author has made every effort to ensure the accuracy of these publications, but publisher cannot be held responsible for any loss or damage arising from any information in this book. All trademarks referred to in the book are acknowledged as properties of their respective owners but BPB Publications cannot guarantee the accuracy of this information.
www.bpbonline.com
Dedicated to My dad, Chandra Shekhar Raju – Sireesha Muppala My wife and sons – Randy DeFauw My parents, Gholamreza Sojoodi and Nahideh Shojaei – Sina Sojoodi
• About the Authors Sireesha Muppala is a visionary technology leader with over 25 years of experience driving digital transformation and unlocking the potential of cutting- edge innovations. As a Senior Solutions Architecture Leader at Amazon Web Services (AWS), Sireesha spearheads efforts to empower businesses with the transformative power of generative AI and other emerging technologies. She currently leads a team of dedicated technologists at AWS, focused on partnering with leading organizations in the automotive and manufacturing sectors to unlock the full potential of generative AI and other cutting-edge cloud-based technologies. Sireesha's remarkable career has been fueled by her passion for technology and her keen ability to identify disruptive trends. Sireesha received her Ph.D. in computer science from the University of Colorado, Colorado Springs. Prior to joining AWS, she held leadership roles at organizations of all sizes across diverse industries, including Oracle, Blackhawk Network, and Primer AI, where she was instrumental in developing and implementing groundbreaking solutions that addressed complex business challenges. Sireesha's technical expertise is complemented by her strategic vision and collaborative approach, which have earned her a reputation as a trusted advisor to
• C-suite executives and industry leaders. An accomplished author and sought-after speaker, Sireesha has authored numerous research papers, white papers, and the acclaimed book "Amazon SageMaker Best Practices " . She frequently shares her insights on the transformative power of emerging technologies at industry conferences and events, inspiring audiences with her deep understanding of the evolving technology landscape and its impact on the future of business. Beyond her professional achievements, Sireesha is a passionate advocate for STEM education and empowering underrepresented communities in the technology sector. She is a co-founder of the Denver chapter of women in big data, where she leverages her expertise to create meaningful change and inspire the next generation of innovators. With her unwavering commitment to innovation and her ability to translate complex technical concepts into practical business solutions, Sireesha Muppala is poised to continue leading the charge in the ever- evolving world of emerging technologies at AWS, driving transformative change and unlocking new frontiers of growth for organizations. Randy DeFauw is a technologist at heart. While majoring in electrical engineering at the University of Michigan, he took part in an autonomous vehicle research program. That started his journey into signal processing and what we now call computer vision. Randy applied his image processing expertise on a major defense program in San Diego, before deciding to pursue more customer-facing roles. Randy worked in consulting, product management, and architecture for several years. In the course of
• that journey, he worked at a startup that was making distributed storage technology for big data systems. That role reignited his passion for working with data, and also gave him a solid grounding in modern distributed system design. At this point he picked up an MBA to give him the business grounding to go along with his two electrical engineering degrees. He then led a private cloud architecture team on a multi- year project. The intersection of data and cloud was the ideal spot for him, as the cloud unlocked so much potential for data and ML projects. Randy then moved to AWS, where he’s helped customers architect a variety of cloud applications. He focuses on the data ecosystem, from big data to machine learning and now generative AI. Randy likes to focus on the unsolved problem areas, like LLM interpretation and solving operations research problems through reinforcement learning. He publishes frequently in various channels, and often authors sample code published on GitHub. He’s a trusted technical advisor to customers and still a hands-on practitioner. Since his days as a teaching assistant in university, Randy never lost his enthusiasm for helping others learn. He works closely with early career colleagues, mentors at the high school and college level, and is an advisor to a Colorado K-12 education program on STEM initiatives. Sina Sojoodi is a technology executive and innovator with nearly two decades of experience driving digital transformation across global enterprises. As the co- founder and CTO of 8090, he leads the development of AI-powered software solutions. These solutions target improving business efficiency by 80% and
reducing costs by 90%. His entrepreneurial achievements include founding and being the founding team member of multiple ventures, leading to two M&A exits and an IPO. He holds patents in push notification activation, dynamic rendering for software applications, and image annotation systems. At (AWS), Sina advised Fortune 500 companies, government agencies, and leading technology startups. He implemented AI solutions across regulated and unregulated environments. His work spans speech recognition, conversational AI, MLOps, and software transformation initiatives. He serves as a board and executive advisor to B2B software companies and startups, guiding their integration of generative AI into core products and services. Prior to AWS, Sina served as a Global Field CTO for application and data architecture at VMware, advising C-level executives on cloud and application modernization strategies. He joined VMware through the Pivotal Software acquisition. At Pivotal, he served in multiple roles including Field CTO and Director of Product and Engineering. He led R&D post-M&A strategies by integrating his team at Xtreme Labs with the Pivotal Cloud Foundry R&D team and expanding Pivotal’s platform with innovative mobile-backend-as- service capabilities. Sina holds a bachelor's degree from the University of Waterloo, where he studied electrical engineering and psychology. This combination shapes his approach to building human-centric technology solutions. His understanding of technical implementation and business strategy makes him a valued advisor for organizations navigating AI adoption and digital transformation.
Through his work at 8090, Sina continues to shape the future of enterprise technology. He focuses on helping organizations achieve new levels of efficiency and cost optimization. His practical approach to AI implementation guides organizations in their transformation and modernization efforts.
❖ ❖ About the Reviewers Pronnoy Goswami is a seasoned software engineer specializing in distributed systems, cloud infrastructure, and observability at scale and he is currently employed at Workday. Previously, at Microsoft Azure Compute, he played a key role in building a low-latency, distributed control plane, enabling seamless container lifecycle management across one of the world’s largest cloud platforms. With a strong foundation in DevOps, AI-driven automation, and large-scale optimization, Pronnoy thrives on designing high-performance systems that push the boundaries of scalability and reliability. His expertise spans Kubernetes, ELK, Prometheus, and Terraform, alongside a deep passion for mentoring, technical writing, and shaping the future of cloud computing. Beyond his industry work, Pronnoy is an active technical reviewer, contributing to books and research in cloud computing, AI, and distributed architectures. He holds a Master’s degree in Computer Engineering from Virginia Tech and enjoys reading and hiking in his free time. Vasanthi Govindaraj is a distinguished AI, cloud, and mainframe modernization expert with over 18 years of experience in enterprise technology transformation, artificial intelligence, and system optimization. She specializes in hybrid cloud integration, AI-driven automation, and mainframe modernization, working extensively with Azure, DB2,
❖ and large-scale system migrations. Vasanthi has played a pivotal role in modernizing legacy systems across the financial, insurance, and healthcare industries, leading critical projects such as the BANA Migration to MSP Platform and CDHP Lumenos Integration at WellPoint. Her expertise spans enterprise architecture, AI-powered analytics, and cloud-mainframe integration, ensuring operational efficiency, compliance, and scalability. Beyond her industry contributions, Vasanthi is deeply involved in research and has published peer-reviewed papers on AI and ML applications. She actively serves as a judge in hackathons and AI competitions, recognizing groundbreaking innovations. She also won second place in a competitive AI hackathon, showcasing her expertise in model development. As a reviewer and session chair for IEEE conferences, and a mentor in AI and mainframe transformation, she remains at the forefront of research and enterprise modernization. Passionate about leveraging AI, cloud technologies, and automation, she continues to drive next-generation digital transformation, making her a recognized leader in enterprise IT innovation. Vineet Jaiswal is the Vice President of generative AI at one of India’s largest banks. He is a distinguished leader with over 17 years of experience in AI, Machine Learning, and software development. Recognized for his deep expertise across all major cloud platforms, he has made significant contributions to generative AI, deep learning, computer vision, MLOps, and backend technologies. His work spans collaborations with multiple Fortune 500 clients, driving innovation and scalable solutions. In addition to his professional achievements, he holds a master’s degree and has
completed various certifications, further solidifying his expertise in the field.
Acknowledgements We would like to express our sincere gratitude to all those who contributed to the completion of this book. First and foremost, we extend our heartfelt appreciation to our family and friends for their unwavering support and encouragement throughout this journey. Their love and encouragement have been a constant source of motivation. We are immensely grateful to BPB Publications for their guidance and expertise in bringing this book to fruition. Their support and assistance were invaluable in navigating the complexities of the publishing process. We would also like to acknowledge the reviewers, technical experts, and editors who provided valuable feedback and contributed to the refinement of this manuscript. Their insights and suggestions have significantly enhanced the quality of the book. Last but not least, we want to express our gratitude to the readers who have shown interest in our book. Your support and encouragement have been deeply appreciated. Thank you to everyone who has played a part in making this book a reality.
Preface The rapid evolution of generative AI has ushered in a transformative era in technology, reshaping how organizations approach problem-solving across industries. This book arrives at a crucial moment when organizations worldwide are seeking to harness the power of generative AI while navigating its complexities and challenges. As technology professionals witnessing the unprecedented growth of AI capabilities, we recognized the need for a comprehensive guide that bridges the gap between theoretical understanding and practical implementation of generative AI solutions in cloud environments. This book is the result of our collective experience working with diverse organizations, from startups to Fortune 50 companies, in implementing AI solutions at scale. We have structured this book into four main sections, carefully designed to take readers through the complete journey of building and deploying generative AI solutions. Beginning with the foundations of cloud computing and the evolution of generative AI, we progress through practical aspects of implementation, including prompt engineering, model fine-tuning, and the critical considerations of security and governance. Our approach combines conceptual discussions with functional examples, using (AWS) as our primary platform to demonstrate real-world applications. We've included
working code examples throughout the book, ensuring readers can immediately apply what they learn. Special attention has been paid to crucial aspects such as responsible AI implementation, security considerations, and operational best practices. This book is written for cloud architects, data analysts, data scientists, and AI professionals who need to understand and implement generative AI solutions in production environments. While basic knowledge of cloud computing and machine learning is helpful, we have ensured the content is accessible to those new to generative AI while still providing advanced insights for experienced practitioners. Each chapter builds upon the previous one, creating a comprehensive learning journey that covers everything from basic concepts to advanced topics like Retrieval Augmented Generation (RAG), agentic workflows, and the future trends shaping this field. We've included practical examples, case studies, and best practices drawn from real- world implementations to help readers understand not just the how but also the why behind each concept. We hope this book serves as both a practical guide and a reference for professionals working to implement generative AI solutions in the cloud. Our goal is to empower readers with the knowledge and tools they need to build secure, scalable, and responsible generative AI applications that drive real business value. Whether you are just beginning your journey with generative AI or looking to enhance your existing knowledge, we trust this book will serve as a valuable resource in your professional development.
Chapter 1: Cloud Computing - This chapter provides foundational knowledge of cloud computing. Concepts of shared infrastructure, elasticity and pay-per-use pricing model will be introduced along with a discussion on various deployment models. Considerations for leveraging cloud platforms, such as availability, reliability, latency, and data privacy/security, will be explored. For readers who are new to cloud computing, this chapter will introduce the foundational concepts of cloud and when to leverage various types of cloud service and deployment models. For readers already familiar with cloud computing, this chapter will serve as a review of these concepts. Chapter 2: Evolution of Generative AI - This chapter will provide an overview of the key milestones and innovations that have propelled the evolution of generative AI, tracing its roots in natural language processing to the key transformer breakthroughs. Readers will gain insight into how the AI field evolved into modern-day generative AI through a combination of theoretical discussion and practical results from powerful foundational models underlying generative AI. The chapter will further touch on potential pitfalls and what’s coming next in the field. Chapter 3: Cloud Computing and Generative AI - This chapter explores how cloud computing enables new opportunities for generative AI applications and services. As cloud platforms provide vast amounts of compute resources on demand, they allow AI techniques like deep learning to be applied at large scale. We discuss how generative AI as a service running in the cloud can democratize access to powerful AI technologies. The chapter will discuss how cloud services like compute instances, storage, databases, and serverless functions
support building and deploying AI models. Readers will get a taste of a few popular generative AI applications hosted on the cloud. Chapter 4: Generative AI Stack - Generative AI model is just one component of a complete solution. This chapter will present a full-stack generative AI solution design. A typical technology stack needed to implement and run generative AI applications in a cloud environment is introduced. This chapter will provide architects and developers a foundation for understanding the different layers involved in building and operating generative AI solutions at scale in the cloud. Readers will further gain insight into the skill sets and tools to be considered, from monitoring to security to model playgrounds. Chapter 5: Design Components, Model Selection, Evaluation, and Model Playgrounds - This chapter discusses key components to consider when designing generative AI systems for cloud-based solutions. It covers choosing the optimal AI model based on business requirements, capabilities, data availability, and other factors. The chapter also explores techniques for evaluating generative models, such as accuracy metrics, human evaluation, and model comparisons. In addition, it discusses model playgrounds - online environments that allow users to interact with and experiment with different generative models before selecting one for production. Chapter 6: Prompt Engineering - Prompt engineering is the art and science of crafting prompts that elicit coherent, truthful, and helpful responses from generative models. This chapter will discuss best practices for prompt construction for performance and for security. Readers will
be guided through techniques for prompt augmentation, and calibration so they can steer the outputs from generative AI applications. Chapter 7: Retrieval Augmented Generation - After gaining expertise with Prompt Engineering, readers will use this chapter to customize the foundation models using their own data. This chapter will start looking at integrating data sources through the Retrieval Augmented Generation (RAG) architecture, including vector databases and data ingest. RAG models leverage a retriever to access external documents that help provide relevant context and examples to supplement the generative capabilities of the language model. Chapter 8: Advanced Model Fine-tuning Techniques - This chapter will cover advanced techniques for fine-tuning generative AI models, which allows the models to adapt their abilities while retaining their pre-trained knowledge. Discussion will include efficient task-specific fine-tuning using PEFT techniques, as well as integrating agents and tools. Readers will explore a hands-on example and fine- tune a pre-trained foundation on the AWS cloud. Chapter 9: Model Hosting and Application Frameworks - As generative models become more advanced, frameworks are needed to safely and securely serve models at scale. Going beyond the simple development deployments, this chapter will start to look at the components needed to run generative AI solutions in production. This includes efficient model hosting and application frameworks, plus performance management. Chapter 10: Agentic Workflows - As generative AI models become increasingly sophisticated, organizations