Skip to main content

AI Systems Performance Engineering

0h 0m 0s
English
Paid

AI Systems Performance Engineering is a practical and comprehensive guide for enhancing the performance of AI systems across all levels of infrastructure. Amidst the rapid growth of generative models, this book offers engineers, researchers, and developers a wealth of applied optimization strategies. These strategies empower them to collaboratively fine-tune hardware, software components, and algorithms, crafting robust, scalable, and cost-effective solutions for both training and inference.

About the Author

Chris Fregly, a renowned engineering and product leader in performance optimization, provides a step-by-step guide on transforming complex AI systems into high-performance solutions. The book covers topics such as the fine-tuning of CUDA cores on GPUs, the use of PyTorch-based algorithms, and the implementation of distributed training and inference systems across multiple nodes.

Key Topics Covered

GPU Optimization and Scaling

Special attention is given to scaling GPU clusters and managing distributed model training tasks, ensuring efficient resource usage.

High-Performance Inference

Learn about high-performance inference servers and how to reduce latency with modern inference strategies.

Identifying Bottlenecks

Discover how to identify and eliminate performance bottlenecks in complex AI pipelines using leading industry scaling tools.

Full-Stack Optimization

The book emphasizes applying full-stack approaches to ensure the reliable and stable operation of AI systems.

Conclusion

The publication concludes with a detailed checklist of over 175 ready-to-use optimizations, offering practical insights and tools to design and optimize AI systems for maximum throughput and cost efficiency.

About the Author: Chris Fregly

Chris Fregly thumbnail

Chris Fregly — a lead solutions architect for generative artificial intelligence at Amazon Web Services (AWS), working in San Francisco, California. He is the co-author of two books published by O'Reilly: Data Science on AWS and Generative AI on AWS, dedicated to the practical application of machine learning and generative AI in the AWS cloud environment.

Chris also founded the international meetup series Generative AI on AWS, bringing together specialists from around the world. He regularly speaks at leading conferences in the field of artificial intelligence and machine learning, including O'Reilly AI, Open Data Science Conference (ODSC), and NVIDIA GPU Technology Conference (GTC), where he shares practical experiences in building and scaling AI systems in industrial environments.

Books

Read Book AI Systems Performance Engineering

#Title
1AI Systems Performance Engineering