Pere Martra is an ML Research Engineer specializing in post-training, compression, and alignment of large language models (LLM). His work aims to bridge the gap between academic research and engineering practice. The main goal of his research is to create specialized small language models (SLM) capable of achieving high performance with significantly fewer computational resources.
He is currently working on a book, "Rearchitecting LLMs," for Manning Publications (MEAP - Q1 2026, publication - Q2 2026). The book focuses on advanced optimization methods for language models that go beyond traditional fine-tuning. Previously, he published the book "Large Language Models Projects" (Apress, 2024), dedicated to the practical application of LLM.
His key technical areas include model architecture optimization, the development of efficient pipelines for creating specialized models, and research on structural pruning. In his work (preprint, December 2025), he explores systematic pruning in GLU architectures, demonstrating how structural optimization can enhance the models' abilities, such as following instructions.
He is also the creator of OptiPfair, an open-source library for detecting and reducing bias at the level of individual neuron components in models.
With over twenty years of experience in technical leadership, he is currently focused on the development of efficient and ethical AI systems. Pere Martra is actively engaged in the professional community: he is the author of the "Large Language Model Notebooks" course (over 1,800 stars on GitHub) and participates in the Hugging Face and Google Gemini ecosystems.
He is open to collaboration on research and engineering projects related to model efficiency, compression, and the responsible development of artificial intelligence.