Home > Store

Introduction to Transformer Models for NLP, 2nd Edition (Video Course), 2nd Edition

By Sinan Ozdemir
Published Jun 8, 2026 by Addison-Wesley Professional.

Online Video

This product currently is not for sale.
About this video

Video accessible from your Account page after purchase.

Not for Sale

Description

Sample Content

Updates

More Information

Description

Copyright 2026
Edition: 2nd

Online Video
ISBN-10: 0-13-556516-2
ISBN-13: 978-0-13-556516-2

10+ Hours of Video Instruction

Overview

Learn how modern LLMs power and perform natural language processing.

Introduction to Transformer Models for NLP, Second Edition is designed to provide you with a deep understanding of how modern LLMs process natural language.

Learn How To

Recognize which type of transformer-based model is best for a given task
Understand how transformers process text and make predictions
Fine-tune transformer-based models with custom data
Deploy fine-tuned models and use them in production
Prompt engineer for optimal outputs from the latest LLMs

Who Should Take This Course

Intermediate to advanced machine learning engineers who have experience with ML, neural networks, and NLP
Those who want the best outputs from modern LLMs
Those interested in state-of-the-art NLP architecture
Those interested in productionizing and fine-tuning LLMs
Those comfortable using libraries like Tensorflow or PyTorch
Those comfortable with linear algebra and vector/matrix operations

Course Requirements

Python 3 proficiency with some experience working in interactive Python environments including Notebooks (Jupyter/Google Colab/Kaggle Kernels)
Should be comfortable using the Pandas library and either Tensorflow or PyTorch
Should have an understanding of ML/deep learning fundamentals including train/test splits, loss/cost functions, and gradient descent

Lesson Descriptions

Lesson 1, Introduction to Attention and Language Models: Lesson 1 lays the groundwork for the entire course by tracing the path from traditional NLP to the attention revolution. Sinan explores how attention mechanisms solve the bottlenecks of earlier architectures. He also examines the encoder-decoder framework that enabled sequence-to-sequence learning and understand how language models fundamentally process text.

Lesson 2, How Transformers Use Attention to Process Text: In Lesson 2, Sinan dives into the mechanics that power modern LLMs. He breaks down tokenization and embeddings, how text becomes numbers, and then explores scaled dot product attention, the elegant computation at the transformers core. You learn how multi-headed attention can capture diverse relationships and how masked attention mechanisms enable autoregressive generation and the modern advancements, pushing attention mechanisms even further.

Lesson 3, LLM Pre-Training and Recipes: Lesson 3 examines how raw neural networks become capable language models. Sinan walks you through the three-phase training pipeline and then contrasts BERTs masked language modeling with GPTs next token prediction, two fundamentally different approaches that shape the field of natural language processing. Youll trace BERTs evolution to modern BERT and explore the scaling laws that guide todays industry toward frontier systems.

Lesson 4, Natural Language Generation with GPT and More: This lesson maps the generative LLM landscape. Sinan surveys the closed source leaders including GPT, Claude, and Gemini, examining their capabilities, pricing, and positioning. Youll then explore the open weight ecosystem, Lama, Qwen, DeepSeek, Mistral, Kimi, and others that have rapidly closed the gap in performance. You will leave with the right context to choose the correct AI for your applications.

Lesson 5, Prompt Engineering as Craft: This lesson elevates prompting from casual instruction writing to systematic practice. Sinan explores why prompting remains underrated, even as models improve. Then he covers structured outputs and prompt chaining for complex workflows. You'll master chain of thought and few shot techniques and learn how inference parameters like temperature and top-p shape model behavior. Prompting well is a skill.

Lesson 6, Alignment and Post-Training: Lesson 6 explores how base models become helpful assistants. Sinan examines supervised training on instruction data and then dives deep into RLHF, the original preference optimization method. You learn about DPO, GRPO, and the modern post-training landscape that's making alignment more accessible than ever. It closes by examining that alignment spectrum from minimal safety to highly constrained behavior.

Lesson 7, Fine-Tuning Fundamentals: Lesson 7 equips you with practical fine-tuning skills. Sinan covers transfer learning principles and then focuses on techniques like LoRA and other parameter-efficient methods to make fine-tuning more accessible without massive compute spend. Youll learn about model distillation for creating smaller, faster models and develop a framework for choosing the right architecture, encoder, decoder, and combination encoder-decoder for your specific task.

Lesson 8, Multimodal Transformers: Lesson 8 extends beyond text to vision, audio, and a unified multimodal system. Sinan traces the path from the vision transformer to todays models that seamlessly handle images, text, and more. You explore current multimodal trends and come to understand how the same transformer principles have revolutionized not just NLP but are also transforming how AI perceives and generates across modalities.

Lesson 9, Reasoning Models: Lesson 9 examines the new frontier of AI reasoning. Sinan explores how models like Deepseek, GPT, and Claude with extended thinking actually work, trading inference speed for deeper problem solving. You learn how to benchmark reasoning capabilities and understand when these slower, more deliberate models justify their additional cost and latency.

Lesson 10, Deploying Transformer Models: Lesson 10 bridges the gap from prototype to production. It covers MLOps fundamentals for AI systems and then dives into quantization formats that enable local inference on consumer hardware. Youll learn about architectural patterns for putting models into production and the infrastructure considerations for serving them reliably at scale.

Lesson 11, RAG Plus Agents: This lesson covers the systems that extend LLMs beyond their training data. You build retrieval, augmented generation, or RAG pipelines from scratch, and then explore AI agents for dynamic workloads. You learn about MCP for connecting agents to external tools all around the world and how to architect multi-agent systems for complex tasks. Then you learn how to implement the reasoning and action (ReAct) agent pattern that powers modern agent loops.

Lesson 12, Evaluating LLMs and AI Systems: This lesson addresses one of the hardest problems in applied AI systems: knowing whether your AI system actually works. Sinan explores why evaluation is harder than it seems, covers techniques for assessing generative outputs, and examines LM as a judge approach alongside a human evaluation. You learn how to navigate benchmarks and leaderboards and how to track the production metrics that actually matter, like cost, latency, and reliability.

Lesson 13, The Future of AI: In this final lesson Sinan looks beyond current paradigms into what might come next. He examines the limits of autoregressive generation and explores alternatives like diffusion LLMs that generate tokens in parallel. Youll learn about world models that predict without generating and gain perspective on the trajectories that will shape AIs next chapter.



Sample Content

Lesson 1: Introduction to Attention and Language Models

1.1 A brief history of modern NLP

1.2 Paying attention with attention

1.3 Encoder-decoder architectures

1.4 How language models look at text

Lesson 2: How Transformers Use Attention to Process Text

2.1 Tokenization and embeddings

2.2 Scaled dot product attention

2.3 Multi-headed attention

2.4 Masked attention

2.5 Modern advancements in attention

Lesson 3: LLM Pre-Training and Recipes

3.1 The LLM training recipe book

3.2 How BERT is pre-trained and the path to ModernBERT

3.3 How GPT is pre-trained: Next token prediction at scale

3.4 Scaling laws and modern pre-training

Lesson 4: Natural Language Generation with GPT and More

4.1 The closed-source generative LLM landscape

4.2 The open-weight generative LLM landscape

Lesson 5: Prompt Engineering as Craft

5.1 Why prompting is still underrated

5.2 Structured outputs and prompt chaining

5.3 Chain-of-thought and few-shot prompting

5.4 LLM prompting and inference parameters

Lesson 6: Alignment and Post-Training

6.1 Supervised post-training

6.2 RLHF: The original preference post-training

6.3 DPO, GRPO, and the modern post-training landscape

6.4 The alignment spectrum

Lesson 7: Fine-Tuning Fundamentals

7.1 Introduction to transfer learning

7.2 LoRA and efficient fine-tuning

7.3 Model distillation

7.4 Choosing the best AI architecture for the task

Lesson 8: Multimodal Transformers

8.1 From ViT to unified multimodal architectures

8.2 Multimodal AI trends

Lesson 9: Reasoning Models

9.1 How reasoning models work

9.2 Benchmark reasoning models

Lesson 10: Deploying Transformer Models

10.1 Introduction to MLOps

10.2 Quantization formats and local inference

10.3 Putting models in production

Lesson 11: RAG Plus Agents

11.1 Introduction to retrieval augmented generation

11.2 Building a RAG pipeline from scratch

11.3 AI agents for dynamic workloads

11.4 MCP: Connecting agents to the world

11.5 Architecting multi-agent systems

Lesson 12: Evaluating LLMs and AI Systems

12.1 Why evaluation is harder than you think

12.2 LLMs-as-judges versus human evaluation

12.3 Benchmarks and leaderboards

Lesson 13: The Future of AI

13.1 The limits of autoregression

13.2 Diffusion and state space LLMs

13.3 What comes next



Updates

Submit Errata



More Information



InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address

Corporate, Academic, and Employee Purchases