🖨️ Printing Instructions: Press Ctrl/Cmd + P and select "Save as PDF".
1

Computer Vision with PyTorch

From Pixels to Predictions: Building Image Classifiers

2

Part 1: Introduction to Computer Vision

3

What is Computer Vision?

4

CV Applications

5

Part 2: Images as Data

6

How Computers See Images

7

Image Tensors in PyTorch

8

Why Normalize Images?

9

Part 3: Convolutions — The Core Operation

10

Why Not Fully Connected Layers?

11

What is a Convolution?

12

Convolution Math

13

Conv2d in PyTorch

14

Output Size Formula

15

Part 4: Pooling & More Layers

16

Pooling Layers

17

Pooling in PyTorch

18

Activation & BatchNorm

19

Part 5: Building a CNN

20

CNN Architecture Overview

21

Simple CNN for CIFAR-10

22

Loading Image Datasets

23

Training the CNN

24

Part 6: Data Augmentation

25

Why Data Augmentation?

26

Common Augmentations

27

Part 7: Transfer Learning

28

What is Transfer Learning?

29

Using Pre-trained Models

30

Fine-Tuning Strategies

31

Part 8: Famous CNN Architectures

32

Historical CNN Milestones

33

Available in torchvision

34

Part 9: Beyond Classification

35

The CV Landscape

36

Object Detection

37

Semantic & Instance Segmentation

38

Video Understanding

39

Vision-Language Models (VLMs)

40

World Models & Physical AI

41

Vision-Language-Action (VLA) Models

42

All Interactive Demos

43

Lecture Summary

44

Supplementary Resources