studio.heelab
[DL for CV] Introduction 본문
Lecture: https://www.youtube.com/playlist?list=PLoROMvodv4rOmsNzYBMe0gJY2XS8AQg16
Stanford CS231N Deep Learning for Computer Vision I 2025
Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving car...
www.youtube.com
Lecture 1: Introduction
Agenda
-A breif history of CV and DL
1. The Essence and Historical Origins of Computer Vision

1959 Hubels &Wiesel
1963 Roberts
1970s David Marr
1979 Gen.Cylinders - Recognize via Parts
1986 Canny - Recognition via Edge Detection
1990s Recognition via Grouping
2000s Recognition via Matching, Face Detection, PASCAL Visual Object Challenge
2006 Deep Leerning
Visual recognition is a fundamental taxk for visual intelligence
2. The Deep Learning Revolution and the Importance of Data
ImageNet dataset
A major reason early neural networks failed to recognize complex real-world images was a lack of data. Professor Fei-Fei Li's team proved the decisive role of data in machine learning by constructing the ImageNet dataset, containing 15 million images.
AlexNet 2012: DL
The modern deep learning revolution began in earnest when AlexNet won the ImageNet challenge by an overwhelming margin. This was the result of combining sophisticated algorithms (Backpropagation), powerful computing resources (GPUs), and large-scale data.
2012 to Present: DL Explosion
picture, video, human movement
3. Tasks and Applications of Modern Computer Vision
The lecture introduces various visual tasks that go beyond simple image classification:
- Expansion of Visual Understanding: Includes technologies for precise object identification such as Object Detection, Semantic Segmentation, and Instance Segmentation.
- Generative AI and Multimodal: Covers generative models like DALL-E (text-to-image), Style Transfer, and models combining vision with language.
- Future Technologies: 3D Reconstruction, video understanding, and Embodied AI integrated with robotics are mentioned as next-generation core technologies.
4. Human-Centered AI and Responsibility
- Social Impact: Since AI models learn from data created by human activity, there is a risk of reflecting human bias
- Positive Applications: It is crucial to utilize computer vision to improve human life, such as in medical imaging analysis and elderly care.
5. Course Overview
Deep Learning Basics (Lecture 2-4)
Image Classification: A core task in CV
- Linear classification, optimization, regularization, and basic principles of neural networks.
Perceiving and Understanding the Visual World (Lecture 5-12)
Task beyond Image Classification
classification -> semantic segmentation -> object detection -> instance segmentation


Models Beyond Muti-layer Perceptron (Lecture 13-17)
CNN(Convolutional neural network

RNN(Recurrent neural network)

Attention mechanism / Transformers

Generative and Interactive Visual Intelligence
Self-supervised Learning

Generative Modeling
using diffusion models
Vision Language Models

3D Vision
Human-Centered Applications and Implications (Lecture 18)
'MMAILab' 카테고리의 다른 글
| [DL for CV] Training CNNs and CNN Architectures (0) | 2026.03.05 |
|---|---|
| [DL for CV] Image Classification with CNNs (0) | 2026.03.05 |
| [DL for CV] Neural Networks & Backpropagation (0) | 2026.03.04 |
| [DL for CV] Regularization & Optimization (0) | 2026.03.04 |
| [DL for CV] Image Classification with Linear Classifiers (0) | 2026.03.04 |