Week |
Date |
Topic |
Invited Speaker |
Reading (response indicates a reading response is required for one of the two articles.) |
Due |
1 |
3/4 |
Introduction & Course Overview
[slide]
|
윤상두 |
please read the updated course syllabus, and please ask any questions you might have. |
|
2 |
3/11 |
Representation learning in computer vision
Session 1: Backbone architectures for computer vision
[slide]
|
허병호 |
(1) response 1 Kornblith, Simon, et al. "Do better imagenet models transfer better?", CVPR 2019
(2) response 1 Dosovitskiy, Alexey, et al. "An image is worth 16x16 words: Transformers for image recognition at scale." ICLR 2021
Recommended reading
|
3/9 |
2 |
3/11 |
Representation learning in computer vision
Session 2: Training strong and robust vision models
[slide]
|
윤상두 |
(1) response 2 Zhang, Hongyi, et al. "mixup: Beyond Empirical Risk Minimization." , ICLR 2018
(2) response 2 Shankar, Vaishaal, et al. "Evaluating Machine Accuracy on ImageNet." , ICML 2020
Recommended reading
- He, Tong, et al. "Bag of tricks for image classification with convolutional neural networks." , CVPR 2019.
- Wightnam, Ross et al., "ResNet strikes back: An improved training procedure in timm." , arXiv 2021.
- Yun, Sangdoo, et al. "CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features." , ICCV 2019
- Yun, Sangdoo, et al. "Re-labeling imagenet: from single to multi-labels, from global to localized labels." , CVPR 2021
|
3/9 |
3 |
3/18 |
Multimodal representation learning
Session 1: Multimodal deep learning |
김진화 |
(1) response 1 Kim, Jin-Hwa, Jaehyun Jun, and Byoung-Tak Zhang. "Bilinear attention networks.", NeurIPS 2018
(2) response 1 Anderson, Peter, et al. "Bottom-up and top-down attention for image captioning and visual question answering.", CVPR 2018
Recommended reading
|
3/16 |
3 |
3/18 |
Multimodal representation learning
Session 2: Vision-and-Language Pre-training |
김원재 |
(1) response 2 Lu, Jiasen, et al. "ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks.", NeurIPS 2019
(2) response 2 Kim, Wonjae, et al. "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision.", ICML 2021
Recommended reading
|
3/16 |
4 |
3/25 |
Generative models Session 1: Unsupervised representation learning for class clustering |
김윤지 |
(1) response 1 Ji, Xu, et al. "Invariant Information Clustering for Unsupervised Image Classification and Segmentation.", ICCV 2019
(2) response 1 Van, Gansbeke, et al. "SCAN: Learning to Classify Images without Labels.", ECCV 2020
Recommended reading
- Chen, Xi, et al. "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets.", NeurIPS 2016
- Krishna, Kumar, Singh, et al. "FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery.", CVPR 2019
- Kim, Yunji and Ha, Jung-Woo. "Contrastive Fine-grained Class Clustering via Generative Adversarial Networks.", ICLR 2022
|
|
4 |
3/25 |
Generative models Session 2: How to improve the Generators in GANs? |
김준호 |
(1) response 2 Kang, Minguk and Park, Jaesik. "ContraGAN: Contrastive Learning for Conditional Image Generation." NeurIPS 2020
(2) response 2 Liu, Bingchen, et al. "Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis." ICLR 2021
Recommended reading
- Zhao, Long, et al. "Improved Transformer for High-Resolution GANs.", ICLR 2021
- Karras, Tero, et al. "Analyzing and Improving the Image Quality of StyleGAN.", CVPR 2020
- Zhang, Han, et al. "Consistency Regularization for Generative Adversarial Networks.", ICLR 2020
- Kim, Junho, et al. "Feature Statistics Mixing Regularization for Generative Adversarial Networks.", arXiv 2021
|
|
5 |
4/1 |
Towards reliable machine learning
Session 1: Threats of un-trustworthy AI: understanding shorcut learning by a case study |
전상혁 |
(1) response 1 Brendel, et al. "Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet", ICLR 2019
(2) response 1 Geirhos, et al. "ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness", ICLR 2019
Recommended reading
|
|
5 |
4/1 |
Towards reliable machine learning
Session 2: Towards reliable machine learning: in a lens of cross-bias generalization and domain generalization |
전상혁 |
(1) response 2 Madry, et al. "Towards Deep Learning Models Resistant to Adversarial Attacks", ICLR 2018
(2) response 2 Ganin, et al. "Domain-Adversarial Training of Neural Networks", JMLR 2016
Recommended reading
|
|
6 |
4/8 |
Practical scenarios and applications in computer vision
Session 1: Face recognition: research to product
|
유영준 |
(1) response 1 An, Xiang, et al. "Partial FC: Training 10 Million Identities on a Single Machine.", ICCV 2021
(2) response 1 Sculley, David, et al. "Hidden technical debt in machine learning systems.", NeurIPS 2015
Recommended reading
|
|
6 |
4/8 |
Practical scenarios and applications in computer vision
Session 2: Video AI and applications
|
위동윤 |
(1) response 2 Feichtenhofer, Christoph, et al. "SlowFast Networks for Video Recognition.", ICCV 2019
(2) response 2 Wang, Xiaolong, et al. "Non-local Neural Networks.", CVPR 2018
Recommended reading
- Carreira, Joao and Zisserman, Andrew. "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset.", CVPR 2017
- Cu, Chunhui, et al. "AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions.", CVPR 2018
- Kim, Jinhyung, et al. "Regularization on Spatio-Temporally Smoothed Feature for Action Recognition.", CVPR 2020
|
|
7 |
4/15 |
Practical scenarios and applications in computer vision
Session 1: All about CLOVA OCR
|
백영민 |
(1) response 1 Kittenplon, Yair, et al. "Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer.", arXiv 2022
(2) response 1 Baek, Youngmin, et al. "Character region awareness for text detection.", CVPR 2019
Recommended reading
|
|
7 |
4/15 |
Practical scenarios and applications in computer vision
Session 2: AI generating handwrittings
|
이바도 |
(1) response 2 Cha, Junbum, et al. "Few-shot Compositional Font Generation with Dual Memory.", ECCV 2020
(2) response 2 Park, Song, et al. "Few-shot Font Generation with Localized Style Representations and Factorization.", AAAI 2021
Recommended reading
|
|
8 |
4/22 |
No invited talk - Student presentations |
|
TBD |
|
9 |
4/29 |
Speech recognition and applications Session 1: Introduction of End-to-End Speech recognition |
정남규 |
(1) response 1 Culati, Anmol, et al. "Conformer: Convolution-augmented Transformer for Speech Recognition.", Interspeech 2020
(2) response 1 Han, Wei, et al. "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context.", Interspeech 2020
Recommended reading
|
|
9 |
4/29 |
Speech recognition and applications Session 2: Self-supervised End-to-End Speech recognition |
김한규 |
(1) response 2 Hsu, Wei-Ning, et al. "HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units.", IEEE/ACM Transactions on Audio, Speech, and Language Processing 2021
(2) response 2 Chung, Yu-An, et al. "W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training.", arXiv 2021
Recommended reading
|
|
10 |
5/6 |
Voice synthesis and applications Session 1
| 송은우 |
(1) response 1 Shen, Jonathan, et al. "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions.", ICASSP 2018
(2) response 1 Ren, Yi, et al. "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.", ICLR 2021
Recommended reading
- Wang, Yuxuan, et al. "Tacotron: Towards End-to-End Speech Synthesis.", Interspeech 2017
- Li, Naihan, et al. "Neural Speech Synthesis with Transformer Network.", AAAI 2019
- Yi, Ren, et al. "FastSpeech: Fast, Robust and Controllable Text to Speech.", NeurIPS 2019
|
|
10 |
5/6 |
Voice synthesis and applications Session 2
| 황민제 |
(1) response 2 Kumar, Kundan, et al. "MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis.", NeurIPS 2019
(2) response 2 Yamamoto, Ryuichi, et al. "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram.", ICASSP 2020
Recommended reading
|
|
11 |
5/13 |
Large-scale user modeling and its applications Session 1 |
곽하녹 |
(1) response 1 Shin, et al. "Scaling Law for Recommendation Models: Towards General-purpose User Representations", arXiv 2021
(2) response 1 Shin, et al. "One4all user representation for recommender systems in e-commerce", arXiv 2021
|
|
11 |
5/13 |
Large-scale user modeling and its applications Session 2 |
정지수 |
(1) response 2 Hsieh, et al. "Collaborative Metric Learning.", WWW 2017
(2) response 2 Kim, Boseop, et al. "What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers.", EMNLP 2021
Recommended reading
|
|
12 |
5/20 |
AutoML and Practical MLOps Session 1
| 김지훈 |
(1) response 1 Real, Esteban, et al. "AutoML-Zero: Evolving Machine Learning Algorithms From Scratch.", ICML 2020
(2) response 1 Falkner, Stefan, et al. "BOHB: Robust and Efficient Hyperparameter Optimization at Scale.", ICML 2018
Recommended reading
- Liu, Hanxiao, et al. "DARTS: Differentiable Architecture Search.", ICLR 2019
- Dong, XuanYi and Yang, Yi. "NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search.", ICLR 2020
Hyperparameter Optimization:
- Li, Lisha, et al. "Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization.", JLMR 2018
- Jaderberg, Max, et al. "Population Based Training of Neural Networks.", arXiv 2017
|
|
12 |
5/20 |
AutoML and Practical MLOps Session 2
| 서동필 |
No reading for his session
|
|
13 |
5/27 |
NLP, Dialogues, and QA Session 1 |
이상우 |
(1) response 1 Devlin, et al. "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.", NAACL 2019.
(2) response 1 Raffel, et al. "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.", JMLR 2020.
Recommended reading
- Radford, Alec, et al. "Language Models are Unsupervised Multitask Learners.", OpenAI 2019
- Yoo, Kangmin, et al. "GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation.", ACL Findings 2021
- Kim, Sungdong, et al. "NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-Based Simulation.", , ACL 2021
|
|
13 |
5/27 |
NLP, Dialogues, and QA Session 2 |
김성동 |
(1) response 2 Roller, Stephen, et al. "Recipes for building an open-domain chatbot.", EACL 2021
(2) response 2 Lewis, Patrick, et al. "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks", NeurIPS 2020
Recommended reading
- Izacard and Grave. "Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering.", EACL 2021
- Shuster, Kurt, et al. "Retrieval Augmentation Reduces Hallucination in Conversation.", EMNLP Findings 2021
- Xu, Jing, et al. "Beyond Goldfish Memory: Long-Term Open-Domain Conversation.", arXiv 2021
- Borgeaud, Sebastian, et al. "Improving language models by retrieving from trillions of tokens.", arXiv 2021
- Sungdong, Kim and Gangwoo, Kim. "Saving Dense Retriever from Shortcut Dependency in Conversational Search.", arXiv 2022
|
|
14 |
6/3 |
Hyperscale LM & NLP applications Session 1 |
이기창 |
(1) response 1 Brown, et al. "Language Models are Few-Shot Learners.", NeurIPS 2021
(2) response 1 Rae, et al. "Scaling Language Models: Methods, Analysis & Insights from Training Gopher.", arXiv 2021.
Recommended reading
|
|
14 |
6/3 |
Hyperscale LM & NLP applications Session 2 |
유강민 |
(1) response 2 Lester, Brian, et al. "The Power of Scale for Parameter-Efficient Prompt Tuning.", EMNLP 2021
(2) response 2 Li, Xiang Lisa, and Percy, Liang. "Prefix-Tuning: Optimizing Continuous Prompts for Generation.", arXiv 2021
Recommended reading
- He, Junxian, et al. "Towards a Unified View of Parameter-Efficient Transfer Learning.", ICLR 2022
- J. Hu, Edward, et al. "LoRA: Low-Rank Adaptation of Large Language Models.", arXiv 2021
- Schick, Timo and Schütze, Hinrich "It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners.", NAACL 2021
- Ouyang, Long, et al. "Training language models to follow instructions with human feedback (InstructGPT), OpenAI Blog 2022
|
|
15 |
6/10 |
Human-centric NLP Session 1 |
이화란 |
(1) response 1 Dinan, Emily, et al. "Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation.", EMNLP 2020
(2) response 1 Perez, Ethan, et al. "Red Teaming Language Models with Language Models.", arXiv 2022.
Recommended reading
- Bender, Emily M., et al. "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?🦜.", ACM Conference on Fairness, Accountability, and Transparency 2021
- Liu, Haochen, et al. "Does Gender Matter? Towards Fairness in Dialogue Systems.", COLING 2020
- Liu, Haochen, et al. "Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning.", EMNLP 2020
- Sheng, Emily, et al. "“Nice Try, Kiddo”: Investigating Ad Hominems in Dialogue Responses.", NAACL 2021
- Ma, Xinyao, et al. "PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction.", EMNLP 2020
- Xu, Albert, et al. "Detoxifying Language Models Risks Marginalizing Minority Voices.", NAACL 2021
- OpenAI. "WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing (WebGPT).", OpenAI Blog 2021
|
|
15 |
6/10 |
Human-centric NLP Session 2 |
정준영, 이민아 |
(1) response 2 Chung, JJY, et al. "TaleBrush: Sketching Stories with Generative Pretrained Language Models.", CHI 2022
(2) response 2 Lee, Mina, Percy Liang, and Qian Yang. "CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities.", CHI 2022
Recommended reading
- Clark, Elizabeth, et al. "Creative writing with a machine in the loop: Case studies on slogans and stories.", IUI 2018
- Singh, Nikhil, et al. "Where to Hide a Stolen Elephant: Leaps in Creative Writing with Multimodal Machine Intelligence.", ToCHI 2022
- Krause, Ben, et al. "Gedi: Generative discriminator guided sequence generation.", EMNLP findings 2021
- Qian, Jing et al. "Controllable Natural Language Generation with Contrastive Prefixes.", arXiv 2022
- Buschek, Daniel, Martin Zürn, and Malin Eiband. "The impact of multiple parallel phrase suggestions on email input and composition behaviour of native and non-native english writers.", CHI 2021
- Calderwood, Alex, et al. "How Novelists Use Generative Language Models: An Exploratory User Study.", HAI-GEN+ user2agent@ IUI. 2020.
| |
16 |
6/17 |
No invited talk - Student presentations |
|
TBD |
|