Transformer
- 제목: Attention Is All You Need
- 연도: 2017
- 학회: NIPS
- url: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
- github:
- 정리: [논문] Attention Is All You Need | Transformer
ELMo
- 제목: Deep contextualized word representations
- 연도: 2018
- 학회: NAACL
- url: https://arxiv.org/pdf/1802.05365.pdf
- github:
GPT
- 제목: Improving language understanding by generative pre-training
- 연도: 2018
- 학회:
- url: https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
- github:
BERT
- 제목: Bert: Pre-training of deep bidirectional transformers for language understanding
- 연도: 2018
- 학회:
- url: https://arxiv.org/pdf/1810.04805.pdf
- github: GitHub - google-research/bert: TensorFlow code and pre-trained models for BERT
- 정리: [논문] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
GPT-2
- 제목: Language models are unsupervised multitask learners
- 연도: 2019
- 학회:
- url: https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
- github: GitHub - openai/gpt-2: Code for the paper "Language Models are Unsupervised Multitask Learners"
MT-DNN
- 제목: Multi-Task Deep Neural Networks for Natural Language Understanding
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1901.11504.pdf
- github: GitHub - namisan/mt-dnn: Multi-Task Deep Neural Networks for Natural Language Understanding
RoBERTa
- 제목: RoBERTa: A Robustly Optimized BERT Pretraining Approach
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1907.11692.pdf
- github:
XLNet
- 제목: XLNet: Generalized Autoregressive Pretraining for Language Understanding
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1906.08237.pdf
- github: GitHub - zihangdai/xlnet: XLNet: Generalized Autoregressive Pretraining for Language Understanding
spanBERT
- 제목: SpanBERT: Improving Pre-training by Representing and Predicting Spans
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1907.10529.pdf
- github: GitHub - facebookresearch/SpanBERT: Code for using and evaluating SpanBERT.
structBERT
- 제목: StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1908.04577.pdf
- github:
XLM
- 제목: Cross-lingual Language Model Pretraining
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1901.07291.pdf
- github:
MASS
- 제목: MASS: Masked Sequence to Sequence Pre-training for Language Generation
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1905.02450.pdf
- github: GitHub - microsoft/MASS: MASS: Masked Sequence to Sequence Pre-training for Language Generation
UniLM
- 제목: Unified Language Model Pre-training for Natural Language Understanding and Generation
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1905.03197.pdf
- github: GitHub - microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
BART
- 제목: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1910.13461.pdf
- github:
DistilBERT
- 제목: DistilBERT, a distilled version of BERT: smaller,faster, cheaper and lighter
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1910.01108.pdf
- github:
ERNIE
- 제목: ERNIE: Enhanced Language Representation with Informative Entities
- 연도: 2019
- 학회: ACL
- url: https://arxiv.org/pdf/1905.07129.pdf
- github: GitHub - thunlp/ERNIE: Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"
knowBERT
- 제목: Knowledge Enhanced Contextual Word Representations
- 연도: 2019
- 학회:
- url: https://arxiv.org/pdf/1909.04164.pdf
- github:
ALBERT
- 제목: *ALBERT: A Lite BERT for Self-supervised Learning of Language *Representations
- 연도: 2020
- 학회: ICLR
- url: https://arxiv.org/pdf/1909.11942.pdf
- github: GitHub - google-research/albert: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
GPT-3
- 제목: Language models are few-shot learners
- 연도: 2020
- 학회:
- url: https://arxiv.org/pdf/2005.14165.pdf
- github: GitHub - openai/gpt-3: GPT-3: Language Models are Few-Shot Learners
T5
- 제목: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- 연도: 2020
- 학회:
- url: https://arxiv.org/pdf/1910.10683.pdf
- github: GitHub - google-research/text-to-text-transfer-transformer: Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
ELECTRA
- 제목: ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
- 연도: 2020
- 학회: ICLR
- url: https://openreview.net/pdf?id=r1xMH1BtvB
- github: GitHub - google-research/electra: ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
DeBERTa
- 제목: DeBERTa: Decoding-enhanced BERT with Disentangled Attention
- 연도: 2021
- 학회:
- url: https://arxiv.org/pdf/2006.03654.pdf
- github: GitHub - microsoft/DeBERTa: The implementation of DeBERTa
Longformer
- 제목: Longformer: The Long-Document Transformer
- 연도: 2020
- 학회:
- url: https://arxiv.org/pdf/2004.05150.pdf
- github: GitHub - allenai/longformer: Longformer: The Long-Document Transformer
BigBird
- 제목: Big Bird: Transformers for Longer Sequences
- 연도: 2020
- 학회: NeurIPS
- url: https://arxiv.org/pdf/2007.14062.pdf
- github: GitHub - google-research/bigbird: Transformers for Longer Sequences
'Study' 카테고리의 다른 글
Passage Retrieval 관련 연구 (0) | 2022.01.22 |
---|---|
[논문] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (0) | 2022.01.22 |
[논문] Attention Is All You Need | Transformer (0) | 2022.01.10 |
추천 시스템 성능 평가 지표(Precision@k, Recall@k, Hit@k, MAP, MRR, nDCG) (0) | 2022.01.09 |
회귀 모델 성능 평가 지표(MAE, MSE, RMSE, MAPE 등) (1) | 2022.01.07 |