BERT - Bidirectional Encoder Representations from Transformers October 11, 2018 Pre-training bidirectional by jointly conditioning on both left and right context