"A Survey of Large Language Models"
"Language Models are Few-Shot Learners"
"Spread Your Wings: Falcon 180B is here"
"The Pile: An 800GB Dataset of Diverse Text for Language Modeling"
"Training Compute-Optimal Large Language Models"
"Dropout: A Simple Way to Prevent Neural Networks from Overfitting"