
Exponential-Min and Gumbel-Max
Exponential-min and Gumbel-max tricks for reformulating sampling from a discrete distribution as argmin and argmax, making the sampling operation differentiable.
Exponential-min and Gumbel-max tricks for reformulating sampling from a discrete distribution as argmin and argmax, making the sampling operation differentiable.
This note provides a high-level summary of the progress in large language models (LLMs) covering major milestones from Transformers to ChatGPT. The note serves as a fast-paced recap for readers to catch up on this field quickly.
A quick walk-through of Expectation-Maximization (EM) algorithm and its cousins.
This is the first post of hopefully a series of post walking through diffusion models. This post will introduce the foundations, focusing on two foundational papers, that many other papers built upon.
nanoGPT repo reading notes
This is a quick note to discuss a few topics below related to building LLM-powered products and applications, such as how to let LLM use tools and become autonomous agents, how to incorporate domain adaptation, and the production hurdles.
In this note, we'll take a look at how Auto-GPT work and discuss LLM's ability to do explicit reasoning and to become an autonomous agent. We'll touch upon a few related works such as WebGPT, Toolformer, and Langchain.
This page is a high-level summary / notes of various recent results in language modeling with little explanations
A list of starter resources for Natural Language Processing (NLP), mostly with deep learning.
A literature survey of recent papers on Neural Variational Inference (NVI) and its application in topic modeling.
A high-level summary of various generative models including Variational Autoencoders (VAE), Generative Adverserial Networks (GAN), and their notable extentions and generalizations, such as f-GAN, Adversarial Variational Bayes (AVB), Wasserstein GAN, Wasserstein Auto-Encoder (WAE), Cramer GAN and etc
End of results • 11 posts found