Nothing Special   »   [go: up one dir, main page]

Variational Autoencoder

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Variational Autoencoders

Gyana Ranjan Nayak


April 4, 2023
Indian Association for the Cultivation of Science, Kolkata

1
Table of Contents

Introduction

Variational Autoencoders

2
Introduction
Table of Contents

Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization

3
What is our Goal?

Suppose we have observed a random sample x which is being drawn from an unknown
probability process with distribution p(x). We want to find a distribution p̂ θ (x) such
that p̂ θ (x) ≈ p(x), where θ is the parameter to be estimated.
* Now onwards, I will refer p(x) to our distribution to be estimated, instead of p̂ θ (x)
for notational convenience.

4
Table of Contents

Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization

5
Latent Variable Models

We have,
1. A high dimensional object of interest x ∈ X D .
2. Low dimensional latent variables z ∈ Z M , often called hidden factors in the data.
We can define the generative process as:
1.z ∼ p(z)
2.x ∼ p(x|z)
Now from the joint probability distribution of x and z is given by p(x, z) = p(z)p(x|z)
To get the likelihood function p(x), we will marginalize p(x, z) as
R R
p(x) = p(x, z) = p(z)p(x|z)dz

6
Table of Contents

Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization

7
Variational Autoencoders

8
Variational Autoencoders
Table of Contents

Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization

9
The Model and The Objective

R
p(z)p(x|z)dz does not have an analytical solution which we can optimize.
To solve this, we introduce a parametric inference model qϕ (z), called encoders or
recognition model. Parameters ϕ are called variational parameters, and we will
optimize ϕ such that qϕ (z) ≈ p(x).
We can assume qϕ (z) as Gaussian with mean, µ and variance, σ 2 . That is ϕ = {µ, σ 2 }

10
The Model and The Objective

Then,
ln p(x)
R
= ln p(x|z)p(z)dz
R q (z)
= ln qϕϕ (z) p(x|z)p(z)dz
=ln Eqϕ (z) [ p(x|z)p(z)
qϕ (z) ]
≥Eqϕ (z) ln[ p(x|z)p(z)
qϕ (z) ] (by Jensen’s inequality)
=Eqϕ (z) [ln(p(x|z)) + ln(p(z)) − ln(qϕ (z))]
=Eqϕ (z) [ln(p(x|z))] − Eqϕ (z) [ln(qϕ (z)) − ln(p(z))]

11
The Model and The Objective

If we consider an amortized variational posterior, namely, qϕ (z|x) instead of qϕ (z) for


each x, then we get
ln p(x) ≥Eqϕ (z|x) [ln(p(x|z))] − Eqϕ (z|x) [ln(qϕ (z|X )) − ln(p(z))]
p(z) : prior
qϕ (z|x) : stochastic encoder
p(x|z) : stochastic decoder
The lower bound to the log-likelihood function is called the Evidence Lower
Bound(ELBO). Eqϕ (z|x) [ln(p(x|z))] : negative reconstruction error
Eqϕ (z|x) [ln(qϕ (z|x)) − ln(p(z))] : regularizer (Coincides with KL divergence)

12
Table of Contents

Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization

13
Evidence Lower Bound

ln p(x)
= Eqϕ (z|x) [ln(p(x))]
=Eqϕ (z|x) [ln p(z|x)(p(x))
(z|x) ]
=Eqϕ (z|x) [ln p(x|z)(p(z))
(z|x) ]
qϕ (z|x)
=Eqϕ (z|x) [ln p(x|z)(p(z))qϕ (z|x ]
p(z) qϕ (z|x)
=Eqϕ (z|x) [lnp(x|z) qϕ (z|x) p(z|x) ]
=Eqϕ (z|x) [ln(p(x|z))]-KL(qϕ (z|x||p(z))+KL(qϕ (z|x)||p(x|z))

14
Table of Contents

Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization

15
Optimization

Our goal is to maximize this with respect to θ and ϕ using stochastic gradient ascent.
Alternatively, we can try to minimize the negative ELBO. It is easy to take gradients of
this with respect to θ, using automatic differentiation. Unfortunately, taking gradients
with respect to ϕ is harder, since we need to take into account that the sampling
process itself depends on .

16
Table of Contents

Introduction
What is our Goal?
Latent Variable Models
Brief discussion on Variational Autoencoders
Variational Autoencoders
The Model and The Objective
Evidence Lower Bound
Optimization
Reparameterization

17
Reparameterization

18
Thank You !

19

You might also like