SDE are used to model the data distribution of an image
data ⇒ noise in SDE
SDE = dX_t = f(X) * dt + r(t) * W_t
W_t = brownian motion
SDE = dX_t = [f(X) - g^2(t)*change(log p(x)_t) ] dt + * g(t)W_t
change(log p(x)_t) : gradient or change of prob density
basically the gradient of the log probability density): how much the density of a particular image output state’s probability of occuring changes based on some input x
We can create some conditional SDE (where the training does not see it) and it can be estimated under unconditional scores. If scores tell us the change in density or likelihood of some image output occuring
Perturbation kernels: function that describes how noise is added to data. It relates to data distribution