Skip to content

Commit a803071

Browse files
author
liuzi
committed
generative models-eg1
1 parent d1e0577 commit a803071

File tree

4 files changed

+66
-4
lines changed

4 files changed

+66
-4
lines changed

_config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ theme_mode: # [light | dark]
9797
cdn:
9898

9999
# the avatar on sidebar, support local or CORS resources
100-
avatar: /assets/images/liuzi-avatar.png
100+
avatar: /assets/img/avatar/liuzi-avatar.png
101101

102102
# The URL of the site-wide social preview image used in SEO `og:image` meta tag.
103103
# It can be overridden by a customized `page.image` in front matter.

_posts/2025-02-15-generative-models.md

Lines changed: 65 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
layout: post
33
title: Generative Models
44
date: 2025-02-15 21:10 +0800
5+
categories: [Fundamentals]
6+
tags: [generative models, discriminative models, bayes' theorem]
57
pin: true
68
math: true
79
mermaid: true
@@ -15,10 +17,70 @@ When discussing **generative models**, it's essential to understand how machine
1517
1. **Discriminative Modeling:** This approach involves building a model that directly predicts classification labels or identifies the decision boundary between elephants and dogs.
1618
2. **Generative Modeling:** This approach entails constructing separate models for elephants and dogs, capturing their respective characteristics. A new animal is then compared against each model to determine which it resembles more closely.
1719

18-
In discriminative modeling, the focus is on learning the conditional probability of labels given the input data, denoted as $$ p(y\mid{x}) $$. Techniques like logistic regression exemplify this by modeling the probability of a label based on input features. Alternatively, methods such as the perceptron algorithm aim to find a decision boundary that maps new observations to specific labels $$\{0,1\}$$, such as 0 for dogs and 1 for elephants.
20+
In discriminative modeling, the focus is on learning the conditional probability of labels given the input data, denoted as
21+
$$ p(y|{x}) $$. Techniques like logistic regression exemplify this by modeling the probability of a label based on input features. Alternatively, methods such as the perceptron algorithm aim to find a decision boundary that maps new observations to specific labels $$\{0,1\}$$, such as $$0$$ for dogs and $$1$$ for elephants.
22+
23+
Conversely, generative modeling focuses on understanding how the data is generated by learning the joint probability distribution $$p(x,y)$$ or the likelihood
24+
$$p(x|{y})$$ along with the prior probability $$p(y)$$. This approach models the distribution of the input data for each class, enabling the generation of new data points and facilitating classification by applying Bayes' theorem to compute the posterior probability:
25+
26+
$$
27+
% p(y\mid{x})=p(x\mid{y})p(y)/p(x)
28+
p(y|x) = \frac{p(x|y)p(y)}{p(x)}
29+
$$
30+
31+
The denominator $$p(x)$$ is the marginal probability that sums the joint probability $$p(x,y)$$ over all possible labels $$y$$:
32+
33+
$$
34+
\begin{aligned}
35+
p(x) &= \sum_{y} p(x,y) \\
36+
&= \sum_{y} p(x|y)p(y) \\
37+
&= p(x|y=0)p(y=0) + p(x|y=1)p(y=1)
38+
\end{aligned}
39+
$$
40+
41+
Actually $$p(x)$$ acts as a normalization constant as it does not depend on the label $$y$$. To be more specific, $$p(x)$$ does not change no matter how $$y$$ varies. So when calculating
42+
$$p(y|x)$$, we do not need to compute $$p(x)$$:
43+
44+
$$
45+
\begin{aligned}
46+
\arg\max_y p(y|x) &= \arg\max_y \frac{p(x|y)p(y)}{p(x)}\\
47+
&= \arg\max_y p(x|y)p(y) \\
48+
\end{aligned}
49+
$$
50+
51+
Let's consider a new binary classification of emails as spam or not spam. $$ x^{(i)} $$ is the feature vector of the $$i$$-th email, and $$ y^{(i)} $$ is the label indicating whether the email is spam ($$1$$) or not spam ($$0$$). Following examples show how discriminative and generative models approach the same problem differently.
52+
53+
### Example 1: Logistic Regression as a Discriminative Model
54+
55+
Since the label $$y$$ can only take on values $$0$$ or $$1$$, it makes sense to choose a hypothesis $$h_{\theta}(x)$$ that ranges in $$[0,1]$$ to represent the probability of
56+
$$p(y=1|x)$$. Then we can set the threshold of $$h_{\theta}(x)$$ to be $$0.5$$ to predict whether a email is spam or not spam. Logistic function fits this case well as it ranges in $$[0,1]$$ for $$z\in(-\infty, +\infty)$$:
57+
58+
$$
59+
h_{\theta}(x) = g(\theta^T x) = \frac{1}{1 + e^{-\theta^T x}}
60+
$$
61+
62+
where
63+
64+
$$
65+
g(z) = \frac{1}{1 + e^{-z}}
66+
$$
67+
68+
is called the logistic function or sigmoid function. Below is a plot of the sigmoid function:
69+
70+
![Sigmoid Function](/assets/img/posts/sigmoid.png){: width="300" height="250" }
71+
72+
From the plot, we can see $$g(z)$$ tends to $$0$$ as $$z\to-\infty$$ and tends to $$1$$ as $$z\to+\infty$$. When $$z=0$$, $$g(z)=0.5$$. $$g(z)$$ or $$h_{\theta}(x)$$ is always bounded between $$0$$ and $$1$$. To keep the convention of letting $$x_0=1$$, we can rewrite the variables in the hypothesis as $$z = \theta^T x = \theta_0 + \sum_{j=1}^n \theta_i x_j$$, where $$\theta_0$$ is the bias term and $$\theta_j$$ is the weight of the $$j$$-th feature $$x_j$$. Please note that other functions that smoothly and monotonically increase from $$0$$ to $$1$$ can be also considered for $$h_{\theta}(x)$$.
73+
74+
<!-- to check:notes page22 -->
75+
76+
### Example 2: Gaussian Discriminant Analysis as a Generative Model
77+
78+
79+
80+
81+
82+
1983

20-
Conversely, generative modeling focuses on understanding how the data is generated by learning the joint probability distribution
21-
$$p(x,y)$$ or the likelihood $$p(x\mid{y})$$ along with the prior probability $$p(y)$$. This approach models the distribution of the input data for each class, enabling the generation of new data points and facilitating classification by applying Bayes' theorem to compute the posterior probability $$ p(y\mid{x})=p(x\mid{y})p(y)/p(x) $$.
2284

2385

2486

assets/img/posts/sigmoid.png

36.5 KB
Loading

0 commit comments

Comments
 (0)