Skip to content

Commit 62c5642

Browse files
author
liuzi
committed
gen model post eg1
1 parent b90e737 commit 62c5642

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

_posts/2025-02-15-generative-models.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,9 @@ Let's consider a new task of spam classification. $$ x^{(i)} $$ is the feature v
5151

5252
### Example 1: Logistic Regression as a Discriminative Model
5353

54-
Since the label $$y$$ can only take on values $$0$$ or $$1$$, it makes sense to choose a hypothesis $$h_{\theta}(x)$$ that ranges in $$[0,1]$$ to represent the probability of
55-
$$p(y=1|x)$$. Then we can set the threshold of $$h_{\theta}(x)$$ to be $$0.5$$ to predict if an email is spam. Logistic function fits this case well as it ranges in $$[0,1]$$ for $$z\in(-\infty, +\infty)$$:
54+
Since it is a binary classification problem, it makes sense to choose a hypothesis $$h_{\theta}(x)$$ that ranges in $$[0,1]$$ to represent the probability of
55+
$$p(y=1|x)$$, where
56+
$$p(y=0|x) = 1 - h_{\theta}(x)$$. Then we can set the threshold of $$h_{\theta}(x)$$ to be $$0.5$$ to predict if an email is spam. Logistic function fits this case well as it ranges in $$[0,1]$$ for $$z\in(-\infty, +\infty)$$:
5657

5758
$$
5859
h_{\theta}(x) = g(\theta^T x) = \frac{1}{1 + e^{-\theta^T x}}
@@ -68,7 +69,7 @@ is called the logistic function or sigmoid function. Below is a plot of the sigm
6869

6970
![Sigmoid Function](/assets/img/posts/sigmoid.png){: width="300" height="250" }
7071

71-
From the plot, we can see $$g(z)$$ tends to $$0$$ as $$z\to-\infty$$ and tends to $$1$$ as $$z\to+\infty$$. When $$z=0$$, $$g(z)=0.5$$. $$g(z)$$ or $$h_{\theta}(x)$$ is always bounded between $$0$$ and $$1$$. To keep the convention of letting $$x_0=1$$, we can rewrite the expression of $$z$$ in the hypothesis as $$z = \theta^T x = \theta_0 + \sum_{j=1}^n \theta_i x_j$$, where $$\theta_0$$ is the bias term and $$\theta_j$$ is the weight of the $$j$$-th feature $$x_j$$. Please note that other functions that smoothly and monotonically increase from $$0$$ to $$1$$ can be also considered for $$h_{\theta}(x)$$.
72+
From the plot, we can see $$g(z)$$ tends to $$0$$ as $$z\to-\infty$$ and tends to $$1$$ as $$z\to+\infty$$. When $$z=0$$, $$g(z)=0.5$$. $$g(z)$$ or $$h_{\theta}(x)$$ is always bounded between $$0$$ and $$1$$. To keep the convention of letting $$x_0=1$$, we can rewrite the expression of $$z$$ in the hypothesis as $$z = \theta^T x = \theta_0 + \sum_{j=1}^n \theta_j x_j$$, where $$\theta_0$$ is the bias term and $$\theta_j$$ is the weight of the $$j$$-th feature $$x_j$$. Please note that other functions that smoothly and monotonically increase from $$0$$ to $$1$$ can be also considered for $$h_{\theta}(x)$$.
7273

7374
<!-- to check:notes page22 -->
7475

0 commit comments

Comments
 (0)