An introduction to Linear Regression

阅读量：

An Introduction to Linear Regression

Intent

supervised learning:
- regression (continious)
- classfication (discrete)

$h(\theta)=\displaystyle\sum_{i=0}^{n}\theta_iX_i=\theta^{T}X$

For historical reasons, this function h is called a hypothesis.

$\theta^{'}s \quad\text{is the parameters}$

$(x^{(i)},y^{(i)})\qquad(training \quad example)$

$\{(x^{(i)},y^{(i)}); i=1,...,m\}\qquad(training \quad set)$

要让 $h(\theta)$ 接近于trainset中的 $y^{(i)}$

cost function: $J(\theta) =\frac{1}{2}\displaystyle\sum_{i=1}^{m}(h_{\theta}(x^{(i)} - y^{(i)})^2$

为了使生成的模型达到稳定性能，在偏差与方差之间进行权衡是非常必要的。正则化回归采用L1（即拉索）和L2（即里德）作为惩罚项。其中,L1等同于特征选择过程(参数变为零)。其主要作用是降低方差。交叉验证与自助法也是常用的技术来权衡偏差与方差。

LMS

gradient descent

$\theta_j$ 被更新为当前值减去学习率乘以目标函数相对于 $\theta_j$ 的偏导数； $\alpha$ 代表学习率

对θj求偏导数运算的结果等于二分之一乘以（h_θ(x)−y）再乘以xj

For a single training example, this gives the update rule:

$\theta_j = \theta_j +\alpha(y^{(i)}-h_\theta(x^{(i)}))x^{(i)}_j$

The rule is called LMS update rule (LMS, “least mean squares”)

batch gradient descent
$I n s i t i t u l t i v e$ $u n t i l$ $c o v e r g e n c e$ ${$ d oq u a dt h e $s u m m a t i o n$ o v e r $d o f$ f o r $$i = 1$ t o$$m( y ^{( i ) } - h _{θ}( x ^{( i ) } )) x _{ j } ^{( i ) }d o w n w a r d sθ _{ j } = θ _{ j } + α ×$
}

随机梯度下降法
$Loop \quad\{$
$\quad for \, i \, from \, 1 \, to \, m,$
$\qquad update \, parameters: \, \theta_j = \theta_j + \alpha \sum_{k=1}^{m} (y^{(k)} - h_\theta(x^{(k)})) x^{(k)}_j$
$\qquad (as applied to each \, j)$
$\quad end.$
$end.$

stochastic 相比于batch的方法收敛速度快。

An interpretation of Probability

please refer to [1]

An interpretation of Linear Algebra

在Ng的draft中分别运用了基于线性代数和基于概率论的方法来推导Linear regression模型。其中概率论的方法则利用高斯分布的最大似然估计来进行建模；相比之下，基于Linear Algebra的方法处理起来较为复杂；然而从投影的角度来看，则可能更为直观[2]。

这里尽量写的清楚点[2]:
Projection onto a line
A line goes through the origin in the direction of $a = (a_1, . .. , a_m)$ . Along that line, we want the point p closest to $b = (b_1, . .. , b_m )$ . The key to projection is orthogonality: The line from $b$ to $p$ is perpendicular to the vector $a$ . This is the dotted line marked $e$ for error in Figure below-which we now compute by algebra.

Projection amounts to a multiple of. Let p equal x^a, which is represented as “x hat” multiplied by. Calculating this scalar x^a will yield the vector p. From the formula for [something], we derive P. These steps outline how to obtain all projection matrices: first determine x, then compute v, and finally obtain P.

Projection onto a Subspace

We aim to calculate projections onto n-dimensional subspaces following a structured approach. The process involves three key steps: identifying a suitable vector, computing its projection onto this subspace using the formula $p = A\hat{x}$ , and determining the corresponding transformation matrix.

The residual $b - A\hat{x}$ is perpendicular to each of the vectors $a_1, ..., a_n$ . The n right angles result in n equations forming the system of equations.

$a^T_1(b-A\hat{x})=0\\$
$\quad\quad\vdots$
$a^T_n(b-A\hat{x})=0\\$

The matrix with those rows $a^T$ is $A^T$ . The equations are exactly $A^T(b - A\hat{x}) = 0$ . Rewrite in its famous form $A^T Ax = A^T b$ . This is the equation for $\hat{x}$ , and the coefficient matrix is $A^T A$ . Now we can find and and , in that order:
$p = A\hat{x}=A(A^TA)^{-1}A^Tb$
project matrix is:
$P=A(A^TA)^{-1}A^T$
有了projection matrix就可以求了，也就是在A的column space上的投影。
这里需要 $A$ 的columns是相互独立的，这样 $A^TA$ 才可逆。

classification (Logistic regression)

Logit回归是对线性回归采用sigmoid函数这一做法无需再多解释

Generalized Linear Models

线性和逻辑回归都属于广义线性模型（Generalized Linear Models, GLM）的关键组成部分。根据响应变量的不同概率分布特性及符号 $y$ 可推导出一系列特定模型。具体而言，线性回归对应于响应变量服从正态分布的情况（Linear regression），而逻辑回归则适用于二元分类问题（Logistic regression）。Softmax回归适用于响应变量处于有限类别中的情况（Softmax regression）。有关此主题，请参考文献中的R实现部分（ Softmax regression）。该算法所采用的目标函数采用了弹性网（Elastic Net）方法进行优化（其解决的cost function is elastic net's cost function）。

CS 229 Notes - 监督学习
线性代数导论（第4章）：吉尔伯特·斯特朗

全部评论 (0)

还没有任何评论哟~

An introduction to Linear Regression

AnIntroductiontoLinearRegression Intent supervisedlearning: \regressioncontinious \classficationdisc...

Statistics with R-Linear Regression-Week 2-Introduction to linear regression

Plotthedata 要得到两个变量的关系，一个是因变量，一个是自变量，通过plot功能画出散点图 ggplotmlb11,mapping=aesx=runs,y=atbats+ geompoint...

Introduction To Linear Algebra(1)Linear Equation

SOLUTIONSETSOFLINEARSYSTEMS HomogeneousEquation Nonhomogeneousequation LinearIndependence LinearTran...

Introduction to Statistical Learning: An Introduction t

作者：禅与计算机程序设计艺术 1.简介随着互联网、物联网、金融科技等新一代信息技术的发展和普及，以及数据量的急剧增长，数据的分析和处理成为信息科技人员的一项重要工作。在这样一个数据驱动时代，统计学习...

Introduction to Linear Algebra, Chapter-1, Introduction to Vectors, Key Notes

IntroductiontoLinearAlgebra,Chapter1,IntroductiontoVectors,KeyNotes 本人在阅读MIT数学教授GilbertStrang所著线性代数教...

An Introduction to Hyperledger Fabric

作者：禅与计算机程序设计艺术 1.简介 HyperledgerFabric是一个开源分布式账本项目，可以让不同组织的多方参与对交易记录进行管理、验证和协调。它实现了私有/联盟链、金融、供应链等众多区块...

An Introduction to Genetic Algorithms

作者：禅与计算机程序设计艺术 1.简介什么是遗传算法（GeneticAlgorithms）？它可以被认为是一个优化算法，旨在通过模拟自然进化过程来解决优化问题。

An Introduction to Vision-Language Modeling

本文是LLM系列文章，针对《AnIntroductiontoVisionLanguageModeling》的翻译。视觉语言建模导论 1引言 2VLM家族 2.1基于Transformer的VLM早期...

Istio: An Introduction to Service Mesh

作者：禅与计算机程序设计艺术 1.简介服务网格（ServiceMesh）是一个微服务架构下用于处理服务间通信的基础设施层。其目的在于通过提供服务发现、负载均衡、限流熔断等功能来解决异构系统之间的网络...

An Introduction to Deep Reinforcement Learning

作者：禅与计算机程序设计艺术 1.简介深度强化学习Deepreinforcementlearning,DRL是机器学习领域中一个新兴的研究方向。它将强化学习与深度神经网络结合起来，使用神经网络作为函...

是否确定退出登录?

An introduction to Linear Regression

An Introduction to Linear Regression

Intent

LMS

An interpretation of Probability

An interpretation of Linear Algebra

classification (Logistic regression)

Generalized Linear Models

全部评论 (0)

相关文章推荐

An introduction to Linear Regression

Statistics with R-Linear Regression-Week 2-Introduction to linear regression

Introduction To Linear Algebra(1)Linear Equation

Introduction to Statistical Learning: An Introduction t

Introduction to Linear Algebra, Chapter-1, Introduction to Vectors, Key Notes

An Introduction to Hyperledger Fabric

An Introduction to Genetic Algorithms

An Introduction to Vision-Language Modeling

Istio: An Introduction to Service Mesh

An Introduction to Deep Reinforcement Learning