Regression
Supervised learning > regression
- mapping of continuous inputs to discrete or continuous outputs
- concept from statistics: observe data to construct an equation, to be able to make predictions for missing or future data
- regression can fit different orders of polynomial (constant, line, parabola, etc.) or, for vector inputs multiple dimensions (hyperplanes)
- input representation must be numeric and continuous => discrete inputs must be enumerated and ordered
Linear Regression
-
linear regression is an attempt to model relationships between a dependent variable
and independent variables ( ) => want to find equation of type is the output variable are the input variables are parameters or weights of the model- the weights which tell how important each corresponding
is to predicting the outcome
-
sample data may not perfectly fit a linear model causing error in the model
- many ways to calculate error e.g. sum of absolute errors, sum of squared errors
-
let
be the predicted output, then:sum of absolute errors sum of squared errors -
use gradient decent algorithm to find the weighs that minimize the error:
-
for a constant function the best error is the mean of data points
Polynomial regression
In more general case, for some dataset mapping values
where