Types of Learning
Summarized notes from Machine Learning by GT SL1-SL2
Terms
instance | set of inputs; vectors of attributes that define the input space |
concept | function that maps inputs to outputs; objects of the world and members of set |
target concept | the function we try to find in classification learning, i.e. the answer |
hypothesis class | the set of functions we consider in finding the target concept |
sample | training set: all inputs paired with the correct label |
candidate | concept that might be the target concept |
testing set | like training set (but cannot intersect); evaluates the candidate concept |
regression | estimating real-valued target function |
residual | error in |
Kernel function | function of distance in used to determine the weight of each training example, i.e. |
Supervised learning
- classification: mapping from input to
where is a discrete set of labels - regression: mapping from inputs to continuous value outputs
Kinds of learners
- lazy learner - instance-based learning: store examples and postpone generalization until new instance must be classified (incl. knn, locally weighted regression)
- eager learner - learns generalization from examples immediately (opposite of lazy learner)
- weak learner - learner that performs better than random chance (error rate < 50%)
Bias
-
algorithms searching through space exhibit two forms of inductive bias:
- restriction bias: based on the hypothesis set (
) of interest - preference bias: what hypotheses are preferred (
)
- restriction bias: based on the hypothesis set (
-
generally preference bias is more desirable than restriction bias because learner can work with complete hypothesis space containing the target concept whereas restriction bias can cause the target concept to be excluded from the search space
- algorithms may be purely one or the other, or exhibit combination of biases
Overfitting
Definition. Given a hypothesis space
Curse of dimensionality
Bellman's curse 🧟
As the number of features or dimensions grows, the amount of data we need to generalize accurately
grows exponentially