Machine Learning (1)

今天复习一下上学时学习的机器学习的概念。

  1. 一些机器学习的应用场景

    1. 手写识别
    2. 物体识别
    3. 情感识别(I love this course 是积极的,I will take another course 是消极的)
    4. 遥感信息图像分类(通过俯视图查看区域种类)
    5. 网页搜索
    6. 语音识别
    7. 垃圾邮件识别
    8. 机器人
    9. 医疗健康
    10. 指纹识别
    11. 面部识别
    12. 自动驾驶
    13. 卫星图树种类分类识别

    一些免费的数据集

    1. Image/Video Databases (comprehensive): https://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm
    2. http://archive.ics.uci.edu/
    3. handwritten digits: http://yann.lecun.com/exdb/mnist/
    4. https://image-net.org/
  2. 一些术语

    1. 分类
    2. 回归
    3. 聚类
    4. 标签 Label
    5. 特征 Features
    6. 损失函数 Loss Function
    7. 消耗函数 Cost Function
    8. 准确度 Accuracy
    9. 单个数据 Example / Sample
    10. 训练集 Labeled Sample
    11. 测试集 Unlabeled Sample
    12. 模型 Model
    13. 算法 Algorithms
  3. 一些例子
    2023-09-04T172309
    分类
    2023-09-05T090448
    物体分类与识别
    2023-09-05T090508
    训练集和测试集

  4. 机器学习的分类
    监督学习 —— Labelled data
    无监督学习 —— Labelled & Unlabeled Data
    强化学习 —— Reward System

监督学习
● Supervised Learning - Classification
○ Features: image/pixels
○ Label: Cat and Dog
● Task: Given a picture of car or dog, predict its label.

● Supervised Learning - Regression
○ Features: size of house
○ Label: house price
● Task: Given a house’s size, predict the selling price.

Recap:
● A model trained on historical data that are
labeled or known ground truth (e.g.
previous house sales example).
● Once the model is trained, it can then be
tested on new data to predict the label.

无监督学习
● What if No ground truth available?
● Unsupervised Learning - Clustering
○ Features: Length and Width for types
of flowers.
○ Label: No Label for unsupervised!
● Task: Cluster together the data into
similar groups/patterns.

2023-09-05T100919

强化学习
Reinforcement learning works through trial and error which actions yield the greatest
rewards.
2023-09-05T101140

深度学习
A family of machine learning methods that uses deep
architectures to learn high-level feature representations.

Parametric and non- Parametric models
• Parametric models:
• Have a fixed number of parameters
• Faster to use, simpler
• Stronger assumptions on the data distribution
• e.g., Logistic regression

• Non- Parametric Models:
• The number of parameters grow with the amount of
training data
• Flexible on data distribution
• Slower, risk of overfitting, more data requirement
• e.g., K-nearest neighbors (KNN)

监督学习工作流程
2023-09-05T111605


那么如何评价一个模型呢?

Supervised Learning —— Classification
Evaluation Metrics
Accuracy , Recall, Precision, Confusion matrix

Accuracy: Correctly Classified divided by
total samples.
2023-09-05T113416

Confusion Matrix: Shows the actual and predicted labels from a
classification problem.
2023-09-05T113507

Recall: the proportion of actual positives was
identified correctly.
2023-09-05T113647

Precision: the proportion of positive identifications
was actually correct.
2023-09-05T113703

2023-09-05T114208

2023-09-05T114224


Machine Learning (1)
https://leiz-eng.github.io/2023/09/04/Machine-Learning-1/
作者
Lei Zhao
发布于
2023年9月4日
许可协议