Yiyang Dong

Deep Learning Notes | FaceNet: CNN for Face Recognition/Verification

Yiyang Dong published on 2022-05-16

0. Two Problems: Face Verification (1:1 Matching): Input: Image, name/ID Output: whether the input image is that of the claimed person (0/1 binary classification) e.g. passport/face match in airport and use face to unlock mobile phone Face Recognition (1:K Matching): Has a database of K persons Input: image (a person’s face) Output: ID if the image is any of the K persons or “not recognized” 1.

Deep Learning Notes | YOLO = Application of Stats + Probability + Computer Vision

Yiyang Dong published on 2022-03-24

Object Detection with YOLO “You Only Look Once” (YOLO) is a popular algorithm because it achieves high accuracy in real time object detection. This algorithm “only looks once” at the image in the sense that it requires only one forward propagation pass through the network to make predictions. Inputs and Outputs The input is a batch of m images, and each image has the shape (m, 608, 608, 3) Anchor Boxes First Filter: Threshold on Class Scores 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 def yolo_filter_boxes(boxes, box_confidence, box_class_probs, threshold = 0.

RealSense RGB-Depth Camera Tutorial 02

Yiyang Dong published on 2022-03-17

I hope this note could finally solve the Point Cloud Segmentation + YOLO Key Code 1 2 3 4 5 6 7 8 import pyrealsense2.pyrealsense2 as rs pc = rs.pointcloud() points = rs.points() points = pc.calculate(depth_frame) vertices = np.asanyarray(points.get_vertices(dim=2)) Notes points = rs.points(): Extends the frame class with additional point cloud related attributes and functions. points.get_vertices(dim=2): Retrieve the vertices of the point cloud o3d.utility.Vector3dVector(vertices_interest): Convert float64 numpy array of shape (n, 3) to Open3D format, 3 means x,y,z

Depth Estimation | Chapter 12 of CV book

Yiyang Dong published on 2022-03-10

3d 视觉笔记 01 ｜ Depth Estimation 此篇笔记是「三维视觉笔记」系列的第 1 篇, 记录了 Richard Szeliski 教授的计算机视觉教材 —— 「Computer Vision: Algorithms and Applications」第二版的第 12 章 —— 深度估计 0. Introduction Stereo matching (立体视觉匹配) is the process of taking two or more images and building a 3D model of the scene by finding matching pixels in the images and converting their 2D positions into 3D depths. The word stereo comes from the Greek for solid; stereo vision is how we perceive solid shape (Koenderink 1990).

RealSense RGB-Depth Camera Tutorial 01

Yiyang Dong published on 2022-03-10

RealSense RGB-Depth Camera Tutorial 01 1. Streaming Depth This example demonstrates how to start streaming depth frames from the camera and print the distance between 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 ##################################################### ## librealsense tutorial #1 - Accessing depth data ## ##################################################### # First import the library import pyrealsense2 as rs try: # Create a context object.

Deep Learning Notes | Transfer Learning with MobileNets

Yiyang Dong published on 2022-02-28

Transfer Learning with MobileNets Many machine learning methods work well ONLY under a common assumption: the training and test data are drawn from the same feature space and the same distribution. When the distribution changes, most statistical models need to be rebuilt from scratch using newly collected training data. In many realworld applications, it is expensive or impossible. In such cases, Transfer Learning between task domains would be desirable. This note illustrate steps of using transfer learning on a pre-trained CNN to build an Alpaca/Not Alpaca （羊驼识别） classifier!