Object Detection with YOLO
“You Only Look Once” (YOLO) is a popular algorithm because it achieves high accuracy in real time object detection. This algorithm “only looks once” at the image in the sense that it requires only one forward propagation pass through the network to make predictions.
The input is a batch of m images, and each image has the shape (m, 608, 608, 3)
Anchor Boxes
First Filter: Threshold on Class Scores
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
def yolo_filter_boxes(boxes, box_confidence, box_class_probs, threshold = 0.6):
"""Filters YOLO boxes by thresholding on object and class confidence.
Arguments:
boxes -- tensor of shape (19, 19, 5, 4)
box_confidence -- tensor of shape (19, 19, 5, 1)
box_class_probs -- tensor of shape (19, 19, 5, 80)
threshold -- real value, if [ highest class probability score < threshold],
then get rid of the corresponding box
Returns:
scores -- tensor of shape (None,), containing the class probability score for selected boxes
boxes -- tensor of shape (None, 4), containing (b_x, b_y, b_h, b_w) coordinates of selected boxes
classes -- tensor of shape (None,), containing the index of the class detected by the selected boxes
Note: "None" is here because you don't know the exact number of selected boxes, as it depends on the threshold.
For example, the actual output size of scores would be (10,) if there are 10 boxes.
"""
x = 10
y = tf.constant(100)
# Step 1: Compute box scores
box_scores = box_class_probs * box_confidence
# Step 2: Find the box_classes using the max box_scores, keep track of the corresponding score
box_classes = tf.math.argmax(box_scores,axis=-1)
box_class_scores = tf.math.reduce_max(box_scores,axis=-1)
# Step 3: Create a filtering mask based on "box_class_scores" by using "threshold". The mask should have the
# same dimension as box_class_scores, and be True for the boxes you want to keep (with probability >= threshold)
filtering_mask = (box_class_scores >= threshold)
# Step 4: Apply the mask to box_class_scores, boxes and box_classes
scores = tf.boolean_mask(box_class_scores,filtering_mask)
boxes = tf.boolean_mask(boxes,filtering_mask)
classes = tf.boolean_mask(box_classes,filtering_mask)
return scores, boxes, classes
|
Second Filter: Non-max Suppression(NMS) with IoU
IOU
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
def iou(box1, box2):
"""Implement the intersection over union (IoU) between box1 and box2
Arguments:
box1 -- first box, list object with coordinates (box1_x1, box1_y1, box1_x2, box_1_y2)
box2 -- second box, list object with coordinates (box2_x1, box2_y1, box2_x2, box2_y2)
"""
# Assign variable names to coordinates for clarity
(box1_x1, box1_y1, box1_x2, box1_y2) = box1
(box2_x1, box2_y1, box2_x2, box2_y2) = box2
# Calculate the (yi1, xi1, yi2, xi2) coordinates of the intersection of box1 and box2. Calculate its Area.
xi1 = np.maximum(box1[0], box2[0])
yi1 = np.maximum(box1[1], box2[1])
xi2 = np.minimum(box1[2], box2[2])
yi2 = np.minimum(box1[3], box2[3])
inter_width = xi2-xi1
inter_height = yi2-yi1
# Case in which they don't intersec --> max(,0)
inter_area = max(inter_width, 0) * max(inter_height, 0)
# Calculate the Union area by using Formula: Union(A,B) = A + B - Inter(A,B)
box1_area = (box1[2]-box1[0])*(box1[3]-box1[1])
box2_area = (box2[2]-box2[0])*(box2[3]-box2[1])
union_area = box1_area + box2_area - inter_area
# compute the IoU
### START CODE HERE ### (≈ 1 line)
iou = float(inter_area)/float(union_area)
### END CODE HERE ###
return iou
|