'Learning' an algorithm: kNN as an edge detector, part 2
(This is the second of two parts. The entire post is split into two parts to stay within Blogger post size limits. Part 1 is here.)
Test 4: Same as Test 1, but with a different local patch
Noticing that the ground section is grainy, we could try to select a training patch taken from the ground area of the original image. Let's see if that works.
orig_image=data.camera()
edge_type=0
edge_img=get_image_outline(orig_image,edge_type,False)
start_patch_x,start_patch_y=int(len(orig_image)*3/4)-25,int(len(orig_image[0,:])/2)-25
patch,full_img,full_img_box=create_experiment_data(orig_image,orig_image,edge_img,edge_img,start_patch_x,start_patch_y)
learn_edge_detector_experiment(orig_image,patch,full_img,full_img_box,edge_type)
It definitely worked better! Notice the increase in both accuracy and F-score. The edge image itself is cleaner and closer to the actual Sobel output.
Test 5: Same as Test 6 but with a Canny edge detector
We now evaluate if we can also mimic a Canny edge detector using a kNN, using the previously shifted patch to clean the ground section of the image.
edge_type=1
patch,full_img,full_img_box=create_experiment_data(orig_image,orig_image,edge_img,edge_img,start_patch_x,start_patch_y)
learn_edge_detector_experiment(orig_image,patch,full_img,full_img_box,edge_type)
We get a similar result as with the Sobel detector. The F-score are even the same. How did we get a similar score and similar-looking output?
If we look at the training patches for both the Sobel and the Canny detectors, we notice that they are exactly similar! So it should not be a surprise that the model 'learned' exactly the same thing and exactly the same output. This is a coincidence (helped by switching to a thresholded black-white image) as the two edge detectors generally produce different patterns.
Test 6: Same as Test 5 (Canny detector) but with a different image
Let us try the Canny edge detector on the coin image and see how kNN can replicate the output. We retain the training patch extracted from the upper-left of the source image.
orig_image=data.coins()
edge_type=1
edge_img=get_image_outline(orig_image,edge_type,False)
start_patch_x,start_patch_y=int(len(orig_image)/4)-25,int(len(orig_image[0,:])/4)-25
patch,full_img,full_img_box=create_experiment_data(orig_image,orig_image,edge_img,edge_img,start_patch_x,start_patch_y)
learn_edge_detector_experiment(orig_image,patch,full_img,full_img_box,edge_type)
Uh-oh. This looks bad, even if we adjust for the disparate pixels in the Canny ground truth image. We do not need to see the low F-score to know that this is not a good output. What is different here?
Well, most of it is the difference in shading between the top and bottom rows and the less-dense Canny image. Further, notice the very clean training set edge patch (vs the sobel b/w patch). That leaves few chance of 'learning' the 'edges' that arise from shallow pixel color gradients. Most of this is due to the source training image variations.
The weak result in this instance is an effect of the canny training data, not the kNN algorithm. The Sobel edge detector clearly is the better edge detector in these images, so the 'learned' detector is also better in the Sobel-trained variant.
Given these limitation of depending on a 'good quality' training image, would this work in the real world then? What if we train on one kind of image and run the 'learned' edge detector on another image? Let's answer that below.
Test 7: Train on one sobel image, test on another
We revert to the Sobel detector since it tends to create thicker and more continouos edge outlines. We also revert to using the middle of the image. There is no particular reason for this other than to avoid the earlier 'guided' attempts at making a more accurate output.
orig_image1=data.coins()
orig_image2=data.camera()
edge_type=0
edge_img1=get_image_outline(orig_image1,edge_type,False)
edge_img2=get_image_outline(orig_image2,edge_type,False)
start_patch_x,start_patch_y=int(len(orig_image1)/2)-25,int(len(orig_image1[0,:])/2)-25
patch,_,_=create_experiment_data(orig_image1,orig_image1,edge_img1,edge_img1,start_patch_x,start_patch_y)
_,full_img,full_img_box=create_experiment_data(orig_image2,orig_image2,edge_img2,edge_img2,start_patch_x,start_patch_y,False)
learn_edge_detector_experiment(orig_image2,patch,full_img,full_img_box,edge_type)
orig_image1=data.camera()
orig_image2=data.coins()
edge_type=0
edge_img1=get_image_outline(orig_image1,edge_type,False)
edge_img2=get_image_outline(orig_image2,edge_type,False)
start_patch_x,start_patch_y=int(len(orig_image1)/2)-25,int(len(orig_image1[0,:])/2)-25
patch,_,_=create_experiment_data(orig_image1,orig_image1,edge_img1,edge_img1,start_patch_x,start_patch_y)
_,full_img,full_img_box=create_experiment_data(orig_image2,orig_image2,edge_img2,edge_img2,start_patch_x,start_patch_y,False)
learn_edge_detector_experiment(orig_image2,patch,full_img,full_img_box,edge_type)
We get a result that is better than prior attempts! This is of course luck in getting enough training data most similar to the range of pixel grid conditions in the test image. We could have just as easily extracted some other patch and performed worse.
Closing thoughts
In this post, we showed that we can indeed automate certain tasks by just showing the desired outcome. We never learned how to do the Sobel algorithm, but we are able to produce comparable results. This is what makes machine learning interesting.
The outcome of the above tests underscore the same constraints faced by all machine learning algorithms. A model's performance depends on the quality of the training data, and the consistency of the test data relative to that training set. We can also surmise that the more varied the training set a model 'learns', the more likely it can handle different test data, including new variations.
There is scope to extrapolate and correctly classify previously unseen data, but data that came from a different process (e.g., the different pixel grading and intensities) will throw off models. It is of course possible for models to recognize and handle the variation in the pixel gradients, but a (supervised) model has to be 'told' of these instances for it to be 'learned'.
A better solution in the above cases is to randomize the training pixels, without being constrained to a particular area of an image, nor as a contiguous rectangular patch. We leave this to others to test, but the outcome should be predictably better.
It is also worth noting that this is a fairly easy problem. The edge detection is straightforward. Real-world edge detection to separate objects from other objects, or even foreground objects against background clutter, is harder. The are harder because some of the patterns that would normally be identified as an edge might not be (because they belong to the same object and therefore not an edge), or vice versa.