[Coursera] Computer Vision Basics 计算机视觉基础笔记
这是疫情期间收到这个免费课的邮件,所以就索性做了一遍, 是基于MATLAB的CV课, 课程非常简单,可以用来做个基本了解. 指路: https://www.coursera.org/learn/computer-vision-basics
Week 1: What is Computer Vision?
主要是关于MATLAB和CV的一些介绍,应用场景.
Week 2: Color, Light, & Image Formation
讲了基本的成像,相机模型,色彩, 一些名词解释.
* Factors that affects “color”:
- The light sources
- Object surface properties
- Emittance and reflective spectrum
- Relative position and orientation

* Image sensor
The main factors affecting the performance of a digital image sensor are:
- Shutter speed
- Sampling pitch
- Chip size
- Analog gain
- Sensor noise
- The quality of the analog-to-digital converter
* MATLAB: Color Imaging - RGB Channels
本周lab是用给的 RGB 单通道图片 合并显示出 color image, 主要使用函数imcrop 进行剪裁
%Read the image
img = imread('image.jpg');
%Get the size (rows and columns) of the image
[r,c] = size(img);
rr=r/3;
%Wrire code to split the image into three equal parts and store them in B, G, R channels
B=imcrop(img,[1,1,c,rr-1]);
% rect = [左上角点纵坐标(xmin), 左上角点横坐标(ymin), col(xWidth), row(yHeight)]
G=imcrop(img,[1,1*rr+1,c,rr-1]);
R=imcrop(img,[1,2*rr+1,c,rr-1]);
%concatenate R,G,B channels and assign the RGB image to ColorImg variable
ColorImg(:,:,1) = R;
ColorImg(:,:,2) = G;
ColorImg(:,:,3) = B;
imshow(ColorImg)
Week 3: Low-, Mid- & High-Level Vision
* General framework:

* Three levels:
https://www.albany.edu/~ron/papers/marrlevl.html
- Low-level vision: feature detection and matching / early segmentation
- Mid-level vision: scene geometry/ infer the camera and object motion
- HIgh-level vision: infer the semantic (object recognition and scene understanding )
* MATLAB: Image Gradient Magnitude
计算和熟悉这几个参数: Sobel Image Gradients (Gx and Gy), Gradient magnitude (Gmag) and Gradient direction(Gdir)
imgradient函数: https://www.mathworks.com/help/images/ref/imgradient.html
img=imread('cameraman.tif');
[Gx,Gy]=imgradientxy(img,'sobel');
%Uncomment the code below to visualize Gx and Gy
imshowpair(Gx,Gy,'montage')
[Gmag, Gdir] = imgradient(Gx,Gy);
%Uncomment the code below to visualize Gmag and Gdir
imshowpair(Gmag,Gdir,'montage')

Week 4: Mathematics for Computer Vision
主要讲了基本的线代 算法的概念和基础.
*MATLAB: Aligning RGB Channels
Align the 3 images from earlier project to get a clearer color image.
Y = circshift(A,K) 循环将 A 中的元素平移 K 个位置。如果 K 为整数,则 circshift 沿大小不等于 1 的第一个 A 维度进行平移。如果 K 为整数向量,则每个 K 元素指示 A 的对应维度中的平移量。
计算SSD(sum of difference), 以绿色为基准,通过移动红色和蓝色的channel 来获得效果最好的图片.
(在计算SSD的时候注意要把数字转化成double 这样才能有负数, 原数据类型是unit 8). 计算红蓝步骤可以精简个函数.
%Read the image
img = imread('course1image.jpg'); %size(img) is 1023 400
B=img(1:341,1:400);
G=img(342:682,1:400);
R=img(683:1023,1:400);
%method2:
%[r,c] = size(img);
%rr=r/3;
%Wrire code to split the image into three equal parts
%and store them in B, G, R channels
%B=imcrop(img,[1,1,c,rr-1]);
%G=imcrop(img,[1,1*rr+1,c,rr-1]);
%R=imcrop(img,[1,2*rr+1,c,rr-1]);
%b=im2double(B);
%g=im2double(G);
%r=im2double(R);
ref_img_region=G(146:196,175:225);
ref_img_region = double(ref_img_region);
bestr = R;
sumr = sum(sum((double(R(146:196,175:225))-ref_img_region).^2));
for i = -10:10
for j = -10:10
shiftr=circshift(R,[i,j]);
ref_r = double(shiftr(146:196,175:225));
C = ref_r - ref_img_region;
curr = sum(sum(C.^2));
if (curr < sumr)
bestr = shiftr;
sumr = curr;
end
end
end
bestb = B;
sumb = sum(sum((double(B(146:196,175:225))-ref_img_region).^2));
for i = -5:5
for j = -5:5
shiftb=circshift(B,[i,j]);;
ref_b = double(shiftb(146:196,175:225));
C = ref_b - ref_img_region;
curb = sum(sum(C.^2));
if (curb < sumb)
bestb = shiftb;
sumb = curb;
end
end
end
ColorImg_aligned=cat(3,bestr,G,bestb);
imshow(ColorImg_aligned)
整个课程内容比较简单.
