pip install -r requirements.txt # 安装依赖
set FLASK_APP=start.py
set FLASK_ENV=development
flask extract # 提取图像特征
flask run # 开启flask开发环境服务器
flask evaluate # 评估指标
UKBench DataSet 下载链接
利用聚类算法形成Visual Vocabulary,其中每个Visual Word代表一个聚类中心,将图像特征划分到最近的Visual Word,然后形成每幅图像的频率直方图。
inertia 惯性:计算所有节点到最近聚类中心的距离的平方和,将所有平方和相加。该指标越小,聚类内部越相似。
Silhouette Coefficient 轮廓系数:是聚类效果好坏的一种评价方式,它结合内聚度和分离度两种因素。可以用来在相同原始数据的基础上用来评价不同算法。 等于b-a/max(a,b),启动b等于平均聚类之间距离,a是平均聚类内部距离。
computing the similarity between two images based on the distance between their local descriptors
目的:用于解决 codebook size 大小都会导致检索质量较差的问题
在Herve Jegou的论文中,提出了基于Hamming Embedding的相似度比较方法。思路如下:根据图像特征生成二进制签名,然后比较二进制签名的距离。
优点:在K-Means算法中,如果两个特征落到同一个聚类内部,就认为这两个特征匹配成功;根据Hamming Embedding算法,除了要满足同一个聚类的条件外,还需要比较汉明距离,如果小于等于预设阈值(Threshold=24),则认为这两个特征匹配成功。这样就避免了K-Means算法中不合理的K值对于匹配的影响。
如何开启CSRF protection:https://flask-wtf.readthedocs.io/en/0.15.x/csrf/
如何配置 CSRF的密钥:https://stackoverflow.com/questions/34902378/where-do-i-get-a-secret-key-for-flask
Lowe的论文提出了Lowe’s Ratio Test。我们需要判断Query Image和检索出的每一幅图像的匹配相关性,根据Euclidean Distance,根据Query Image的每一个特征,确定检索图像中距离最短的特征和距离次短的特征。具体处理如下: If distance1 < distance2 * 0.7 then go on Else discard the match 其中,distance1表示最短的距离,distance2表示次短的距离。该处理可以舍弃90%的错误匹配和5%的成功匹配,但是会增加处理时间。
The Laplacian of Gaussian (LoG) operation goes like this. You take an image, and blur it a little. And then, you calculate second order derivatives on it (or, the "laplacian"). This locates edges and corners on the image. These edges and corners are good for finding keypoints.
But the second order derivative is extremely sensitive to noise. The blur smoothes it out the noise and stabilizes the second order derivative.
The problem is, calculating all those second order derivatives is computationally intensive. So we cheat a bit.
BLOB stand for Binary Large Objects. Well it is used to represent a group of pixels having similar values for intensity but different from the ones surrounding it.
BLOB in an can be detected with the help of techniques like DoG, LoG and Determinant of Hessian.
In the first step of SIFT, you generate several octaves of the original image. Each octave's image size is half the previous one. Within an octave, images are progressively blurred using the Gaussian Blur operator.
找到局部最大值、最小值像素点 如下图所示,x表示当前像素,如果该像素比周围8个像素点大,同时比上下9+9=18个像素点大,则该像素就是局部最大值像素点,局部最小值像素点同理。
找到局部最大值 、最小值子像素点。
The author of SIFT recommends generating two such extrema images. So, you need exactly 4 DoG images. To generate 4 DoG images, you need 5 Gaussian blurred images. Hence the 5 level of blurs in each octave.
In the image, I've shown just one octave. This is done for all octaves. Also, this image just shows the first part of keypoint detection. The Taylor series part has been skipped.
