然后我们对所有的质心做N次复制,对每个样本点做K次复制,这样样本点和质心的形状都是NxKx2,我们就可以计算每一个样本到每一个质心点之间在所有纬度上的距离.
rep_centroids = tf.reshape(tf.tile(centroids, [N, 1]), [N, K, 2]) rep_points = tf.reshape(tf.tile(points, [1, K]), [N, K, 2])
复制张量
tf.tile(input, multiples, name=None)
Constructs a tensor by tiling a given tensor.
This operation creates a new tensor by replicating input multiples times. The output tensor’s i’th dimension has input.dims(i) * multiples[i] elements, and the values of input are replicated multiples[i] times along the ‘i’th dimension. For example, tiling [a b c d] by [2] produces [a b c d a b c d].
Args:
input: A Tensor. 1-D or higher.
multiples: A Tensor of type int32. 1-D. Length must be the same as the number of dimensions in input
name: A name for the operation (optional).
Returns:
A Tensor. Has the same type as input.
重组张量纬度
tf.reshape(tensor, shape, name=None)
Reshapes a tensor.
Given tensor, this operation returns a tensor that has the same values as tensor with shape shape.
If shape is the special value [-1], then tensor is flattened and the operation outputs a 1-D tensor with all elements of tensor.
If shape is 1-D or higher, then the operation returns a tensor with shape shape filled with the values of tensor. In this case, the number of elements implied by shape must be the same as the number of elements in tensor.
For example:
# tensor ‘t’ is [1, 2, 3, 4, 5, 6, 7, 8, 9]
# tensor ‘t’ has shape [9]
reshape(t, [3, 3]) ==> [[1, 2, 3]
[4, 5, 6]
[7, 8, 9]]
# tensor ‘t’ is [[[1, 1], [2, 2]]
# [[3, 3], [4, 4]]]
# tensor ‘t’ has shape [2, 2]
reshape(t, [2, 4]) ==> [[1, 1, 2, 2]
[3, 3, 4, 4]]
# tensor ‘t’ is [[[1, 1, 1],
# [2, 2, 2]],
# [[3, 3, 3],
# [4, 4, 4]],
# [[5, 5, 5],
# [6, 6, 6]]]
# tensor ‘t’ has shape [3, 2, 3]
# pass ‘[-1]’ to flatten ‘t’
reshape(t, [-1]) ==> [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6]
Args:
tensor: A Tensor.
shape: A Tensor of type int32. Defines the shape of the output tensor.
name: A name for the operation (optional).
Returns:
A Tensor. Has the same type as tensor.
求得每个点与质点的平方差,然后降 维成2维数组
sum_squares = tf.reduce_sum(tf.square(rep_points – rep_centroids), reduction_indices=2)
然后我们对所有纬度求和,得到和最小的那个索引(这个索引就是每个点所属的新的类)
tf.argmin(input, dimension, name=None) 返回input最小值的索引index
best_centroids = tf.argmin(sum_squares, 1)
centroids也会在每个迭代之后由bucket_mean函数更新
停止条件
本例的停止条件是所有质心不再变化
did_assignments_change = tf.reduce_any(tf.not_equal(best_centroids, cluster_assignments))
此处,我们使用control_depaendencise来控制是否更新质心:
with tf.control_dependencies([did_assignments_change]):
do_updates = tf.group(
centroids.assign(means),
cluster_assignments.assign(best_centroids))