计算TensorFlow序列之间的编辑距离
2018-10-12 16:38 更新
tf.edit_distance
edit_distance (
hypothesis ,
truth ,
normalize = True ,
name = 'edit_distance'
)
定义在:tensorflow/python/ops/array_ops.py.
参见指南:数学函数>序列比较和索引
计算序列之间的编辑距离.
该操作采用可变长度序列(假设(hypothesis)和真值(truth)),每个序列都提供 SparseTensor,并计算编辑距离.通过将规范化设置为 true, 可以将编辑距离正常化.
例如,给出以下输入:
# 'hypothesis' is a tensor of shape `[2, 1]` with variable-length values:
# (0,0) = ["a"]
# (1,0) = ["b"]
hypothesis = tf.SparseTensor(
[[0, 0, 0],
[1, 0, 0]],
["a", "b"]
(2, 1, 1))
# 'truth' is a tensor of shape `[2, 2]` with variable-length values:
# (0,0) = []
# (0,1) = ["a"]
# (1,0) = ["b", "c"]
# (1,1) = ["a"]
truth = tf.SparseTensor(
[[0, 1, 0],
[1, 0, 0],
[1, 0, 1],
[1, 1, 0]]
["a", "b", "c", "a"],
(2, 2, 2))
normalize = True
此操作将返回以下内容:
# 'output' is a tensor of shape `[2, 2]` with edit distances normalized
# by 'truth' lengths.
output ==> [[inf, 1.0], # (0,0): no truth, (0,1): no hypothesis
[0.5, 1.0]] # (1,0): addition, (1,1): no hypothesis
ARGS:
- hypothesis:SparseTensor 含有假设序列.
- truth:一个 SparseTensor 含有真值序列.
- normalize:一个布尔值.如果为 True,将编辑的距离正常化为真值的长度.
- name:操作的名称(可选).
返回:
返回秩为 R - 1 的稠密 Tensor,其中 R 是 SparseTensor 输入 hypothesis(假设) 和 truth(真值) 的秩.
注意:
- TypeError:如果任何一个 hypothesis(假设) 和 truth(真值) 不是一个SparseTensor.