批量4D张量Tensorflow索引

凯文·扎卡（Kevin Zakka）

给定

batch_images：形状的4D张量 (B, H, W, C)
x：形状的3D张量 (B, H, W)
y：形状的3D张量 (B, H, W)

目标

如何索引batch_images使用x和y坐标以获得shape的4D张量B, H, W, C。也就是说，我想为每个批次和每对获取(x, y)shape的张量C。

在numpy中，这可以使用input_img[np.arange(B)[:,None,None], y, x]例如来实现，但我似乎无法使其在tensorflow中工作。

到目前为止我的尝试

def get_pixel_value(img, x, y):
    """
    Utility function to get pixel value for 
    coordinate vectors x and y from a  4D tensor image.
    """
    H = tf.shape(img)[1]
    W = tf.shape(img)[2]
    C = tf.shape(img)[3]

    # flatten image
    img_flat = tf.reshape(img, [-1, C])

    # flatten idx
    idx_flat = (x*W) + y

    return tf.gather(img_flat, idx_flat)

这将返回不正确的形状张量(B, H, W)。

伊巴布

通过平整张量可以做到这一点，但是在索引计算中必须考虑批处理尺寸。为此，您必须制作一个附加的虚拟批处理索引张量，其形状与相同，x并且y始终包含当前批处理的索引。这基本上是np.arange(B)您的numpy示例中的示例，而TensorFlow代码中缺少该示例。

您还可以使用来简化一些事情tf.gather_nd，它可以为您进行索引计算。

这是一个例子：

import numpy as np
import tensorflow as tf

# Example tensors
M = np.random.uniform(size=(3, 4, 5, 6))
x = np.random.randint(0, 5, size=(3, 4, 5))
y = np.random.randint(0, 4, size=(3, 4, 5))

def get_pixel_value(img, x, y):
    """
    Utility function that composes a new image, with pixels taken
    from the coordinates given in x and y.
    The shapes of x and y have to match.
    The batch order is preserved.
    """

    # We assume that x and y have the same shape.
    shape = tf.shape(x)
    batch_size = shape[0]
    height = shape[1]
    width = shape[2]

    # Create a tensor that indexes into the same batch.
    # This is needed for gather_nd to work.
    batch_idx = tf.range(0, batch_size)
    batch_idx = tf.reshape(batch_idx, (batch_size, 1, 1))
    b = tf.tile(batch_idx, (1, height, width))

    indices = tf.pack([b, y, x], 3)
    return tf.gather_nd(img, indices)

s = tf.Session()
print(s.run(get_pixel_value(M, x, y)).shape)
# Should print (3, 4, 5, 6).
# We've composed a new image of the same size from randomly picked x and y
# coordinates of each original image.

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。