阅读(9.6k) 书签赞(0)

TensorFlow函数：tf.estimator.regressor_parse_example_spec

2018-05-07 11:24 更新

tf.estimator.regressor_parse_example_spec函数

tf.estimator.regressor_parse_example_spec(
    feature_columns,
    label_key,
    label_dtype=tf.float32,
    label_default=None,
    label_dimension=1,
    weight_column=None
)

定义在：tensorflow/python/estimator/canned/parsing_utils.py

为tf.parse_example生成解析规范以以便与回归器(regressor)一起使用.

如果用户将数据保存为tf.Example格式,则需要使用正确的功能规格调用tf.parse_example.这个工具有两个主要的功能：

用户需要将功能的分析规范与标签和权重(如果有的话)结合起来,因为它们都是从相同的tf.Example实例中解析的.该实用程序结合了这些规格.
要将回归器(例如,DNNRegressor)中期望的标签映射到相应的tf.parse_example规范是困难的.此实用程序通过从users (key, dtype)获取相关信息对其进行编码.

分析规范的输出示例：

# Define features and transformations
feature_b = tf.feature_column.numeric_column(...)
feature_c_bucketized = tf.feature_column.bucketized_column(
  tf.feature_column.numeric_column("feature_c"), ...)
feature_a_x_feature_c = tf.feature_column.crossed_column(
    columns=["feature_a", feature_c_bucketized], ...)

feature_columns = [feature_b, feature_c_bucketized, feature_a_x_feature_c]
parsing_spec = tf.estimator.regressor_parse_example_spec(
    feature_columns, label_key='my-label')

# For the above example, regressor_parse_example_spec would return the dict:
assert parsing_spec == {
  "feature_a": parsing_ops.VarLenFeature(tf.string),
  "feature_b": parsing_ops.FixedLenFeature([1], dtype=tf.float32),
  "feature_c": parsing_ops.FixedLenFeature([1], dtype=tf.float32)
  "my-label" : parsing_ops.FixedLenFeature([1], dtype=tf.float32)
}

使用回归器的示例用法：

feature_columns = # define features via tf.feature_column
estimator = DNNRegressor(
    hidden_units=[256, 64, 16],
    feature_columns=feature_columns,
    weight_column='example-weight',
    label_dimension=3)
# This label configuration tells the regressor the following:
# * weights are retrieved with key 'example-weight'
# * label is a 3 dimension tensor with float32 dtype.

# Input builders
def input_fn_train():  # Returns a tuple of features and labels.
  features = tf.contrib.learn.read_keyed_batch_features(
      file_pattern=train_files,
      batch_size=batch_size,
      # creates parsing configuration for tf.parse_example
      features=tf.estimator.classifier_parse_example_spec(
          feature_columns,
          label_key='my-label',
          label_dimension=3,
          weight_column='example-weight'),
      reader=tf.RecordIOReader)
   labels = features.pop('my-label')
   return features, labels

estimator.train(input_fn=input_fn_train)

函数参数：

feature_columns：包含所有功能列的iterable,所有项目都应该是从_FeatureColumn派生的类的实例.
label_key：标识标签的字符串.这意味着tf.Example使用此键存储标签.
label_dtype：一个tf.dtype标识标签的类型.默认情况下是tf.float32.
label_default：如果label_key不存在于给定的tf.Example中,则用作标签；默认情况下,DEFAULT_VALUE为None,意味着tf.parse_example如果有任何缺少的标签,则将会出错.
label_dimension：示例每个的回归目标数量；这是标签和logits张量对象的最后一个维度的大小(通常,这些对象具有形状[batch_size, label_dimension]).
weight_column：通过tf.feature_column.numeric_column定义表示权重的特征列创建的字符串或_NumericColumn.它用于在训练过程中减轻权重或增强示例；它将乘以示例的损失.如果它是一个字符串,则它将被用作一个从features中获取权重张量的关键字.如果它是a _NumericColumn,则原始张量由键weight_column.key提取,然后weight_column.normalizer_fn应用于其上以获得权重张量.

函数返回值：

tf.estimator.regressor_parse_example_spec函数返回一个字典将每个功能键映射到FixedLenFeature或VarLenFeature值.

可能引发的异常：

ValueError：如果使用标签feature_columns.
ValueError：如果使用weight_column feature_columns.
ValueError：如果给定的任何feature_columns不是_FeatureColumn实例.
ValueError：如果weight_column不是一个_NumericColumn实例.
ValueError：如果label_key是None.

← TensorFlow函数：tf.estimator.ModeKeys

TensorFlow函数：tf.estimator.RunConfig →