从磁盘读取信息刘立博的代码

TensorFlow可以读取许多常用的标准格式,包括大家耳熟能详的CSV,图像文件和标准TensorFlow格式.

1.列表格式–CSV

为了读取CSV格式,TensorFlow构建了自己的方法,与其他库相比,读取一个简单的CSV文件的过程有点复杂.

读取CSV需要几个准备步骤,首先,我们必须创建一个文件名队列对象与我们将使用的文件列表,然后创建一个textlinerreader.使用此行读取器,剩余的操作将是解码CSV列,并将其保存于张量,如果我们想将同质数据混合在一起,可以使用pack方法.

import tensorflow as tf
 
sess = tf.Session()
 
filename_queue = tf.train.string_input_producer(["/temp/iris.csv"])
reader = tf.TextLineReader()
key,value = reader.read(filename_queue)
 
 
record_defaults = [[0.],[0.],[0.],[0.],[""]]
col1, col2, col3, col4, col5 = tf.decode_csv(value, record_defaults=record_defaults)
 
features = tf.stack([col1, col2, col3, col4])
 
tf.initialize_all_variables().run(session=sess)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord, sess=sess)
for iteration in range(0, 5):
example = sess.run([features,col5])
print(example)
coord.request_stop()
coord.join(threads)
 
[array([5.1, 3.5, 1.4, 0.2], dtype=float32), b'setosa']
[array([4.9, 3. , 1.4, 0.2], dtype=float32), b'setosa']
[array([4.7, 3.2, 1.3, 0.2], dtype=float32), b'setosa']
[array([4.6, 3.1, 1.5, 0.2], dtype=float32), b'setosa']
[array([5. , 3.6, 1.4, 0.2], dtype=float32), b'setosa']

2.读取图像数据

TensorFlow能够以图像格式导入数据,这对于面向图像的模型非常有用,因为这些模型的输入往往是图像.TensorFlow支持的图像格式是JPG和PNG,程序内部以uint8表示,每个图像通道一个二维张量

import tensorflow as tf

sess = tf.Session()

filename_queue = tf.train.string_input_producer(["/temp/111.png"])
reader = tf.WholeFileReader()
key,value = reader.read(filename_queue)
image = tf.image.decode_png(value)

flipImageLeftRight=tf.image.encode_jpeg(tf.image.flip_left_right(image))
tf.initialize_all_variables().run(session=sess)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord, sess=sess)
example = sess.run(flipImageLeftRight)
print(example)

3.读取标准TensorFlow格式

另一种方法是将任意数据转换为TensorFlow官方格式这种方法将简化混合或者匹配数据集与网络结构.

我们可以把获取到的数据通过tf.python_io.TFRecordWriter类将该字符串写入一个TFRecords文件

要读取TFRecodes文件,可以使用tf.TFRecordReader的tf.parse_single_example解析器,将协议缓存解析为张量