TensorFlow & GPU

2. LITERATURE REVIEW

2.4 TensorFlow & GPU

block, B is the sample size of the testing block,  is the standard deviation of training data in the training block, and  represents the maximal deviation between actual and predicted outputs.

Table 3: The proposed mechanism implemented by Wu (2017)

Step 0: Set M as 1.

Step 1.1: Apply the RLEM stated in Table 2 (with envelope width = 2) to the N training examples {(x^(M-1)B+1, y^(M-1)B+1+h), (x^(M-1)B+2, y^(M-1)B+2+h), …, (x

(M-1)B+N, y^(M-1)B+N+h)} to filter out Nk potential outliers and obtain an acceptable SLFN.

Step 1.2: Remove the outlier candidates, and then use the SLFN obtained in Step 1.1 and the RLEM stated in Table 2 (with envelope width = 2) again to learn the remained N(1-k) training examples.

Step 2: Apply the SLFN obtained in Step 1.2 to the B testing examples {(x

(M-1)B+N+1, y(M-1)B+N+1+h), (x^(M-1)B+N+2, y(M-1)B+N+2+h), …, (x^MB+N, y^MB+N+h)}.

Step 3: For more data, M  M+1 and GOTO Step 1.1; otherwise, STOP.

2.4 TensorFlow & GPU

2.4.1 TensorFlow

The Google Brain project started in 2011 to explore the use of very-large-scale deep neural networks, which be used in both Google’s products and research. Based on the DistBelief, first-generation scalable distributed training and inference system, they have built TensorFlow, second-generation system for the implementation and deployment of largescale machine learning models. They have open-sourced the TensorFlow API and the reference implementation under the Apache 2.0 license for November 2015, available at www.TensorFlow.org. This could be said to be the evolution of DistBelief (Abadi et al., 2016).

TensorFlow uses dataflow-like models to compute and map them to a variety of hardware platforms, from running inference on mobile device platforms like Android and iOS to simulations using a single machine containing one or more GPU cards to

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

thousands of GPUs (Abadi et al., 2016).

TensorFlow uses a single dataflow graph to represent all calculations and states in a machine learning algorithm, including the individual mathematical operations, parameters and their update rules, and input preprocessing. The graph is composed of a set of vertices.

In a TensorFlow graph, each vertex has zero or more inputs and zero or more outputs, and represents the instantiation of an operation. And each edge represents the output from, or input to, a vertex. We refer to the computation at vertices as operations, and the values that flow along edges as tensors. In this subsection, I will describe some basic programming model and basic concepts of TensorFlow. (Abadi, Martín, et al 2016)

 Tensors

In TensorFlow, they model all data as tensors (n-dimensional arrays) with the elements having one of a small number of primitive types, such as int32, float32, or string (where string can represent arbitrary binary data). Tensors naturally represent the inputs to and results of the common mathematical operations in many machine learning algorithms. (Abadi et al., 2016)

This design is for connected machine learning algorithms, such as logistic regression, ANN and so on. Connected machine learning algorithm can be expressed as a graph algorithm, Tensors from front to back in the Graph to complete the forward operation; and the residual from the back to go forward, to complete the back propagation.

 Operations

An operation takes m ≥ 0 tensors as input and produces n ≥ 0 tensors as output. An operation has a named “type” (e.g., add of two tensors of type float versus add of two tensors of type int32) and may have zero or more compile-time attributes that determine its behavior. Each operation will have attributes, all attributes are established when the

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Graph is established. The figure 1 is some of the operations that TensorFlow implements.

Figure 1: TensorFlow operation examples

 Variables

In most computations a graph is executed multiple times. Machine learning algorithm will have parameters, and the state of the parameters need to be saved. The parameters in the Graph has its fixed position, not as normal data flow as normal. Thus, Variables is implemented as a special operator in TensorFlow, which returns the variable Tensor it holds, and are updated as part of the Run of the training graph for the model.

 Sessions

Clients programs interact with the TensorFlow system by creating a Session. In normal mode, a session is created and an empty Graph is created; nodes and edges are added to the session to form a Graph and then run. Using the arguments to Run, the TensorFlow implementation can compute the transitive closure of all nodes that must be executed in order to compute the outputs that were requested.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 2: Example TensorFlow code fragment via Python

Figure 3: Corresponding computation graph for Figure 2

2.4.2 GPU

In recent years, the accelerated development of deep learning and artificial intelligence technology, one of the most important factors is that GPU provides a powerful parallel computing architecture that enables deep learning training ten times faster than the CPU, the model training time, from weeks shorten to a few days. Deep learning simulates neuron operation with a large number of matrix operations. The characteristic of the matrix operation is that a single operation is simple, but requires a large number of operations and is particularly suitable for parallel operations.

Compare CPU and GPU processing tasks, you can clearly understand the difference between the two. The CPU contains several cores that are used to optimize

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

sequential processing. The GPU contains thousands of smaller, more efficient cores that are optimized for handling multiple tasks simultaneously.

Since NVIDIA's launch in February 2007, the CUDA programming model has been used to develop many of the GPU applications. CUDA provides an easy to learn ANSI C language extension. Programmers specify parallel threads, each of which running scalar code. NVIDIA’s graphics card all support CUDA technology after GeForce 8 series (Topa, T., Karwowski, A., & Noga, A. 2011).

With hardware and software support, GPUs have become an integral part of today's mainstream computing systems. In recent years, the performance and functionality of GPUs have significantly increased. Modern GPU is not only a powerful graphics engine, but also a highly parallel programmable processor that substantially outpaces its CPU (Owens et al., 2008).

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

在文檔中應用類神經網路於學生微型信貸 - 政大學術集成 (頁 20-25)

2. LITERATURE REVIEW

2.4 TensorFlow &amp; GPU

2.4 TensorFlow & GPU

2.4.1 TensorFlow

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

2.4.2 GPU

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

2.4 TensorFlow & GPU

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學