!
"
#
The Application of Algorithms with Neural Network and Support
Vector Machine to Reduce the Makespan for Dynamic Parallel
Machine Problems
$
% & ' ( ) *
+
, - . ' / 0 1 2 3
!
"
#
The Application of Algorithms with Neural Network and
Support Vector Machine to Reduce the Makespan for
Dynamic Parallel Machine Problems
Student : Zi-Cheng Han
Advisor : Dr. Shu-Hsing Chung
A Thesis
Submitted to Department of Industrial Engineering and Management
College of Management
National Chiao Tung University
in partial Fulfillment of the Requirements
for the Degree of
Master
in
Industrial Engineering
July 2006
Hsinchu, Taiwan, Republic of China
!
"
#
$
!
"
#
$
!
"
#
$
!
"
#
$
:
:
!" # $ % & ' () * + , ' - . / 0 1 2 3 4 $ 5 6 7 !8 9 3 : ; < = > ? @ A B C D E F G H I J K L J M N O P N Q R S T U V W !X Y ?Z [ \ O !] ^ # N O _ ` !a b F c 1 = d e !f g a h i j !?k l d e ] m n G H o I J K L J M N O P N !p q r s t u v R w x !y ?z { | } w x ~ L I u k S ( ( . X Y D , . 9 = Y e s p R W ! q ? & d N O _ ¡ f g ¢ £ ¤ ¥ ¦ . § ¨ © ª «?¬ ® ¯ ° ± h ² ³ ´ ¢ £ r s µ e t ¶ s ? · ¸ ¹ s ( ¥ ¦ (N O _ ¡ ( 9
The Application of Algorithms with Neural Network and Support
Vector Machine to Reduce the Makespan for Dynamic Parallel
Machine Problems
Student : Zi-Cheng Han
Advisor : Dr. Shu-Hsing Chung
Department of Industrial Engineering and Management
National Chiao Tung University
AbstractThe scheduling of jobs on parallel machines to reduce the setups of machine will be essential for the process including parallel machine with sequence-dependent setup time and being regarded as bottleneck, which includes in many manufacturing industries, such as wafer probing, IC testing, IC packing, and polyimide (PI) print in the TFT-LCD manufacturing. In the real world, it is important to minimize makespan, i.e. minimizing the maximum completion time and the balance of utilizing capacities of parallel machines. However, dynamic events such as jobs arrivals in uncertainty usually occur, from which the negative effects can not be neglected. Therefore, it is essential to develop efficient scheduling algorithms to reduce the number of setups among jobs and then the makespan of scheduling, in which a reactive mechanism such as rescheduling criteria is adopted by the algorithm to overcome the impacts from dynamic events. In this paper, we consider dynamic parallel machine problem with the sequence-dependent setup time and the objective to minimizing makespan, which includes jobs being classified into various product types and carrying with processing time, due dates, dynamic arrival, and machines required to be set up for different product types of jobs. With the consideration of the above properties, we characterize the dynamic parallel machine problem as several features and apply neural network and support vector machine to assist the considered algorithms to determine the value for parameter used in the cost function of algorithms, in which the problem features include job-type, tightness, setup time severity and setup time range. We also proposed idle preventing mechanism to keep machines working for reducing the makespan and enhancing the utilization of machines. A computation is also provided to demonstrate that our approach has remarkable performance than existing algorithm.
! " # $% & ' ( ) deadline * + , - . / 0 1 2 3 4 5 6 7 8 9 : ;< = > ? @ A B C DE F G H I J K L M NDOP G Q R ) S T U V W X Y Z [ \ ] ^ _ ` a b c d [ \ e f g . h i j k l m n o pq r s ) t u C v w p1 2 V W x @ g RRy z % V W { | Z \ 7 } ~ C \ T . ; Npt B K pt y O> ? 6 7 : p . B y 6 H g ;¡ 5 . ¢ £ ¤ 6 7 ¥ ¦ § ¨ $© ª « ¬ ;
Contents
Contents………..….i
List of Tables……….…….ii
List of Figures………... iii
Notations………...………iv
1. Introduction……… 1
2. Problem Definition……….………4
2.1 Features Definition for Dynamic parallel Machine Problems………..…4
3. Proposed Methodology………..… 6
3.1 Neural Network with Backpropagation Learning Rule………....7
3.1.1 Structure of Neural Network………..………. 8
3.1.2 Backpropagation Neural Network………...……9
3.2 Support Vector Machine……….13
3.3 Proposed Approach to Estimate the Cost Function Parameters……….15
3.4 Algorithm Selecting Decision…
………...17
4. The Modified Parallel Insertion Algorithms………...…. 19
4.1 Parallel Insertion Algorithms………. 19
4.2 The Idle Preventing Mechanism and Rescheduling Criteria………..…23
5. Problem Design and Computational Results………...……….………26
6. Conclusion………..…………..……33
Reference………..………34
List of Tables
Table 1. Summary of the number of problems with lowest makespan for various rescheduling criteria……….………..….... 27 Table 2. Summary of the number of problems with lowest makespan for various
values for parameters………..27 Table 3. The number of the best among all problems solved by the considered
algorithms………...…….28 Table 4. The number of correct identification for rescheduling criterion and value for
parameter with neural network and SVM……….………... 29 Table 5. The number of correct identification for the selecting algorithms with neural
network and SVM………..……... 29 Table 6. Comparisons of performance for each considered algorithm with setting the
rescheduling criterion and value for parameter by neural network or SVM and fixed them according to the best conditions………...……. 30 Table 7. Comparisons of performance between applying neural network and SVM to
choose the considered algorithm and estimate the rescheduling criterion and parameter value for the selecting algorithm………... 30 Table A 1. Summary of problem design……….…………..……….... 36 Table A 2. Real features of 81 problems………..……….... 38 Table A 3. The best rescheduling criterion and value for parameter for each
algorithm………... 40 Table A 4. The makespan of each algorithm with the best rescheduling criteria and
value for parameter and the best algorithm for each problem…...……...42 Table A 5. The estimated rescheduling criterion and value for parameter by neural
network………..….44 Table A 6. The estimated rescheduling criterion and value for parameter by SVM.... 46 Table A 7. The makespan solved by each algorithm with estimated rescheduling
criteria and values for parameters by neural network and SVM…...…….48 Table A 8. The estimated algorithm and makespan solved by the estimated algorithm
with the estimated rescheduling criteria and values for parameters by neural network and SVM……….……..50
List of Figures
Figure 1. The framework of proposed methodology………...6 Figure 2. The structure of a four-layer neural network……….. 9 Fighre 3. The processes of the backpropagation learning rule………...………..12 Figure 4. The flowchart of training process for the algorithm selecting decision…....18 Figure 5. The scheduling and dispatching of jobs by the considered algorithms with
the idle preventing mechanism and rescheduling criteria...….. 24 Figure 6. The number of even or better cases for a particular algorithm with different
Notations Problem Definition
mk : the kth machine of identical parallel machines
K : the total number of identical parallel machines
Jr : the rth subset (product type) of jobs
I : the total number of jobs
R : the total number of product types
pi : the processing time of job ji
F(i) : the product type of job ji
Cmax : the maximum completion time (makespan)
W : the predetermined machines capacity expressed in terms of processing time units
sil : the sequentially dependent setup time between any two consecutive jobs ji
and jl
di : the latest starting processing time of job ji
ei : the due date of job ji
JR : the job-type feature, which indicates the average number of jobs included in a product type
T : the tightness feature, which indicates the difficulty of scheduling
D(n) : the nth time point of due dates
Num(n) : the number of jobs which belong to D(n)
ST : the setup time severity feature, which presents the relative significance of the setup time compared to the processing time
SR : the setup time range feature, which presents the degree of the spread of setup time
smax : the maximum of setup time
smin : the minimum of setup time
Structure of Neural Network
G : the input layer node in a four-layer neural network
U : the first hidden layer node in a four-layer neural network V : the second hidden layer node in a four-layer neural network
A : the output layer node in a four-layer neural network G
g
O : the real output of input layer U
u
O : the real output of first hidden layer V
v
A a
O : the real output of output layer
wug : the weight from input layer node to first hidden layer node
wvu : the weight from first hidden layer node to second hidden layer node
wav : the weight from second hidden layer node to output layer node
Da : the desired output
Backpropagation Neural Network
f(•) : the activation function used in the neural network A
a
S : the linear summation of the product of wav and OVv
E : the square-error between Da and OaA
: the learning rate which affects the speed of convergence of neural network V
v
S : the linear summation of the product of wvu and OuU U
u
S : the linear summation of the product of wug and OgG
IV : the input vector for training of neural network Support Vector Machine
PS : the training patterns for neural network and SVM
xi : the ith input pattern
yi : the ith desired output pattern
w : the unit vector
The Modified Parallel Insertion Algorithms
C1ipk : the insertion cost of job ji on position p of machine mk
C1ik : the minimal insertion cost of job ji on machine mk
C1i : the minimal insertion cost of job ji among all positions of all machines
C2i : the regret value of job ji
SLi : the slackness of job ji
: the parameter in PIA-II which is adjusted for more focusing on the additional setup time or on the slackness of job ji
: the parameter in PIA-III which is adjusted for more focusing on the standard deviation of minimum insertion cost on all machines
Stdv(C1ik) : the standard deviation of all the best insertion costs of all machines
1. Introduction
In the real world, the scheduling of jobs on parallel machines to reduce the setup of machine is essential in some processes including wafer probing, IC testing, IC packing, or polyimide (PI) print in the TFT-LCD manufacturing and so forth. In some industries, it is important to minimize the makespan and to balance the utilization of machines. However, dynamic events, such as the jobs with uncertain arrivals will affect the efficiency of scheduling algorithm. Moreover, the sequence-dependent setup time is a practical issue involving changeover time as a machine processes two jobs with different product types. Therefore, it is important to develop efficient scheduling algorithms to reduce the number of setups among jobs and then makespan of scheduling, in which a reactive mechanism such as rescheduling criteria to overcome the impacts from dynamic events. For above reasons, we consider dynamic parallel machine problem with the sequence-dependent setup time and the objective to minimize makespan.
It is well known that minimizing makespan on identical parallel machines has been proved to be a NP problem by Sethi (1977) and Garey et al. (1979). Some researcher developed mathematical programming to find the optimal solution of identical parallel machines problem with minimizing makespan such as integer programming, branch-and-bound method, dynamic programming etc. Jacek (2004) considered identical parallel machines problem with minimizing makespan and allowing several machines processing one job simultaneously. He derived processing speed function based on convex, concave or arbitrary to decide the job sequences. Jozefowska (2004) developed two phases algorithm to solve parallel machines problem with minimizing maximum completion time and allowing preemption. The first phase used linear programming model to solve the problem without preemption and the second phase applied branch-and-bound method to find the optimal preemptive schedule by the solution of first phase. Klaus and Lorant (2003) formulated the problem of scheduling on identical parallel machines with minimizing makespan and allowing preemption and processing one job simultaneously by several machines as a linear programming model and solved it.
However, applying mathematical programming to find optimal solutions is subject to the small scale problems. In order to solve large scale problems, several polynomial time heuristic algorithms have been proposed. Lin and Liao (2004)
consider the identical parallel machines problem with minimizing makespan subject to minimum total flowtime. They develop an algorithm and save computational time by lower bound calculation and job replacement rules. Gupta et al. (2001) proposed LISTFIT algorithm which assign a proper job to the next available machine through job list ordered by proposed criteria to solve the identical parallel machines problem with minimizing makespan. Min and Cheng (1999) proposed a genetic algorithm for parallel machines problem with minimizing makespan which includes the procedures of coding, initializing, reproducing, crossover and mutation and is efficient for large scale problem. Veen and Zhang (1996) applied network algorithms to solve the single machine and parallel machines problem with minimizing makespan which are took as the traveling salesman problem (TSP) and vehicle routing problem (VRP).
It is well known the network algorithms for some scheduling problems are efficient. Pearn et al. (2002) considered the wafer probing scheduling problem (WPSP) with the objective of minimizing total workloads. They formulated the WPSP as an integer programming problem and transformed the WPSP into the vehicle routing problem with time window (VRPTW). They proposed three VRPTW algorithms for solving the WPSP and their computational results showed that the network transformation makes VRPTW algorithms solve the WPSP efficiently. However, the objective of minimizing total workload may result in the unbalanced utilization of machines. Tsai (2004) proposed improving heuristic algorithms with adjusting procedure and hybrid genetic algorithm for WPSP with the objective to minimizing makespan.
In this paper, we consider the identical parallel machines problem with the objective to minimize makespan and take dynamic arrivals and sequence-dependent setup time into account. We proposed an extension of the modified parallel insertion algorithms developed by Hsiao (2003) and Tsai (2004), and apply neural network and support vector machine (SVM) to estimate the rescheduling criterion and the value for parameter used in the cost function of considered algorithms. Besides, the neural network and SVM are also adopted to make the algorithm selection decision based on the features of problems. For the dynamic environments, we develop the idle preventing mechanism to reduce the idle time of machines. In this paper, we first define our problem and introduce the features of problems in section 2. In section 3, we describe the proposed approach to solve the dynamic parallel machine problem
with minimizing makespan, in which some critical decisions are made by neural network or SVM to facilitate the execution of scheduling algorithms. Section 4 makes introduction to the modified parallel insertion algorithms, and then the reactive mechanisms such as the rescheduling policy and the idle preventing rule to deal with the dynamic events. The problem design and computational testing are presented in section 5. Finally, we give the conclusions.
2. Problem Definition
Consider the problem that several product types of jobs with arrival time and due dates are to be processed on identical parallel machines. The job processing time depends on the product type of the job processed. The machine setup time of two consecutive jobs of different product types is sequence dependent. The objective is to find a schedule for the jobs with dynamic arrivals which makes the maximal completion time minimum.
We first define the machine set as M =
{
m1,m2,...,mk,...,mK}
containing K identical parallel machines. The job subsets for different product types can be defined as J ={
J1,J2,...,Jr,...,JR,JR+1}
and each subset Jr ={
jii=Ir−1+1,Ir−1+2,...,Ir−1+Ir}
contains Ir jobs, where I0 =0 and IR+1 =K. The job subset JR+1 is a pseudoproduct type defined as JR+1=
{
jii=IR +1,IR +2,...,IR +K}
which contains K jobs and is used to state the K machines are idle. The notation r indicates a specific product type and R is the total number of the product types.Thus, there are I jobs grouped intoR product types and K pseudo jobs belong to the subset JR+1 , where R
I I
I
I = 1 + 2 +...+ . Let pi be the processing time of job ji in cluster Jr ji∈ and it should be equal to Jr pF(i) because the job processing time depends the product type of the job, where F(i) is the product type of job j . Let Cmax be i
the maximum completion time (makespan) and W be the predetermined machine capacity expressed in terms of processing time units, respectively. Let s be the il
setup time between any two consecutive jobs j and i j , in which l s is equal to il
) ( ) (iF l F
s . Further, set b and i d be the arrival time and due date of job i j , i
respectively. Let e be the latest starting processing time of job i j to be processed i
on machine m , where k e can be formulated as i ei =di −pi. 2.1 Features Definition for Dynamic parallel Machine Problems
For any scheduling problem, it can be characterized by several features such as average processing time, average setup time, average due date and so on. The problems with different features may be appropriate to be solved by dissimilar algorithms. Therefore, we identify four features of the considered problem and analyze the solution quality of various algorithms for the problem with diverse characteristics.
R I
JR= (1)
where I is the number of jobs and R is the number of product types. The feature JR indicates the average number of jobs included in a product type.
The second feature is the tightness feature, which is define as
p n Num n D I T N n − ∗ = = )] ( ) ( [ 1 1 (2) In equation (2), the tightness feature indicates the average slackness of jobs. The smaller the value of T implies that there is less slackness and the difficulty of scheduling is increasing. We divide the due dates of jobs into N levels, then the D(n) indicate the due date for nth level and Num(n) indicates the number of jobs with due date D(n). For instance, there are 10 jobs with due dates of 10 hours and 15 jobs with due dates of 20 hours. The average processing time is 8 hours, then the tightness features, T, will be (10 10 20 15) 8 8
25
1 ∗ + ∗ − = hours.
The third feature is the setup time severity feature which is proposed by Lee and Pinedo (1997), which is defined as
p s
ST = (3)
where p is the average processing time and s is the average setup time including the setup time of processing jobs with different product types and from the machine idle state to processing state. It presents the relative significance of the setup time compared to the processing time.
The forth feature is the setup time range, which is defines as
p s s
SR=( max − min) (4)
where smax and smin are the maximum and minimum value of setup time. The larger
the SR is, the more important the times of setups is.
For each particular problem, we can identify the above features and observe the changes in the makespan of problems with different combinations of four features. By analyzing the variations, we expect to develop proper approaches to solve the problems more efficiently and will be introduced in next section.
3. Proposed Methodology
In our approach, we extend the modified parallel insertion algorithms proposed by Hsiao (2003) and Tsai (2004) with neural network and support vector machine (SVM) to estimate the values of cost function parameters and the rescheduling criteria used in the considered algorithms and make the algorithm selecting decision to give a strategy for choosing an appropriate modified parallel insertion algorithm to a specific problem. For a particular problem, which is defined by the JR, T, ST and SR stated in section 2.2, we get the suggestion of a proper considered algorithm through the algorithm selecting decision. Then, apply the neural network and SVM to estimate the appropriate values for parameters of the suggested algorithm and solve the problem with the considered algorithm and the estimated value for parameter. An appropriate value for parameter means that use the modified parallel insertion algorithm with it will perform best. However, for dynamic arriving problems, we propose the idle preventing mechanism to reduce the idle time of machines caused by the stochastic inter-arrival time and the unbalanced schedules generated by the ordering criteria of the improving algorithm. Figure 1 shows the framework of proposed methodology.
A specific Problem
Identify the features, JR, T, ST and SR
The decision of algorithm selecting
among the parallel insertion algorithms
Applying the considered algorithm with idle preventing mechanism to schedule jobs for
dynamic parallel machine problem
The application of neural network and SVM to determine the value of parameter in the cost function and rescheduling criteria for the
parallel insertion algorithms The selection of an algorithm by neural network
Training of neural network and SVM
Training of neural network and SVM
Solve
The neural network and SVM model of estimating the proper values for cost function parameters and rescheduling criteria for each modifies parallel insertion algorithm is established by training a lot of data. We will generate a set of problems by combining different features and solve them by the considered algorithms with the idle preventing mechanism to get the appropriate values for parameters used in the considered algorithms. Then, for a specific problem, take the four features and the proper values for cost function parameters and rescheduling criteria as the inputs and desire outputs respectively to train for the neural network and SVM. After all the neural network and SVM models for different algorithms have been finished training, we can make the algorithm selecting decision applying the neural network and SVM models. Therefore, we will introduce the neural network and SVM in section 3.1 and section 3.2. The estimating approach will be stated in section 3.3. The section 3.4 will introduce the proposed algorithm selecting decision. The modified parallel algorithms including the cost function parameters and the idle preventing mechanism are stated in section 4.
3.1 Neural Network with Backpropagation Learning Rule
The neural network is an artificial intelligent system designed to perform intelligent tasks similar to those performed by the human brain. Neural networks resemble the human brain in acquiring knowledge through learning and storing knowledge within inter-neuron connection strengths known as synaptic weights stated by Huang (2003). It can not only be able to capture and represent complex input/output relations but also classify data into characterized groups and solve problems optimally.
Sabuncuoglu et al. (1997) and Chen et al. (2001) applied the Hopfield neural network which is commonly used to obtain optimal solutions in various different scheduling applications such as the traveling salesman problem (TSP) to generate optimal solutions for job shop scheduling problems. Unfortunately, the Hopfield neural network can just solve problems optimally with smaller scale which is not suitable for practical scheduling problems.
Park et al. (2000) proposed heuristic algorithm for scheduling problems with the objective of minimizing total weighted tardiness. They applied neural network to find the proper parameters for the dispatching rule, apparent tardiness cost with setups (ATCS), developed by Lee and Pinedo (1997) and shown their extending algorithm
based on the neural network performing better than formulated measure for the parameters.
Because solving scheduling problems optimally with the neural network has the restriction of small scale, we apply the learning measure of the neural network to look for appropriate parameter values of algorithms to perform efficiently. In our approach, the processes of estimating proper values for parameters are similar to pattern recognition and the backpropagation (BP) learning rule is a popular measure for pattern recognition. Therefore, we apply the backpropagation method as the learning rule in our neural network model. We input some characterized data and desired outputs into the neural network model to generate the convergent weights by the backpropagation learning rule. After the weighted coefficients are converged, we can decide the appropriate values for parameters in the improving algorithms which are described in section 3 through our neural network model. In the following, we will briefly introduce the structure of our neural network, the backpropagation rule and the execution of learning processing.
3.1.1 Structure of Neural Network
We construct a four-layer neural network containing one input layer, two hidden layers and one output layer. Figure 2 shows the architecture of a simple four-layer neural network.
Figure 2. The structure of a four-layer neural network
The nodes in the input, hidden and output layers are indexed with 1, 2, …, G, 1, 2, …, U, 1, 2, …, V and 1, 2, …, A, respectively. The nodes G+1, U+1 and V+1 indicate the bias and usually are set to 1. The outputs from input, hidden and output layers can be symbol as G
g O , U u O , V v O and A a
O . wug is the weight from node g to
node u , wvu is the weight from node u to node v and wav is the weight from node v to
node a. We can get U u
O by the inner product of G g
O and wug. Transforming the inner
product of U u
O and wvu by the activation function, we can get O and generate vV A
a
O from transforming the inner product of V v
O and wav. The activation function is
used to adjust the inner product of input data and weights to be comparable with the desired output. For the each A
a
O in the output layer, there is a desired output Da to
correspond to it, respectively. We can adjust the weights by comparing the real output A
a
O and the desired output Da.
3.1.2 Backpropagation Neural Network
The backpropagation learning rule indicates the adjustment of the weighting
1 a A 1 v V V+1 1 u U U+1 wav wvu Output Layer The Second Hidden Layer The First Hidden Layer 1 g G G+1 Input Layer wug
coefficients propagates from output layer to hidden layers, then to the inside input layer. Assume the activation function is f(•), then define the real output A
a O as following equations. ) ( A a A a f S O = (5) + = = 1 1 V v V v av A a w O S (6) where A a
S is the linear summation of the product of wav and O for all Vv v∈ . V Comparing the real output A
a
O and the desired output Da, the sum of the
squared-error E can be formulated as
= − = A a A a a O D E 1 2 ) ( 2 1 . (7)
From the equation (7), the adjustment of the weighting coefficient can be defined as below equation. ) ( ) 1 (t w t w wav = av + − av ∆ av w E ∂ ∂ − =
η
av A a A a w S S E ∂ ∂ ∂ ∂ − =η
av A a A a A a A a w S S O O E ∂ ∂ ∂ ∂ ∂ ∂ − =η
V v A a A a a O f S O D )) ( ) ( (− − ' − = η V v A a A a a O f S O D ) ( ) ( − ' =η , a=1,2, ,A, v=1,2, ,V,V +1 (8) , where is the learning rate which affects the speed of convergence of neural network. The larger will lead to converge quickly and the smaller will result in converging slowly. Both too large and too small learning rates will be ill effects for the training of neural networks. Considering the equation (8), we can define the adjustment of weighting coefficients between the second hidden layer and the first hidden layer in the same measure.) ( ) 1 (t w t w wvu = vu + − vu ∆ vu w E ∂ ∂ − = η vu V v V v V v V v w S S O O E ∂ ∂ ∂ ∂ ∂ ∂ − = η
U u V v V v O S f O E ) ( ' ∂ ∂ − = η U u V v V v A A A A V v A A V v A A O f S O S S E O S S E O S S E ) ( ) ( 2 ' 2 1 1 ∂ ∂ ∂ ∂ + + ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ − = η U u V v V v A A A A A A A A V v A A A A V v A A A A O f S O S S O O E O S S O O E O S S O O E ) ( ) ( 2 ' 2 2 2 1 1 1 1 ∂ ∂ ∂ ∂ ∂ ∂ + + ∂ ∂ ∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ ∂ ∂ − = η U u V v V v A a A a aA A a A a O S f O S S O O E ) ( )] ( [ ' 1 ∂ ∂ ∂ ∂ ∂ ∂ − = = η U u V v av A a A a A a a O f S w f S O D ) ( ) ]} ( ) [( { ' 1 ' = − − − = η U u V v av A a A a A a a O f S w f S O D ) ( ) ] ( ) [( ' 1 ' = − =η , V v=1,2, , , u=1,2, ,U,U+1 (9) We can get the adjustment of weighting coefficients between the first hidden layer and the input layer in the same measure and give as the equation (10).
G g U u V v vu V v av A a A a A a a ug D O f S w f S w f S O w { 1[ ( ) ( ) ] ( ) } '( ) 1 ' 1 ' + = = − = ∆ η , U u=1,2, , , g=1,2, ,G,G+1 (10) The backpropagation learning rule of two-hidden layer perceptron can be stated as following.
Step 1. Choose the activation function, f(•), and set the learning rate, , and error limit, . Initialize the weighting coefficients, wav and wvu.
Step 2. Input a training vector, IV(t), and its desired output vector, D(t). Step 3. Calculate U u S , U u O , V v S , V v O , A a S and A a O . G g O is equal to IV(t). Step 4. Adjust the weighting coefficient, wav.
V v A a A a a av av t w t D O f S O w ( +1)= ( )+η( − ) '( ) Step 5. Adjust the weighting coefficient, wvu.
U u V v A a av A a A a a vu vu t w t D O f S w f S O w ( 1) ( ) [ ( ) ( ) ] '( ) 1 ' = − + = + η
Step 6. Adjust the weighting coefficient wug.
G g U u V v vu V v av A a A a A a a ug ug t w t D O f S w f S w f S O w ( 1) ( ) { 1[ ( ) ( ) ] ( ) } '( ) 1 ' 1 ' + = = − + = + η
Step 7. Compute the sum of squared-error, E, between the real output A a
O and the desired output Da.
Step 8. If the error E is less than the error limit , then stop. Otherwise, go to Step 2. The processing of the backpropagation learning rule for a four-layer neural network is shown in Figure 3.
Figure 3. The processes of the backpropagation learning rule.
Strat
Stop
Choose f(•), and set and . Initialize wav, wvu and wug
η
Input one training vector, IV(t), and its desired output vector, D(t)
Calculate , , , , and . is equal to IV(t).
V v S V v O A a S A a O U u O Adjust weights, wav Adjust weights, wvu
Compute error, E, to test updated network E < ? Yes No U u S G g O Adjust weights, wug θ θ
3.2 Support Vector Machine
Support vector machine (SVM) is a learning system based on statistic theory and has become a popular method to classify and predict problems in recent years. SVM proposed by Corts and Vapnik (1995) is an excellence classified approach and used to deal with complicated problems widely such as symbol classifications, pattern recognitions and analysis of genetic sequence. Gersmann and Hammer (2004, 2005) considered the resource constrained project scheduling problems and applied SVM to learning an appropriate value function which guides the search strategy to solve the problems. Liu et al. (2005) applied SVM to propose support vector scheduler for scheduling the flexible manufacturing system problems and showed the performance better than other dispatching rules such as FIFO (First in First out), SPT (Shortest Processing Time) and EDD (Earliest Due Date).
The decision boundary of traditional classified approaches such as neural network is decided by all training vectors of both classifications, but most training vectors do not make the critical influence to the decision boundary. SVM is base on statistic theory to pick several key training vectors called support vectors for making the decision boundary and separate the two classifications farther.
Given training patterns set PS={(xi,yi)|i=1,2,...,N}, where xi∈ and R
} 1 , 1 {+ − ∈ i
y indicate the input and desired output patterns respectively. We want to find the optimum linear separating hyperplane, f(x)=w•y+δ =0, where w is a unit vector and w =1 . The distance between x and the hyperplane is
w x
w )/
( • +δ . We define the d+ (d-) as the minimum distance between the positive
(negative) pattern and the separating hyperplane and the “Margin” as (d+ + d-). We
can demonstrate that the Margin is equal to 2/ w , thus, for the maximize Margin optimization problem can be saw the following problem equally:
Minimize
2
w
Subject to yi(w•xi+δ)−1≥0, i = 1, 2, …, N
For linear nonseparating problem, two common used Kernel function are used and introduce as following.
Radial Basis Function (RBF) Kernel:
) 2 ' exp( )' , ( 2 2
ζ
x x x x K = − −Polynomial Kernel: ψ ) 1 ' ( )' , (x x = x•x+ K
The and are user-defined parameters which indicate the width of the RBF Kernel and power of the polynomial Kernel. Then, the separating hyperplane can be defined as = = + = L i i i x x K y x f 1 0 ) ' , ( )
( τ δ , where L is the number of support vectors and l is the Lagrange multipliers. In general, we apply Lagrangian to solve such
optimization problems.
Friessand et al. (1998) developed the Kernel Adatron (KA) algorithm which is powerful for classification problems. The Kernel Adatron algorithm is a very fast approach to approximate the solution of the support vector learning and especially effective in separating sets of data that share complex boundaries. We introduce the processes of the kernel Adatron algorithm as following.
Step 1. Initialize i = 1.
Step 2. For every training pattern (xi, yi), calculatezi = yi Lj=1ρiyiK(xi,xj). Step 3. Compute γi = yizi for all patterns.
Step 4. Let ∆ρi =η(1−γi) be the change of the multiplier i.
Step 5.If (ρi +∆ρi)≤0 then set i = 0 to avoid the multiplier being a negative value,
otherwise update the multiplier as ρi =ρi +∆ρi.
Step 6. Calculate the bias from (min( ) max( )) 2 1 + + − = zi zi bias , where + i z is the
pattern i with class label +1 and −
i
z is one with class label -1.
Step 7. If the maximum number of presentations of the pattern set has been exceeded then stop, otherwise return to Step 2.
The Kernel K( xx, ') used in the Kernel Adatron algorithm can be the RBF or polynomial Kernel functions given above. We can control the speed of convergence with which is the learning rate stated in section 3.1.2.
The Kernel Adatron algorithm is able to maximize the margin in the feature space and deal with complicated classification problems. Therefore, we adopt the Kernel Adatron algorithm for the support vector machine and assign one Gaussian function for each input training pattern. The structure of SVM is simpler than the
neural network because the user does not have to find the number of hidden layers and hidden nodes in each hidden layer. If a hyperplane can not separate the training patterns, SVM will try to map them into higher-dimension space.
3.3 Proposed Approach to Estimate the Cost Function Parameters
For scheduling dynamic parallel machines problem with sequence-dependent setup time and the objective of minimizing makespan, we adopt the modified parallel insertion algorithms developed by Hsiao (2003) and Tsai (2004) and the proposed idle preventing mechanism. In our proposed approach, we will apply the neural network and SVM respectively to look for appropriate values for parameters including the rescheduling criteria used in the parallel insertion algorithms to perform more efficiently.
In our proposed approach, we use the four features, JR, T, ST and SR, stated in section 2.2 as our input vector of the neural network and SVM for a characterized problem. We apply a four-layer neural network with five input nodes including a bias node and the desired output is the values of parameters which we are interested in. We define the activation function f(•) used in neural network as
S e S f − + = 1 1 ) ( . (11)
The nodes of hidden layer do not have objective measures such as a formulation. They can just be design by testing repeatedly to look for an appropriate number of nodes to make the weighting coefficients convergent efficiently.
Our proposed approach can be briefly described as three phases: Phase I, identify the input vectors and desired outputs for the neural network model; Phase II, training patterns. Phase III, get the estimating parameter value for a particular problem and solve it. In Phase I, we generate a set of problems which are the combinations of different features and use the considered algorithms with idle preventing mechanism to solve them with several scales of parameters. Then, for each problem, identify the four features and the optimal value for parameter which make the makespan minimum as the input vector and the desired output individually. After identifying the input and output data, we can input the features and best values for parameters of all problems in sequence for training in the neural network model in Phase II. When the average error is less than the value we set, the training process gets finished and we can input
the features of a specific problem and gain an appropriate value for parameter of the considered algorithm through the neural network or SVM model. Then, apply the modified parallel insertion algorithm with the value for parameter got from the neural network model to generate a quality schedule for the specific problem. The detailed processing of the proposed approach for estimating the cost function parameters used in the improving algorithms is stated as following.
Phase I – Identify the input and desired output data:
Step 1. Generate a set of characterized problems by combining different features. Identify the four features for each different problem individually.
Step 2. Use the modified parallel insertion algorithm including the idle preventing mechanism with different scales of parameters and rescheduling criteria to solve each problem.
Step 3. Find the optimal value for parameter and rescheduling criterion with minimum makespan for each problem respectively.
Phase II – Train for the neural network and SVM:
Step 1. For each problem, take four features and the optimal parameter and rescheduling criterion as the input vector and the desired output to train. Step 2. Repeat Step 1 of Phase II until the average error is less than the setting value. Phase III – Get the estimating parameter value and rescheduling criterion for a
particular problem and solve it:
Step 1. Input the features of a particular problem to get an appropriate value for parameters and rescheduling criterion for the improving algorithms through the trained model.
Step 2. Solve the particular problem using the improving algorithm with the parameter and rescheduling criterion got from the output of the neural network and generate a quality schedule.
For each generated problem, we first set the number of jobs, processing time, arrival time and the product types of jobs. Then, we generate the setup time and due dates of jobs from the features of the problem. The average setup time is computed
from ST, i.e. s=ST×p. The setup time is generated from uniformly distribution with the interval [s−p∗SR/2,s+p∗SR/2]. To avoid the lower bound being negative, the item, s−p∗SR/2, must be greater or equal to 0 and we can get
SR ST ≥
2 derived from replacing s with ST× . Then, decide the proper due p
dates for jobs and avoid unreasonable set against restricts of due dates.
We establish a training model for one considered algorithm to find the proper values of parameter for specific problems for the neural network and SVM respectively. Therefore, we will build four neural network models and SVM respectively for four modified parallel insertion algorithms which will be described in section 4.
3.4 Algorithm Selecting Decision
For each modified parallel insertion algorithm, we establish one training model to decide the appropriate parameter value for the neural network and SVM, respectively. However, for a specific problem, there will be some algorithm with proper parameter which can lead to a minimal value of objective function among all considered algorithms. In order to prevent the inefficiency of trying all algorithms, we propose algorithm selecting decisions to choose an appropriate algorithm with suitable parameter value which can make the makespan minimum. To select an appropriate algorithm with proper parameters, we also apply the neural network and SVM to make the algorithm selecting decision and the structure is similar to the proposed algorithm stated in section 3.3.
We establish one training model to decide the appropriate value for cost
function parameter for each modified parallel insertion algorithm, respectively.
After the neural network or SVM models finished training, we can get the
estimates of parameters for every specific problem by input the features. Then,
solve each problem by different algorithms with the proper values of
parameters generated from proposed models. Therefore, we can observe the
performance of different improving algorithms and identify which one is
proper to solve the specific problem.
For the proposed approach of algorithm selecting decision, we also take the four features of a particular problem as the inputs and the proper modified parallel insertion algorithm identified by the above manner as the desired output. After the
neural network model finishes training, we can input the features of a specific problem and get the consultation of appropriate considered algorithm. Figure 4 shows the training processes of the algorithm selecting decision.
4. The Modified Parallel Insertion Algorithms
There are many kinds of network algorithms for solving parallel machines problem with minimizing total workload including saving and insertion algorithms, in general. The parallel insertion algorithm (PIA) developed by Potvin and Roussean (1993) is an efficient network algorithm to solve parallel machines problem with minimizing total workload. Moreover, Hsiao (2003) proposed modified parallel insertion algorithms for solving the parallel machine problems more efficiently. In order to schedule jobs on parallel machines with minimizing makespan, Tsai (2004) developed the adjusting procedure to make the modified parallel insertion algorithms proposed by Hsiao (2003) and showed the performance well. However, above researches assume that all jobs are available at the beginning and did not consider the effects of dynamic arrivals. To overcome the influence of jobs arriving dynamically, we propose the idle preventing mechanism against the idle time of machines caused by stochastic inter-arrival time of jobs. In our approach, we use the modified parallel insertion algorithms proposed by Hsiao (2003) and combine the idle preventing mechanism to generate the initial solutions and apply neural network and SVM for estimating proper values of parameters including the rescheduling criteria used in the considered algorithms and make the algorithm selecting decision with neural network and SVM to choosing an appropriate modified parallel insertion algorithm for a specific problem.
In this section, we introduce the parallel insertion algorithm and modified ones in section 4.1. The idle preventing mechanism is stated in the section 4.2.
4.1 Parallel Insertion Algorithms
Potvin and Rousseau (1993) proposed parallel insertion algorithm (PIA) for solving parallel machines problem to minimize total workload. They set the conditions for rescheduling to order unprocessed jobs including jobs without scheduling yet and ones scheduled but not processed. The concepts of PIA are looking for a position with the minimum insertion cost and apply the notion of regret to decide the job first to process. The PIA computes the insertion costs of every job for any permitted inserted position of all machines and chooses the minimal value as C1. The
regret value, C2, indicates the loss and affections in the future if the job is not
processed first. The permitted inserted position means that jobs just allow inserting into the positions of the jobs which are not to be processed and will not violate the constraints of due date and capacity of machines.
The insertion cost function of the job ji on position p of machine mk, C1ipk, can be
formulated as p p ip i p ipk s s s C1 = ( −1) + − ( −1) . (12) The equation (31) indicates the additional setup time of the job ji for inserting the
th
p position in the original schedule of machine mk. Compute the insertion cost of
every position on all machines and choose the minimum one as the best inserting position on the machine. The minimal insertion cost of job ji on machine mk is
formulated as following
{ }
ipk p ik C C1 =min 1 . (13){ }
ipk k i C C1 =min 1 . (14)The equation (14) is the minimum insertion cost of job ji among all positions of all
machines. The regret value of job ji is defined as
= − = K k ik i i C C C 1 1 1 2 ( ). (15)
The regret value is used to measure the degree of influence for the objective function in the future, if a given job is not immediately inserted within its best place. A large regret value indicates that there is a big gap between the best insertion place for a job and its best insertion places in other schedules. Hence, unscheduled jobs with large regret value should be considered in high priority.
When the number of arriving jobs achieve to the condition of rescheduling, the PIA will compute the C1i and C2i of all unprocessed jobs. In each rescheduling, it will
select the job with max C2i to schedule first till all unprocessed jobs are made the
sequence. Repeat the above operations until all jobs are scheduled and then generate the complete jobs processed sequence. The procedure of the PIA is described as below.
Step 1. Initialize the schedule on each machine by selecting K jobs with first K largest initial setup time.
Step 2. If the number of arriving jobs achieve to the condition of rescheduling, then go to Step 3.
Step 3. For each unscheduled job, find its best allowable insertion position by computing C1i as the number of arriving jobs achieve the rescheduling
condition.
Step 4. Calculate the regret value C2i for each unscheduled job. Select the job with
max C2i and insert it to the pth position of machine mk got by C1i without
violating the machine capacity and its due date restriction.
Step 5 Repeat Step 3 and Step 4 until all unprocessed jobs are scheduled. Step 6. If all jobs are scheduled, then stop. Otherwise, go to Step 2.
Hsiao (2003) proposed three kinds of improving heuristic algorithms for the PIA to generate solutions with better performance. He considered the different initial scheduling step and the different cost functions of C1 and C2 with taking the latest
starting time and standard deviation of C1, respectively into account. We introduce the
three improving heuristic algorithm as following. Parallel Insertion with New Initial Criteria (PIA-I)
The PIA selects the initial processed jobs with the largest initial setup time. However, it may result in that the first jobs of several machines are the same and not minimize the total workload minimum. Comparing to the PIA, this modified algorithm, PIA-I, create a new criteria to select the initial jobs on the machines. Because the setup time between two jobs with the same product type is zero, the PIA-I selects the product type containing maximal number of jobs and picks the job with minimal latest starting time of this type to be initial job processed on each machine. Once the job ji of product type F(i) is selected for a specific machine, other
jobs of product type F(i)can not be selected as the initial job processed on other machines. After initializing K jobs, the following steps of PIA-I are the same as PIA. Parallel Insertion with Slackness (PIA-II)
To consider the impact of job due date, PIA-II modifies the insertion cost function given by equation (12) with adding the slackness of jobs. It adds the
slackness of job ji, SLi, which is formulated as (di − pi −t) into the equation (12)
and can be defined as
, ) 1 ( ) ( ( 1) ( 1) 1ipk sp i sip s p p SLi C =α − + − − + −α 0≤α ≤1. (16) is the weight to adjust for more focusing on the additional setup time or on the slackness of job ji and t is the time whenever one machine becomes free. The insertion
procedure is the same as PIA.
Parallel Insertion with the Standard Deviation of Regret Value (PIA-III)
The original PIA does not consider the influence of variance between the best insertion positions on all machines. Hsiao (2003) modified the regret value measure with adding the variance of minimal insertion costs of all machines. However, in order to prevent the larger impacts for the new regret value from changing the variance item slightly, we adopt the standard deviation of minimal insertion costs of all machines to replace the variance considered by Hsiao and define as equation (17).
) ( ) 1 ( ] ) ( [ 1 1 1 1 2 ik K k ik i i C C Stdv C C =β − + −β = (17) ) 2 / 1 ( 1 1 1 1 ) {[ ( ( ))]/( 1)} ( = − − = K C Avg C C Stdv K k ik ik ik (18) K C C Avg K k ik ik) ( )/ ( 1 1 1 = = (19)
The equation (18) is the standard deviation of all the best insertion costs of all machines and Avg(C1ik) is the average value of all the best insertion costs on all
parallel machines. We can determine the job priorities by adjusting parameter . Parallel Insertion with New Initial Criteria and Slackness (PIA-IV)
Tsai (2004) proposed this modified algorithm with combining PIA-I and PIA-II because both two algorithms have the advantage of reducing total machine setup time. For the initial step, the algorithm selects the product type containing the maximal number of jobs and chooses the job with the smallest the latest starting time ei as the
first job processed on each machine. Once the job ji of product type R(i) is selected for
a specific machine, other jobs of product type R(i) can not be selected as the initial job
processed on other machines. The insertion cost function is the same as the equation (16) and other steps in this modified algorithm are identical to PIA-II.
4.2 The Idle Preventing Mechanism and Rescheduling Criteria
The modified parallel insertion algorithms stated in section 4.1 are originally designed for scheduling problems with minimizing total workload. However, the objective of minimizing total workload may result in unbalanced utilization of machines. Tsai (2004) proposed the adjusting procedure to make the above algorithms enable to solving the scheduling problems with minimizing makespan.
However, both Hsiao (2003) and Tsai (2004) assumed all jobs available at the beginning and did not consider the influence of jobs dynamic arrivals. In order to overcome the effects of machine idle time caused by the stochastic inter-arrival time of jobs, we propose the idle preventing mechanism to reduce makespan to prevent unnecessary idle time of machines. If any machine completes its all scheduled jobs and there are still some jobs unprocessed, we will decide the next job processed on the available machine with minimal setup time for all unprocessed jobs. We can reduce the additional time for waiting to reschedule and diminish the makespan efficiently by proposed policies. The Figure 5 shows the processes of PIA with idle preventing mechansim.
Figure 5. The scheduling and dispatching of jobs by the considered algorithms with the idle preventing mechanism and rescheduling criteria
The unprocessed jobs include unscheduled jobs and scheduled jobs but not processed yet. When jobs arrival and do not achieve to the criteria of rescheduling, they will be put on the unscheduled buffer. After the length of unscheduled buffer achieve to the rescheduling criteria, the considered algorithms will order jobs and move them to scheduled buffer and then process jobs according to the schedule. As a
Job arriving
Do jobs achieve the
rescheduling criteria? completedJob
Dispatch the job with minimum setup time in the unscheduled buffer
Can select a job according to machine schedule? Is Unscheduled buffer empty? Dispatch a job according to
machine schedule Dispatching the job with minimum
setup time in the scheduled buffer Move jobs in
unscheduled buffer to scheduled buffer
Move the arriving job to unscheduled buffer Use considered algorithm to order jobs Update machine schedule Yes No Yes No No Yes
machine is free and there is no unprocessed job in the schedule, we will choose the unprocessed job in the unscheduled buffer with minimal setup time between it and the last job processed on the machine rather than one in the scheduled buffer to avoid violating the original schedule generated by the algorithm. If there is not any job in the unscheduled buffer, we select the unprocessed job in the scheduled buffer with minimal setup time to reduce makespan.
The modified parallel insertion algorithms with idle preventing mechanism can balance the workloads of machines and does not have to use the adjusting procedure proposed by Tsai (2004). The computational time of considered algorithms with idle preventing procedure is much less than of ones with adjusting procedure because the binary search measure of adjusting procedure must solve a problem repeatedly until finding the best solution. For the reason of more efficiency, we adopt the modified parallel insertion algorithms with idle preventing mechanism to solve the dynamic parallel machine problems to minimize makespan.
5. Problem Design and Computational Results
In the real world, scheduling for parallel machine problems with jobs arriving in dynamics and with sequence-dependent setup time is important, especially in bottleneck stage. To design the testing problem, we consider the characteristics of polyimide (PI) print process in the cell factory of TFT-LCD industry. The PI print process is regarded as the bottleneck stage in the cell factory because of the high ratio of setup time to the processing time.
For testing problems, we consider 5 machines and the number of jobs are set to 180. The processing time is uniform distributed over the interval [1200, 1800] with the unit in second. The arrival time is exponential distributed with the mean decided by that the expected service time divided by machine utilization. The expected service time is the sum of expected processing time and expected setup time. We keep the machine utilizations to 90% and the arrival rate can be obtained from the above statements.
The job-type feature, JR, includes three levels, 15, 30, and 60; the setup time severity feature, ST, includes three levels, 1, 2, and 4; the setup time range feature, SR, includes three levels, 0.5, 1, and 2; and finally the tightness feature, T, includes three levels determined by equation (2).
We divide the due dates of jobs into three levels, which are 20, 30 and 40 hours. If the tightness feature, T, is 1975, that means there are 30 jobs assigned for 20 hours, 60 jobs assigned for 30 hours and 90 jobs assigned for 40 hours. If the feature T is 1775, that indicates there are 60 jobs assigned for 20, 30 and 40 hours respectively. There are 70 jobs assigned for 20 hours, 60 jobs assigned for 30 hours and 50 jobs assigned for 40 hours when T is 1708.3. The tightness feature gets tighter as the values of T decreasing. For all combinations of different features, we will have 81 testing problems.
For all testing problems, we apply four modified parallel insertion algorithms stated in section 4 to solve these problems respectively. We set the rescheduling criterion as 10 and 20 respectively which indicates the job sequences will be rescheduled when the length of unscheduled buffer reaches the rescheduling criterion. Parameters as and respectively used in the considered algorithms, PIA-II and PIA-III, are divided into five levels, 0.1, 0.3, 0.5, 0.7, and 0.9. We note that PIA-I
includes no parameters in the cost function and PIA-IV is the combination of PIA-I and PIA-II, which indicates PIA-IV also including parameter in the cost function. For all problems, we apply each considered algorithm to find the best rescheduling criterion and the best value for the parameter, with which the algorithm can solve the testing problem. We use eM-Plant 4.6 to simulate the considered manufacturing environment and apply NeuroSolutions 5.0 for training data with neural network and SVM with 20% of training set for cross verification. The computational time for solving a testing problem is around 5 minutes with Pentium IV 2.4G Hz and 1,536 Ram. Table 1 and Table 2 show the summary of the number of problems with lowest makespan for various rescheduling criteria and values for parameters.
Table 1. Summary of the number of problems with lowest makespan for various rescheduling criteria
PIA-I PIA-II PIA-III PIA-IV All Algorithms Rescheduling
Criterion 10 20 10 20 10 20 10 20 10 20 Number of
Cases* 41 40 45 36 45 36 41 40 172 152
* The total cases for each algorithm are 81 and the total cases for all algorithms are 324
Table 2. Summary of the number of problems with lowest makespan for various values for parameters
PIA-II ( ) PIA-III ( ) PIA-IV ( ) All Algorithms Value for
Parameter 0.9 Others 0.1 Others 0.9 Others 0.1 0.9 Others Number of
Cases 58 23 65 16 54 27 83* 113 77
* The total cases for all algorithms with considering parameters are 273 because the PIA-I have no parameter
For considered algorithms, the best rescheduling criterion with 10 takes 172 better cases among 324 testing problems and more than the rescheduling criterion with 20. The parameter equal to 0.9 is appropriate for PIA-II and PIA-IV; the PIA-III with equal to 0.1 will be more efficiently. Table 3 shows the number of the best among all 81 problems solved by the considered algorithms, in which the rescheduling criterion is set to 10 and values for parameters, and , in the cost function is set to 0.9 and 0.1 respectively.