# Shrinkage and warpage prediction of injection-molded thin-wall parts using artificial neural networks.

INTRODUCTIONBecause of the highly competitive nature of 3C (computers, communication devices and consumer electronics) industries, demands for shortening the research and development time, reduction in the size of the products, and better production accuracy are growing each day. The role of precision injection-molding technology with large-scale production capability and good repeatability is therefore becoming increasingly important in 3C industries. To ensure the quality of injected products, the important processes of part design, mold design, and process conditions should be taken into consideration. Since the processes of mold design and part design are usually determined in the initial stage of product development, they cannot be easily changed. So, the most important work during the final production process is to systematically change the process parameters in order to determine the optimal process condition that conforms to the quality demand of the product.

At present, many 3C products in the market, such as cellular phone covers and memory cards of digital cameras, are injected thin-wall products. Thin-wall parts have been defined as parts with thickness less than 1.5 mm, or with the ratio of flow length/wall thickness larger than 100 (1-3). In the past, the important quality factors (observed factors) for determining the quality of thin-wall injection products were shrinkage and warpage. A number of studies have been carried out to determine the important process parameters and optimal process conditions for the shrinkage and warpage of injected thin-wall parts. The results of Huang et al. (4) showed that the order of process parameter influencing the shrinkage and warpage of injected thin-wall parts was the packing pressure, mold temperature, melt temperature, and packing time. The interaction of mold temperature and the melt temperature was shown to be important. Wang et al. (5) adopted numerical simulation and injection experiments to study the effect of process parameters on the shrinkage and warpage of injected thin-wall parts, and found that the packing pressure was the most important. Liao et al. (6) also investigated experimentally the effect and interaction of process parameters on the shrinkage and warpage of a commercial part--a cover of a cellular phone--and noted that the packing pressure was the most important factor influencing the shrinkage and warpage of injected thin-wall parts. The study also showed that the optimal process conditions are different for the shrinkage and the warpage. Comparing the results in open literature, it was found that the geometric effect of real commercial parts did affect the optimal process conditions and the order of influence of process parameters.

Since a large number of parameters influence injection-molding processes, a full-factorial experiment is required to determine the important factors and optimal process condition. This is both costly and time-consuming. Therefore, design of experiments (DOE) is usually adopted to schedule the injection experiment. DOE commonly uses the method invented by Taguchi (7,8). Taguchi's method uses orthogonal arrays to schedule the experiment and to obtain statistical information with fewer experiments (compared to full-factorial experiments). One of its advantages is to reduce the number of experiments, but the experimental results are obtained under the conditions determined by the orthogonal array. Therefore, the value of each optimal process parameter determined by Taguchi's method is restricted to one of the levels in the orthogonal array, and it is unable to determine an optimal process parameter with a value between the levels of the orthogonal array. To improve on this, researchers suggested the use of AI (Artificial Intelligence) technology, such as Expert Systems (9), GA (Genetic Algorithm) (10), CBR (Case-Based Reasoning) (11) and ANN (Artificial Neural Network) (12). Among these, ANN is widely used.

ANN simulates the operation mechanism of neurons within the human cerebrum during information handling. The advantages of ANN include: 1) it requires no assumption of the problem; 2) it has adaptive learning ability; 3) it is a true MIMO (Multi-Input Multi-Output) system; 4) it is capable of dealing with highly nonlinear problems; and 5) it shortens the computation time. It has hence been applied to various applications of design optimization, system identification, classification and control systems, etc.

Because of its advantages, ANN has also been applied to the study of injection-molding processes. Zhao et al. (13) used ANN to predict the effect of the operation conditions on the melt temperature during plasticizing. The polymer used was high-density polyester, and the results showed that the predicted value agreed well with the experimental data (with a difference less than 2[degrees]C). Petrova et al. (14) trained three different types of artificial neural networks to predict the cavity pressure during the injection process of a printer output tray. The results indicated that the hybrid network had the lowest sum-squared error and its convergence rate was the quickest. But, when the number of training pairs increased, the conventional neural network had the best performance. Sadeghi (15) used 4-2-3 (4 neurons of input layer, 2 neurons of hidden layer, and 3 neurons of output layer) a BPANN to predict the filling time and the injection pressure during the injection process of the back cover of a Casio fx-570s calculator. The training and testing data were obtained from the simulated results of CAE software. In the work by Lau et al. (16), the geometric change of the injected part was the input of an ANN, and the process parameters were the output. They found that the geometry of the injected part was determined mainly by the filling time and the coolant temperature. In predicting the quality of the injected product, Lee et al. (17) used an ANN to predict the shrinkage of the part injected with a four-cavity mold and found that the predicted results were in fairly good agreement with the experimental value. Chen et al. (18) used an on-line ANN to successfully predict the wave front diagram during the filling process of a mold cavity.

From the above review, it is noted that ANN has been adopted by various researchers to predict the injection molding processes, including the time variations of cavity pressure and temperature, filling time, shrinkage and weld line. Yet, no work has so far been performed to investigate the effect of structural parameters of artificial neural networks, such as the number of nodes of the hidden layer, transfer function, learning increasing rate, and learning decreasing rate, on the network performance. In addition, the testing pairs for the verification of the prediction performance of the trained artificial neural network were usually selected manually and randomly, without any statistical consideration, which is unable to statistically quantify the prediction performance of the trained networks.

In this study, a BPANN is trained to predict the shrinkage and warpage of injection-molded thin-wall parts. The prediction performance of the trained BPANN is obtained by systematically increasing the number of tests and randomly selecting the testing data from the experimental data. A comparison of prediction performance of the BPANN and Taguchi's method in determining quality factors and optimal process conditions is also discussed. Finally, the effect of structural parameters on network training quality is examined and addressed.

EXPERIMENT

Mold and Injection Machine

A single-cavity mold of 130 X 56 X 1.3 mm was used to inject the thin-wall cover of a cellular phone, such as shown in Fig. 1. It has a 35 X 25 mm window opening and 12 openings with dimensions of 10 X 5 mm for key numbers. The nominal thickness of the part is 1.3 mm. On the back of the thin-wall cover are four pins. Two fan gates located in the window area of the part are adopted in this study to connect the cold runner and the mold cavity. The polymer used in carrying out the injection experiment is PC/ABS, which is commonly used for commercial cellular phone covers. PC/ABS is an amorphous plastic produced by the GE Plastics Company. An Arburg 520C/2000/350 injection-molding machine was adopted for injecting the parts. Detailed information on the mold and the injection machine can be found in Liao et al. (6).

[FIGURE 1 OMITTED]

Experimental Design

This study used C-Mold to conduct a parametric study to investigate theoretically the effect of each process parameter on the shrinkage and warpage. The simulation mid-plane model of the thin-wall cover is shown in Fig. 2. There were six cooling channels, each with a diameter of 6 mm, and the flow rate through each channel was 13.3 liter/min. From the C-Mold simulation results, the mold temperature, the melt temperature, the packing pressure, and the injection velocity were found to be the four most important process parameters. Based on the C-Mold simulation results and the practical limitation of the experimental facilities adopted in this study, the number and the level range of the process parameters of DOE were determined as shown in Table 1. It should be noted that the packing time was kept at 16 sec to ensure the freezing of the polymer at the gate. The cooling time was maintained at 50 sec to ensure that the temperature of the whole part was below the ejection temperature before the mold was opened.

[FIGURE 2 OMITTED]

As for the DOE, this study adopted [L.sub.27] ([3.sup.13]) orthogonal array of Taguchi's method to schedule the injection experiment. The control factors considered were the four most important process parameters determined from the parametric study using C-Mold--mold temperature, melt temperature, packing pressure, and injection velocity. In this study, the interaction effect is ignored, and the mold temperature was therefore input to the first column, melt temperature to the second column, packing pressure to the fifth column, and injection velocity to the ninth column, as shown in Table 2. Table 2 lists only the level value of each process parameter. We conducted 27 sets of injection experiments according to this orthogonal array and then measured the shrinkage and warpage of the injected parts.

EXPERIMENTAL PROCEDURE AND MEASUREMENTS

During the injection experiments, each process condition was allowed to stabilize for at least 30 minutes. Then 10 parts were injected at each process condition, and the shrinkage and warpage of the seventh and ninth injected parts were measured. The residual stresses for all the parts were allowed to relax after the experiments for at least a week before measurements were taken. The measurements of the shrinkage and warpage of the injected parts were conducted with Cyclone Scanner, PolyCAD and PolyWorks. Detailed information on the measurement method can be found in Liao et al. (6).

In this study, the shrinkage and warpage of injected covers are defined as shown in Fig. 1. The shrinkage and warpage of the injected part were calculated using the following equations:

x-direction shrinkage:

[DELTA][bar.X] = [1/3]([DELTA][X.sub.1] + [DELTA][X.sub.2] + [DELTA][X.sub.3]) (1)

y-direction shrinkage: [DELTA][bar.Y] = [DELTA]Y (2)

z-direction warpage: [DELTA][bar.Z] = [DELTA][Z.sub.1] - [DELTA][Z.sub.2] (3)

The x-direction shrinkage is defined as the average shrinkage of the [X.sub.1], [X.sub.2], and [X.sub.3] segments. The y-direction shrinkage is defined as the shrinkage of the Y segment. The z-direction warpage is quantified by the difference between the z-direction displacements of points [Z.sub.1] and [Z.sub.2]. The definition of warpage in this study refers to the out-of plane deformation with reference to the plane passing through [Z.sub.2] point. The injected parts of this study have maximum upward deformation at the [Z.sub.1] point and maximum downward deformation at the [Z.sub.2] point. In Eq 3, the plane that passes through [Z.sub.2] point is then defined as the reference plane for the determination of warpage. The displacement difference between the [Z.sub.1] and [Z.sub.2] points is therefore the out-of-plane deformation.

This study divides all measured shrinkage and warpage data into training sets and testing sets (see the discussion below). The dimensionless process parameter and shrinkage and warpage data are used as the input and output layer data, respectively, for the training and testing of artificial neural networks.

THEORY

Configuration of Artificial Neural Network

An artificial neural network (19) is a parallel operation system that simulates a human neuron. A neuron usually obtains information from the preceding layer of neurons and delivers it to the next neuron after calculation. Because of this characteristic, an artificial neural network does not need to make any assumption of the relationship between input/output. Therefore, it can simulate a highly nonlinear system (20).

In many artificial neural network models, the BPANN (21) is the most well known and widely used, especially in diagnosis and prediction. As shown in Fig. 3, a BPANN is constructed by an input layer, a hidden layer (or several hidden layers), and an output layer. The input layer receives and distributes input information, the output layer produces output information, and the hidden layer captures the nonlinear relationship between the input layer and the output layer.

[FIGURE 3 OMITTED]

There are weights connecting the neurons of each layer of the network. The neuron of the hidden layer and output layers possesses a transfer function and a bias to relate the input and the output of the neuron. Generally, the transfer function adopts the input value with a wide range and transforms it into an output with a limited range. Bias is a constant that is added to the input signal. During a training process, a BPANN compares the calculated output value with the target value and then modifies the weights and biases based on the error between them. The BPANN uses the optimal steepest descent algorithm to modify the weights and biases, and iterates until the convergence criterion is reached.

Development of BPANN Model

For the BPANN adopted in this study, there are four neurons in its input layer, which are the dimensionless mold temperature, melt temperature, packing pressure, and injection velocity. The output layer has three neurons, which are the dimensionless x-direction shrinkage, y-direction shrinkage, and z-direction warpage. The dimensionless rules of these parameters are given below:

Mold temperature: [T*.sub.mold] = [[T.sub.mold] - [T.sub.mold,min]]/[[T.sub.mold,max] - [T.sub.mold,min]] (4)

Melt temperature: [T*.sub.polymer] = [[T.sub.polymer] - [T.sub.polymer,min]]/[[T.sub.polymer,max] - [T.sub.polymer,min]] (5)

Packing pressure: [P*.sub.p] = [[P.sub.p] - [P.sub.p,min]]/[[P.sub.p,max] - [P.sub.p,min]] (6)

Injection velocity: v* = [v - [v.sub.min]]/[[v.sub.max] - [v.sub.min]] (7)

X-direction shrinkage: [x*.sub.s] = [[x.sub.s] - [x.sub.s,min]]/[[x.sub.s,max] - [x.sub.s,min]] (8)

Y-direction shrinkage: [y*.sub.s] = [[y.sub.s] - [y.sub.s,min]]/[[y.sub.s,max] - [y.sub.s,min]] (9)

Z-direction warpage: [z*.sub.w] = [[z.sub.w] - [z.sub.w,min]]/[[z.sub.w,max] - [z.sub.w,min]] (10)

In this study, we separate all experimental data randomly into training data and testing data. Training data is used to train the BPANN to capture the relationship between input and output. The training data contains 24 pairs of experimental data. The testing data is the data that the network has never seen and is used to test the prediction performance of the network. The testing data of this study contains three pairs of experimental data.

To accelerate the convergence of the network, we simultaneously add the momentum and adaptive learning rate in the training process (22, 23). And when constructing a BPANN, many network parameters should be set. In this study, some parameters are set as constants, including the initial learning rate as 0.01, the biggest error rate, 1.04, and the number of hidden layer, 1.

The effects of the number of training pairs, number of hidden layer nodes, transfer function of hidden layer, increasing ratio of learning rate, decreasing ratio of learning rate and momentum on network training quality are discussed in this study. The testing conditions of all parameters are given in Table 3. Under each test condition, the training of the BPANN is carried out with ten different sets of initial values of weights and bias produced randomly by computer. The training process is stopped when either one of two convergence criteria is reached. The two convergence criteria are: 1) sum-squared error. SSE is smaller than 0.0001 or 2) iterative number reaches 10,000 times. The training quality is determined by root-mean-squared error and average root-mean-squared error. The definitions of sum-squared-error, root-mean-squared error and average root-mean-squared error are

SSE = [m.summation over (j=1)][n.summation over (i=1)] (Tij - Pij)[.sup.2] (11)

[RMS.sub.q] = [square root of ([SSE.sub.q]/[m X n])] (12)

[bar.RMS] = [1/r] [r.summation over (q=1)] [RMS.sub.q] (13)

Determination of Prediction Performance of BPANN

In the past, the prediction performance of a trained BPANN was usually conducted by choosing randomly and manually several testing pairs from the experimental data that was never seen by the BPANN. But the prediction performance was found to depend strongly on the selection of testing pairs; therefore, the prediction performance of the trained BPANN was usually not correctly expressed. To overcome this problem, this study systematically increases the total number of tests for the determination of prediction performance. In each prediction-performance test, three sets of experimental data are randomly selected from the experimental data by the statistics software Minitab as the testing pairs. When evaluating the prediction performance of the network, the average relative error [bar.E.sub.rel] in each prediction-performance test is calculated first, and then the average prediction-performance is calculated based on the following equations.

Average relative error [bar.E.sub.rel,l] = [[o.summation over (k=1)] [n.summation over (i=1)]|[[[T.sub.ik,l] - [P.sub.ik,l]]/[T.sub.ik,l]]|]/[n X o] (14)

Average prediction-performance error

[bar.[E.sub.quality]] = [[M.summation over (l=1)] [bar.E.sub.rel,l]]/M (15)

where M is the total number of tests for evaluating the prediction performance. [bar.E.sub.rel,l] is the average relative error that is obtained from [l.sub.th] prediction-performance test.

This study increases M continuously until [bar.E.sub.quality] approaches a constant. This constant is then used in this study to evaluate the prediction performance of the trained network.

Comparison of Prediction Performance of BPANN and Taguchi's Method in Determining Quality Factors (Shrinkage and Warpage) and Optimal Process Condition

This study also uses the optimal process condition suggested by Taguchi's method as the input layer of the trained BPANN, and compares the shrinkage and warpage predicted by its output layer with the predicted value of Taguchi's method. See Liao et al. (6) for details concerning the application of Taguchi's method in determining the optimal process of shrinkage and warpage.

When using Taguchi's method to determine the optimal process condition, each process parameter of the optimal process condition cannot deviate from the levels used in the orthogonal array; it must be one of the levels used in the orthogonal array, which reduces the flexibility of Taguchi's method. In order to compare the optimal process conditions obtained by the Taguchi's method and BPANN, this study uses the trained BPANN and Taguchi's method to determine the optimal process conditions. In determining the optimal process condition with the BPANN, each of the four process parameters are divided equally into N process conditions (N ranges between 3 and 51). This totally forms [N.sup.4] process conditions. To prevent the local minimal value problem commonly encountered by various optimization algorithms, this study adopts a full-range searching method. The trained BPANN was adopted to predict the shrinkage and warpage under all [N.sup.4] process conditions. Although this method is time-consuming (when N = 51, the program execution time is about 3 hours on a Celeron 400 MHz personal computer), it can ensure that the optimal process condition determined is the full-range minimal value. At last, we compare the optimal shrinkages and warpages determined by the BPANN and Taguchi's method.

RESULTS AND DISCUSSION

Training and Testing Results of BPANN

Before discussing the BPANN training quality and prediction performance, this study investigates the effect of initial values of weight and bias, which are randomly produced by the computer, on the training quality of the BPANN under base-line condition A5. The results are shown in Table 4, among which are root-mean-squared errors that obtained for each set of initial values, average of the ten root-mean-squared errors and standard deviation. We find that the best result is the eighth set of initial values (RMS = 0.00203), and the worst result is the fourth set of initial values. (RMS = 0.0102) with one-order difference between them. From this result, it is noted that initial values of weight and bias indeed influence significantly the training quality of the network. It also explains why we always determine the average root-mean-squared error [bar.RMS] for the investigation of the effect of structure parameters on the training quality of the BPANN.

In the following, we discuss variations of sumsquared errors during the training process with the eighth set of initial values for base-line condition A5. Figure 4 shows the variation of sum-squared error during the training process. The result shows that the SSE drops significantly from 7 to [10.sup.-3] within 2500 iterations, and then it shows a gradually decreasing trend.

The predicted results by the trained base-line network in determining the shrinkage and warpage of training pairs and testing pairs are shown in Fig. 5. The testing pairs are the fifth, thirteenth, and twenty-fifth pairs of data. From the result, it is noted that the predicted shrinkage and warpage of the training pairs are very accurate, and the relative errors ([E.sub.rel] = (Predicted Value - Target Value)/(Target Value)) are all under 0.4%. For the testing pairs, the largest relative error for the x-direction shrinkage is 12.8% (twenty-fifth pair of data), the y-direction shrinkage, 7.92% (thirteenth pair of data), and the z-direction warpage, 15.39% (thirteenth pair of data). The average relative error ([bar.E.sub.rel] = [[o.summation over (k=1)] [n.summation over (i=1)] |[E.sub.rel,i,k]|]/[n X 0]) is 9.3% (see Table 5 for detailed data). The predicted values by C-Mold are also shown in Fig. 5. It is noted that the predicted values by the BPANN are far better than the C-Mold simulation value in terms of agreement with experimental data. It should be noted that the BPANN is trained by using experimental data, while C-Mold uses numerical modeling of the process and material characterization to predict the behavior of the material during the process. C-Mold has no built-in learning process from the experimental data. The comparison is aimed to show the capability of the BPANN and its potential as an alternative tool for predicting the shrinkage and warpage of the injected parts.

[FIGURE 4 OMITTED]

[FIGURE 5 OMITTED]

Prediction Performance of BPANN

In determining the prediction performance of a BPANN, several pairs of experimental data are usually randomly chosen as the testing pairs. In this study, if the fifth, thirteenth, and twenty-fifth pairs of data are chosen as the experimental data of testing pairs, the average relative error [bar.E.sub.rel] of x-direction shrinkage, y-direction shrinkage and z-direction warpage is 9.3%. But if the tenth, twelfth, and fourteenth pairs of experimental data are chosen, the average relative error [bar.E.sub.rel] may reach 122.6%. Thus, it is noted that the selection of testing pairs from experimental data has an extremely large influence on the estimation of the prediction performance of a BPANN. To overcome this problem, this study systematically increases the total number of tests for determining prediction performance. In each test, three sets of experimental data are randomly selected as the testing pairs. The average relative error [bar.E.sub.reLl] for each test and the average prediction-performance error [bar.E.sub.quality] for all tests are then calculated based on Eqs 14 and 15, respectively.

Figure 6 shows the influence of the number of tests on the [bar.E.sub.quality]. When the total number of tests (M) is less than 15, the [bar.E.sub.quality] shows upper-and-lower jumping phenomenon. The reason is probably that [bar.E.sub.quality] does not have statistical meaning when the total number of tests is small. When M > 15, [bar.E.sub.quality] will gradually approach a fixed value of 34.0%.

Comparison of Prediction Performance of BPANN and Taguchi's Method in Determining Quality Factors

To compare the prediction performance of the BPANN and Taguchi's method in determining quality factors (observed factors), this study calculates the quality factors (shrinkage and warpage) under the optimal process condition determined by Taguchi's method (6), using both the trained BPANN (base-line condition A5) and Taguchi's method. It should be noted that the optimum condition indicated in Liao et al. (6) is obtained by considering one parameter at a time (x-shrinkage, y-shrinkage, and warpage separately) and does not optimize for an objective function taking into account the effect of all three parameters. The result shows that the shrinkage and warpage predicted by the BPANN is closer to the experimental data than those by Taguchi's method. This indicates that, under the condition of this study, the BPANN can predict the shrinkage and warpage better than Taguchi's method under the optimal process condition determined by Taguchi's method. In a comparison of computation times, Taguchi's method computes the predicted value with an algebraic equation; therefore it takes almost no computation time. The computation time for the calculation of the output layer by the trained BPANN is 0.005 sec (PIII 600 CPU). Thus the computation times of the Taguchi and BPANN methods are almost the same.

[FIGURE 6 OMITTED]

Comparison of Prediction Performance of BPANN and Taguchi's Method in Determining Optimal Process Conditions

Table 6 lists the optimal process conditions determined by the BPANN under different number of divisions N. From Table 6, it is noted that when N is larger than 21, the optimal process conditions determined by the BPANN approach asymptotic conditions. This indicates that the asymptotic condition determined by the BPANN is a full-range optimal process condition. Comparing the results in Tables 6 and 7, we find that the shrinkage and warpage determined by the BPANN under the optimal process conditions are smaller than those by Taguchi's method. This is because a BPANN is more flexible in determining the optimal process conditions; the optimal process parameters determined by a BPANN need not be restricted to the fixed levels used in orthogonal tables, as in Taguchi's method.

Effect of Structural Parameters of BPANN on Training Quality

In this section, we discuss the effect of the structural parameter of a BPANN on the network training quality. The structural parameters discussed are the number of training pairs, number of hidden layer nodes, hidden layer transfer function, increasing ratio of learning rate, decreasing ratio of learning rate and momentum. The network training quality is evaluated by the average root-mean-squared error and the minimum root-mean-squared error defined in Eqs 12 and 13. All testing conditions are given in Table 3. The training time listed in Tables 8-13 is the computation time required for 5000 iterations with the same set of initial values of weights and biases.

Number of Training Pairs

Table 8 shows the effect of the number of training pairs on network training quality (test conditions are A1-A5 given in Table 3). The higher the number of training pairs, the lower the value of [bar.RMS] or [RMS.sub.min]. This is because when more information is presented to the network for training, a better training quality of the network will be achieved. Note that the training time increases with the number of training pairs.

Number of Hidden Layer Nodes

The test conditions for examining the effect of number of hidden layer nodes are B1--B4 and A5. The number of hidden layer nodes changes from 1 to 16, and the results are given in Table 9. The results show that when the number of hidden layer nodes is increased, the training quality of the network is better. This is because when the number of hidden layer nodes is increased, the BPANN has more adjustable weights and biases to achieve a better training quality. Based on the training time listed in Table 9, it is observed that when the number of hidden nodes is increased, the training time is longer.

Transfer Function of Hidden Layer

This study considers three differentiable and monotonically increasing transfer functions of the hidden layer. The transfer functions are defined separately as:

Sigmoid Function: f(x) = 1/[1 + exp(- x)] (16)

Hyperbolic Tangent Function: f(x) = tanh(x) (17)

Pure Linear Function: f(x) = x (18)

Test conditions are C1 - C2 and A5 given in Table 3, and the effect of transfer function of hidden layer on network quality is shown in Table 10. It shows that the result of the sigmoid function is the best; its [bar.RMS] and [RMS.sub.min] are significantly lower than the other two transfer functions for two reasons: 1) The sigmoid function is a nonlinear function; combining it with the linear transfer function of output layer nodes will enable the network to successfully learn both the linear and nonlinear relationship between input and output, and 2) the sigmoid function has more difficulty reaching the saturation phenomenon than the hyperbolic tangent function. Based on the training time listed in Table 10, it is observed that the training time is the longest for the hyperbolic tangent function, and the same for the sigmoid and pure-linear functions.

Increasing and Decreasing Ratios of Learning Rate

Table 11 shows the influence of increasing ratio of learning rate on network training quality. The test conditions are D1-D4 and A5 given in Table 3. Table 12 shows the influence of decreasing ratio of learning rate on the network training quality (test conditions are E1--E4 and A5). The initial learning rate used in this study is 0.01. The increasing ratio of learning rate is the ratio of the new learning rate to the old learning rate when the ratio of sum-squared errors between iterations is larger than a certain prescribed error ratio (= 1.04 in this study). Similarly, when the ratio of sum-squared errors between iterations is less than the prescribed error ratio, the ratio of the new to the old learning rate is termed "decreasing ratio of learning rate." From Table 11, it is observed that the [bar.RMS] or [RMS.sub.min] are the lowest for all test conditions when the increase ratio of learning rate is equal to 1.05. Thus a learning rate increasing ratio close to unity is preferred. Table 12 shows that when the decreasing ratio of learning rate is 0.7, the best training quality will be obtained in this study. Based on the training time listed in Tables 11 and 12, the training time decreases with the increase of the increasing ratio of learning rate and decreasing ratio of learning rate.

Momentum

In the training process of a BPANN, weights and biases change continuously in the direction of steepest descent with respect to error. Adding a certain ratio of the change in the weights and biases of the previous iteration allows a network to respond, not only to the local gradient, but also to recent trends on the error surface. This ratio is called momentum. Momentum allows the network to ignore small features on the error surface. With small momentum, a network may get stuck in a shallow local minimum. With adequate momentum, a network can slide through such a minimum. However, too large momentum may merely cause a long training time if the error surface is not so complicated.

Table 13 shows the influence of momentum on network training quality and training time. The test conditions are F1-F4 and A5. The result shows that when the momentum is 0.95, the [bar.RMS] or [RMS.sub.min] are minimal. The results also indicate that the training time decreased with the increase of the momentum; the training time reduced from 134 to 127 sec when the momentum increased from 0.6 to 0.95.

SUMMARY AND CONCLUSIONS

This study uses a BPANN to successfully predict the shrinkage and warpage of thin-wall injected products and investigates the effect of structure parameters on network training quality. The results show the trained BPANN is found to better predict the target values of training and testing pairs than C-Mold. Compared with Taguchi's method, the BPANN is also more flexible and able to give a better optimal process condition for minimizing the shrinkage and warpage of thin-wall parts. On the other hand, a BPANN has some shortcomings and limitations. It requires a training process before a BPANN can be used for prediction and optimization. When the number of training data is not enough to describe the predicted system adequately, the predicted value by the trained network will be inaccurate. A BPANN is also system-dependent; when the predicted system is changed, the network has to be retrained.

Based upon the results from the parametric study conducted in the work, the use of the following network parameters is suggested for the best training quality: number of training pairs = 24, number of hidden layer nodes = 16, transfer function = sigmoid function, increasing ratio of learning rate = 1.05, decreasing ratio of learning rate = 0.7, and momentum = 0.95. In this paper, a method is also suggested for evaluating statistically the prediction performance of the trained BPANN. Selection of testing pairs from experimental data is found to affect significantly the evaluation result of the prediction performance of the trained BPANN. This study suggests adopting randomly sampled experimental data as testing pairs in each test, and increasing of the total number of tests until the average prediction-performance error approaches a fixed value. This fixed value represents the accuracy estimation of the network prediction performance.

NOMENCLATURE BPANN back-propagation artificial neural network E error of training quality M the total number of tests for evaluating the prediction performance m number of training pairs n number of nodes of output layer o number of testing pairs P pressure, MPa [P.sub.ij] predicted value of node of output layer of BPANN for training pairs [P.sub.ik] predicted value of node of output layer of BPANN for testing pairs RMS root-mean-squared error r number of sets of weight and bias initial values SSE sum-squared error T temperature, [degrees]C [T.sub.ij] target value of node of output layer of BPANN for training pairs [T.sub.ik] target value of node of output layer of BPANN for testing pairs v velocity, mm/sec X x-direction shrinkage, mm Y y-direction shrinkage, mm Z z-direction warpage, mm Greek Symbols [DELTA] difference [sigma] standard deviation Subscripts max maximum value min minimum value mold mold p packing pressure polymer polymer q BPANN trained by the [q.sub.th] initial set of values of weight and bias quality quality rel relative s shrinkage w warpage Superscripts - average value * dimensionless parameter Table 1. Factors and Levels Selected in This Study. Factor Level 1 Level 2 Level 3 A. Mold Temperature (1) ([degrees]C) 60 70 80 B. Melt Temperature (2) ([degrees]C) 260 270 280 C. Packing Pressure (5) (MPa) 60 80 100 D. Injection Speed (9) (mm/sec) 50 90 130 Table 2. The Orthogonal Array [L.sub.27] ([3.sup.13]) Used in This Study. Column Process Condition 1 (A) 2 (B) 5 (C) 9 (D) 1 1 1 1 1 2 1 1 2 2 3 1 1 3 3 4 1 2 1 2 5 1 2 2 3 6 1 2 3 1 7 1 3 1 3 8 1 3 2 1 9 1 3 3 2 10 2 1 1 2 11 2 1 2 3 12 2 1 3 1 13 2 2 1 3 14 2 2 2 1 15 2 2 3 2 16 2 3 1 1 17 2 3 2 2 18 2 3 3 3 19 3 1 1 3 20 3 1 2 1 21 3 1 3 2 22 3 2 1 1 23 3 2 2 2 24 3 2 3 3 25 3 3 1 2 26 3 3 2 3 27 3 3 3 1 Table 3. Test Conditions To Examine the Effect of Structural Parameters on Network. Input Layer Hidden Layer Test No. of No. of Condition Training Pairs Hidden Nodes A1 12 16 A2 15 16 A3 18 16 A4 21 16 A5* 24 16 B1 24 1 B2 24 2 B3 24 4 B4 24 8 C1 24 16 C2 24 16 D1 24 16 D2 24 16 D3 24 16 D4 24 16 E1 24 16 E2 24 16 E3 24 16 E4 24 16 F1 24 16 F2 24 16 F3 24 16 F4 24 16 Overall Parameters Test Transfer Condition Function Lr-inc Lr-dec Momentum A1 Sigmoid 1.05 0.7 0.95 A2 Sigmoid 1.05 0.7 0.95 A3 Sigmoid 1.05 0.7 0.95 A4 Sigmoid 1.05 0.7 0.95 A5* Sigmoid 1.05 0.7 0.95 B1 Sigmoid 1.05 0.7 0.95 B2 Sigmoid 1.05 0.7 0.95 B3 Sigmoid 1.05 0.7 0.95 B4 Sigmoid 1.05 0.7 0.95 C1 Hyper tangent 1.05 0.7 0.95 C2 Pure linear 1.05 0.7 0.95 D1 Sigmoid 1.2 0.7 0.95 D2 Sigmoid 1.5 0.7 0.95 D3 Sigmoid 1.7 0.7 0.95 D4 Sigmoid 2.0 0.7 0.95 E1 Sigmoid 1.05 0.6 0.95 E2 Sigmoid 1.05 0.8 0.95 E3 Sigmoid 1.05 0.9 0.95 E4 Sigmoid 1.05 0.95 0.95 F1 Sigmoid 1.05 0.7 0.6 F2 Sigmoid 1.05 0.7 0.7 F3 Sigmoid 1.05 0.7 0.8 F4 Sigmoid 1.05 0.7 0.9 *The test condition of A5 is the base-line condition. Table 4. Effect of Initial Values on Training Results for Base-Line Condition A5. Set Number 1 2 3 4 5 6 7 8 9 10 RMS 2.72 2.48 2.43 1.02 2.30 2.69 3.06 2.03 4.34 2.54 E-3 E-3 E-3 E-2 E-3 E-3 E-3 E-3 E-3 E-3 [bar.RMS] 3.48E-3 [sigma] 2.45E-3 Table 5. The Prediction Performance for Base-Line Condition A5. X Y Testing Target Predicted Target Predicted Pairs Value Value [E.sub.rel] Value Value [E.sub.rel] 5 0.295 0.269 -8.81% 1.18 1.22 3.39% 13 0.305 0.326 6.88% 1.01 1.09 7.92% 25 0.390 0.340 -12.8% 1.08 1.14 5.55% Z Testing Target Predicted Pairs Value Value [E.sub.rel] 5 0.0825 0.0932 13.0% 13 0.0825 0.0952 15.39% 25 0.131 0.118 -9.92% Table 6. Optimal Process Conditions Determined by BPANN. Optimal Process Condition Number Observed Mold Melt Packing of Divisions Factor Temperature Temperature ([degrees]C) ([degrees]C) Pressure (MPa) 3 [DELTA]X 70 260 100 [DELTA]Y 70 280 100 [DELTA]Z 70 260 100 5 [DELTA]X 70 260 100 [DELTA]Y 75 280 100 [DELTA]Z 70 260 100 11 [DELTA]X 70 260 100 [DELTA]Y 74 280 100 [DELTA]Z 68 260 100 21 [DELTA]X 70 261 100 [DELTA]Y 73 280 100 [DELTA]Z 68 260 100 31 [DELTA]X 70 260.67 100 [DELTA]Y 72.67 280 100 [DELTA]Z 68.67 260 100 41 [DELTA]X 69.5 261.5 100 [DELTA]Y 73 280 100 [DELTA]Z 68.5 260 100 51 [DELTA]X 69.6 261.6 100 [DELTA]Y 72.8 280 100 [DELTA]Z 68.4 260 100 Optimal Process Condition Number Injection Predicted Value of Divisions Speed (mm/sec) by BPANN (mm) 3 50 0.1787 50 0.6305 50 0.00727 5 70 0.1786 50 0.6301 50 0.00727 11 58 0.1785 50 0.6301 50 0.00727 21 62 0.1784 50 0.63 50 0.00709 31 60.67 0.1784 50 0.63 50 0.00708 41 62 0.1784 50 0.63 50 0.007083 51 61.2 0.1784 50 0.63 50 0.007083 Table 7. Comparison of the Shrinkage and Warpage Obtained from Experimental Measurement and Predictions of Taguchi's Method and Neural Network at Optimal Process Conditions Deduced by Taguchi's Method (6). Observed Factor [DELTA]X [DELTA]Y Experimental Value (mm) 0.185 0.701 Predicted Value by Taguchi's Method (mm) 0.176 0.629 Predicted Value by BPANN (mm) 0.192 0.632 [DELTA]Y [DELTA]Z [DELTA]X 1.111 0.000 0.236 Observed Factor [DELTA]Z Experimental Value (mm) 0.164 Predicted Value by Taguchi's Method (mm) 0.005 Predicted Value by BPANN (mm) 0.007 [DELTA]Z [DELTA]X [DELTA]Y 0.437 0.203 0.886 Table 8. Effect of Number of Training Pairs on RMS, [RMS.sub.min], [sigma], and Training Time. No. of Training Training Time Pairs [bar.RMS] [RMS.sub.min] [sigma] (sec) 12 0.005254 0.005221 0.000018 113 15 0.011199 0.004712 0.003395 117 18 0.006705 0.004303 0.003442 120 21 0.005621 0.003983 0.003354 124 24 0.003479 0.002028 0.002452 127 Table 9. Effect of the Number of Hidden Nodes on [bar.RMS], [RMS.sub.min], [sigma], and Training Time. No. of Training Hidden Time Nodes [bar.RMS] [RMS.sub.min] [sigma] (sec) 1 0.16931 0.169173 0.000276 101 2 0.150346 0.141445 0.003456 108 4 0.09902 0.087777 0.007968 115 8 0.033142 0.02324 0.004638 121 16 0.003479 0.002028 0.002452 127 Table 10. Effect of Transfer Function of Hidden Nodes on [bar.RMS], [RMS.sub.min], [sigma], and Training Time. No. of Training Transfer Time Function [bar.RMS] [RMS.sub.min] [sigma] (sec) Sigmoid 0.003479 0.002028 0.002452 127 Hyperbolic Tangent 0.034746 0.007719 0.077901 135 Pure-Linear 0.168512 0.167731 0.002275 127 Table 11. Effect of Increasing Ratio of Learning Rate on [bar.RMS], [RMS.sub.min], [sigma], and Training Time. Increasing Ratio of Training Learning Time Rate (Lr-inc) [bar.RMS] [RMS.sub.min] [sigma] (sec) 1.05 0.003479 0.002028 0.002452 127 1.2 0.007481 0.004067 0.003221 124 1.5 0.017759 0.009079 0.008654 120 1.7 0.019558 0.011438 0.00776 117 2.0 0.026142 0.020779 0.006249 109 Table 12. Effect of Decreasing Ratio of Learning Rate on [bar.RMS], [RMS.sub.min], [sigma], and Training Time. Decreasing Ratio of Training Learning Time Rate (Lr-dec) [bar.RMS] [RMS.sub.min] [sigma] (sec) 0.6 0.005979 0.003725 0.003312 133 0.7 0.003479 0.002028 0.002452 127 0.8 0.005884 0.003724 0.003357 125 0.9 0.00617 0.003727 0.003346 125 0.95 0.010547 0.005161 0.006113 114 Table 13. Effect of Momentum on [bar.RMS], [RMS.sub.min], [sigma], and Training Time. Training Time Momentum [bar.RMS] [RMS.sub.min] [sigma] (sec) 0.6 0.012575 0.006018 0.007203 134 0.7 0.012201 0.005605 0.0075 133 0.8 0.011616 0.004956 0.008227 133 0.9 0.009325 0.004003 0.007184 132 0.95 0.003479 0.002028 0.002452 127

ACKNOWLEDGMENT

The authors are grateful for financial support from Pou-Yuen Technology. 3C Product, Pou-Chen Group.

REFERENCES

1. B. Johnson and M. Sun, Workshop, Hsinchu, Taiwan (February 1997).

2. J. Fassett, SPE ANTEC Conference Proceedings, 1, 430 (1995).

3. N. R. Schott, SPE ANTEC Conference Proceedings, 1, 367 (1998).

4. M. C. Huang and C. C. Tai, J. Materials Processing Technology, 110, 1 (2001).

5. T. H. Wang, W. B. Young, and J. T. Wang, Intern. Polymer Processing XVII, 2, 146 (2002).

6. S. J. Liao, D. Y. Chang, H. J. Chen. L. S. Tsou, J. R. Ho. H. T. Yau, W. H. Hsieh, J. T. Wang, and Y. C. Su, Polym. Eng. Sci. 44, 917 (2004).

7. W. Y. Fowlkes and C. M. Creveling, Engineering Methods for Robust product Design: Using Taguchi Method in Technology and Product Development, Addison-Wesley Publishing Co., Reading, Mass. (1995).

8. G. Taguchi, S. Konishi, and Y. Wu, Quality Engineering Series, Vol. 1. Taguchi Methods. Research and Development, ASI (1992).

9. K. Seiji, N. Haramoto, and S. Sakai, Adv. Polym. Technol., 12(4), 403 (1993).

10. S. J. Kim, K. Lee, and Y. I. Kim, SPIE, 2644, 173 (1996).

11. C. K. Kwong and G. F. Smith, International J. Adv. Manufacturing Technol., 14, 239 (1998).

12. W. He, Y. F. Zhang, K. S. Lee, J. Y. H. Fuh, and A. Y. Nee, J. Intelligent Manufacturing, 9, 17 (1998).

13. C. Zhao and F. Gao, Polym. Eng. Sci., 39, 1787 (1999).

14. T. Petrova and D. Kazmer, Adv. Polym. Technol., 18(1), 19 (1999).

15. B. H. M. Sadeghi, J. Materials Processing Technol., 103, 411 (2000).

16. H. C. W. Lau, A. Ning, K. F. Pun, and K. S. Chin, J. Materials Processing Technol., 117, 89 (2001).

17. S. C. Lee and J. R. Youn, J. Reinf. Plastics and Composites, 18(2), 186 (1999).

18. X. Chen and F. Gao, SPE ANTEC Conference Proceedings (2000).

19. R. P. Lippmann, IEEE ASSP Magazine, 4(2), 4 (1987).

20. T. Kohonen, Neural Networks, 1(1), 3 (1998).

21. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Parallel Distributed Processing, 1, 318 (1986).

22. A. A. Minai and R. D. Williams. Proceedings of the 1990 International Joint Conference on Neural Networks, 1, 676, IEEE, Piscataway, N.J. (1990).

23. R. A. Jacobs, Neural Networks, 1(4), 295 (1988)

S. J. LIAO and W. H. HSIEH*

Department of Mechanical Engineering

National Chung Cheng University

Chia-Yi, Taiwan, ROC

JAMES T. WANG and Y. C. SU

Pou Yuen Technology, 3C Product

Pou Chen Group

Chang Hwa, Taiwan, ROC

*To whom correspondence should be addressed. E-mail: imewhh@ccu.edu.tw

Printer friendly Cite/link Email Feedback | |

Author: | Liao, S.J.; Hsieh, W.H.; Wang, James T.; Su, Y.C. |
---|---|

Publication: | Polymer Engineering and Science |

Date: | Nov 1, 2004 |

Words: | 7730 |

Previous Article: | In-situ ultrasonic compatibilization of unvulcanized and dynamically vulcanized PP/EPDM blends. |

Next Article: | Morphology, thermal, and mechanical properties of vinylester resin nanocomposites with various organo-modified montmorillonites. |

Topics: |