|A publication of the National Electronics Manufacturing Center of Excellence||July 2004|
Michael D. Frederickson
Reliability is defined as the probability that an item will perform its intended function for a specified interval of time and at stated environmental conditions. Identifying part failures and failure mechanisms is critical to understanding, predicting, and testing for reliability. Failures can occur as a result of any stress applied to the device such as temperature, environmental conditions, or voltage. The long term failure distribution is usually determined by the chemical and physical properties associated with the technology, design, material, and full impact of the environmental stresses imposed. Qualification testing is essential for the prediction of product reliability. This testing is critical to any reliability program to highlight any deficiencies in the design, and to verify that corrective actions will improve quality. The product should be tested in all of the environments it will experience during its application. These include high and low temperatures, humidity, corrosion, mechanical shock, high pressure, fungus, sand, dust, explosive atmosphere, vibration, and acceleration. 
Two areas of qualification can be pursued to verify reliability. Qualification of the process utilizes statistical process control (SPC) to insure consistent device fabrication. This can be used to work toward lower costs and shorter delivery times. Product qualification is essentially a validation of the circuit’s ability to perform to a minimum specification under stress and environmental conditions. It typically includes a measurement demonstrating the failure rate of the part using accelerated life testing. By conducting tests in both of these areas, the technology and fabrication of the part may be verified to meet the indicated level of quality and reliability.
SPC provides a baseline for measuring the continuous improvement of a manufacturer’s semiconductor production process. The SPC program should use in-process monitoring techniques to measure key process characteristics that affect device yield and reliability. Every device lot typically has built-in control monitors from which data is gathered. The resulting data should be analyzed using appropriate SPC methods to determine the effectiveness of continuous improvement plans. Cause and effect, pareto analysis, and design of experiments can all be used to verify which key characteristics are the most important to measure for quality purposes. Control charts are used to plot data based on the assumption that the data follow a normal distribution. Upper and lower control limits are then calculated from the data. The process capability (Cp) is also calculated and incorporates the upper and lower specification limits (USL, LSL) that are set by the manufacturer. The Cp index compares the statistical process variation to the specification range. If the process variation band matches the specification range exactly, the Cp index is 1.0. The higher the Cp, the more capable the process is of routinely meeting specifications. A general depiction of Cp is shown in Figure 4-1. Curve A represents a Cp of 1 which indicates a capable process, whereas curve B represents a Cp less than 1, and indicates a process which is not.
Since reliability can be defined as the probability that an item will perform a required function under stated conditionsfor a stated period of time, it can be modeled as a probability distribution. The probability of a component surviving to a time (t) is the reliability (R(t)).
The failure rate can be expressed as f(t) below.
Thus, the failure rate can be defined as the probability of failure in unit time of a component that is still working satisfactorily. For a constant failure rate f, R(t) varies exponentially as a function of time as given below.
R(t) = e-f(t)
The failure rate, f(t), is given as the number of units failing per unit time. Since the number of components failing is typically very low, the units are usually reported as the percent (%) failure per 1x106 hours, or as the number of devices failing in 1x109 hours. This unit is referred to as the FIT (failures in time).
1 FIT = 1 failure per 1x109 device hours
Another common method for reporting component reliability is defined by the MTBF. Assuming that the failures occur randomly at a constant failure rate, the MTBF is given below.
MTBF = 1/ f
This may also be written as the probability of success (P(s)), or zero failures.
P(s) = e (-t / MTBF) where t = time
Figure 4-2 shows P(s) versus time as normalized to the MTBF. From this plot it can be seen that after 1/2 MTBF, the probability of no failures is 60% and after 1 MTBF, the probability of no failures is 37%.
In summary, military electronics systems and the accompanying semiconductor devices, particularly GaAs, require thorough reliability testing. This includes both SPC monitoring to observe and improve the key characteristics and accelerated-life testing to gather failure rate data to characterize MTBF and FIT over the long term life of the component and system.
|ACI Technologies, Inc. - - www.aciusa.org - - (610)362-1200|