Guide to Reliability of Electrical/Electronic Equipment and Products--Component and Supplier Selection, Qualification, Testing, and Management (part 2)

Home | Articles | Forum | Glossary | Books

AMAZON multi-meters discounts AMAZON oscilloscope discounts

4. THE ROLE OF THE COMPONENT ENGINEER

The component engineer (CE) (see also Section 3) is a resource for product development and strategic planning. The focus of the CE is on matching the company's product direction with the direction of the supplier base. The most effective development support that can be provided by the CE is to implement fast and effective means of evaluating new products and technologies before they are needed, provide product road maps, participate in design reviews, and provide component application assistance. The CE has traditionally been a trailer in the product development cycle by qualifying components that are already designed into new products. Lets look at the evolution of the CE function.

In the 1960s and 1970s, the job of the component engineer was to write specifications and issue component part numbers. The specifications were used by Receiving Inspection to identify and measure important parameters. The component engineer was not responsible for component quality; component quality was the responsibility of the quality organization and was controlled by test and measurement techniques with an objective of finding defective components.

The average defect rate was 1 to 5%. For example, a 1% defect rate for a 1000 piece lot of components received translated into an average of 10 bad parts.

Screening material at incoming was an effective and practical, although expensive, way of finding defective components, but it did nothing to institute/drive continuous improvement on the supplier's part. By this practice, the supplier was not held accountable for the quality of the components.

In the 1980s, the scope of responsibility of the component engineer expanded to include quality as well as specification and documentation of components. Component engineers worked with quality engineers to improve product quality. During the 1980s, the average component defect rate was reduced from 10,000 ppm to around 100 ppm. This reduction in defect rate occurred at the same time that devices became more complex. Electrically testing these complex devices became more and more expensive, and the number of defects found be came less and less. A lot of 1000 parts received would yield 0.1 defects, or 100 ppm. Another way of looking at it is that it would now take 10,000 parts to find a single defect versus finding 1 defect in 100 parts in the 1970s. Electrical testing at receiving inspection became more and more expensive (next to impossible for VLSI circuits) and was an ineffective means of controlling component quality, and so was discontinued.

Component engineers decided that component quality was best controlled at the supplier's site. Thus the suppliers should be responsible for quality, they should regularly be assessed by means of on-site audits, and they should provide process capability indices (Cp and Cpk) for critical processes and electrical performance parameters. This provided an acceptable means of assessing a supplier's ability to consistently build good components. Component engineers also discovered that construction analysis was an effective means of evaluating the quality of execution of a component supplier's manufacturing process. Thus, both audits and construction analyses were added to the qualification process. Because of the increased cost and complexity of ICs, qualification tests per MIL-STD-883 method 5005 or EIA/JESD 47 (Stress-Test-Driven qualification of Integrated Circuits) were found to add little value, and their use by OEMs was terminated by the late 1980s. Wafer level reliability tests, in-line process monitors, and step stress-to-failure methods were used by the IC suppliers instead.

In the 1990s, component engineers de-emphasized the use of on-site audits and construction analyses, devices became even more costly and complex, and pin counts rose dramatically. New package types (such as BGAs and CSPs) added new complications to device qualification and approval. Many devices were custom or near custom (PLDs, FPGAs, and ASICs), and because of their associated tool sets it became easier to develop applications using these devices. With this change came still higher component quality requirements and reduced defect rate; the average defect rate was approximately 50 ppm. A lot of 1000 parts received would yield 0.05 defective components. That is, it would take 50,000 parts to find a single defect versus finding 1 defect in 100 parts in the 1970s.

A dramatic improvement in the quality of integrated circuits occurred during the past 30-40 years. Since the early 1970s, the failure rate for integrated circuits has decreased approximately 50% every 3 years! A new phenomenon began to dominate the product quality area. A review of OEM manufacturing issues and field returns showed that most of the problems encountered were not the result of poor component quality. Higher clock rate and higher edge rate ICs as well as impedance mismatches cause signal integrity problems. Component application (fitness for use) and handling processes, software/hardware interaction, and PWA manufacturing quality (workmanship) and compatibility issues are now the drivers in product quality. Many misapplication and PWA manufacturing process problems can be discovered and corrected using HALT and STRIFE tests at the design phase. Furthermore, many component- and process-related defects can be discovered and controlled using production ESS until corrective actions are in place.

4.1 Beyond the Year 2000

The traditional tasks of the component engineer do not fit in the current business model; therefore a new paradigm is needed. To meet the challenges of the 21st century, the component engineer's role in generating specifications and issuing part numbers is minimal. In the years since the referenced paper was written much has changed. Readily available component and technology databases have been established across the industry on the Internet and are easily licensed and downloaded to the component and development engineer's desktops, easing much of the research and documentation required for new components and sup pliers. The services offered by these Internet database providers include bill-of materials optimization, component and sourcing strategies, component availability and lifecycle issues, and price tracking. The amount of information available on the Internet allows more efficient use of both the design and component engineers' time.

The role of the component engineer has changed from an emphasis on device specification and reactive quality control to a more global concept that includes active up-front design involvement, Alpha/Beta application testing, manufacturing, specification, and quality control. Component engineering has moved from a back-end reactionary and documentation mindset function to pro active involvement at the conceptual phase of a design. Technology and product road maps are an important part of the component engineer's tool set in providing source and component recommendations that satisfy both the circuit performance requirements and the issues of manufacturability and an assured source of supply.

The CE must be a forward-looking resource for product development and strategic planning. The focus is on matching a company's product direction with the direction of the supply base (see Section 3, Fig. 2). Throughout the design pro cess, the component engineer questions the need for and value of performing various tasks and activities, eliminating all that are non-value added and retaining the core competencies. As more IC functions are imbedded into ASICs via core ware, the core-ware will have to be specified, modeled, and characterized, adding a new dimension to a CE's responsibility.

The component engineer is continuously reinventing the component qualification process (more about this in a subsequent section) in order to effectively support a fast time-to-market requirement and short product development and life cycles. The CE has traditionally been a trailer in the product development cycle by qualifying components that are designed into new products. Now components need to be qualified ahead of the need, both hardware and software- and their interactions.

A listing of the component engineer's responsibilities is as follows:

1. Eliminate non-value added activities.

Reduce the time spent on clerical activities and documentation.

Eliminate the development of SCDs for standard off-the-shelf components unless special testing is required.

Create procurement specifications only for special/custom components.

2. Implement the use and placement of Internet-driven component data bases on designer desktops or work stations. These databases will be expanded to include previous component/supplier usage history, field experience, qualification information, application information, and lessons learned.

3. Adopt the use of standard off-the-shelf components (such as those used in PCs, networking, telecommunication equipment, and mobile applications) as much as possible since they drive the marketplace (i.e., IC manufacturing) by their volume.

4. Use of ASICs and PLDs where they provide design flexibility, product performance, and advantage and thus competitive marketplace ad vantage.

5. Provide product definition and development support-early involvement in technology and functional component assessment and selection in the conceptual design phase.

Work closely with design teams and development labs to mold technology needs, product direction, and component and supplier selection.

Provide component application assistance.

Disseminate basic technology and road map information to the company via presentations and the use of the intra-company worldwide web. Examples of these technologies include PCI, HSTL, GTL, Fiber Channel, ATM, SSA, I2O, Optimal interconnect, etc.

Conduct ongoing component/supplier selection application reviews (i.e., bill of material reviews) throughout the design process to en sure that components and suppliers selected are fit for the application, won't go end-of-life for the projected product life, are manufacturable, and can meet delivery, quality, and reliability needs.

6. Coordinate component and technology qualification.

Develop a qualification plan for strategic components.

Begin early qualification of technologies, devices, and package styles that will be required for new designs when the parts become avail able such as low-voltage (1.2-V) logic-before they are needed in product design.

Ensure that the component is qualified and available just as the circuit design is complete, insert the new component in the product, and ship it.

Obtain device SPICE models for signal integrity analysis and use in system modeling and evaluation.

Determine the need for construction analyses, special studies, and design reviews and coordinate their conduct.

Define combinational "smorgy" test plans (mixing and matching sup pliers on a PWA) for product test.

7. Manage supplier design usage.

Consider current suppliers first. Do the current suppliers' technology and component road maps match ours? Is there a good fit? Are there other suppliers we should be using? Only approved suppliers will be used for new designs. Suppliers not on the approved suppliers list (ASL) need to be justified for use and pass the supplier approval process. If the business dictates (lower price or increased performance), new suppliers will be considered.

Reduce the number of supplier audits by eliminating the need to audit the top tier suppliers. Institute supplier self-audits in lieu of customer audits for top tier suppliers.

Focus resources on components that pose the greatest quality or de sign risks, greatest areas for improvement, and/or those that are strategic to a given product design.

8. Provide manufacturing support.

Work with suppliers on quality and availability issues. Ensure that the right devices are used for application requirements. Ensure that parts meet safety requirements and manufacturing requirements, such as PWA water wash and reducing the number of device families to support.

Work with Development, Manufacturing, and Purchasing on cost reduction programs for existing and new designs. Focus on Total cost of ownership in component/supplier selection.

9. Provide CAD model support. Component engineers MCAD, and ECAD designers should combine efforts to create singular and more accurate component models. The MCAD and ECAD models will be come the specification control vehicle for components in replacement of SCDs.

10. Coordinate the establishment of quality (ppm) and reliability (FIT) goals for critical, core, or strategic commodities including a 3-year improvement plan. Establish a plan to monitor commodity performance to these quality and reliability goals on a quarterly basis.

11. Institute obsolescence BOM reviews for each production PWA every 2 years. Institute obsolescence BOM reviews when a portion of an existing board design will be used in a new design.

5. COMPONENT SELECTION

As discussed in Section 3, the right technology and functional implementation (i.e., specific component) must be selected based on the application requirements.

The type of component selected is a double-edged sword and can significantly impact a product manufacturer's ability to compete in the marketplace. For example, in a low-volume product application market such as military/aerospace or high-end computer manufacturing a very viable component strategy is to use as many industry standard off-the-shelf components as possible. The advantages this provides are low price, high availability, low risk, manufacturing process stability due to high volumes produced, high quality and reliability, common building blocks, and fast product ramping. High-volume industry standard ICs tend to have sufficient design margins and minimum variability of critical parameters indicating good manufacturing (wafer fab) process control. However, when multiple sources are used for a given component there is significant variability between all sources of supply. Mixing and matching various suppliers' components can lead to timing violations due to variability and lack of margin, for example. Essentially the low-volume users "piggyback" on the high-volume users who drive the IC marketplace and who resolve any issues that occur. The disadvantage of this strategy is that anyone can use the components in their designs.

Thus, all competing hardware designs could end up looking alike (true commodity products) with there being no real leverage, competitive performance advantage, or differentiating factors between the various manufacturers' product offerings, such as is the case for personal computers.

On the flip side of this issue is the competitive advantage and differentiation provided by the use of customizable application specific ICs (ASICs), PLDs, and systems on a chip (SOCs). They provide the circuit designer with flexibility, performance advantages, and reduced size. But for the low-volume product markets, the device usage is small; there is no benefit to be gained from the quality improvement inherent in large manufacturing (wafer fab) runs; and because of the small quantities purchased the component prices will tend to be high. Also, since many of these high-leverage parts are designed by fabless IC companies involving multiple outsource providers (wafer fab, assembly, electrical test), several questions arise: who, how, when, and how fast with regards to problem resolution and who has overall IC quality responsibility? Conversely, due to the long product lifetimes for these product markets, the use of ASICs and PLDs provides a solution to the escalating obsolescence of standard off-the-shelf ICs and the subsequent expensive and time-consuming system requalification efforts required for substitute parts.

Today, there is also the issue of leverage in bringing products to market rapidly. Printed wire assembles are being designed using ASICs, PLDs, and SOCs as placeholders before the logic details are determined, allowing a jumpstart on PWA design and layout. This is contrary to waiting until the circuit design (actual components used) is fixed using standard off-the-shelf components before PWA layout begins. Again the issues are market leverage, product performance, and time to market. Another problem is that incurred in using ASICs. Normally, several "spins" of a design are required before the IC design is finalized, resulting in high nonrecurring engineering (NRE) and mask charges being incurred by the product manufacturer. Many designers have turned to the use of programmable devices [PLDs and field programmable gate arrays (FPGAs)] because of their flexibility of use and the fact that device densities, performance, and price have approached those of ASICs. Another strategic decision needs to be made as well.

For which components are multiple sources both available and acceptable for use in a given design? For which components will a single-source solution be used? These decisions are important from a risk perspective. Using a single sourced component may provide a performance advantage, but at the same time may involve increased risk from a yield perspective and in being able to ensure a continuous source of components in the manufacturing flow. This last point is key because often (but not always) unique sole-sourced components (which pro vide product functional and performance advantages) are provided by smaller boutique suppliers that do not have the fab capacity arrangements and quality infrastructure in place to provide the level of support required by the OEM. Multiple suppliers for any component increase the number of variables in the product and manufacturing process, both of which make it difficult to ensure a consistent product.

6. INTEGRATED CIRCUIT RELIABILITY

The reliability of ICs critical to equipment operation (primarily microprocessors, memories, ASICs, and FPGAs) by and large determines (or drives) equipment (product) reliability. This section traces the historical perspective of IC quality and reliability and how it evolved to where we are today. I then discuss accelerated (life) testing and how it is used to estimate fielded reliability.

6.1 Historical Perspective: Integrated Circuit Test

Accelerated testing has been extensively used for product improvement, for product qualification, and for improving manufacturing yields for several decades.

Starting with the lowest level in the electronics equipment food chain, accelerated testing was rigorously applied to the packaged integrated circuit, for economic reasons. Component screening has been a way of life for the military/aerospace community, led by NASA and the U.S. Air Force, since the inception of the integrated circuit industry in the late 1950s. Almost every major computer, telecommunications, and automotive manufacturer performed accelerated stress testing on all purchased integrated circuits until major quality improvements became widely evident.

The reason why OEMs performed these seemingly non-value added screening requirements was that U.S. IC manufacturers had an essentially complacent, almost arrogant, attitude toward quality. "We'll decide what products you need and we'll provide the quality levels that we feel are appropriate." They turned a deaf ear to the user's needs. In the 1960s and 1970s, IC suppliers often had a bone pile, stored in 50-gallon drums, of reject devices at final electrical test. Additionally, the ICs that were delivered to customers had a high field failure rate. There was no concerted effort on the part of the IC suppliers to find the root cause of these in-house rejects or the high field failure rates experienced and institute a corrective action feedback loop to design and manufacturing.

As a result, the automotive industry and the U.S. computer industry (led to Burroughs/Unysis, Honeywell, and others), due to usage problems encountered with the first DRAMs (notably the Intel 1103), came to the conclusion that it would be prudent for them to perform 100% incoming electrical inspection, burn-in, and environmental stress testing. Other industries followed suit.

The decision to perform accelerated stress testing (AST) resulted in the creation of an open forum in 1970 for the discussion of electrical testing issues between IC manufacturers and users; the International Test Conference. Testing required an expensive capital equipment and technical personnel overhead structure at most large users of ICs. Some of the smaller users chose to outsource their incoming electrical inspection needs to independent third-party testing laboratories, thus fueling that industry's growth.

==========

(coming soon) TABLE 8 Cost of Failures

Type of business Cost per hour of downtime Retail brokerage $6,450,000 Credit card sales authorization $2,600,000 Home shopping channels $113,700 Catalog sales center $90,000 Airline reservation centers $89,500 Cellular service activation $41,000 Package shipping service $28,250 Online network connect fees $22,250 ATM service fees $14,500

Source: IBM internal studies.

==========

FIGURE 7 Cost of failure versus place of discovery for electronic devices.

Up to the mid-1980s most users of integrated circuits performed some level of screening, up to the LSI level of integration, for the following reasons:

They lacked confidence in all suppliers' ability to ship high-quality, high reliability ICs.

They felt some screening was better than none (i.e., self-protection).

They embraced the economic concept that the earlier in the manufacturing cycle you find and remove a defect, the lower the total cost.

The last item is known as the "law of 10" and is graphically depicted in Figure

7. From the figure, the lowest cost node where the user could make an impact was at incoming test and thus the rationale for implementing 100% electrical and environmental stress screening at this node.

The impact of failure in the field can be demonstrated by considering some of the effects that failure of a computer in commercial business applications can have. Such a failure can mean

The Internet going down

A bank unable to make any transactions The telephone system out of order A store unable to fill your order Airlines unable to find your reservation A slot machine not paying off

The dramatic increase in cost of failures when moving from identification at the component level to identification in the field is shown in Table 8. This is the overwhelming reason why accelerated stress testing was performed.

The arrogant independence of the IC suppliers came to a screeching halt in 1980 when Hewlett-Packard dropped a bombshell. They announced that Japanese 16K DRAMs exhibited one to two orders of magnitude fewer defects than did those same DRAMs produced by IC manufacturers based in the United States.

The Japanese aggressively and insightfully implemented the quality teachings of such visionaries as Drs. W. Edwards Deming and Joseph Juran and consequently forever changed the way that integrated circuits are designed, manufactured, and tested. The Japanese sent a wake-up call to all of U.S. industry, not just the semiconductor industry. Shoddy quality products were no longer acceptable.

They raised the bar for product quality and changed the focus from a domestic one to a global one.

The Hewlett-Packard announcement and subsequent loss of worldwide DRAM market share served to mobilize the U.S. IC industry to focus on quality.

The result was a paradigm shift from an inspection mindset to a genuine concern for the quality of ICs produced. It was simply a matter of design and manufacture quality ICs or go out of business. Easy to say; longer and tougher to realize.

Also, in the 1980-981 timeframe, the U.S. Navy found that over 50% of its fielded F14 aircraft were sitting on the flight line unavailable for use. Investigations traced the root cause of these unavailable aircraft to IC defects. As a result, Willis Willoughby instituted the requirement that all semiconductors used in Navy systems must be 100% rescreened (duplicating the screens that the manufacturer performs) until such time as each IC manufacturer could provide data showing that the outgoing defect level of parts from fab, assembly, and electrical test was less than 100 ppm. This brute force approach was necessary to show the semiconductor industry that the military was serious about quality.

Since components (ICs) have historically been the major causes of field failures, screening was used

To ensure that the ICs meet all the electrical performance limits in the supplier's data sheets (supplier and user).

To ensure that the ICs meet the unspecified parameters required for system use of the selected ICs/suppliers (user).

To eliminate infant mortalities (supplier and user).

To monitor the manufacturing process and use the gathered data to institute appropriate corrective action measures to minimize the causes of variation (supplier).

As a temporary solution until the appropriate design and/or process corrective actions could be implemented based on a root cause analysis of the problem (user or supplier) and until IC manufacturing yields improved (user).

Because suppliers expect sophisticated users to find problems with their parts. It was true in 1970 for the Intel 1103 DRAM, it is true today for the Intel Pentium, and it will continue to be true in the future. No matter how many thousands of hours a supplier spends developing the test vectors for and testing a given IC, all the possible ways that a complex IC will be used cannot be anticipated. So early adopters are relied on to help identify the bugs.

As mentioned, this has changed-IC quality and reliability have improved dramatically as IC suppliers have made major quality improvements in design, wafer fabrication, packaging and electrical test. Since the early 1970s IC failure rates have fallen, typically 50% every 3 years. Quality has improved to such an extent that

1. Integrated circuits are not the primary cause of problems in the field and product failure. The major issues today deal with IC attachment to the printed circuit board, handling issues (mechanical damage and ESD), misapplication or misuse of the IC, and problems with other system components such as connectors and power supplies. Although when IC problems do occur, it is a big deal and requires a focused effort on the part of all stakeholders for timely resolution and implementation of permanent corrective action.

2. Virtually no user performs component screening. There is no value to be gained. With the complexity of today's ICs no one but the IC supplier is in a position to do an effective job of electrically testing the ICs. The supplier has the design knowledge (architectural, topographical, and functional databases), resources (people) who understand the device operation and idiosyncracies, and the simulation and the test tools to develop the most effective test vector set for a given device and thus assure high test coverage.

3. United States IC suppliers have been continually regaining lost market share throughout the 1990s.

4. Many failures today are system failures, involving timing, worst case combinations, or software-hardware interactions. Increased system complexity and component quality have resulted in a shift of system failure causes away from components to more system-level factors, including manufacturing, design, system-level requirements, interface, and software.

FIGURE 8 The impact of advanced technology and improved manufacturing processes on the failure rate (bathtub) curve.

6.2 Failure Mechanisms and Acceleration Factors

Integrated circuits are not the primary cause of product failure, as they were in the 1960s through the 1980s. The reason for this improvement is that IC suppliers have focused on quality. They use sophisticated design tools and models that accurately match the circuit design with the process. They have a better under standing and control of process parameters and device properties. There is a lower incidence of localized material defects; greater attention is paid to detail based on data gathered to drive continuous improvement; and when a problem occurs the true root cause is determined and permanent corrective action is implemented.

Figure 8 shows this improvement graphically using the bathtub failure rate curve.

Notice the improved infant mortality and lower useful life failure rates due to improved manufacturing processes and more accurate design processes (design for quality) and models. Also notice the potentially shorter life due to smaller feature size ramifications (electro-migration and hot carrier injection, for example).

Nonetheless failures still do occur. Table 9 lists some of the possible failure mechanisms that impact reliability. It is important to understand these mechanisms and what means, if any, can be used to accelerate them in a short period of time so that ICs containing these defects will be separated and not shipped to customers. The accelerated tests continue only until corrective actions have been implemented by means of design, process, material, or equipment change. I would like to point out that OEMs using leading edge ICs expect problems to occur. However, when problems occur they expect a focused effort to understand the problem and the risk, contain the problem parts, and implement permanent corrective action. How issues are addressed when they happen differentiates strategic suppliers from the rest of the pack.

=======

(coming soon) TABLE 9 Potential Reliability-Limiting Failure

Mechanisms Mobile ion contamination Impure metals and targets Manufacturing equipment Metals Electromigration (via and contact) Stress voiding Contact spiking Via integrity Step coverage Copper issues Oxide and dielectric layers TDDB and wearout Hot carrier degradation Passivation integrity Gate oxide integrity (GOI) Interlayer dielectric integrity/delamination EOS/ESD Cracked die Package/assembly defects Wire bonding Die attach Delamination Solder ball issues Single event upset (SEU) or soft errors Alpha particles Cosmic rays

=======

(coming soon) TABLE 10 Example of Acceleration

Stresses Voltage acceleration

Dielectric breakdown

Surface state generation

Gate-induced drain leakage current

Hot carrier generation, injection

Corrosion

Current density acceleration

Electromigration in metals

Hot-trapping in MOSFETs

Humidity/temperature acceleration

Water permeation

Electrochemical corrosion

========

Integrated circuit failure mechanisms may be accelerated by temperature, temperature change or gradient, voltage (electric field strength), current density, and humidity, as shown in Tables 10 and 11.

For more information on IC failure mechanisms, see Ref. 2.

(coming soon) TABLE 11 Reliability Acceleration Means

Temperature Acceleration

Most IC failure mechanisms involve one or more chemical processes, each of which occurs at a rate that is highly dependent on temperature; chemical reactions and diffusion rates are examples of this. Because of this strong temperature de pendency, several mathematical models have been developed to predict tempera ture dependency of various chemical reactions and determine the acceleration factor of temperature for various failure mechanisms.

FIGURE 9 The Arrhenius model showing relationship between chip temperature and acceleration factor as a function of Ea.

The most widely used model is the Arrhenius reaction rate model, deter mined empirically by Svante Arrhenius in 1889 to describe the effect of tempera ture on the rate of inversion of sucrose. Arrhenius postulated that chemical reactions can be made to occur faster by increasing the temperature at which the reaction occurs. The Arrhenius equation is a method of calculating the speed of reaction at a specified higher temperature and is given by

r(T) _ r0 e_Ea/kT (1)

where

A _ constant e _ natural logarithm, base 2.7182818 Ea _ activation energy for the reaction (eV) k _ Boltzmann's constant _ 8.617 _ 10_5 (eV/K) T _ temperature (K)

Today the Arrhenius equation is widely used to predict how IC failure rates vary with different temperatures and is given by the equation

R1 _ R2 eEa/k(1/T1_1/T2) (2)

Acceleration is then AT _ eEa/k(1/T1_1/T2) (3)

where T1 and T2 _ temperature (K)

k _ Boltzmann's constant _ 86.17 µeV/K

These failure mechanisms are primarily chemical in nature. Other acceleration models are used for nonchemical failure mechanisms, as we shall shortly see. The activation energy Ea is a factor that describes the accelerating effect that temperature has on the rate of a reaction and is expressed in electron volts (eV). A low value of Ea indicates a reaction that has a small dependence on temperature. A high value of Ea indicates a high degree of temperature dependence and thus represents a high acceleration factor. Figure 9 shows the relation ship between temperature, activation energy, and acceleration factor.

FIGURE 10 Coffin-Manson model showing the number of stress cycles as a function of temperature change or gradient.

Voltage Acceleration

An electric field acceleration factor (voltage or current) is used to accelerate the time required to stress the IC at different electric field levels. A higher electric field requires less time. Since the advent of VLSICs, voltage has been used to accelerate oxide defects in these CMOS ICs such as pinholes and contamination.

Since the gate oxide of a CMOS transistor is extremely critical to its proper functioning, the purity and cleanliness of the oxide is very important, thus the need to identify potential early life failures. The IC is operated at higher than normal operating VDD for a period of time; the result is an assigned acceleration factor AV to find equivalent real time. Data show an exponential relation to most defects according to the formula

AV _ e? (VS_VN) (4)

where VS _ stress voltage on thin oxide VN _ thin oxide voltage at normal conditions

? _ 4-6 volt _1

Humidity Acceleration

The commonly used humidity accelerated test consists of 85°C at 85% RH. The humidity acceleration formula is

AH _ e0.08(RHs_RHn) (5)

where RHs _ relative humidity of stress RHn _ normal relative humidity For both temperature and humidity accelerated failure mechanisms, the acceleration factor becomes

AT&H _ AHAT (6)

Temperature Cycling

Temperature cycling, which simulates power on/off for an IC with associated

field and temperature stressing is useful in identifying die bond, wire bond, and metallization defects and accelerates delamination. The Coffin-Manson model for thermal cycling acceleration is given by ATC _ [fitstress] c fituse (7) where c _ 2-7 and depends on the defect mechanism. Figure 10 shows the number of cycles to failure as a function of temperature for various values of c.

214 Section 4 The total failure rate for an IC is simply the mathematical sum of the individual failure rates obtained during the various acceleration stress tests, or

Ftotal _ FT _ FVDD _ FT&H _ FTC (8)

where

FT _ failure rate due to elevated temperature FVDD _ failure rate due to accelerated voltage (or electric field) FT&H _ failure rate due to temperature humidity acceleration FTC _ failure rate due to temperature cycling

Temperature Acceleration Calculation Example

Reliability defects are treated as chemical reactions accelerated by temperature.

Refining Eq. (3) for temperature acceleration factor, we get TAF _ r2 r1

_ eEa/k[1/T1-1/T2] (9) Conditions: TBI _ 125°C, TN _ 55°C, k _ 86.17 µeV/K.

For Ea _ 0.3 eV, TAF _ e0.3/86.17[1/328-1/398]

_ 6.47 For Ea _ 1.0 eV, TAF _ e1.0/86.17[1/328-1/398]

_ 504.1

What does this mean? This shows that a failure mechanism with a high activation energy is greatly accelerated by high temperature compared to a failure mechanism with a low activation energy.

Voltage Acceleration Calculation Example

Using the electric field form of Eq. (4), we get VAF _ e(ES-EN)/EEF where ES _ stress field on thin oxide (mV/cm) EN _ stress field at thin oxide at normal conditions EEF _ experimental or calculate (Suyko, IRPS'91)(MV/cm) and VAF _ f(tOX, process ) _ 1 ln(10)

? and ? _ 0.4exp(0.07/kT) (10)

Component and Supplier Selection 215 Conditions: tOX _ 60 A ° ,VN _ 3.3 V, Vs _ 4.0 V, ES _ 6.67 MV/cm, EN _ 5.5 MV/cm, EEF _ 0.141 MV/cm. Then VAF _ e(6.67-5.5)/0.141 _ 3920 If Vs _ 4.5 V and the other conditions remain the same, then VAF _ e[7.5-5.5]/0.141 _ 1.45 _ 106

This shows the greater acceleration provided by using a higher voltage (E field) for a failure mechanism with a high voltage activation energy.

Integrated Circuit FIT Calculation Example

A practical example is now presented using temperature and voltage acceleration stresses and the activation energies for the encountered failure mechanisms to calculate the reliability of a specific IC. (This example provided by courtesy of Professor Charles Hawkins, University of New Mexico.) Three different lots of a given IC were subjected to an accelerated stress life test (two lots) and a high voltage extended life test (one lot). The results of these tests are listed in the following table.

The way you read the information is as follows: for lot 1, 1/999 at the 168-hr electrical measurement point means that one device failed. An investigation of the failure pointed to contamination as the root cause (see Step 2).

A step-by-step approach is used to calculate the FIT rate from the experimental results of the life tests.

Step 1. Organize the data by failure mechanism and activation energy.

Step 2. Calculate the total device hours for Lots 1 and 2 excluding infant mortality failures, which are defined as those failures occurring in the first 48 hr.

Step 3. Arrange the data:

Step 4. Divide the number of failures for each Ea by the total equivalent device hours:

Calculate the number of total oxide failures in their time: two fails gives 6.5 FITs [2 (1.44811 _ 107 _ 2.9175 _ 108 hours)].

Therefore, the total failure rate is 6.5 _ 42 _ 3.6 _ 52 FITs.

Top of Page

PREV. | Next | Article Index | HOME