Balancing Privacy and Explainable AI in Semiconductor Manufacturing

Balancing Privacy and Explainable AI in Semiconductor Manufacturing


Explainable AI (that is, the ability to understand and interpret the decisions made by a machine learning model) can make ML a practical option in a variety of applications while also aligning with ethical and regulatory imperatives. As we continue with the fourth and final installment in our Explainable AI series, we previously explored why SHAP values are a leading technique for Explainable AI. In this post, we will demonstrate how privacy-preserving SHAP values can solve crucial challenges, by examining an example use case in the semiconductor manufacturing field, where SHAP values enable root-cause analysis of detected anomalies.

A Crash Course in Semiconductor Manufacturing

Semiconductor fabrication is the process of creating integrated circuits or microchips – electronic devices that are at the heart of nearly all modern electronic products, from smartphones to medical devices. Semiconductor manufacturing involves two main parties, Fabs and SMEs.

The nano scale circuit of a microchip is manufactured layer by layer on a wafer, a thin silicon disk that comprises billions of transistors. Integrated circuits (ICs) are produced in fabrication plants commonly called Fabs that manufacture semiconductor products either for other companies (the fabless business model) or for themselves.

Modern semiconductor manufacturing processes involve several complex phases using advanced equipment and cleanroom environments to minimize contamination and defects. The high-tech specialized machines are produced by diverse Semiconductor Manufacturing Equipment (SME) companies.

Challenges in Semiconductor Manufacturing

The current manufacturing process involves two significant challenges:

1.Data Privacy

At the end of the manufacturing process, dies are tested and assembled into a finished IC product. To maximize the percentage of microchips that pass the functional tests, precise control and measurements of temperature, pressure, and chemical processes are required all along the production line. 

This testing requires sharing sensitive data between the Fabs and the SMEs. However, SME companies are reluctant to give Fabs access to their machines’ data, due to IP concerns. Likewise, Fabs do not want to expose commercially sensitive information about the fabrication processes and quality metrics to the equipment vendors. 

2. Explainability

In the testing process, AI algorithms, particularly computer vision techniques, can be used to analyze high-resolution images of wafers and identify defects or abnormalities in individual dies. Deep learning models, such as convolutional neural networks (CNNs), can learn to recognize patterns and anomalies that may not be obvious to human inspectors.

Yet without Explainable AI, inspectors may not be able to identify what factors led the model to identify a defect or abnormality – that is, they may not be able to effectively determine key factors contributing to yield excursions downstream in the process.

Privacy-Preserving Model Explainability

Inpher’s XOR Platform supports privacy-preserving model explainability for gradient boosting decision trees. The recently released algorithm XorSHAP allows for computing SHAP values on sensitive IP data. In the context of semiconductor manufacturing, where metrics and machine sensor readings are proprietary to the SME, this algorithm enables Fabs to perform root-cause analysis and yield optimization by identifying what exact phases and machines of the manufacturing pipeline account for defective dies.

SHAP Values for Root-Cause Analysis and Yield Optimization

A complex modern semiconductor manufacturing process is constantly monitored; however, not all the signals collected by sensors are equally valuable to detect and explain defects. Often, useful information is buried in irrelevant information and noise.

Our use case employs a publicly available UCI SECOM dataset from sensors used in the production of one specific wafer. For simplicity, we assume that the dataset needed for the analytics is split between the Fab, holding the labels (pass/fail), and the SME, holding sensor data. We split the private datasets into training and testing datasets, train a gradient boosting model, then compute SHAP values on the test dataset without ever exposing information on the original datasets.

Next, we interpret the SHAP values to facilitate the root-cause analysis for the defective samples.

Observation 1
We start by identifying the sensor readings (features) that most significantly influence the pass/fail prediction by averaging the absolute values of the computed SHAP values across the samples and sorting the resulting means. To do that, we do not need to reveal the SHAP values to the data analyst. Our first observation is that features 126 and 127 have the most influence on the model prediction.

Observation 2
Of the 415 input features, only 46 contributed to the model output; that is, the sum of the SHAP values of the feature is greater than 0. So nearly 89% have no impact. That is, the issues causing the failures are located on a small part of the production line.

Observation 3
For the 3 highest contributing features (left plot below), we see that a high feature value correlates with a high SHAP value, and that the blue and red samples are far apart. We conclude that for these features, a high feature value indicates failure.

Observation 4
We now analyze if it’s possible to identify failures from simple thresholding of a single feature. The scatter plots below show the relationship between the feature and SHAP values for the top 4 most important features. Each data point is colored with its failure probability. The histograms of the feature values are drawn in gray.

For features 126, 127 and 214, a vertical line could decently separate samples from the 2 classes. There is little vertical overlap of the red and blue samples for higher values of the features. The fourth most influential feature, 16, seems to add noise to the model prediction.

Observation 5
Surprisingly, using only the top 2 features, a naive approach of classifying samples by single feature thresholding correctly identifies 14 of the 21 failures, for only 1 false positive. This is almost as good as the XorBoost model, which scores: TP=15 and FP=1 ! We remark that feature 214 is redundant with feature 127; maybe these are neighboring measures.

Observation 6
Let’s have a closer look at the SHAP values of the failure sample not yet explained, in the left plot below. Clearly feature 358 is the primary factor for the model to classify the sample as a failure. Its feature versus the SHAP values scatter plot (top right), shows an isolated data point on the top-right corner. That’s our sample! Again, a simple thresholding allows us to identify the last failure.

All the detected failures could be explained by only 3 features out of 415.


In our semiconductor use case, we quickly and precisely identified the troublesome piece of equipment in the production line, which should be inspected first by the operators. This use of SHAP values will enable an increase in process throughput, decreased time to learning, and reduced per unit production costs.

This methodology can be adapted for a wide variety of applications, fields and industries, including government, finance, insurance and healthcare. Simply, wherever explainability is required to make AI models practical and useful, the SHAP value is a powerful tool. For a deeper dive see our research paper on XorSHAP: Privacy-Preserving Explainable AI for Decision Tree Models.

For more information on this blog or Inpher, please do not hesitate to reach out to our authors and please register for our monthly newsletter where we summarize our latest content, happenings and highlights.