AI@Edge — Model Fingerprinting: Choosing the Right Model for the Edge and Knowing When the Model Doesn’t Know
Authors: Nirmit Desai, Wei-Han Lee, Utpal Mangla, Satish Sadagopan, Mudhakar Srivatsa, Matthews Thomas, Dinesh Verma
In our first blog we provided an overview of the challenges in implementing AI/ML at the edge. In our second blog we discussed a solution to one of the challenges — how to intelligently sub-sample data from multiple edges before they are transferred to a central location for model training. The solution we discussed was coresets — machine learning algorithms to succinctly summarize the data from the edge devices without losing data quality, while enabling real-time decisions with pre-trained AI models at the point of action, and at the speed of 5G.
In our third blog, we discussed whether it is essential to gather data at a centralized location for model training. We discussed how, with Federated Learning, AI/ML models can learn from data residing across multiple edges, without sharing the raw data (thereby, offering higher levels of data privacy). Federated Learning algorithms form distributed cohorts of data and create a global model without revealing raw (personal) data.
In this blog, we will discuss NeuralFP, a technique for Model Fingerprinting, that allows us to select the right ML model for the dataset we are dealing with, based on the characteristics of the local/edge data. Further, at the time of inference, model fingerprints can also be used to detect anomalous inputs. Anomalous inputs may indicate a shift in the edge environment, e.g., a camera’s viewpoint may change, or the assembly line may start manufacturing different kind of product than what the model is trained for originally.
Returning to our ambulance use case that was introduced in the previous blogs, depending on the context of the patient and ambulance, we need to select the right model, possibly from a repository of pre-trained models, that can provide the best results for the dataset at hand at a given edge location. There are various types of data collected for analysis such as the personal characteristics of the patient including, age, gender, ethnicity, pre-existing conditions, and current vital statistics. Then, there is data about the context of the ambulance and ambient conditions such as status and location of the ambulance, whether the ambulance is on city roads or highways, various weather conditions such as rain, snow, whether it is day and nighttime etc. Given data from an edge node, the challenge is in selecting a machine learning model that best matches the data characteristics of that node in order to get the best results for a patient enroute to the hospital, as well as route the ambulance on the best path, with the best 5G network slice. Such a solution requires working with the network OSS systems to orchestrate and either move the ambulance connectivity to a better 5G slice, or create a new 5G slice that better matches the requirements for patient care. In addition, things can rapidly change during the journey. The patient, for example, may start to exhibit a new medical problem. The ambulance may have to be redirected due to congestion on the road network. The underlying network supporting the far and near edge devices may change. All of these will require new models to be deployed at the different edge nodes and the underpinning challenge is in automating model selection for deployment at the respective edge nodes.
Model Fingerprinting will help us in the above scenario of selecting the right model for the right edge node and execution context. IBM Research has proposed a new technique called NeuralFP (Neural Fingerprinting) which constructs fingerprints of neural network models that inherently capture the training data characteristics. The key intuition behind NeuralFP is that a neural network model responds differently to data in its training set vis-à-vis those data records whose characteristics are inherently different from those in the training dataset. The neural network models associated with their fingerprints learnt through NeuralFP will be leveraged for selecting the most suitable model for a deep learning task.
Specifically, NeuralFP constructs fingerprints for neural network models following two steps, 1) Build deep generative models for each layer in a deep neural network; and 2) Obtain reconstruction error distribution of the training set on the deep generative models. At a high level, NeuralFP first passes all the training data through the original neural network model to produce activations at latent-space layers, based on which deep generative models (such as autoencoders) will be constructed. Autoencoders are a popular type of deep generative models that are optimized to learn reconstructions that are close to the input through compressed representations. Therefore, autoencoders are likely to perform poorly at compressing anomalous data and the corresponding reconstruction errors should be higher. Motivated by this, we feed the activations (data that are used to train autoencoders) to the learnt autoencoders to obtain the distribution of reconstruction errors, serving as fingerprinting information of the neural network models. A target dataset can be matched against fingerprints of existing (pre-trained) models to determine the best match.
It is interesting to observe that model fingerprints can enable other AI tasks as well such as out-of-distribution (OOD) detection. Neural network models trained on the cloud serve as useful predictive functions for data at edge devices. However, the models may not be aware of the edge context even if the prediction output is highly confident. This can happen due to changes in the edge context (e.g., different lighting conditions), data drift, or adversarial influence. Let us consider the ambulance use case again where a patient data may change over time or the network state is being intentionally manipulated by an adversary. NeuralFP will pass this data through the neural network model and the associated autoencoders to measure the reconstruction error, which will then be compared with the stored reconstruction error distribution for determining abnormality before obtaining meaningful prediction results.
Thus, in the ambulance use case, the assumption is that there is a repository of pre-trained models that can be deployed to the edge when the ambulance initially responds to the call. When the ambulance begins it journey, data from the ambulance edge will be evaluated against existing model fingerprints to measure suitability, based on which the best model will be selected from a model repository. Typically, such a fine-grained match would be carried out after applying coarse-grained matching techniques, such as keyword-based searches on model metadata. The model that is identified will then be deployed to the ambulance edge so that the best possible care is provided to the patient. As time passes, the conditions of the patient may change, and the edge devices transmit a different set of data. However, the existing models may not be appropriate for the new condition the patient has developed so a request is made for a more appropriate models for the new condition. If a better model is found it will be deployed. It is also possible that a model is not found. In such cases the best solution may be to detect any out-of-distribution data samples and bring a human expert in the loop to verify model predictions in such cases. Dynamic construction of 5G slices can enable real-time transmission of vital patient data to a remote expert for this purpose.
In our next blog we will address the challenge of ensuring that the application performance is maintained across varying conditions of the telco services provider(s)’ network topology, especially with 5G network slices. In the case of moving ambulance, Neural Tomography can monitor and configure the network slice dynamically instructing the driver in real-time to avoid a particular route (e.g., a congested tunnel) and redirect to an alternate route that affords access to a more appropriate 5G network slice.
 Wei-han Lee, Steve Millman, Nirmit Desai, Mudhakar Srivatsa and Changchang Liu. NeuralFP: Out- of-distribution Detection using Fingerprints of Neural Networks. In IEEE Intl Conference on Pattern Recognition (ICPR), Nov 2020.
 Nirmit Desai, Raghu Ganti, Heesung Kwon, Ian Taylor and Mudhakar Srivatsa. Unsupervised Estimation of Domain Applicability of Models. In Military Communications (MILCOM), Nov 2018.