The example in this repository demonstrates how to perform open vocabulary Class Activation Mapping (CAM), a neural network explainability technique. This is achieved via CLIP Surgery [1], a training-free modification to OpenAI's CLIP model that enables several downstream open vocabulary tasks.
The following image illustrates the class activation map for the prompt
"car" before and after applying CLIP surgery.

- Download the repository and extract the contents.
- Download the required products below.
Requires MATLAB® release R2026a or newer with the following toolboxes/add-ons:
- Computer Vision Toolbox®
- Deep Learning Toolbox®
- Computer Vision Toolbox Model for OpenAI CLIP Network
Run the openVocabularyCAMviaCLIPSurgery.m live script.
The license is available in the License.txt file in this GitHub repository.
[1] Li, Yi, et al. "A closer look at the explainability of contrastive language-image pre-training." Pattern Recognition 162 (2025): 111409.
Copyright © 2026 The MathWorks, Inc.