SHAP Save
Saves a SHAP explainer object to a binary file using Joblib.
SHAP Save
Processing
This brick saves a SHAP explainer object (a tool used to interpret machine learning models) to a file on your system. It serializes the object using the Joblib library, creating a binary file with a .shap extension. This allows you to persist the explanation results to disk and reload them later without having to recalculate the heavy processing steps. The brick also handles file naming automatically, offering features like timestamping and version control to ensure you don't accidentally overwrite existing files.
Inputs
- explainer
- The SHAP explainer object you wish to save. This is typically the output from a previous brick that calculated the model explanations.
- directory
- The destination folder where the file will be saved. If this directory does not exist, the brick will create it for you.
Inputs Types
| Input | Types |
|---|---|
explainer |
Any |
directory |
Str, DirectoryPath |
You can check the list of supported types here: Available Type Hints.
Outputs
- result path
- The full file path of the saved SHAP explainer. This indicates exactly where the file was stored and can be passed to other bricks that might need to load or reference the file.
Outputs Types
| Output | Types |
|---|---|
result_path |
Str, FilePath |
You can check the list of supported types here: Available Type Hints.
Options
The SHAP Save brick contains some changeable options:
- Name Prefix
- Defines the beginning of the filename. Use this to label your file meaningfully (e.g., "credit_risk_model"). The default is "shap_explainer".
- Include Date (YYYYMMDD)
- If enabled, the current date will be appended to the filename. This is useful for organizing experiments by the day they were run.
- Include Time (HHMMSS)
- If enabled, the current time (Hours, Minutes, Seconds) will be appended to the filename. This helps differentiate between multiple runs on the same day.
- Auto-Increment Version
- Controls how the brick handles duplicate filenames. When enabled (default), If a file with the generated name already exists, the brick appends a version number (e.g.,
_v1,_v2) to prevent overwriting. - Return as Path Object
- Determines the data type of the output. If toggled, it returns the path as a Python
pathlib.Pathobject (useful for technical workflows requiring object manipulation). - Verbose
- Controls the amount of logging information.
import re
import logging
import pathlib
import datetime
import joblib
import shap
from coded_flows.types import Union, Str, Any, FilePath, DirectoryPath
from coded_flows.utils import CodedFlowsLogger
logger = CodedFlowsLogger(name="SHAP Save", level=logging.INFO)
def _get_next_version_index(
directory: pathlib.Path, base_name: str, extension: str
) -> int:
"""Finds the next integer version based on existing files in the directory."""
pattern = re.compile(f"^{re.escape(base_name)}_v(\\d+){re.escape(extension)}$")
max_v = 0
if directory.exists():
for item in directory.iterdir():
if item.is_file():
match = pattern.match(item.name)
if match:
current_v = int(match.group(1))
if current_v > max_v:
max_v = current_v
return max_v + 1
def save_explainer(
explainer: Any, directory: Union[Str, DirectoryPath], options=None
) -> Union[Str, FilePath]:
options = options or {}
verbose = options.get("verbose", True)
custom_prefix = options.get("custom_prefix", "shap_explainer")
include_date = options.get("include_date", False)
include_time = options.get("include_time", False)
use_versioning = options.get("use_versioning", True)
return_as_pathlib = options.get("return_as_pathlib", False)
file_extension = ".shap"
save_dir = pathlib.Path(directory)
result_path = None
try:
verbose and logger.info(f"Preparing to save SHAP explainer to: {save_dir}")
if explainer is None:
verbose and logger.error("Input explainer is None.")
raise ValueError("Input explainer cannot be None.")
if not save_dir.exists():
verbose and logger.info(f"Creating output directory: {save_dir}")
save_dir.mkdir(parents=True, exist_ok=True)
base_name_parts = [custom_prefix]
now = datetime.datetime.now()
if include_date:
base_name_parts.append(now.strftime("%Y%m%d"))
if include_time:
base_name_parts.append(now.strftime("%H%M%S"))
clean_base_name = "_".join(filter(None, base_name_parts))
version_str = ""
if use_versioning:
next_v = _get_next_version_index(save_dir, clean_base_name, file_extension)
version_str = f"_v{next_v}"
final_filename = f"{clean_base_name}{version_str}{file_extension}"
output_file_path = save_dir / final_filename
verbose and logger.info(f"Saving explainer to file: {output_file_path}")
joblib.dump(explainer, output_file_path)
verbose and logger.info("SHAP explainer saved successfully via Joblib.")
result_path = output_file_path
except Exception as e:
verbose and logger.error(f"Failed to save SHAP explainer: {e}")
raise e
return result_path if return_as_pathlib else str(result_path)
Brick Info
- shap>=0.47.0
- joblib
- numba>=0.56.0
- shap