# KINTSUGI Troubleshooting Guide This guide covers common issues and their solutions when installing or running KINTSUGI. ## Table of Contents - [Installation Issues](#installation-issues) - [Conda Environment Creation Fails](#conda-environment-creation-fails) - [Package Installation Errors](#package-installation-errors) - [Dependency Issues](#dependency-issues) - [libvips Not Found](#libvips-not-found) - [Java/JVM Issues (Legacy)](#javajvm-issues-legacy) - [CUDA/GPU Issues](#cudagpu-issues) - [Runtime Issues](#runtime-issues) - [Import Errors](#import-errors) - [Memory Issues](#memory-issues) - [Stitched Image Quality Issues](#stitched-image-quality-issues) - [Registration Failures](#registration-failures) - [HPC/SLURM Issues](#hpcslurm-issues) - [Platform-Specific Issues](#platform-specific-issues) - [Windows](#windows) - [Linux](#linux) - [macOS](#macos) --- ## Installation Issues ### Conda Environment Creation Fails **Symptom**: `conda env create` command fails with dependency conflicts. **Solutions**: 1. **Use libmamba solver** (faster and more reliable): ```bash conda install -n base conda-libmamba-solver conda config --set solver libmamba conda env create -f envs/env-linux.yml ``` 2. **Update conda**: ```bash conda update -n base conda ``` 3. **Try the streamlined environment**: ```bash conda env create -f env_streamlined.yml ``` 4. **Create minimal environment and add packages**: ```bash conda create -n KINTSUGI python=3.10 conda activate KINTSUGI pip install -e . # Install additional dependencies as needed ``` ### Package Installation Errors **Symptom**: `pip install -e .` fails. **Solutions**: 1. **Ensure setuptools is updated**: ```bash pip install --upgrade pip setuptools wheel ``` 2. **Check Python version**: ```bash python --version # Should be 3.10+ ``` 3. **Install with verbose output**: ```bash pip install -e . -v ``` --- ## Dependency Issues ### libvips Not Found **Symptom**: Error message like `cannot find library libvips` or `pyvips.GObject.Error`. #### Windows 1. **Download from Zenodo**: - Download PyVips-dev-8.16 from [Zenodo](https://zenodo.org/records/14969214) - Extract to KINTSUGI folder 2. **Set PATH manually**: ```powershell $env:PATH = "C:\Users\[username]\KINTSUGI\vips-dev-8.16\bin;$env:PATH" ``` 3. **Add to system PATH permanently**: - Open System Properties > Environment Variables - Add `C:\Users\[username]\KINTSUGI\vips-dev-8.16\bin` to PATH #### Linux ```bash # Ubuntu/Debian sudo apt-get update sudo apt-get install -y libvips-dev # Fedora/RHEL sudo dnf install vips-devel # Verify installation vips --version ``` #### macOS ```bash brew install vips ``` ### Verify libvips Installation ```python import pyvips print(f"libvips version: {pyvips.version(0)}.{pyvips.version(1)}.{pyvips.version(2)}") ``` --- ### Java/JVM Issues (Legacy) > **Note:** Java, Maven, and FIJI/CLIJ2 are no longer required. KINTSUGI now uses pure Python > implementations for all processing. This section is kept for users of older versions. **Symptom**: `java.lang.Exception` or `JVMNotFoundException`. #### Check Java Installation ```bash java -version echo $JAVA_HOME # Linux/macOS echo %JAVA_HOME% # Windows ``` #### Fix Missing Java 1. **Install via conda** (recommended): ```bash conda install openjdk=11 ``` 2. **Download from Zenodo** (Windows): - Download java-jdk21 from Zenodo - Extract to KINTSUGI folder 3. **Set JAVA_HOME**: ```bash # Linux/macOS export JAVA_HOME=/path/to/java export PATH=$JAVA_HOME/bin:$PATH # Windows PowerShell $env:JAVA_HOME = "C:\Users\[username]\KINTSUGI\java-jdk21" $env:PATH = "$env:JAVA_HOME\bin;$env:PATH" ``` #### JVM Already Started Error **Symptom**: `JVMAlreadyStarted` exception. This occurs when trying to start JVM multiple times. Solutions: 1. **Restart Python kernel** (in Jupyter) 2. **Single JVM initialization**: ```python import jpype if not jpype.isJVMStarted(): jpype.startJVM() ``` --- ### CUDA/GPU Issues **Symptom**: `torch.cuda.is_available()` returns `False`. #### Verify GPU Detection ```python import torch print(f"CUDA available: {torch.cuda.is_available()}") print(f"CUDA version: {torch.version.cuda}") if torch.cuda.is_available(): print(f"GPU: {torch.cuda.get_device_name(0)}") ``` #### Solutions 1. **Check NVIDIA driver**: ```bash nvidia-smi ``` 2. **Reinstall PyTorch with CUDA**: ```bash pip uninstall torch torchvision pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121 ``` 3. **Verify CUDA toolkit**: ```bash nvcc --version ``` 4. **Use CPU fallback**: - Most KINTSUGI operations work without GPU - Set device manually: `device = "cpu"` --- ## Runtime Issues ### Import Errors **Symptom**: `ModuleNotFoundError` or `ImportError`. #### General Solutions 1. **Verify environment is activated**: ```bash conda activate KINTSUGI which python # Should show conda environment path ``` 2. **Run dependency check**: ```bash kintsugi check ``` 3. **Reinstall package**: ```bash pip install -e . --force-reinstall ``` #### Specific Import Errors **`ImportError: cannot import name 'Valis'`**: ```python # Correct import path from Kreg.registration import Valis # or via kintsugi package from kintsugi.kreg import Valis ``` **`ModuleNotFoundError: No module named 'Kreg'`**: ```python import sys sys.path.insert(0, '/path/to/KINTSUGI/notebooks') ``` --- ### Memory Issues **Symptom**: `MemoryError` or process killed during processing. #### Solutions 1. **Reduce image dimensions**: ```python config = { "max_image_dim_px": 2048, # Reduce from default "max_processed_image_dim_px": 1024, } ``` 2. **Process images in tiles**: - Use tiled processing for large images - Reduce batch size 3. **Monitor memory usage**: ```python import psutil print(f"Memory: {psutil.virtual_memory().percent}%") ``` 4. **Clear pyvips cache**: ```python import pyvips pyvips.cache_set_max(0) ``` 5. **Use zarr for large datasets**: ```python import zarr # Process in chunks ``` --- ### Stitched Image Quality Issues **Symptom**: Stitched images appear blank/white (saturated) or show visible tile grid patterns. #### Diagnosing the Problem ```python from skimage.io import imread import numpy as np img = imread('path/to/stitched.tif') mean_val = img.mean() pct_max = 100.0 * np.sum(img > 64000) / img.size std_mean_ratio = img.std() / img.mean() print(f"Mean: {mean_val:.0f}") print(f"Pct at max: {pct_max:.1f}%") print(f"Std/Mean ratio: {std_mean_ratio:.2f}") # Saturation: pct_max > 50% # Tile grid: std_mean_ratio > 0.8 ``` #### Saturation Issues (Blank White Images) **Cause**: BaSiC illumination correction amplifies signal excessively when flatfield values are too low. **Solutions**: 1. **Ensure flatfield minimum is enforced**: ```python BASIC_FLATFIELD_MIN = 0.1 flatfield_safe = np.clip(flatfield, BASIC_FLATFIELD_MIN, None) corrected = (tiles_norm - darkfield) / flatfield_safe ``` 2. **Verify input normalization**: ```python # Input must be normalized to [0,1] before BaSiC correction tiles_norm = tiles.astype(np.float64) / 65535 ``` 3. **Reprocess affected z-planes**: ```bash python notebooks/reprocess_problematic_images.py --dry-run python notebooks/reprocess_problematic_images.py ``` #### Tile Grid Patterns (Visible Seams) **Cause**: Insufficient blending at tile overlap boundaries or stitch model mismatch. **Solutions**: 1. **Increase blend sigma**: ```python from kintsugi.stitch_blend import stitch_with_blending_gpu stitched = stitch_with_blending_gpu( tiles, result_df, sigma=10.0, # Increase from default overlap_fraction=(0.3, 0.3) ) ``` 2. **Verify stitch model compatibility**: - Stitch model computed from one z-plane may not work for all z-planes - Use consistent reference z-plane across channels 3. **Check overlap settings**: - Ensure overlap_fraction matches actual tile overlap - Typical values: (0.2, 0.2) to (0.3, 0.3) #### Batch Reprocessing For multiple problematic images, use the reprocessing script: ```bash # First scan for problems python -c " from Kio import scan_stitched_quality issues = scan_stitched_quality('/path/to/stitched') print(f'Saturated: {len(issues[\"saturated\"])}') print(f'Tile grid: {len(issues[\"tile_grid\"])}') " # Then reprocess python notebooks/reprocess_problematic_images.py ``` --- ### Registration Failures **Symptom**: Registration produces poor results or fails. #### Check Input Images 1. **Verify image format**: ```python import tifffile img = tifffile.imread('image.tif') print(f"Shape: {img.shape}, dtype: {img.dtype}") ``` 2. **Check image quality**: - Ensure images are not corrupted - Verify adequate contrast and features #### Adjust Parameters 1. **Try different registrars**: ```python config = { "micro_rigid_registrar_cls": "RigidRegistrar", # or "AffineRegistrar" } ``` 2. **Adjust image dimensions**: ```python config = { "max_image_dim_px": 4096, # Increase for more detail } ``` 3. **Disable non-rigid registration**: ```python config = { "compose_non_rigid": False, } ``` --- ## HPC/SLURM Issues ### "snakemake: command not found" ```bash conda activate KINTSUGI pip install "snakemake>=8.0" snakemake-executor-plugin-slurm ``` ### "No workflow/config.yaml found" Run `kintsugi workflow config .` from your project directory first. ### CuPy Appears Unavailable **On login nodes** (no GPU hardware), `import cupy` may succeed but GPU operations fail with `cudaErrorInsufficientDriver`. Functions like `gpu.cupy_available` test **GPU hardware availability**, not package installation. CuPy IS installed — do not attempt to reinstall it. To check if CuPy is actually installed (import-only, no GPU needed): ```python from kintsugi.gpu import get_gpu_manager gpu = get_gpu_manager() print(f"CuPy installed: {gpu.cupy_installed}") # True if package is importable print(f"CuPy available: {gpu.cupy_available}") # True only if GPU hardware present ``` ### Jobs Pending Indefinitely (QOS Limits) Check your account allocations: ```bash sacctmgr show associations user=$(whoami) format=account,partition,qos,grptres -n -P ``` Reduce `-j` when running `kintsugi workflow run` or set `jobs:` lower in `profiles/slurm/config.yaml`. ### SLURM Jobs Fail with OOM (Out of Memory) GPU jobs use CuPy (float32 in GPU memory), needing ~48 GB CPU RAM. CPU jobs use SciPy (float64 in system memory), needing ~128 GB. Adjust memory in `workflow/config.yaml`: ```yaml resources: mem_stitch: 48000 # MB, GPU jobs mem_decon: 48000 cpu_mem_decon: 128000 # MB, CPU jobs ``` ### "Missing output files after job completion" (NFS Latency) Increase latency wait in `workflow/profiles/slurm/config.yaml`: ```yaml latency-wait: 300 # Increased from 120 to 300 seconds ``` ### "CUDA initialization failed" in Job Logs This is expected on CPU-only nodes. Scripts automatically fall back to CPU mode via `KINTSUGI_DEVICE_MODE`. To ensure jobs land on GPU nodes, verify the partition setting in `workflow/config.yaml` points to a GPU partition (e.g., `hpg-b200`, `hpg-turin`). ### "srun: fatal: SLURM_TRES_PER_TASK is mutually exclusive" SLURM >= 24.11 sets `SLURM_TRES_PER_TASK` in GPU job environments, which conflicts with the Snakemake jobstep plugin's `srun` call. Fix by patching the jobstep plugin: ```python # In snakemake_executor_plugin_slurm_jobstep/__init__.py, add to __post_init__(): import os os.environ.pop("SLURM_TRES_PER_TASK", None) ``` **This patch must be re-applied after any pip upgrade of `snakemake-executor-plugin-slurm-jobstep`.** ### Stitch Model Not Found for CH2+ Channel 1 computes the stitching model used by all other channels. If CH1 fails, subsequent channels fail with "No stitch model." Check the CH1 stitching log first: ```bash tail /path/to/project/slurm/logs/snakemake/stitch_cyc01.log ``` --- ## Platform-Specific Issues ### Windows #### Long Path Issues **Symptom**: `FileNotFoundError` with long paths. **Solution**: 1. Enable long paths in Windows: - Run `gpedit.msc` - Navigate to: Computer Configuration > Administrative Templates > System > Filesystem - Enable "Enable Win32 long paths" 2. Or move KINTSUGI to shorter path (e.g., `C:\KINTSUGI`) #### DLL Load Failures **Symptom**: `ImportError: DLL load failed`. **Solutions**: 1. Install Visual C++ Redistributable 2. Verify all Zenodo dependencies are extracted 3. Add paths to system PATH --- ### Linux #### Permission Issues **Symptom**: Permission denied errors. **Solutions**: ```bash # Fix permissions chmod +x scripts/install.sh chmod -R u+rw KINTSUGI/ # Don't run as root # If needed, use: sudo chown -R $USER:$USER KINTSUGI/ ``` #### Display Issues (Napari) **Symptom**: Napari doesn't display. **Solutions**: ```bash # Check display echo $DISPLAY # For headless servers, use Xvfb Xvfb :99 -screen 0 1024x768x24 & export DISPLAY=:99 ``` --- ### macOS #### Apple Silicon (M1/M2) **Symptom**: Package incompatibility on ARM. **Solutions**: 1. Use Rosetta 2 for x86 compatibility: ```bash arch -x86_64 conda create -n KINTSUGI python=3.10 ``` 2. Use native ARM packages where available: ```bash conda config --add channels apple ``` #### Gatekeeper Blocks **Symptom**: "Cannot be opened because the developer cannot be verified". **Solution**: ```bash xattr -d com.apple.quarantine /path/to/file ``` --- ## Getting Help If you're still experiencing issues: 1. **Run full diagnostics**: ```bash kintsugi check --verbose > diagnostics.txt python -c "import sys; print(sys.version)" conda list > packages.txt ``` 2. **Check GitHub Issues**: [https://github.com/smith6jt-cop/KINTSUGI/issues](https://github.com/smith6jt-cop/KINTSUGI/issues) 3. **Create a new issue** with: - Operating system and version - Python version - Error message (full traceback) - Steps to reproduce - Output of `kintsugi check` --- ## Quick Reference | Issue | Quick Fix | |-------|-----------| | libvips not found | Windows: Download from Zenodo; Linux: `apt install libvips-dev` | | GPU not detected | Check `nvidia-smi`; `kintsugi install gpu` | | CuPy unavailable on login node | Normal — no GPU on login nodes; test on compute node | | Import errors | `conda activate KINTSUGI && pip install -e .` | | Memory errors | Reduce `max_image_dim_px` in config | | Blank white stitched images | Check BaSiC flatfield min (0.1); run `reprocess_problematic_images.py` | | Tile grid in stitched images | Increase blend sigma (10.0); verify stitch model compatibility | | Snakemake not found | `pip install "snakemake>=8.0" snakemake-executor-plugin-slurm` | | SLURM OOM kill | GPU: 48 GB RAM; CPU: 128 GB RAM — increase in config.yaml | | SLURM_TRES_PER_TASK error | Patch jobstep plugin (see HPC section above) |