[2606.04922] Geometry-Aware Distillation for Prompt Tuning Biomedical Vision-Language Models
Abstract page for arXiv paper 2606.04922: Geometry-Aware Distillation for Prompt Tuning Biomedical Vision-Language Models
America Forever Bytes
Technology
Abstract page for arXiv paper 2606.04922: Geometry-Aware Distillation for Prompt Tuning Biomedical Vision-Language Models
Abstract page for arXiv paper 2606.03730: Beyond False Stability: High-Noise Drift Gating for Test-Time Adversarial Defenses in Vision-Language Models
Abstract page for arXiv paper 2606.02742: Consistent Yet Wrong: Evidence Insensitivity in Spatial Vision-Language Models
Abstract page for arXiv paper 2606.01847: The Lie We Tell: Correcting the Euclidean Fallacy in Vision Language Action Policies via Score Matching on Tangent Spa...
Abstract page for arXiv paper 2606.02273: Vision-language Models for Driver Monitoring Systems: A Driver Activity Description Dataset
Abstract page for arXiv paper 2606.01612: Self-Improving Small Object Grounding in LVLMs
Abstract page for arXiv paper 2605.30713: Diversity Matters: Revisiting Test-Time Compute in Vision-Language Models
Abstract page for arXiv paper 2605.30716: Simple Token-Efficient Vision-Language Model for Case-level Pathology Synoptic Report Generation
Abstract page for arXiv paper 2605.31196: Probing Collision Grounding in Vision-Language Models for Safe Human-Robot Collaboration
Abstract page for arXiv paper 2605.31349: FBHM: Functional Benchmarking and Steering of VLMs for Hateful Meme Detection
Abstract page for arXiv paper 2605.31556: Vision-Language Models Suppress Female Representations Under Ambiguous Input
Abstract page for arXiv paper 2605.29438: ElegantVLA: Learning When to Think for Efficient Vision-Language-Action Models
Abstract page for arXiv paper 2605.29585: World Models in Words: Auditing Physical State-Transition Commitments in Vision-Language Models
Abstract page for arXiv paper 2605.29881: Mitigating Hallucination in Vision-Language Models through Barrier-Regulated Adaptive Closed-form Steering
Abstract page for arXiv paper 2605.27894: Towards Unified Vision-Language Models with Incomplete Multi-Modal Inputs
Abstract page for arXiv paper 2605.28051: Beyond Surrogate Gradients: Fully Differentiable Token Pruning for Vision-Language Models
Abstract page for arXiv paper 2605.28346: When Discourse Pressures Conflict: Information Structure in Vision-Language Model Outputs