|
- Grounding Language with Vision: A Conditional Mutual Information . . .
To alleviate this issue, we introduce a novel Conditional Pointwise Mutual Information (C-PMI) calibrated decoding strategy, which adaptively strengthens the mutual dependency between generated texts and input images to mitigate hallucinations
- Grounding Language with Vision: A Conditional Mutual Information . . .
To alleviate this issue, we introduce a novel Conditional Pointwise Mutual Information (C-PMI) calibrated decoding strategy, which adaptively strengthens the mutual dependency between generated texts and input images to mitigate hallucinations
- Grounding Language with Vision: A Conditional Mutual Information . . .
To alleviate this issue, we introduce a novel Conditional Pointwise Mutual Information (C-PMI) calibrated decoding strategy, which adaptively strengthens the mutual dependency between
- Visual Description Grounding Reduces Hallucinations and Boosts . . .
TL;DR: We propose a novel method to reduce hallucinations and improve LVLM performance on cognitive prompts requiring reasoning and knowledge retrieval Large Vision-Language Models (LVLMs) often produce responses that misalign with factual information, a phenomenon known as hallucinations
- Grounding Language with Vision: A Conditional Mutual Information . . .
We revisit the hallucination mitigation problem in LVLMs from an information-theoretic perspective, where we reformulate it as a conditional mutual information maximization problem and introduce a novel bi-level optimization-based solution framework
- Multi-Modal Hallucination Control by Visual Information Grounding
We introduced M3ID, a new approach designed to com-bat multi-modal hallucinations by maximizing the mutual information between the text generated by VLMs and the corresponding visual context
- [论文评述] Grounding Language with Vision: A Conditional Mutual Information . . .
为了解决这个问题,本文提出了一种新的 Conditional Mutual Information-aware adaptive Vision-Language Decoding (CMI-VLD) 策略,该策略从信息论角度出发,通过最大化视觉输入与生成文本之间的条件互信息(Conditional Mutual Information, CMI)来增强跨模态关联,从而缓解幻觉。
- Grounding Language with Vision: A Conditional Mutual Information . . .
To alleviate this issue, we introduce a novel Conditional Pointwise Mutual Information (C-PMI) calibrated decoding strategy, which adaptively strengthens the mutual dependency between generated texts and input images to mitigate hallucinations
|
|
|