Hugging Face: ConTextual benchmark: evaluating context-sensitive text-rich visual reasoning in multimodal models | SignalBreak | SignalBreak