Just as human eyes tend to focus on pictures before reading accompanying text, multimodal artificial intelligence (AI)—which processes multiple types of sensory data at once—also tends to depend more heavily on certain types of data. KAIST researchers have now developed a new multimodal AI training technology that enables models to recognize both text and images evenly, enabling far more accurate predictions.
Multimodal AI learns to weigh text and images more evenly
Reader’s Picks
-
When humans interact with each other and engage in everyday activities, they typically follow various undefined rules, also known as [...]
-
Research involving Pompeu Fabra University has explored the relationship between having or not having a romantic partner with changes in [...]
-
Women do the majority of “thinking work” in households, regardless of their employment status or how much they earn, new [...]
