Just as human eyes tend to focus on pictures before reading accompanying text, multimodal artificial intelligence (AI)—which processes multiple types of sensory data at once—also tends to depend more heavily on certain types of data. KAIST researchers have now developed a new multimodal AI training technology that enables models to recognize both text and images evenly, enabling far more accurate predictions.
Multimodal AI learns to weigh text and images more evenly
Reader’s Picks
-
In numerous developed countries, youth crime has declined significantly since the 1990s, according to criminologists Dietrich Oberwittler (Max Planck Institute [...]
-
Whether they’re knocking at your door trick or treating, or hung as decorations in shop windows, witches are rife at [...]
-
Young teenagers on TikTok can easily access hardcore porn content, a new study has found.This article is brought to you [...]