When people watch a scene in the film “Jurassic Park” where a giant dinosaur walks toward them, they naturally imagine a heavy, rumbling sound, as if the ground were shaking. This is because humans predict sound by considering not only the shape of an object, but also physical properties such as its size, weight, and speed of movement. However, existing video-to-audio generation AI mainly generates sound based on the category of objects or scene information in the video, and has not sufficiently reflected physical properties that vary depending on weight or speed.
Physics-aware AI generates more realistic sounds by estimating mass and velocity from video
Reader’s Picks
-
New research by Dr. Patricia Nabuco Martuscelli and a team of researchers challenged the “Adult Gaze,” arguing that children’s expertise [...]
-
How ambitious should you be? Folk wisdom offers conflicting advice: “Shoot for the moon,” but also, “Don’t let the perfect [...]
-
While society often assumes that finding a romantic partner is the ultimate key to happiness, tracking relationship changes over time [...]
