Cross-Modal Understanding

Artificial intelligence links what she sees with what she reads and hears. She combines vision, text, and sound to build a deeper understanding of the world.