Return to Article Details Cross-Modal World Modeling with HY-Himmel: Unifying Video, Text, and Sensor Streams for Embodied AI Download Download PDF