Technical Innovations
Extends R1's RL framework to multimodal inputs
Rule-Based Rewards: 128 objective scoring metrics for visual reasoning
3D Attention Gates: Fuse visual/textual tokens through volumetric processing
Benchmark Shockers
Stanford IQ-VQA: 148 IQ score (human average 100)
Medical Imaging: 97.3% tumor detection vs radiologists' 92.1%
Autonomous Driving: 0.01 disengagements/km in Shanghai robotaxi trials
Controversy
Ethicists warn of "Artificial Intuition" surpassing human pattern recognition in military targeting systems.