Despite incredible technological progress in the last decades, latency is still an issue for today’s technologies and their applications. To better understand how latency and resulting feedback delays affect the interaction between humans and cyber-physical systems (CPS), the present study examines separate and joint effects of visual and auditory feedback delays (length: 200 ms) on performance (speed, accuracy) and the motor control strategy (movement kinematic) in a complex visuomotor task. Visual feedback delays slowed down movement execution and impaired precision, while delayed auditory feedback improved performance (i.e. increased precision) compared to a condition without feedback delays. Descriptively, this latter finding mainly appeared in the condition with joint (congruent) visual and auditory feedback delays. In this regard, we discuss the role of temporal congruency of audiovisual information as well as potential compensatory mechanisms that can inform the design of multisensory feedback in human-CPS interaction faced with latency.