When data movement is delayed, even the fastest compute engines are left waiting, reducing throughput, increasing latency, ...