I'm experiencing the same issue on ARTIQ 7.
I load a 100k × 32-bit array using very simple kernel code (with out long array it took at all <0.1s).
This results in a 0.9-second dead time, and the sequence is adjusted to run at 1 second. (repetition every 1.9 sec)
I added a timestamp printer inside artiq.coredevice.core._run_compiled, and here’s what I observed:
self.comm.load() takes about 0.28 seconds — which is already quite a lot. Even at 100 Mb/s, it should be significantly faster.
self.comm.run() takes 0.0001seconds — not sure what it's doing? It looks like time for kernel FPGA code compilation and run flag. But it fine do not take time at all.
self.comm.serve() takes 1.17 seconds, which includes 0.17 seconds of unexplained idle time, which happen at the end of sequence (2 some rpc call at the end, second use this time). What’s going on here?
And some additional 0.4 s time for some thing else!! What time used before load
Thank you!
P.S. total deadtime and all times separatly is mostly linear with data size.