I'm trying to understand the behavior of asynchronous RPC functions better. This basic example successfully executes two loops and then throws an RTIO Underflow:

from artiq.experiment import *
from time import sleep

class Example(EnvExperiment):

    def build(self):
        self.setattr_device('core')
        self.setattr_device('ttl0')

    def prepare(self):
        pass

    @rpc(flags={'async'})
    def host_function(self, i) -> TNone:
        sleep(0.5)
        print("Executing on the host.")
        print(i)

    @kernel
    def run(self):
        self.core.reset()

        # Delay a bit
        delay(10*us)

        # Determine the starting time
        t_start = now_mu()

        # Enter the loop
        for i in range(10):

            at_mu(t_start)

            self.ttl0.pulse(20*us)
            print("at loop")
            print(i)
            self.host_function(i)

            t_start += self.core.seconds_to_mu(0.2)
            self.core.wait_until_mu(t_start - self.core.seconds_to_mu(0.0001))
at loop
0
Executing on the host.
0
at loop
1
Executing on the host.
1

ERROR:worker(1773,test.py):root:Terminating with exception (RTIOUnderflow: RTIO underflow at 18076965246240 mu, channel 9, slack -124180360 mu)
Core Device Traceback (most recent call last):
  File "C:\Users\Admin\Documents\artiq\experiments/test.py", line 35, in artiq_worker_C:\Users\Admin\Documents\artiq\experiments\test.Example.run(..., ...) (RA=+0x624)
    self.ttl0.pulse(20*us)
  File "<artiq>\coredevice/ttl.py", line 201, in ... artiq.coredevice.ttl.TTLInOut.pulse<artiq.coredevice.ttl.TTLInOut>(...) (inlined)
    self.on()
  File "<artiq>\coredevice/ttl.py", line 173, in ... artiq.coredevice.ttl.TTLInOut.on<artiq.coredevice.ttl.TTLInOut>(...) (inlined)
    self.set_o(True)
  File "<artiq>\coredevice/ttl.py", line 163, in ... artiq.coredevice.ttl.TTLInOut.set_o<artiq.coredevice.ttl.TTLInOut>(...) (inlined)
    rtio_output(self.target_o, 1 if o else 0)
  File "ksupport/rtio.rs", line 67, column 14, in (Rust function)
    <unknown>
                 ^
artiq.coredevice.exceptions.RTIOUnderflow(0): RTIO underflow at 18076965246240 mu, channel 9, slack -124180360 mu
Traceback (most recent call last):
  File "C:\Users\Admin\anaconda3\envs\artiq\lib\site-packages\artiq\master\worker_impl.py", line 290, in main
    exp_inst.run()
  File "C:\Users\Admin\anaconda3\envs\artiq\lib\site-packages\artiq\language\core.py", line 54, in run_on_core
    return getattr(self, arg).run(run_on_core, ((self,) + k_args), k_kwargs)
  File "C:\Users\Admin\anaconda3\envs\artiq\lib\site-packages\artiq\coredevice\core.py", line 137, in run
    self.comm.serve(embedding_map, symbolizer, demangler)
  File "C:\Users\Admin\anaconda3\envs\artiq\lib\site-packages\artiq\coredevice\comm_kernel.py", line 505, in serve
    self._serve_exception(embedding_map, symbolizer, demangler)
  File "C:\Users\Admin\anaconda3\envs\artiq\lib\site-packages\artiq\coredevice\comm_kernel.py", line 497, in _serve_exception
    raise python_exn
artiq.coredevice.exceptions.RTIOUnderflow: RTIO underflow at 18076965246240 mu, channel 9, slack -124180360 mu
ERROR:master:artiq.master.scheduler:got worker exception in run stage, deleting RID 1773

There are two things I don’t understand here:

  1. I would have expected the second at loop print statement to occur before the first Executing on the host. print statement, as the second at loop print statement should occur 0.4 seconds after the run start time and the first Executing on the host. print statement should happen 0.5 seconds after the run start time.
  2. I do not get an RTIO Underflow if I remove the sleep(0.5) statement from host_function. This behavior is confusing to me since host_function should execute asynchronously and have no effect on the timing inside the kernel function.

I expect that I’m making some wrong assumptions about the behavior of RPC functions. What am I missing/assuming incorrectly?

In case it is relevant, this is on a system running ARTIQ-5.

    Hi jroth,

    It seems like the async RPC call in point(1) is blocking because it waits until it is completed. The manual mentions this behavior here: Asynchronous RPCs. If that's the case, it should result in point(2) as the delay accumulates, causing an RTIO underflow error. From my understanding, async RPCs can still indirectly interfere with RTIO timing if they take longer than the intervals between scheduled RTIO events. This might be why commenting out the sleep call or setting it to a time faster than the kernel delay prevents the underflow.

      Thanks @fsagbuya! I don't think I understand that part of the manual. It states

      If an RPC returns no value, it can be invoked in a way that does not block until the RPC finishes execution, but only until it is queued.

      I understand the first part--that would be consistent with what I had assumed before that an asynchronous RPC call does not block. However I do not understand what "only until it is queued" means here. I haven't seen any documentation about the RPC queue before--could you point me to where I might learn more?

      Thanks!

        jroth

        However I do not understand what "only until it is queued" means here. I haven't seen any documentation about the RPC queue before--could you point me to where I might learn more?

        This is part of the internality of ARTIQ and there's no explicit documentation on it. You could learn more by reading through the source code if you're so inclined -- e.g. rpc_send_async, _serve_rpc, etc. The gist is that outgoing communications are placed in a queue to be sent over the network, but as regards what fsagbuya said, I think the salient part is that even when we don't wait for a return value, simply making the RPC call (even if this is queuing it and not sending it) still consumes a nonzero amount of time.

        It is worth mentioning that print() is in effect (when executed in a standard kernel) an RPC itself -- after all, you're communicating "at loop" and i to the host to be printed -- and I'm not sure if it's by default an asynchronous one. That may be related to what's happening here...?

        Hi @architeuthis, thanks--that makes sense! When I remove the print statement in the kernel method the RPC method executes without producing any error.