- Edited
The problem:
artiq_coremgmt -s 0 log
succeeds, but artiq_coremgmt -s 1 log
fails with OSError: Incorrect reply from device: Reply.Error (expected Reply.LogContent)
and the following lines appear in the master's UART log:
[ 72.712499s] INFO(runtime::mgmt): received connection
[ 72.726959s] ERROR(runtime::mgmt::remote_coremgmt): aux packet error (link down)
[ 72.734340s] WARN(runtime::mgmt): connection terminated: DrtioError(LinkDown)
Moreover, neither the master's nor the satellite's log contains any mention of attempting to establish a link with each other.
For a full picture of what's going on, see the full UART logs further below.
This behavior is 100% reproducible no matter:
- which fiber/cable/transceivers we use to connect
SFP0
(master) toSFP0
(satellite), - which power supply we use for either the master or the satellite,
- which SD card we use for either the master or the satellite,
- which of the 2 Kasli SoC boards is master and which is satellite.
Edit from later
We transferred the SD cards from these two Kasli SoCs to two other Kasli SoCs and there the DRTIO links were established automatically and artiq_coremgmt -s 1 log
succeeds. Could the former two Kasli SoCs have a hardware problem?
Our system:
We have two Kasli SoC v1.1.1 boards flashed with artiq-zynq commit 63157588bb8ed214bc9f24ade2d09b4ddefada03 from 2025_01_15:
- master with 12x DIO_SMA peripherals
(We have tested both with and without actual peripherals plugged into its EEM sockets.) - satellite with 2x Fastino peripherals
(We have tested both with and without actual peripherals plugged into its EEM sockets.) SFP0
(master) is connected toSFP0
(satellite).- We communicate with the master via Ethernet using artiq commit d1177836f2ab7248befecd5403041b2ca2e36746 from 2025_02_07.
Full UART log of master
Contains responses to 1 successful artiq_coremgmt -s 0 log
followed by 3 failed artiq_coremgmt -s 1 log
:
[ 0.019996s] INFO(szl): Simple Zynq Loader starting...
[ 0.025201s] DEBUG(libboard_zynq::clocks::source): Set ARM_PLL to 2000000000 Hz
[ 0.007041s] DEBUG(libboard_zynq::clocks::source): Set IO_PLL to 1000000000 Hz
[ 0.016260s] DEBUG(libboard_zynq::clocks::source): Set DDR_PLL to 1066666666 Hz
[ 0.023531s] DEBUG(libboard_zynq::ddr): DDR 3x/2x clocks: 533333328/355555552
[ 0.030752s] DEBUG(libboard_zynq::ddr): DDR DCI clock: 10062892 Hz (divisors=2*53)
[ 0.042001s] DEBUG(libboard_zynq::sdio): Reset SDIO!
[ 0.046948s] DEBUG(libboard_zynq::sdio): Changing clock frequency to 400000
[ 0.053896s] INFO(szl): Card inserted. Mounting file system.
[ 0.073372s] DEBUG(libboard_zynq::sdio): Changing clock frequency to 25000000
[ 0.080427s] DEBUG(libboard_zynq::sdio::sd_card): Getting bus width
[ 0.086942s] DEBUG(libboard_zynq::sdio::sd_card): 4 bit support
[ 0.092839s] DEBUG(libboard_zynq::sdio::sd_card): Changing bus width
[ 0.101001s] DEBUG(libboard_zynq::sdio): Set block size to 512
[ 0.107352s] DEBUG(libconfig::sd_reader): Partition ID: C
[ 0.113805s] INFO(szl): Loading gateware
[ 0.118308s] DEBUG(libconfig::bootgen): Partition header pointer = C80
[ 0.125112s] DEBUG(libconfig::bootgen): Unencrypted length = EB1B0
[ 0.131274s] DEBUG(libconfig::bootgen): Partition start address: B360
[ 0.562538s] DEBUG(libboard_zynq::devc): Invalidate DCache for bitstream buffer
[ 0.576735s] DEBUG(libboard_zynq::devc): Init preload FPGA
[ 0.582200s] DEBUG(libboard_zynq::devc): Toggling PROG_B
[ 0.617489s] DEBUG(libboard_zynq::devc): Waiting for done
[ 0.622867s] DEBUG(libboard_zynq::devc): Init postload FPGA
[ 0.628343s] INFO(szl): Loading runtime
[ 0.632546s] DEBUG(libconfig::bootgen): Partition header pointer = C80
[ 0.639350s] DEBUG(libconfig::bootgen): Unencrypted length = AD9D
[ 0.645427s] DEBUG(libconfig::bootgen): Unencrypted length = 43A2C
[ 0.651588s] DEBUG(libconfig::bootgen): Partition start address: F6510
[ 0.782958s] INFO(szl): Preparing for runtime execution
[ 0.788597s] INFO(szl): executing payload
[ 0.000067s] INFO(runtime): NAR3/Zynq7000 starting...
[ 0.005248s] INFO(runtime): gateware ident: 9+unknown;crate_mqva_dio_kasli_soc_v1.1_artiq_zynq_master_631575
[ 0.020340s] INFO(libboard_zynq::i2c): PCA9548 detected
[ 0.170589s] WARN(runtime::rtio_clocking): error reading configuration. Falling back to default.
[ 0.179442s] WARN(runtime::rtio_clocking): Using default configuration - internal 125MHz RTIO clock.
[ 0.188643s] INFO(runtime::rtio_clocking): using internal 125MHz RTIO clock
[ 0.579377s] INFO(libboard_artiq::si5324): waiting for Si5324 lock...
[ 5.830667s] INFO(libboard_artiq::si5324): ...locked
[ 5.885989s] INFO(runtime::rtio_clocking): SYS CLK switched successfully
[ 5.897953s] INFO(libboard_zynq::i2c): PCA9548 detected
[ 5.933535s] INFO(runtime::comms): network addresses: MAC=fc-0f-e7-0f-90-ef IPv4=192.168.1.30 IPv6-LL=fe80::fe0f:e7ff:fe0f:90ef IPv6: no configured address
[ 5.960253s] INFO(libboard_artiq::drtio_routing): could not read routing table from configuration, using default
[ 5.970504s] INFO(libboard_artiq::drtio_routing): routing table: RoutingTable { 0: 0; 1: 1 0; 2: 2 0; 3: 3 0; 4: 4 0; }
[ 5.982697s] INFO(runtime::rtio_mgt): SED spreading disabled by default
[ 5.990587s] WARN(ksupport): error reading device map (Configuration key `device_map` not found), device names will not be available in RTIO error messages
[ 6.006015s] INFO(runtime::rtio_mgt::drtio): [DEST#0] destination is up
[ 6.014044s] INFO(runtime::comms): no idle kernel found
[ 10.505082s] INFO(libboard_zynq::eth): eth: got Link { speed: S1000, duplex: Full }
[ 15.243863s] INFO(runtime::mgmt): received connection
[ 19.328225s] INFO(runtime::mgmt): received connection
[ 19.343072s] ERROR(runtime::mgmt::remote_coremgmt): aux packet error (link down)
[ 19.350457s] WARN(runtime::mgmt): connection terminated: DrtioError(LinkDown)
[ 20.514414s] INFO(runtime::mgmt): received connection
[ 20.529064s] ERROR(runtime::mgmt::remote_coremgmt): aux packet error (link down)
[ 20.536370s] WARN(runtime::mgmt): connection terminated: DrtioError(LinkDown)
[ 21.500834s] INFO(runtime::mgmt): received connection
[ 21.516064s] ERROR(runtime::mgmt::remote_coremgmt): aux packet error (link down)
[ 21.523442s] WARN(runtime::mgmt): connection terminated: DrtioError(LinkDown)
Full UART log of satellite
The following is simultaneous in time to the above log from the master:
[ 0.019990s] INFO(szl): Simple Zynq Loader starting...
[ 0.025196s] DEBUG(libboard_zynq::clocks::source): Set ARM_PLL to 2000000000 Hz
[ 0.007039s] DEBUG(libboard_zynq::clocks::source): Set IO_PLL to 1000000000 Hz
[ 0.016260s] DEBUG(libboard_zynq::clocks::source): Set DDR_PLL to 1066666666 Hz
[ 0.023609s] DEBUG(libboard_zynq::ddr): DDR 3x/2x clocks: 533333328/355555552
[ 0.030769s] DEBUG(libboard_zynq::ddr): DDR DCI clock: 10062892 Hz (divisors=2*53)
[ 0.042000s] DEBUG(libboard_zynq::sdio): Reset SDIO!
[ 0.046947s] DEBUG(libboard_zynq::sdio): Changing clock frequency to 400000
[ 0.053895s] INFO(szl): Card inserted. Mounting file system.
[ 0.073371s] DEBUG(libboard_zynq::sdio): Changing clock frequency to 25000000
[ 0.080504s] DEBUG(libboard_zynq::sdio::sd_card): Getting bus width
[ 0.086958s] DEBUG(libboard_zynq::sdio::sd_card): 4 bit support
[ 0.092856s] DEBUG(libboard_zynq::sdio::sd_card): Changing bus width
[ 0.101001s] DEBUG(libboard_zynq::sdio): Set block size to 512
[ 0.107346s] DEBUG(libconfig::sd_reader): Partition ID: C
[ 0.113583s] INFO(szl): Loading gateware
[ 0.117863s] DEBUG(libconfig::bootgen): Partition header pointer = C80
[ 0.124658s] DEBUG(libconfig::bootgen): Unencrypted length = E7C59
[ 0.130822s] DEBUG(libconfig::bootgen): Partition start address: B360
[ 0.554995s] DEBUG(libboard_zynq::devc): Invalidate DCache for bitstream buffer
[ 0.569170s] DEBUG(libboard_zynq::devc): Init preload FPGA
[ 0.574639s] DEBUG(libboard_zynq::devc): Toggling PROG_B
[ 0.609521s] DEBUG(libboard_zynq::devc): Waiting for done
[ 0.614898s] DEBUG(libboard_zynq::devc): Init postload FPGA
[ 0.620452s] INFO(szl): Loading runtime
[ 0.624645s] DEBUG(libconfig::bootgen): Partition header pointer = C80
[ 0.631439s] DEBUG(libconfig::bootgen): Unencrypted length = AD9D
[ 0.637510s] DEBUG(libconfig::bootgen): Unencrypted length = 2F7A8
[ 0.643671s] DEBUG(libconfig::bootgen): Partition start address: F2FC0
[ 0.739063s] INFO(szl): Preparing for runtime execution
[ 0.744701s] INFO(szl): executing payload
[ 0.000067s] INFO(satman): ARTIQ satellite manager starting...
[ 0.006029s] INFO(satman): gateware ident 9+unknown;crate_mqva_ao_kasli_soc_v1.1_artiq_zynq_master_6315758
[ 0.020873s] INFO(libboard_zynq::i2c): PCA9548 detected
[ 0.534652s] INFO(libboard_artiq::si5324): waiting for Si5324 lock...
[ 2.465227s] INFO(libboard_artiq::si5324): ...locked
[ 2.570436s] INFO(satman): Switching SYS clocks...
[ 2.625298s] INFO(satman): SYS CLK switched successfully
[ 2.650379s] INFO(satman): SED spreading disabled by default