sb10q here is an example of what we see - the satellite is just peridocally rebooting with no error message, the master shows the aux packet errors and the link re-establishing but doesn't report why the satellite has fallen over.
Satellite Log (Kasli just has 4x DIO SMA and 4x MCX boards)
__ __ _ ____ ____
| \/ (_) ___| ___ / ___|
| |\/| | \___ \ / _ \| |
| | | | |___) | (_) | |___
|_| |_|_|____/ \___/ \____|
MiSoC Bootloader
Copyright (c) 2017-2021 M-Labs Limited
Bootloader CRC passed
Gateware ident 6.7582.c2248278;Kasli_Earth_Luna_artiq6_Satellite
Initializing SDRAM...
Read leveling scan:
Module 1:
00000000000011111111111000000000
Module 0:
00000000000011111111111100000000
Read leveling: 17+-5 17+-6 done
SDRAM initialized
Memory test passed
Booting from flash...
Starting firmware.
[ 0.000004s] INFO(satman): ARTIQ satellite manager starting...
[ 0.005612s] INFO(satman): software ident 6.7582.c2248278;Kasli_Earth_Luna_ artiq6_Satellite
[ 0.014072s] INFO(satman): gateware ident 6.7582.c2248278;Kasli_Earth_Luna_ artiq6_Satellite
[ 0.293009s] INFO(board_artiq::si5324): waiting for Si5324 lock...
[ 2.245255s] INFO(board_artiq::si5324): ...locked
[ 2.393374s] INFO(satman): uplink is up, switching to recovered clock
[ 2.426444s] INFO(board_artiq::si5324): waiting for Si5324 lock...
[ 4.169475s] INFO(board_artiq::si5324): ...locked
[ 6.857361s] INFO(board_artiq::si5324::siphaser): calibration successful, l ead: 80, width: 435 (349deg)
[ 7.098975s] WARN(satman): aux packet error (routing error)
[ 7.306905s] WARN(satman): aux packet error (routing error)
[ 7.521824s] WARN(satman): aux packet error (routing error)
[ 7.729419s] INFO(satman): TSC loaded from uplink
[ 7.733062s] WARN(satman): aux packet error (routing error)
[ 7.940356s] WARN(satman): aux packet error (routing error)
[ 8.343516s] INFO(satman): rank: 1
[ 8.345622s] INFO(satman): routing table: RoutingTable { 0: 0; 1: 1 0; 2: 2 0; }
[ 16.036747s] INFO(satman): resetting RTIO
[ 48.406541s] INFO(satman): resetting RTIO
[ 48.552867s] INFO(satman): resetting RTIO
[ 110.166787s] INFO(satman): resetting RTIO
[ 110.320345s] INFO(satman): resetting RTIO
[ 383.968813s] INFO(satman): resetting RTIO
[ 384.094257s] INFO(satman): resetting RTIO
__ __ _ ____ ____
| \/ (_) ___| ___ / ___|
| |\/| | \___ \ / _ \| |
| | | | |___) | (_) | |___
|_| |_|_|____/ \___/ \____|
MiSoC Bootloader
Copyright (c) 2017-2021 M-Labs Limited
Bootloader CRC passed
Gateware ident 6.7582.c2248278;Kasli_Earth_Luna_artiq6_Satellite
Initializing SDRAM...
Read leveling scan:
Module 1:
00000000000011111111111000000000
Module 0:
00000000000011111111111100000000
Read leveling: 17+-5 17+-6 done
SDRAM initialized
Memory test passed
Booting from flash...
Starting firmware.
[ 0.000004s] INFO(satman): ARTIQ satellite manager starting...
[ 0.005612s] INFO(satman): software ident 6.7582.c2248278;Kasli_Earth_Luna_ artiq6_Satellite
[ 0.014072s] INFO(satman): gateware ident 6.7582.c2248278;Kasli_Earth_Luna_ artiq6_Satellite
[ 0.293009s] INFO(board_artiq::si5324): waiting for Si5324 lock...
[ 2.253302s] INFO(board_artiq::si5324): ...locked
[ 3.051533s] INFO(satman): uplink is up, switching to recovered clock
[ 3.084603s] INFO(board_artiq::si5324): waiting for Si5324 lock...
[ 4.827635s] INFO(board_artiq::si5324): ...locked
[ 8.500670s] INFO(board_artiq::si5324::siphaser): calibration successful, l ead: 259, width: 435 (349deg)
[ 8.675852s] WARN(satman): aux packet error (routing error)
[ 8.883582s] WARN(satman): aux packet error (routing error)
[ 9.098713s] WARN(satman): aux packet error (routing error)
[ 9.306158s] INFO(satman): TSC loaded from uplink
[ 9.309789s] WARN(satman): aux packet error (routing error)
[ 9.517295s] WARN(satman): aux packet error (routing error)
[ 9.932076s] INFO(satman): rank: 1
[ 9.934181s] INFO(satman): routing table: RoutingTable { 0: 0; 1: 1 0; 2: 2 0; }
[ 26.772851s] INFO(satman): resetting RTIO
[ 86.098893s] INFO(satman): resetting RTIO
[ 86.241249s] INFO(satman): resetting RTIO
__ __ _ ____ ____
| \/ (_) ___| ___ / ___|
| |\/| | \___ \ / _ \| |
| | | | |___) | (_) | |___
|_| |_|_|____/ \___/ \____|
Master Log (Kasli just has 4x DIO SMA and 4x MCX boards)
[104689.643272s] ERROR(runtime::moninj::remote_moninj): aux packet error (link went down)
[104689.651103s] ERROR(runtime::moninj::remote_moninj): aux packet error (link went down)
[104689.658938s] ERROR(runtime::moninj::remote_moninj): aux packet error (link went down)
[104689.666773s] ERROR(runtime::moninj::remote_moninj): aux packet error (link went down)
[104689.674608s] ERROR(runtime::moninj::remote_moninj): aux packet error (link went down)
[104689.682443s] ERROR(runtime::moninj::remote_moninj): aux packet error (link went down)
[104689.690281s] ERROR(runtime::moninj::remote_moninj): aux packet error (link went down)
[104689.698390s] ERROR(runtime::moninj::remote_moninj): aux packet error (aux packet error)
[104689.706144s] ERROR(runtime::moninj::remote_moninj): aux packet error (aux packet error)
[104689.820219s] INFO(runtime::rtio_mgt::drtio): [LINK#0] link RX became up, pinging
[104689.915386s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104690.324361s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104690.733413s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104691.142388s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104691.551559s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104691.960487s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104692.369576s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104692.778520s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104693.187559s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104693.596597s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104694.005673s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104694.414636s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104694.823560s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104695.232533s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104695.641696s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104695.849079s] INFO(runtime::rtio_mgt::drtio): [LINK#0] remote replied after 15 packets
[104695.855886s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104696.067695s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104696.275156s] ERROR(runtime::moninj::remote_moninj): aux packet error (timeout)
[104696.461447s] INFO(runtime::rtio_mgt::drtio): [LINK#0] link initialization completed
[104696.468527s] INFO(runtime::rtio_mgt::drtio): [DEST#1] destination is up
[104696.474657s] INFO(runtime::rtio_mgt::drtio): [DEST#1] buffer space is 128
[104696.681648s] ERROR(runtime::rtio_mgt::drtio): [LINK#0] error(s) found (0x03):
[104696.687544s] ERROR(runtime::rtio_mgt::drtio): [LINK#0] received packet of an unknown type
[104696.695727s] ERROR(runtime::rtio_mgt::drtio): [LINK#0] received truncated packet
[105708.648765s] INFO(runtime::session): new connection from 192.168.1.38:58273
[105708.724260s] INFO(runtime::kern_hwreq): resetting RTIO
[105709.436403s] INFO(runtime::session): no connection, starting idle kernel
[105709.442471s] INFO(runtime::session): no idle kernel found
[155065.367958s] INFO(runtime::moninj): new connection from 192.168.1.38:49884
[155108.665804s] INFO(runtime::session): new connection from 192.168.1.38:49890
[155108.730199s] INFO(runtime::kern_hwreq): resetting RTIO
[155108.875240s] INFO(runtime::kern_hwreq): resetting RTIO