I have Urukul cards on a master and a satellite. They have synchronization enabled and I am trying to store information in the EEPROM for io_update_delay and sync_delay_seed. Now it turns out, EEPROM ports are aliased over the master and satellite crate. See the DDB entries below. This basically makes this EEPROM useless for system with DRTIO. Is there a known fix/workaround for this? I guess I could make a custom port mapping to use EEPROM of other EEM ports that do not use EEPROM.

...
device_db["eeprom_urukul2"] = {
    "type": "local",
    "module": "artiq.coredevice.kasli_i2c",
    "class": "KasliEEPROM",
    "arguments": {"port": "EEM6"}
}
...
device_db["eeprom_urukul6"] = {
    "type": "local",
    "module": "artiq.coredevice.kasli_i2c",
    "class": "KasliEEPROM",
    "arguments": {"port": "EEM6"}
}
...
a month later

I see an issue was created for this. just fyi, we currently use a manual remapping to make this work.

# device db...

def _mutate_ddb():
    def eeprom_remap(key, value):
        if value.get("class") == "KasliEEPROM":
            value["arguments"]["port"] = f"EEM{key[13:]}"

    for k, v in device_db.items():
        if isinstance(v, dict):
            eeprom_remap(k, v)


# Mutate the auto-generated device db
_mutate_ddb()
del _mutate_ddb
    8 months later

    yes, the code above solves the problem. instead of numbering the EEPROM channels based on the EEM channel number (which could overlap for urukuls in different crates), this code numbers the used EEPROM channels based on the index of the urukul, which is sequential over multiple crates and therefore unique. you will run into trouble though if you have more urukuls than there are EEPROM chips.

    let me know if you get this to work, or if things are still unclear.

    on our system, we have actually changed to a hard-coded mapping from urukul to EEPROM chip because two EEPROM chips appear to be broken. that EEPROM function looks like this

    
        def eeprom_remap(key, value):
            if value.get("class") == "KasliEEPROM":
                value["arguments"]["port"] = {
                    "eeprom_urukul0": "EEM0",
                    "eeprom_urukul1": "EEM10",  # EEM1 is defect
                    "eeprom_urukul2": "EEM11",  # EEM2 is defect
                    "eeprom_urukul3": "EEM3",
                    "eeprom_urukul4": "EEM4",
                    "eeprom_urukul5": "EEM5",
                    "eeprom_urukul6": "EEM6",
                    "eeprom_urukul7": "EEM7",
                    "eeprom_urukul8": "EEM8",
                }[key]

      lriesebos How do you know your EEPROM chip is broken, do you get something like the following error?

      Traceback (most recent call first):
        File "<artiq>/coredevice/i2c.py", line 133, column 13, in artiq.coredevice.i2c.i2c_read_many(..., ...)
          raise I2CError("failed to ack bus address")
          ^
        File "example.py", line 28, in ... artiq_run_example.DIO.init<artiq_run_example.DIO>(...) (RA=+0x2c4)
          channel.init()
        File "example.py", line 38, in artiq_run_example.DIO.run(..., ...) (inlined)
          self.init()
        File "<artiq>/coredevice/ad9910.py", line 92, in ... artiq.coredevice.ad9910.SyncDataEeprom.init<artiq.coredevice.ad9910.SyncDataEeprom>(...) (RA=+0xbec)
          word = self.eeprom_device.read_i32(self.eeprom_offset) >> 16
        File "<artiq>/coredevice/ad9910.py", line 464, in ... artiq.coredevice.ad9910.AD9910.init<artiq.coredevice.ad9910.AD9910>(...) (inlined)
          self.sync_data.init()
        File "<artiq>/coredevice/kasli_i2c.py", line 70, in ... artiq.coredevice.kasli_i2c.KasliEEPROM.read_i32<artiq.coredevice.kasli_i2c.KasliEEPROM>(...) (RA=+0x2aa0)
          i2c_read_many(self.busno, self.address, addr, data)
        File "<artiq>/coredevice/i2c.py", line 133, in ?? (RA=+0x3000)
          raise I2CError("failed to ack bus address")
      artiq.coredevice.exceptions.I2CError(3): failed to ack bus address
      
      End of Core Device Traceback
      
      Traceback (most recent call last):
        File "/nix/store/72j55i85kdybcwqmyz50hqk80bnpcy8i-python3.9-artiq-7.0.db79100/bin/.artiq_run-wrapped", line 9, in <module>
          sys.exit(main())
        File "/nix/store/qfh7rcama7s6ps2sb65p5vsm4cfj99lp-python3-3.9.16-env/lib/python3.9/site-packages/artiq/frontend/artiq_run.py", line 224, in main
          return run(with_file=True)
        File "/nix/store/qfh7rcama7s6ps2sb65p5vsm4cfj99lp-python3-3.9.16-env/lib/python3.9/site-packages/artiq/frontend/artiq_run.py", line 210, in run
          raise exn
        File "/nix/store/qfh7rcama7s6ps2sb65p5vsm4cfj99lp-python3-3.9.16-env/lib/python3.9/site-packages/artiq/frontend/artiq_run.py", line 203, in run
          exp_inst.run()
        File "/nix/store/qfh7rcama7s6ps2sb65p5vsm4cfj99lp-python3-3.9.16-env/lib/python3.9/site-packages/artiq/language/core.py", line 54, in run_on_core
          return getattr(self, arg).run(run_on_core, ((self,) + k_args), k_kwargs)
        File "/nix/store/qfh7rcama7s6ps2sb65p5vsm4cfj99lp-python3-3.9.16-env/lib/python3.9/site-packages/artiq/coredevice/core.py", line 140, in run
          self._run_compiled(kernel_library, embedding_map, symbolizer, demangler)
        File "/nix/store/qfh7rcama7s6ps2sb65p5vsm4cfj99lp-python3-3.9.16-env/lib/python3.9/site-packages/artiq/coredevice/core.py", line 130, in _run_compiled
          self.comm.serve(embedding_map, symbolizer, demangler)
        File "/nix/store/qfh7rcama7s6ps2sb65p5vsm4cfj99lp-python3-3.9.16-env/lib/python3.9/site-packages/artiq/coredevice/comm_kernel.py", line 716, in serve
          self._serve_exception(embedding_map, symbolizer, demangler)
        File "/nix/store/qfh7rcama7s6ps2sb65p5vsm4cfj99lp-python3-3.9.16-env/lib/python3.9/site-packages/artiq/coredevice/comm_kernel.py", line 698, in _serve_exception
          raise python_exn
      artiq.coredevice.exceptions.I2CError: failed to ack bus address`

      Have you tried using the EEPROM of the Kasli LOC0?

      so we use the 9910 with EEPROM for storing sync values. during initialization of the sync_data object, we see this

      artiq.coredevice.exceptions.I2CError: I2C bus could not be accessed

      then after a warm reboot using artiq_flash start I noticed this in the UART logs

      [2024-03-12 15:16:24] panic at runtime/main.rs:98:30: I2C initialization failed: "SDA is stuck low and doesn\'t get unstuck"

      that gave me the idea that the EEPROM chip might be messed up and switching the I2C mux a broken chip makes SDA hang. so I just power cycle the device and figure out which EEPROM chips give me trouble.

      just want to note that we have seen various errors. the ones shown were the most common. see also https://gitlab.com/duke-artiq/dax/-/wikis/ARTIQ/Core-device-hardware#i2c-errors-and-defective-eeprom-chips

        lriesebos When you write I2C mux, do you mean I2CSwitch? Are both of these broken for you?

        For me, it seems that only the EEPROM on the Urukul itself is the problem as I am only missing the ACK from it but don't get an error writing the bits before. (Also can't observe errors via the usb serial console nor aqctl_corelog).

        Now it made click: Actually, what I observe is expected: On my master Kasli, I only installed four expansion modules at EEM0, EEM1, EEM10, and EEM11 (none of them being Urukuls btw).

        So, of course, I can only use EEMx ports of the master Kasli which have an expansion modules with EEPROM attached.

        I think, it would make much more sense to store this kind of data into LOC0 onto the EEPROM of the master Kasli than, for example, the DIO's EEPROM.

        ok now you got me confused. as far as I am aware, there is no EEPROM on the urukul cards that can be accessed. I could be wrong here, but that is my understanding. all the EEPROM is on the kasli controller. so all of that EEPROM is accessible regardless of the cards installed.

        So with the I2C mux I am referring to the I2CSwitch. first we had the aliasing issue, which I resolved by remapping. the aliasing issue does not give you errors, but causes multiple urukuls to use the same EEPROM chip and therefore the same sync settings. that is how I noticed. I remapped them and that solved it. then after some hardware changes, we got these I2C errors. we tracked that down to two specific EEPROM chips. I mapped those to different chips, and all worked fine again.

        There are a bunch of EEPROM chips called EEMx and a few that have special names. I don't know what the other ones might be used for, so I only use the ones named EEMx.

          lriesebos

          ok now you got me confused. as far as I am aware, there is no EEPROM on the urukul cards that can be accessed. I could be wrong here, but that is my understanding. all the EEPROM is on the kasli controller. so all of that EEPROM is accessible regardless of the cards installed.

          Yes, that was also my first hypothesis, but after checking the schematics with our electrical engineer, I think only LOC0 is Kasli's EEPROM, and EEMx is the I2C buses going to the extension module connected to EEM port x.

          See the attached excerpts from the schematics:



          • IC5 in the Kasli schematic is the LOC0 using the 24AA025E48 EEPROM (red)
          • SHARED_SDA and SHARED_SCL make up the I2C bus from the second I2CSwitch to the Kasli's EEPROM (green)
          • I2C_SDA_x and I2C_SCL_x make up the I2C bus going the the xth EEM extension module (orange)
          • IC8B in the Urukul schematic is the EEMx 24AA02E48 EEPROM (blue)

          Do TTL cards also have EEPROM then? because we have TTL cards in the first two EEM slots, but use the EEPROM successfully on those channels. We also have our urukuls connected with two EEM connectors, and use EEPROM chips on both EEM channels. I have not noticed any problems.

          Besides, this would also cause issues with DRTIO, because I don't think we get I2CSwitch devices on any DRTIO destinations. This is getting me really confused.

          @Pavel (assuming this is the Pawel from technosystem, Hi, hope all is going well!) sorry for dragging you in here, but do you maybe know how this works?

            Thanks sb10q , that clears up some things.

            Just wondering, do Urukuls have EEPROM chips for both EEM channels? Because that is how we are using it right now.

            @bodokaiser I have the feeling that the issues we experienced were different from yours. it might still be worth trying to map to different EEPROM chips, but your symptoms seem a bit different from mine.

              lriesebos

              Just wondering, do Urukuls have EEPROM chips for both EEM channels? Because that is how we are using it right now.

              If I am not mistaken, the pins of both EEM connectors are attached to the same nets. So one can use either EEM port to write to the same EEPROM.

              @bodokaiser I have the feeling that the issues we experienced were different from yours. it might still be worth trying to map to different EEPROM chips, but your symptoms seem a bit different from mine.

              I think our problems have some overlap but are not the same. @fsagbuya helped me get the LOC0 option working. For us this will be the new default to store DDS timing data as its invariant under the EEM module configuration of the master Kasli.

                Interesting enough, I don't see that behavior. I have an urukul connected to EEM10/11, and my DDB points both urukul 1 and 2 to those respective EEPROM ports. if I read their sync data, their values are different. @sb10q can you confirm if two EEM ports on an urukul wire to the same EEPROM or not?

                10 months later

                bodokaiser
                Do you mind sharing your implementation? We just ran into this issue (again? [1]):

                • M: Master with 2 Urukuls
                • S1: Satellite1 with 2 Urukuls
                • S2: Sattellite2 with 3 Urukuls

                Apparently, the the device DB entry for Urukul3's EEPROM in S2 got an EMM address assigned, that is not populated on M. This makes the Urukul unsuable.
                Rather than using [2] by franken-steining the EEPROMs of random cards in the M for S1/S2 Urukuls, we would love to use the the Kasli EEPROM of M (LOC0). It seems like, you went that route.
                So my question is: How did you implement it?
                And also: Does your implementation allow individual sync data (none-shared) for all Urukuls?

                [1] We have already seen sync issues on RTIO before https://github.com/m-labs/artiq/issues/1692 . But that bug uncovered also a hardware issue in the end. Disregarding the HW bug, I still think sync is broken on DRTIO with this eeprom mixup.
                [2] https://forum.m-labs.hk/d/622-eeprom-aliasing-when-using-drtio/2

                  KlausZipfel Apparently, the the device DB entry for Urukul3's EEPROM in S2 got an EMM address assigned, that is not populated on M. This makes the Urukul unsuable.

                  Sounds surprising. Did you physically check that it is not populated? Where is this card from?

                    sb10q I just checked our device configuration json file. EEM10 on the master is not populated. So satellite2 can not "frankestein" any EEPROM on EEM10 on the master for syncing the Urukul on satellite2

                    In my honest opinion, this behaviour is absolutely annoying on DRTIO setups and needs to be adressed.
                    For standalone, it the behaviour works as intended.

                    sb10q I think, the problem seems to be the way how the device DB is generated or the master/compiler intepreting it. Yes, channels are assigned correctly but I2C access is performed always locally on the master crate.

                    This is an excerpt of our master json file

                        {
                    	"type": "urukul",
                    	"ports": [3, 4],
                    ...
                    	"synchronization": true
                        },
                        {
                    	"type": "urukul",
                    	"ports": [5, 6],
                    ...
                    	"synchronization": true
                        },

                    This is satellite 1 (which by chances has the same EEM ports as the Urukuls on the master)

                        {
                    	"type": "urukul",
                    	"ports": [3, 4],
                    ...
                    	"synchronization": true
                        },
                        {
                    	"type": "urukul",
                    	"ports": [5, 6],
                    	}

                    And here finally satellite 2

                        {
                            "type": "urukul",
                            "ports": [4, 5],
                    ...
                            "synchronization": true
                        },
                        {
                            "type": "urukul",
                            "ports": [6, 7],
                    ...
                            "synchronization": true
                        },
                        {
                            "type": "urukul",
                            "ports": [10, 11],
                    ...
                            "synchronization": true
                        },

                    The device db has been derived via artiq_ddb_template -s 2 yb-satellite.json -s 1 yb-satellite2.json yb-master.json > device_db_auto.py which resulted in entries like this:

                    # Master
                    device_db["eeprom_urukul0"] = {
                        "type": "local",
                        "module": "artiq.coredevice.kasli_i2c",
                        "class": "KasliEEPROM",
                        "arguments": {"port": "EEM3"}
                    }
                    
                    device_db["eeprom_urukul1"] = {
                    ...
                        "arguments": {"port": "EEM5"}
                    }
                    
                    
                    # Satellite 1
                    device_db["eeprom_urukul2"] = {
                        "type": "local",
                        "module": "artiq.coredevice.kasli_i2c",
                        "class": "KasliEEPROM",
                        "arguments": {"port": "EEM3"}
                    }
                    
                    device_db["eeprom_urukul3"] = {
                    ...
                        "arguments": {"port": "EEM5"}
                    }
                    
                    
                    # Satellite 2
                    device_db["eeprom_urukul4"] = {
                        "type": "local",
                        "module": "artiq.coredevice.kasli_i2c",
                        "class": "KasliEEPROM",
                        "arguments": {"port": "EEM4"}
                    }
                    
                    device_db["eeprom_urukul5"] = {
                    ...
                        "arguments": {"port": "EEM6"}
                    }
                    
                    device_db["eeprom_urukul6"] = {
                    ...
                        "arguments": {"port": "EEM10"}
                    }

                    Assessment

                    • Satellite 1's Urukuls access the same EEPROM as on the master.
                    • Satellite 2's first two Urukuls access an EEPROM offset by +1 on the master (which - due to how we connected the urukuls on the master - is again the masters Urukul EEPROMs).
                    • But the third Urukul on Satellite 2 now wants to access an EEPROM on EEM10 (which is not populated on the master).
                    • No satellite access the EEPROMs on the respective satellite via remote I2C

                    Since you pointed out remote I2C: It simply seems like this feature does not work as intended with the ddb above. If one could figure out, how to make this work, it would be the most elegant solution TBH.