OM2+ getting stuck in boot stage with custom firmware using b255



  • The default build system is not changed. I believe this is a built-in script that we don't make any changes to. The only customization is adding binaries, scripts, and supporting packages required by our FW. Also, the same build works fine on some devices. However, they all brick, eventually, after there is a power cycle.

    The same image worked fine with our older batch of devices, in which we used the build system b252 to create the custom image. Since we came across this notice, we upgraded the build system to b255, as suggested and we started to see this error.



  • @mrahul If you install b258 does that resolve your issue?



  • No build other than b255 are working on these devices. And I don't think trying builds lower than b255 would be sensible on these devices. I tested this with the images downloaded from the Onion FW Repo.

    However, as far as the custom build goes, I only created a custom image using the b255 build system.


  • administrators

    @mrahul What happens when b258 firmware is loaded on the device?

    Note that not too much has changed in the OnionIoT/source repo between b255 and b258.
    For details see: https://github.com/OnionIoT/source/compare/be7777b369c55c53b3582e868fd268ff4bf31318...928cad900fd7ecbf37af5ab6804e1af161e89430

    The most notable difference is the change of the default repo from http://downloads.openwrt.org/releases/18.06-SNAPSHOT to http://downloads.openwrt.org/releases/18.06.1 in package/base-files/image-config.in

    If the b255 fw released by Onion works on these devices and the custom image based on the b255 build system does not, this points to an issue introduced in the custom image building process.

    Can you share more details about that?



  • @Lazar-Demin, I've compared the commits too, and agree that there aren't many significant changes. Flashing b255 from the Onion Firmware Repo works fine. Talking about the custom image creation process, the issue was observed even when there weren't any changes done to the build system before creating the image.

    I cloned the b255 commit from the repo, and did no changes. I created the image using the same cloned build system, AS-IS, and when flashed with the image created, the onion still goes in the loop of the booting process. I'm ok to share more details if you want, please let me know what, and I'll share accordingly.

    Thanks.



  • @mrahul Can I ask how you are updating your devices, are you sysupgrade or flash via u-boot option?



  • I'm using sysupgrade to flash the custom image.


  • administrators

    @mrahul said in OM2+ getting stuck in boot stage with custom firmware using b255:

    I created the image using the same cloned build system, AS-IS, and when flashed with the image created, the onion still goes in the loop of the booting process. I'm ok to share more details if you want, please let me know what, and I'll share accordingly.

    Yes please share more on your process of making the image.
    Did you use the instructions from the OnionIoT/source readme?

    EDIT: actually, the b255 commit of OnionIoT/source shouldn't successfully compile anymore since it points to the OpenWRT 18.06-SNAPSHOT package repos. These package repos have since been deleted.
    The error you see when booting now makes sense: Linux expects /etc/hotplug.json to exist, but since the build system couldn't find that package it's not included in the firmware.

    You will need to use b257 (and up) to successfully compile firmware. See the b257 commit.



  • @mrahul try sysupgrade -N so the existing configuration is not preserved. I'm wondering if there is something in your configuration creating the issue.



  • Hi @crispyoz & @Lazar-Demin, thank you so much for your input. I've been working on some immediate deliverables, so I won't be able to share what you asked for at least the next few days. I'll post the logs and observations as soon as I try what you both have suggested. Thanks again!



  • Hi @crispyoz & @Lazar-Demin, hope you guys are doing great.

    The error you see when booting now makes sense: Linux expects /etc/hotplug.json to exist, but since the build system couldn't find that package it's not included in the firmware.
    

    Should we see the same behavior if we use the binary downloaded from the Onion FW Repo?

    Lately, we've started seeing the same behavior again with b255 where it gets stuck in the boot process. On checking, it is the same error related to the hotplug.json file not being found. We're not creating and flashing custom images for now, all this is using the binaries from the FW Repo. I saw that there is a new build release, b259, that we are planning to try.

    One way we recover from this stage is by accessing the device through failsafe mode and doing firstboot. I'll try flashing the new build using sysupgrade -N <filename> and share the behavior here. I also wanted to know if there's any difference between the way an image is flashed on the devices. There are methods to upgrade or re-install images like through ethernet, SD Card/USB, and also there's sysupgrade. However, we only use two methods, i.e. through ethernet or syupgrade to upgrade or re-flash a device on bricking. How different are these two processes?

    Thanks again for all the responses and suggestions. Really looking forward to get to the core of this.


  • administrators

    @mrahul said in OM2+ getting stuck in boot stage with custom firmware using b255:

    Should we see the same behavior if we use the binary downloaded from the Onion FW Repo?

    No, my previous comment does not apply to firmware in the Onion firmware repo - these firmware images are already compiled and should contain everything needed.

    So devices with mac addresses that start with 40:A3:6B should boot and operate properly when flashed with any firmware from the Onion firmware repo.

    Devices with mac address addresses starting with 88:1E:59 will boot and operate correctly when flashed with firmware b255 and higher.

    I've confirmed this with devices I have on hand
    Device with 40:A3:6B mac address running firmware b254:
    Screenshot 2024-10-04 at 3.54.44 PM.png

    Device with 40:A3:6B mac address running firmware b259:
    Screenshot 2024-10-04 at 3.49.19 PM.png
    Device with 88:1E:59 mac address running firmware b256:
    Screenshot 2024-10-04 at 3.35.20 PM.png


    @mrahul said in OM2+ getting stuck in boot stage with custom firmware using b255:

    However, we only use two methods, i.e. through ethernet or syupgrade to upgrade or re-flash a device on bricking. How different are these two processes?

    For a device that boots into linux successfully, upgrading the firmware thru sysupgrade and thru ethernet with the bootloader will have the same effect. As long as sysupgrade is run with the -n option to overwrite the existing configuration and filesystem.

    For a device that cannot boot into Linux, we recommend reflashing the firmware with ethernet thru the bootloader.


    @mrahul said in OM2+ getting stuck in boot stage with custom firmware using b255:

    Lately, we've started seeing the same behavior again with b255 where it gets stuck in the boot process.

    Does this mean the firmware sometimes works and sometimes doesn't?
    This could point to something in the hardware design impacting the boot sequence. Do you use any SPI devices? Do you follow the bootstrapping pins guidelines in your hardware?


    My recommendation

    Since you mentioned you're using the through-hole Omega2+, we can try to isolate where the issue is coming from.

    I recommend doing the following:

    • Remove an Omega2+ device from your custom board
    • Plug it into an Onion Expansion Dock with an Ethernet Expansion
    • Use the bootloader + ethernet to flash firmware b259 from the Onion firmware repo
    • Observe the outcome, given my testing I expect this device should boot properly
    • Plug the device back into your custom board and attempt to boot it again

    Let me know how it goes!



Looks like your connection to Community was lost, please wait while we try to reconnect.