Firmware Boot Failure on Onion Omega2+: Steady Orange LED, Stuck Bootloader Mode and Missing Hostname
-
@crispyoz thank you for looking into my issue.
Log Details:
The logs provided above are from the same Onion Omega2+ device, captured before and after it got stuck. The working log corresponds to the device’s state before the issue occurred, while the not working log was generated after it got stuck.Dock Information:
I am not using an expansion dock. Instead, I am using a custom PCB, which has been thoroughly tested and verified for compatibility with the Omega2+. Both the non-working and previously working Omega2+ units were tested on this same custom PCB setup.
apart from this the Omega2+ is monitored using Docklight. I am using the same Docklight v1.6 setup for both the working and non-working states of the Omega2+ unit.Reset Button Attempt:
I have attempted holding the reset button while powering on the Omega2+, keeping it pressed throughout the startup process, for 10-15 seconds and repeatedly pressing the Enter/Space bar keys to exit boot mode. Unfortunately, this did not resolve the issue. After releasing the reset button, the orange LED remains steadily ON, and the device fails to move beyond the bootloader mode.Power Source:
I have verified that the power source is stable. The same power source is used for both the working and non-working states of the Omega2+ unit, ruling out power-related concerns.Testing Another Omega2+:
To further investigate, I tested a fresh, new Omega2+ unit using the same dock and power source. Initially, the new unit worked fine. However, after 7–8 days, it exhibited the same behavior as the previous device, with identical logs indicating it is now stuck in bootloader mode.Dock and Power Source Consistency:
The same custom pcb and power source were used to test the Omega2+ unit both before and after it got stuck. This confirms that the setup is consistent across both states.
-
@mayur_ingle A custom PCB makes it more tricky. Just so I am clear, you are using the Omega2+ (through hole version) not the Omega2S+ (surface mount)? Do you have a power dock and an ethernet expansion?
If the device can't boot how did you "Verified the /etc/rc.local configuration for any errors." I assume you mean you checked it from version control system, not the device directly.
I have a few thoughts on possible causes:
-
Have you checked your design against Hardware Design Guide? I read that your design is well tested already but I had a similar issue a few years ago where a pin should have been pulled/pushed (I forget) and had accidentally been so, but every once in a while my devices wouldn't boot or started to boot then stopped. I think this is less likely in your case, but worth checking.
-
My guess is the RAM is getting corrupted by something because we can see the correct relocation address, it reads the environment then stops. I recall a similar issue on another custom PCB, the Onion guys looked into it and adding some shielding resolved the issue.
-
You mentioned you are running some scripts, it is possible to corrupt your file system programmatically. Can you provide more detail on what the script(s) are doing.
If you are using an Omega2+ (through hole) my next step would be to insert it into a standard dock and view the boot process using minicom or some other terminal software. I looked at DockLight but haven't used it, but a raw terminal would remove any potential issues of handshaking or such causing the issue.
You can access the flash by removing the cover and accessing the chip directly (if you're adventurous) @luz provided some instructions on this. Check his posts for more details.
-
-
@crispyoz Thank you for your detailed response and suggestions.
1)Omega2+ Version Confirmation:
I am using the Omega2+ (through-hole version), not the Omega2S+ (surface-mount).2)RAM Corruption and Shielding:
Your observation about potential RAM corruption due to inadequate shielding, your point number 2 and 3 are insightful. I will investigate the custom PCB design to identify any potential interference or shielding gaps that might be causing the problem.3)Script Behavior and Potential File System Corruption:
I suspect the scripts might also be contributing to the issue. For clarity:
The main_app.py script captures Modbus data packets (32 packets of 105 bytes each) and appends them to an Excel file every 2 seconds. It then publishes a JSON string (~3147 bytes) via MQTT every 5 minutes. It handles reading Excel parameters, connecting to MQTT, and clearing processed data.
The check_system_status.sh script monitors the system and triggers a reboot if main_app.py encounters issues like corruption or logging failure.Important Observation:
The last log observed before the device got stuck in bootloader mode occurred immediately after executing the check_system_status.sh script which executes only if the main_app.py got corrupted/failure in data logging.
so auto reboot was provided in the same .sh script and it got stucked from then only.Recovery for Stuck Device:
Could you provide detailed steps to recover the device stuck in bootloader mode?
for the below Observed Behavior
The device fails to boot correctly, getting stuck in bootloader mode with a steady orange LED and no hostname availability.
Only a partial boot log is available after resetting.
-
@mayur_ingle this is a great bug report, very detailed!
Since you're using the through-hole Omega2+, I agree with @crispyoz 's suggestion to try out your stuck devices on a standard dock. Just to rule out any hardware issues.
IMO the issue is more likely to be file system corruption than RAM corruption
Avoiding File System Corruption
@mayur_ingle said in Firmware Boot Failure on Onion Omega2+: Steady Orange LED, Stuck Bootloader Mode and Missing Hostname:
The main_app.py script captures Modbus data packets (32 packets of 105 bytes each) and appends them to an Excel file every 2 seconds.
I responded to your colleague on a GitHub Issue but I will post this here for visibility:
A few other users have reported file system instability when programs are running that frequently write to the flash storage. To get around this, we recommend moving any file writes to the
/tmp
directory (as this is actually on the RAM, not the flash).In this case, data that should persist indefinitely should be copied over from
/tmp
to the flash filesystem (anything else on/
) at some longer interval, perhaps daily. Cron is solid tool for this copy job.Recovering Stuck Devices
How many stuck devices do you currently have?
I'd like to confirm if the bootloader can be accessed on a stuck device.On a working device, the bootloader menu can be enabled by powering on the device while holding the FW_RST pin (GPIO38) active. This reset pin is active-high, and this is the pin used by the reset button on the Omega2 Docks.
Keep in mind pressing the enter or space keys will not activate the bootloader menu.Please try this first on a working device, and then try it on a "stuck" device. Report back how it goes.
-
A few other users have reported file system instability when programs are running that frequently write to the flash storage
I can echo @Lazar-Demin's observation re regular flash writes leading to file system issues. I have a custom PCB based on Omega2S+ that maintains a sqlite3 database of network traffic on a specific set of ports, it then pushes counters via MQTT to our central server every few seconds. The upshot is that it is writing to FLASH sometimes every second or more. After about 3 months of normal use I started to see devices failing with file system issues. JFSS2 was able to fix many of the issues on restart and the sqlite3 db could be rebuilt to largely recover the historical data, but it was not a good long term solution.
I added an SD Card to my design and all my problems went away. Other than my hardware costs I use Kingston 16GB SDHC U1 C10 which you can buy for a few shekels each. A symlink or mountpoint negates the need to modify any software.
-
@Lazar-Demin Thank you for your detailed response and suggestions.
Updates and Observations
Setup: Using a custom PCB (not an expansion dock).1) File Writes to /tmp
Following your suggestion, I have updated the setup to move all file writes to the /tmp directory. This is intended to prevent flash wear and file system corruption.2) Recovery Attempts on Devices
I currently have three stuck devices.a) On a working device:
I followed the recommended steps: powering on while holding the FW_RST pin (GPIO38) active.
Observation: This process erases the existing firmware on the working device, resets it, and allows me to re-upload the firmware. After this, folders become accessible, and the hostname is visible.
I did not see a bootloader menu during this process; it directly erased the firmware and enabled reconfiguration. In short working fineb) On the stuck devices:
I performed the same steps as with the working device, but there was no change. The stuck devices remain in their frozen state, displaying the same log output as shared earlier:Board: Onion Omega2 APSoC
DRAM: 128 MB
relocate_code Pointer at: 87f60000
flash manufacture id: c2, device id 20 19
find flash: MX25L25635E
*** Warning - bad CRC, using default environmentCurrent Status:
Following your suggestion, I updated the file writes to /tmp and the device is under observation for 8 to 10 days (as my old device is got stucked after 8 days).Do you have any further suggestions for recovering the stuck devices, given that the bootloader menu doesn’t appear to be accessible? Could there be an underlying hardware issue contributing to this behavior?
I look forward to your advice.
-
@crispyoz Thank you for sharing your experience.
I’m using the Omega2+ (through-hole version) on a custom PCB, which already has an SD card slot below it.
**Do I need to configure anything on the software side for the SD card, or can I simply insert the SD card and start using it?If any configuration steps are required, could you kindly share them with me? Your guidance would be greatly appreciated.**
-
@mayur_ingle I can see you're using a standard Onion release of the firmware so the SD Card requirements are pre-installed. You can insert an SD card into the slot and you'll see in the log something like this:
[63130.024501] mmc0: new high speed SDHC card at address 59b4 [63130.041132] mmcblk0: mmc0:59b4 SD16G 14.6 GiB [63130.048468] mmcblk0: p1 p2
So we can see the device is mmcblk0 (the first device) and it has two partitions, p1 and p2. You set it to automount using these commands:
uci set fstab.@global[0].auto_mount='1' uci commit fstab
Re-insert the card and you should see the card is mounted automagically. Use the mount command to see where it was mounted:
/dev/mmcblk0p2 on /mnt/mmcblk0p2 type ext4 (rw,relatime) /dev/mmcblk0p1 on /mnt/mmcblk0p1 type vfat (rw,relatime,fmask=0000,dmask=0000,allow_utime=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
This card I used above is from a Raspberry Pi 4B so it has two partitions, you probably will only see /dev/mmcblk0p1, you can see it is mounted at /mnt/mmcblk0p1, you can see it's contents using the command:
ls -la /mnt/mmcblk0p1
To set the default mountpoint for the card to my preferred location of /etc/myappname/data we can change this by adding an entry to fstab.First we need the UUID assigned to the SD Card device using the command:
block info
The output will be something like this:
/dev/mtdblock5: UUID="188c96f5-f6939c36-1805def7-637d9f6c" VERSION="4.0" MOUNT="/rom" TYPE="squashfs" /dev/mtdblock6: MOUNT="/overlay" TYPE="jffs2" /dev/mtdblock7: MOUNT="/mnt/mtdblock7" TYPE="jffs2" /dev/mmcblk0p1: UUID="3537-3964" LABEL="NO NAME" VERSION="FAT32" MOUNT="/mnt/mmcblk0p1" TYPE="vfat"
You can see the UUID of the mmc device at the bottom is "3537-3964" and the card is formatted as FAT32. Now I can add the default mount point using the following commands:
uci add fstab mount uci set fstab.@mount[0].uuid='3537-3964' <--- UUID you found above uci set fstab.@mount[0].target='/etc/myappname/data' uci set fstab.@mount[0].enabled='1' uci commit fstab
Remove the card then reinsert it, you should now be able to see the card is mounted at /etc/myappname/data
You can point your database or script output to this directory and everything will be written to your SD Card.
-
@mayur_ingle said in Firmware Boot Failure on Onion Omega2+: Steady Orange LED, Stuck Bootloader Mode and Missing Hostname:
a) On a working device:
I followed the recommended steps: powering on while holding the FW_RST pin (GPIO38) active.
Observation: This process erases the existing firmware on the working device, resets it, and allows me to re-upload the firmware. After this, folders become accessible, and the hostname is visible.
I did not see a bootloader menu during this process; it directly erased the firmware and enabled reconfiguration. In short working fineThis is not expected.
Can you elaborate on your observations? What do you mean by it erases the existing firmware on the device? What were the steps you had to do to make this happen? Can you post a log of the terminal?Expected behaviour
If you have Omega2 devices manufactured in the last ~7 years, you should see a bootloader menu if the device is powered on with the FW_RST active:
You then need to select an option from the menu. See the Firmware Flashing With Web Recovery Mode docs article for the full process.
@mayur_ingle said in Firmware Boot Failure on Onion Omega2+: Steady Orange LED, Stuck Bootloader Mode and Missing Hostname:
Following your suggestion, I updated the file writes to /tmp and the device is under observation for 8 to 10 days (as my old device is got stucked after 8 days).
Great! I suspect this will resolve the issue. Let us know how it goes!
-
@Lazar-Demin Now I observed after retest in working device I got the menu as expected from your mentioned steps:
log as below
b) On the stuck devices:
but on a stuck devices device it's no change, observed as same logs which updated earlier for not working device:
I performed the same steps (multiple times) as with the stuck devices. The stuck devices remain in their frozen state, displaying the same log output as shared earlier for not working device log.Observed Behavior
The device fails to boot correctly, getting stuck in bootloader mode with a steady orange LED and no hostname availability.
Only a partial boot log is available after the mentioned steps followed.log as below:
Board: Onion Omega2 APSoC
DRAM: 128 MB
relocate_code Pointer at: 87f60000
flash manufacture id: c2, device id 20 19
find flash: MX25L25635E
*** Warning - bad CRC, using default environment
-
@mayur_ingle ok, glad to hear you're now seeing the expected behaviour with the working devices.
The situation is a little unusual with the stuck devices. I didn't expect the bootloader to be impacted by the file system issue.
I agree with what @crispyoz said above:
If you are using an Omega2+ (through hole) my next step would be to insert it into a standard dock and view the boot process using minicom or some other terminal software. I looked at DockLight but haven't used it, but a raw terminal would remove any potential issues of handshaking or such causing the issue.
For the stuck devices your next step should be trying them on a standard Dock from Onion, and using a simple terminal program like screen, minicom, or putty.
Otherwise, these 3 devices might be write-offs. You can try to recover them by using an external device to rewrite the flash but we (Onion) don't recommend this procedure as a lot can go wrong.