We have upgraded the community system as part of the upgrade a password reset is required for all users before login in.

Omega2S+ overlay JFFS2 filesystem corruption



  • @magge There are 3 scenarios I am aware of that have previously been reported to corrupt the file system.

    1. flaky power supply
    2. exhausted file system capacity
    3. EMI

    2 should be easy to isolate.
    3 perhaps try shielding the Omega2S temporarily and see if it this resolves your issue, if so then some redesign may be required.

    A fourth option can be a software defect that causes any files you're writing to disk to become corrupt, but JFSS2 is pretty robust so I'm not convinced this is the most likely cause.


  • administrators

    @magge what does your Golang program do? Are there a lot of writes/reads to the fs?

    Also, cheers for following the How to Ask for Help post.



  • @crispyoz Thanks again!

    1. I've seen the issue on more than one power supply, but I will try to check with oscilloscope if I can see something.

    2. Yes, that was suspect initially. However, FS is still corrupted for some reason even just adding a small /etc/init.d/ now (for the case where I run the app from USB storage). df -h reports still free space after I upload my files but my understanding is that JFFS2 is a compressing FS and that remaining space is a guesstimate.

    3. EMI. I read another article here on community that shielding was a thing. There is no radio on our device. Stick a tin foil hat on my Omega2S+ and see if it helps? 😆

    I will try check the power supply and get back to you. There are PWM'ed motors connected on the device, so I could see there being some noise coming back from that. That being said I'm almost sure I've seen FS go bad even without running the motors. I.e. uploading the software, rebooting without even running it, getting bad FS.

    If there is noise from motors/supply could that permanently damage the flash so that it's just broken now and that's why "nothing works"?
    Could a damaged flash operate like I have described due to caching smartness or similar; i.e. I upload the file and it looks okay, trouble only starts after a reboot?



  • @Lazar-Demin Thanks!

    My app logic only reads a config file, and that's it for FS access. It will then do GPIO to control motor directions and read buttons, I2C to PWM chip for motor speeds, I2C to AD converter chip for current measurements.
    Running the app does not seem to trigger issues from what I can tell. The trouble seems to start after uploading the files and rebooting.


  • administrators

    @magge Interesting. If the app only reads a config file then I think it's safe to assume the app itself isn't the problem.

    You should try to confirm this is the case. One way would be to try running the same software + firmware combination on all Onion hardware, like an Omega2S development kit or an Omega2 on a Dock.

    If the same issue happens, then we can safely assume the program is causing the fs issue.
    If not, we can rule out the software, and you can focus on your circuit/power supply/EMI/etc.



  • @magge If you don't have Omega2 and a dock, you could try just running your custom board but not your app. Then do some file system reads like ls -laR on a loop then see if rebooting results in a corrupted file system.



  • Short update.

    Unfortunately I don't have a Omega2P that is known to be good, I only have my custom hardware. Is there a prototyping board I should order when working with this? I tried to google for it but did not get any wiser...

    I have been working on this, trying to be more structured and changing less things at once, rebooting often and going through different variations.
    New observations when using the official omega2p-v0.3.3-b256.bin firmware

    • The binary used above (go build -ldflags "-s -w" and upx --brute, filesize < 2MB) can be transferred to Omega2P and MD5 does not change over reboot. No logs of filesystem corruption.
      However! If the app has been started at least once, the next reboot will log xz decompression failed data is probably corrupt, and next reboot the system is not well. My application fails and other things like opkg update will log errors, and there will be errors logged by squashfs . It was my understanding that squashfs was read-only on the Omega2P, so I'm not sure how it can fail.
      Recovery seems consistent by going to failsafe and wiping out rootfs_data (for example firstboot or mtd erase rootfs_data).

    • Alternative binary (no stripping, no upx, filesize 9.4 MB) I start getting JFFS2 warnings (errors?) in the log, like:

      [ 1179.029279] jffs2: warning: (703) jffs2_do_read_inode_internal: no data nodes found for ino #134
      [ 1179.038223] jffs2: iget() failed for ino #134
      

      The MD5 of the app binary changes over reboot (somehow it still worked though!).

      # before reboot
      $ md5sum /isys/isysctrl /mnt/sda1/isys-3/isysctrl-full
      ef7936ff1a4030871fd47687d9306965  /isys/isysctrl
      ef7936ff1a4030871fd47687d9306965  /mnt/sda1/isys-3/isysctrl-full
      
      # after reboot
      $ md5sum /isys/isysctrl /mnt/sda1/isys-3/isysctrl-full
      05300935ae776926d7ce20f145682392  /isys/isysctrl
      ef7936ff1a4030871fd47687d9306965  /mnt/sda1/isys-3/isysctrl-full
      

      /etc/banner got corrupted:

      BusyBox v1.28.3 () built-in shell (ash)
      
      â–’localâ–’â–’JSON_PREFIX=â–’1=â–’=â–’"â–’$___valâ–’"â–’eval_json_set_varâ–’1 .bpâ–’a_json_set_var1â–’btbâ–’bâ–’bâ–’bâ–’blb,bâ–’bb$bb_a_value=â–’â–’2=â–’localâ–’â–’JSON_PREFIX=â–’1=â–’=â–’"â–’${â–’JSON_PREFIX=â–’1=} â–’$_a_valueâ–’"â–’eval_jroot@Omega-266F:/#
      

      No xz decompression errors logged when rebooting after app has been running.

      JFFS2 seems to be in trouble , fex opkg fails:

      root@Omega-266F:/# opkg update
      [   86.107495] jffs2: notice: (1925) jffs2_get_inode_nodes: Wrong magic bitmask 0x0000 in node header at 0x220558.
      [   86.118551] jffs2: notice: (1925) jffs2_get_inode_nodes: Wrong magic bitmask 0x0000 in node header at 0x2204fc.
      [   86.129229] jffs2: notice: (1925) jffs2_get_inode_nodes: Wrong magic bitmask 0x0000 in node header at 0x22006c.
      [   86.139536] jffs2: warning: (1925) jffs2_do_read_inode_internal: no data nodes found for ino #113
      [   86.148627] jffs2: iget() failed for ino #113
      [   86.155510] jffs2: warning: (1925) jffs2_get_inode_nodes: Eep. No valid nodes for ino #113.
      [   86.164064] jffs2: warning: (1925) jffs2_do_read_inode_internal: no data nodes found for ino #113
      [   86.173074] jffs2: iget() failed for ino #113
      [   86.178849] jffs2: warning: (1925) jffs2_get_inode_nodes: Eep. No valid nodes for ino #113.
      [   86.187422] jffs2: warning: (1925) jffs2_do_read_inode_internal: no data nodes found for ino #113
      [   86.196445] jffs2: iget() failed for ino #113
      Collected errors:
       * opkg_conf_parse_file: /etc/opkg/distfeeds.conf:1: Ignoring invalid line: `â–’@'
      
      
    • The best working option I have found (no stripping, upx --brute, filesize 4MB).
      No errors after transfer.
      No errors first reboot after transfer.
      No xz decompression errors first reboot after app has been running.
      Still got /etc/banner corruption, opkg update still fails.



  • @magge If you don't use upx do you have the same issue?



  • @crispyoz Thanks for responding again!
    It should be 2nd bullet above, there I get a bunch of JFFS2 errors, so I'm not sure if maybe the file is just too big without upx, it's about 9.4 MB in that case.
    Looks to me like the 3rd bullet is slightly bettery (there the binary is not stripped but compressed with upx).



  • @magge Sorry missed that. We really need more information on your software's interaction with the file system. JFFS is a very robust system, Omega2 file system corruption such as you are describing is not something I have ever seen outside of a hardware design issue, and I don't recall any similar reports to your experience. My guess is that there is something in your code causing the issue and you've simply been lucky not to have experienced the issue on other hardware (I recall you mentioned your software was on another device).

    To remove hardware from the equation, I'd install your software on a stock power dock with Omega2+. Since we don't know what additional hardware your custom PCB includes, or exactly how your software interacts with the device, I can only offer an opinion on how I would proceed if faced with a similar issue.

    Honestly, I don't think the issue is the Omega2, your issue is simpy too reproducable not to have been experienced by those of us who have 100s or 1000s of Omega2 in production for years. So we're here to assist you in working thourgh the issue until the light bulb moment arrives 🙂


  • administrators

    @magge Thanks for sharing your detailed testing report.

    I agree with @crispyoz's suggestion to try running your software on stock Onion hardware - this way we can rule out circuit issues.

    @magge said in Omega2S+ overlay JFFS2 filesystem corruption:

    Is there a prototyping board I should order when working with this? I tried to google for it but did not get any wiser...

    There's an Omega2S development board you can use. Or if you would prefer, you can use an Omega2+ and an Expansion Dock.
    Our distributors have all 3 products in stock and ready to ship.



Looks like your connection to Community was lost, please wait while we try to reconnect.