@dixit I just noticed this post after answering the other one in the p44-ledchain thread
Hearing that you had similiar problems with the ws2812-draiveris driver as with my p44-ledchain, I can conclude that the problem is most likely not the timing, but the hardware (missing level shifter?).
Because ws2812-draiveris always meets the timing for the WS28xx - as you might have seen when porting it, it simply locks all interrupts and then bit-bangs the data.
While this is good for meeting the LED chip timing (and simple to code), locking interrupts for dozens of milliseconds is a cardinal sin in a Linux driver. On the original AR9331 CPU (Omega1), where ws2812-draiveris originates from, this was the only way to get such SmartLEDs working at all, so ws2812-draiveris is as good as it can get on that chip.
However, on the MT7688/Omega2 with its PWM units, the p44-ledchain way to do it, without locking interrupts, is the way to go. And that works easily, at least with WS2813 or WS2815, with all 4 PWMs running in parallel with 1000 LEDs each.