URGENT: Pixhawk Gyro Drift causing crashes in flight

4,318 views
Skip to first unread message

Oliver Volkmann

unread,
Apr 23, 2015, 11:57:16 PM4/23/15
to drones-...@googlegroups.com, ch...@3drobotics.com, and...@aeromapix.com, Micro Aerial Projects L.L.C.
Dear Developers, 

Chris Anderson sent me here to this group to share with you an issue that we feel is worth escalating since we have had similar crashes with 3 Pixhawks so far. Please see the information that I sent to the help department at 3DR, Brandon Basso and Chris Anderson below. The flight logs can be found here: https://drive.google.com/folderview?id=0B62edZ4l5_rOfkxrZlFzM0F1TElWd0xWd1NEVFVycTVNTVBienpqakNDcFVURWhYbzhwNTA&usp=sharing 

On March 15th of this year we had a very strange experience where, during an Auto Mission our Pixhawk (FW3.1.5) based SteadiDrone QU4D X started drifting off its flight line increasing in roll angle and speed. We tried to intervene however could not save the vehicle from crashing. We have until now not found out why this happened even with the help from Santiago. I discussed this issue with Andreas Breitenstein (a colleague of mine) who had a similar experience in Namibia on the 13th of March with a completely different drone equipped with a Pixhawk and FW3.1.5. During his auto mission, the drone did a very similar thing where it deviated from the flight plan and inevitably crashed. Yesterday we were testing some of the new features on another drone with a Pixhawk and FW3.2.1 and noticed that as soon as we put this vehicle down and powered it up, the horizon in mission planners HUD would drift in the roll axis constantly. At first I had thought that since I just put this drone together that perhaps the accelerometer calibration was not done properly and so repeated it in the field. We then flew 3 very nice flights in auto and tested some of the features. At the end of the 3rd flight, we landed and immediately noticed on the HUD that the horizon had drifted to about 45 degrees and continued to drift. We decided to see then if the pre-arm checks worked and tried to arm the unit which happened successfully even with the horizon at 45 degrees. Andreas then attempted to take-off to see if it was possible and the unit did fly, albeit, as expected it flew sideways and hit the dirt about 3 feet away.

 We then re-started the drone and the horizon went back to normal. I enabled EKF as we have never flown with that and believed it to have far more safety features and that it would produce a far more stable flight. It was quite impressive and I flew the drone around quite a bit in POS-HOLD mode. As we stood there with the drone in POS-HOLD waiting for the battery failsafe to kick in, the drone started rolling to its right and flying away by itself even while I tried to roll left. It continued to fly away at which point I put it in Stabilize to try and prevent it from flying too far away however even Stabilize mode did not seem to help.

 We have looked at the log files of all three flights which you will find attached to this email and have discovered that at the point at which each of these 3 flights went bad, there was a deviation between the two IMUs Gyros in either  X1 to X2 and Y1 to Y2. We are now very concerned because we have a lot of drones out with our customers who have Pixhawks in them and this issue seems to happen randomly. All three drones (multi-rotors) which had the crashes all had different firmware, were flown in different locations and with different hardware/electronic components. The only thing that they share is that they all had Pixhawks on board and that they all experienced this strange Fly Away behavior which could not be stopped even when switching to stabilize.

Could you please take a look at these logs and let us know what is going on? We are pretty confident that this is not a firmware issue nor a stability algorithm issue as the firmware versions vary and we have had this experience with EKF on as well. We are at a loss as to what to do now other than test each and every Pixhawk that we (and our various clients) have on a crash test drone as this issue happens randomly. Sometimes it happens when it is powered via USB on the desk, sometimes before take-off and sometimes during the flights which result in crashes. We don’t know how many flights to do to replicate the issue but will keep the logs of all of the upcoming tests and send them to you as we have them.

 If there are any tests or if there is any other information that you may need such as serial numbers please do not hesitate to let us know. Perhaps these pixhawks are faulty and could be associated with a batch production in which case it may be easier to determine which pixhawks suffer from this strange behavior.

 

Thank you in advance for your prompt attention to this matter and we look forward to hearing back from you soon.

Regards,

Oliver Volkmann

Micro Aerial Projects L.L.C.

Randy Mackay

unread,
Apr 24, 2015, 2:40:59 AM4/24/15
to drones-...@googlegroups.com, ch...@3drobotics.com, and...@aeromapix.com, Micro Aerial Projects L.L.C.

Oliver,

 

     So this is what we call “the leans” when it occurs in the roll-pitch.  It can be caused by:

1.       very bad vibration which leads to inaccurate accelerometer values which leads to the attitude not being corrected properly by the accelerometer values.

2.       Inaccurate gyro bias which can be caused by:

a)      Vehicle being jostled during the calibration at startup or during the first arming

b)      A bug/issue in AC3.1.5 (and earlier versions) in which we did not reset the gyro bias estimate when we redid a gyro calibration.  The effect was most obvious if the vehicle was left for a few minutes between when the battery was plugged in and the first arming/take-off.  During these few minutes DCM/EKF would learn the gyro biases but those biases would shift to zero as part of the first arming calibration meaning DCM/EKF would incorrectly learned biases for another few minutes (until it “unlearn” them).  Normally we only saw this issue in yaw but it could happen in roll/pitch.

c)       Large temperature variations between the calibration and the rest of the flight.

 

     From the logs #1 is not the cause because the vibes appear pretty low so I think it’s 2a, 2b or 2c.

 

     In the AC3.2.1 log it appears to me that it switched automatically from EKF to DCM right at the end of the flight.  DCM’s roll estimate was off by 20deg so that switch wasn’t good although the EKF only gives up and hands back control to DCM if it’s having serious problems.  I’ve asked Paul Riseborough if he can take a look.

 

-Randy

--
You received this message because you are subscribed to the Google Groups "drones-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drones-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Randy Mackay

unread,
Apr 24, 2015, 10:03:25 AM4/24/15
to and...@aeromapix.com, drones-...@googlegroups.com, ch...@3drobotics.com, Micro Aerial Projects L.L.C.

Andreas,

 

     Paul Riseborough (EKF expert) had a look at the logs and determined that in fact, it’s #1 (vibration).  The high frequency vibration doesn’t show up well in the logs so it can only be seen indirectly which is why I missed it.  The EKF is happy until the end but then at the end it develops a bad Z axis accel bias and the EKF becomes so unhappy with the data consistency that it’s flags itself as unhealthy and hands control to DCM .. which sadly was in even worse shape.

 

     With a lot of vibration we get accelerometer clipping (where the accelerations go outside the 8G accelerometer range) and aliasing.  When this happens DCM/EKF get an incorrect angle from the accels and then can’t correct the attitude estimate proper and we get “the leans”.  Sometimes the clipping/aliasing appears only when the mounting/airframe/motors hit the right resonance which is why it can come and go.  For example a particular throttle speed can be a factor.

    

     There’s two things that could make it better:

Revisit the vehicle’s vibration isolation.  We recommend:

·         3M vibration isolating foam sold by 3dr: https://store.3drobotics.com/products/pixhawk-foam

·         Paul says the pink vibration foam is as good or better than the 3M foam but it requires double sided tape as well

Upgrade to AC3.3 when it’s released (currently in beta testing – you can try it now if you want).  We think this version will be more resilient to vibration because we’ve increased the accel range from 8G to 16G which will reduce the chance of clipping.  We’ve also reworked the filtering (I’m not qualified to explain the new scheme but Paul or Leonard could).

 

     On the code side there’s two things that we could do to help you and other people facing this problem:

·         Add in-flight vibration monitoring (i.e. look for clipping) so that we can alert the pilot that vibes are getting bad hopefully before it affects flight.

·         Stop the EKF from handing control to DCM unless DCM reports it is happy.  In your AC3.2.1 crash log DCM was not happy (i.e. ATT message’s ErrRP field was 0.8 which is bad, 0.1 is good).

 

     So this info from Paul changes things and probably some of your concerns but here’s some answers to your questions in any case:

·         The vibration issue isn’t related (as far as we know) to whether you’re using an APM or Pixhawk.  Theoretically the Pixhawk should be better because it has two IMUs that operate at different frequencies meaning it should be less susceptible to aliasing.

·         Re 2c, this failure is a bit theoretical.  We know temperature affect the gyro drift but we actually in the thousands of logs we’ve looked at, we have not identified a case of this causing a crash.

·         Both DCM and EKF can learn the gyro drift at up to 1deg/sec per minute so generally when we’ve seen problems caused by the gyros, it’s been early in the flight (within the first few minutes) before DCM/EKF have had time to learn the offsets.

 

     So, I think in the short-term looking at the vibration isolation is the best move, then move to AC3.3 when it’s released.

 

     Sorry for your troubles, hope this helps some.

 

-Randy

 

From: Andreas Breitenstein [mailto:and...@aeromapix.com]
Sent: 24-Apr-15 6:15 PM
To: 'Randy Mackay'; drones-...@googlegroups.com
Cc: ch...@3drobotics.com; 'Micro Aerial Projects L.L.C.'
Subject: RE: [drones-discuss] URGENT: Pixhawk Gyro Drift causing crashes in flight

 

Hi Randy,

 

All this is a bit disturbing especially as this is not really documented anywhere and that this can apparently happen during flight at any time.

 

I have some questions concerning the problem that arise on point 2(c).

What do you suggest one can do about this?

 

Also, have you got any Idea if this is also a problem on the Arduplane code?

We often fly for long periods of time (up to 2 hours) and this lean would be a disaster if the plane or hardware heat up or more than likely cool down during flight.

 

Is this fault present on the Pixhawk only? I have some copters on APM and they have been flawless for the past 2-3 years now.

 

Thank for your prompt response.

 

kind regards  

  

 

Andreas Breitenstein

AEROmapix UAV services

www.aeromapix.com

P. O. Box 90725

Windhoek

Tel: 0811294050


No virus found in this message.
Checked by AVG - www.avg.com
Version: 2015.0.5863 / Virus Database: 4334/9611 - Release Date: 04/24/15

Radek Voltr

unread,
Apr 26, 2015, 1:49:18 PM4/26/15
to drones-...@googlegroups.com, wal...@unirove.com, and...@aeromapix.com, ch...@3drobotics.com
Is there way how to verify high frequency vibrations before I will turn EKF on ?

Andy Piper

unread,
Apr 26, 2015, 3:18:38 PM4/26/15
to drones-...@googlegroups.com, wal...@unirove.com, ch...@3drobotics.com, and...@aeromapix.com


On Friday, 24 April 2015 15:03:25 UTC+1, Randy Mackay wrote:
·         Add in-flight vibration monitoring (i.e. look for clipping) so that we can alert the pilot that vibes are getting bad hopefully before it affects flight.

+1000

I'm fairly new to all of this, but clearly bad compass messes you up and bad vibrations or IMU messes you up - both potentially diastrously. Any in-flight diagnosis is a huge step in the right direction. When I look through the code I see a lot "if this thing is bad do this other thing to prevent disaster", which is good from a defensive programming point of view but has two bad consequences:

Andy Piper

unread,
Apr 26, 2015, 3:24:04 PM4/26/15
to drones-...@googlegroups.com, and...@aeromapix.com, wal...@unirove.com, ch...@3drobotics.com
[I accidently hit POST!]

1. It masks hardware failure. Most people want to know about hardware failure so that they can beat up the supplier - not crashing with hardware failure is good, not knowing is worse IMO.
2. The transition between alternatives can be confusing or lethal, especially if the alternative is only less-bad. Often I would prefer to know that this happened and immediately land, rather than cope and carry on.

[FWIW The roll right fly-away has happened to me and others and my issue is compass rather than vibes.]

andy

Randy Mackay

unread,
Apr 26, 2015, 8:49:16 PM4/26/15
to and...@aeromapix.com, drones-...@googlegroups.com, ch...@3drobotics.com, Micro Aerial Projects L.L.C.

Andreas,

 

      Ok, the bench test issue you’re seeing is related to a gyro calibration and/or gyro drift  (could be 2a or 2b) plus the way we changing the weighting of accels vs gyros when armed vs disarmed.  So the HUD rolling as soon as the vehicle is armed can happen because while disarmed (and landed) we give higher priority to the accelerometers than we do during flight.  While on the ground the correction from the accels can compensate for the bad gyro drift/calibration but once the vehicle is armed and we weaken the accel correction the gyro drift causes the attitude estimate to rotate away from reality.

 

     We do this change in accel/gyro weighting (only in copter) because during flight the accelerometers are more susceptible to vibration than the gyros but while on the ground (during startup or brief periods of being landed) we want to correct the HUD back to reality as quickly as possible.

 

     You’re being careful not to jostle the copter while the pixhawk led is flashing red/blue during startup?

 

@Andy,

     I agree with your point that we should not mask hardware failures.  I think with AC3.2.1 if the MPU6k fails we alert the user by setting the IMU unhealthy with the MP shows as “accel bad”.  For the switch between DCM/EKF you’re right that we need to make it more obvious in the logs when that happens and alert the user that things are going wrong.  AC3.3 is the first version which will have the EKF on by default so we’ve done some work including adding an EKF_STATUS_REPORT message that we hope the GCS people will implement.

            Tower: https://github.com/DroidPlanner/Tower/issues/1437

            AP Planner 2: https://github.com/diydrones/apm_planner/issues/693

            MP: https://github.com/diydrones/MissionPlanner/issues/851

 

-Randy

 

From: Andreas Breitenstein [mailto:and...@aeromapix.com]
Sent: 24-Apr-15 11:29 PM
To: 'Randy Mackay'; drones-...@googlegroups.com
Cc: ch...@3drobotics.com; 'Micro Aerial Projects L.L.C.'
Subject: RE: [drones-discuss] URGENT: Pixhawk Gyro Drift causing crashes in flight

 

Hi Randy,

 

Thank you for the hard work and the detailed answer. I can however not take this as an answer that I can live with. As far as I am concerned, there is something else going on here.

The fact that the copter that crashed on 3.2.1 showed a offset in the HUD by some 30* just as it was connected to telemetry with it standing still on the flying field. The HUD continued to turn up to some 45* until we restarted the copter. On this same day, the same copter stood still on the lawn on the field with the HUD showing level until!! I armed the copter at this point the HUD started rolling as soon as I throttled up it rolled up to 45* and the copter flipped on takeoff. This flip was not by accident, but was induced by me to see if the copter would even take off.

 

That same copter when connected to the USB on my office table would show everything in order then when armed with the radio the HUD will start rolling in a random direction. All this without the Battery even connected.

If it shows level after connecting I leave it connected for some time (maybe 10 minutes) If I look at the Mission Planner again the HUD will show a drift of some 15* to 45* at random.

 

All this has nothing to do with vibration but is a bug somewhere. I suspect that there is a faulty batch of Pixhawks out there.

I will send the Pixhawks that have this problem back to the US with Oliver and maybe you could arrange for someone to look at them.  

 

Thank you

Kind regards

Al B

unread,
Apr 27, 2015, 10:42:26 AM4/27/15
to drones-...@googlegroups.com, wal...@unirove.com, ch...@3drobotics.com, and...@aeromapix.com
Hi Andreas,

When you wrote "I will send the Pixhawks that have this problem back to the US", do you mean that you have other batch of Pixhawks that don't have this issue?

Also, do you see the same behavior if you load/use the PX4 Flight Control Stack instead of the APM firmware and use QGroundControl instead of Mission Planner?  That might help to isolate the root of this issue from the software standpoint.

Al B

unread,
Apr 27, 2015, 10:51:34 AM4/27/15
to drones-...@googlegroups.com, wal...@unirove.com, ch...@3drobotics.com, and...@aeromapix.com
Just for clarification. By seem the same behavior using the PX4 Flight Control Stack + QGroundControl, I meant this:

"That same copter when connected to the USB on my office table would show everything in order then when armed with the radio the HUD will start rolling in a random direction. If it shows level after connecting I leave it connected for some time (maybe 10 minutes) If I look at the Mission Planner again the HUD will show a drift of some 15* to 45* at random."


I didn't mean that you should try to flight using the PX4 Flight Control Stack to see if crashes.

Andy Piper

unread,
Apr 27, 2015, 11:11:52 AM4/27/15
to drones-...@googlegroups.com, ch...@3drobotics.com, wal...@unirove.com, and...@aeromapix.com
FWIW I think its vital that these issues are discernable from the logs. I have found that HUD messages in-flight are easy to miss - and with the compass in 3.2.1, for instance, if you miss them the data is gone. It's also easy to get these messages pre-arm, to make them go away by waiting/rebooting/shorting/doing a rain dance and to think all is well, when in actual fact it's not.

I'm looking forward to trying 3.3, sounds like it will be a big improvement.

My $0.02 :)

andy

From: Randy Mackay [mailto:...@yahoo.com]
Sent: 24 April 2015 07:41
To: drones-...@googlegroups.com
Cc: ch...@3drobotics.com; and...@aeromapix.com; 'Micro Aerial Projects L.L.C.'
Subject:

...

Oliver Volkmann

unread,
Apr 28, 2015, 3:06:29 AM4/28/15
to drones-...@googlegroups.com, and...@aeromapix.com, ch...@3drobotics.com, wal...@unirove.com
Hi Randy and everyone else, 

We have 3 Pixhawks that do not have this issue and have been extensively flown to test and find out if they do. These units were tested for well over an hours worth of flight time each over the past few days. We have 3 units that DO have this issue and it shows up with EKF on and with EKF off and on different firmware versions. We have tested each of these on exactly the same platform with no difference other than the Pixhawk being used. The vibration dampening has been consistently the 3DR Vibration Dampening Foam hence we really cant see that this is in fact a vibration issue. Randy, what typically happens when you have high vibration on a platform? As far as we can tell, and because we have units that do not have the leans and ones that do we can only conclude that this is in fact a hardware issue as they have all been tested with the same firmware. 

With this being said, there is a failure point somewhere that needs to be addressed. While it could possibly be solved via code, the risk of crashes is still high and concerning, especially considering the locations platforms with Pixhawks are being flown, the weights of these as well as the the evidenced fact that we cannot recover the unit when the leans take place. 

Has 3DR identified the batch of these units that have this hardware issue? If so, can the units be traced via the serial number that is found in the logs/terminal window?

Regards,
Oliver Volkmann



From: Randy Mackay [mailto:...@yahoo.com]
Sent: 24 April 2015 07:41
To: drones-...@googlegroups.com
Cc: ch...@3drobotics.com; and...@aeromapix.com; 'Micro Aerial Projects L.L.C.'
Subject:

...

Oliver Volkmann

unread,
Apr 28, 2015, 3:14:08 AM4/28/15
to drones-...@googlegroups.com, wal...@unirove.com, and...@aeromapix.com, ch...@3drobotics.com
Hi Al,

I am here with Andreas in Namibia (where we have ample place for the leans to occur - sorry, bad joke) so am responding on his behalf. We have not heard anything about using the PX4 Flight Control Stack nor do we know how to go about that. We are quite familiar with Pixhawk and Mission Planner and of course now our clients are too. We would be willing to try the PX4 Stack however there is a potentially long learning curve and we have no clue as to where to start. Thanks again for mentioning that we dont have to fly to see if it works. Typically that's what we are doing now with every Pixhawk we have and our clients units. So far, the success rate is a bit alarming. If you would be willing to share with us how to do the test you are suggesting we would appreciate it however I do want to get Randy (or anyone else on the code side) opinion on whether or not such a test would help in solving the problem and how it would do that.

Regards,
Oliver & Andreas

Randy Mackay

unread,
Apr 28, 2015, 3:48:10 AM4/28/15
to drones-...@googlegroups.com, and...@aeromapix.com, ch...@3drobotics.com, wal...@unirove.com

Oliver,

 

     If this can be reproduced on the bench then it would be good to do this:

·         Ensure the Pixhawk has AC3.2.1 loaded on it (or alternatively AC3.3-rc1 if you prefer)

·         Set the LOG_BITMASK to ALL+DisarmedLogging (ie. “131070”)

·         Restart the pixhawk and attempt to recreate the problem (i.e. set the board down, try arming and see if the HUD begins to roll).

·         Download the dataflash logs from the test and send it to Paul and I

    

     I don’t know about the serial numbers of boards with the MPU6k accelerometer issue.  Maybe Vu from 3dr could advise.  So far we haven’t seen that failure in the logs you’ve sent along though so there’s no reason to think it’s that particular MPU6k accelerometer issue that 3DR had from June-2014 to this Feb-2015.

 

-Randy

--

Al B

unread,
Apr 28, 2015, 11:18:28 AM4/28/15
to drones-...@googlegroups.com, ch...@3drobotics.com, and...@aeromapix.com, wal...@unirove.com
Hi Oliver,

I should have also clarified that I was not suggesting to switch to the PX4 Flight Control Stack for your product line.  I just mentioned that option because ardupilot builds on top of that stack so it would have allowed to determine if the problem manifested on the bench was specific to the ardupilot code; which is the firmware that Randy and this group lead.  However, my suggestion might now be irrelevant after I saw your previous comment saying that you have 3 Pixhawks that do not have this issue and 3 units that DO.  At this point, it is probably better wait until Paul and Randy analyze the dataflash logs they are asking for.

One question though.  When you wrote, "All this without the Battery even connected.", does it mean that you are powering the copter only with the USB from the computer when you run your bench tests?

Oliver Volkmann

unread,
Apr 29, 2015, 2:47:40 AM4/29/15
to drones-...@googlegroups.com, ch...@3drobotics.com, wal...@unirove.com, and...@aeromapix.com
Hi Al, 

Thanks. Yes, in the case of "All this without the Battery even connected", the pixhawk was powered only via USB.

Oliver Volkmann

unread,
Apr 29, 2015, 2:50:21 AM4/29/15
to drones-...@googlegroups.com, wal...@unirove.com, ch...@3drobotics.com, and...@aeromapix.com
HI Randy, 

Thanks, we are in the process of doing those tests right now. We did experience it right away with one unit however it was right as we had connected the Pixhawk to the computer to change the log bitmask so it was not recorded. Please find a log from a client of mine attached here which seems to show a complete IMU failure around the 5 minute mark. IMU1 seems to stop working all together... 

Regards,
Oliver
2015-04-23 08-55-02.log

Oliver Volkmann

unread,
Apr 29, 2015, 6:05:09 AM4/29/15
to drones-...@googlegroups.com, wal...@unirove.com, ch...@3drobotics.com, and...@aeromapix.com
In case the attachment is not visible, here is a link to the folder on Google Drive where you can find the file: https://drive.google.com/folderview?id=0B62edZ4l5_rOfkxrZlFzM0F1TElWd0xWd1NEVFVycTVNTVBienpqakNDcFVURWhYbzhwNTA&usp=sharing 


It's the IMU FAIL one...

Randy Mackay

unread,
Apr 29, 2015, 6:54:59 AM4/29/15
to drones-...@googlegroups.com, wal...@unirove.com, ch...@3drobotics.com, and...@aeromapix.com

Oliver,

 

     Yes indeed, as you say, this log shows an IMU failure (the MPU6k).  Assuming this Pixhawk was purchased between June-2014 and Feb-2015 it should be returned to 3DR and I suspect they will send a replacement.  My understanding is the MPU6k failures are due to a manufacturing issue so on the software side, all it can do is alert the user and try and cope.  It looks like the fail-over to the 2nd IMU worked and the pilot got the copter back hopefully in one piece?  If yes, then that’s a success!

 

     By the way, the vibes on the vehicle look pretty high and possibly the pitch tuning is a bit off.  There are places where the desired and actual pitch go off by as much as 20deg although perhaps there were some environmental factors making the control this bad.

OliverIMUFail.png

Oliver Volkmann

unread,
Apr 29, 2015, 7:16:03 AM4/29/15
to drones-...@googlegroups.com, ch...@3drobotics.com, and...@aeromapix.com, wal...@unirove.com
Hi Randy, 

Thanks for the feedback on that log. I will send that unit back to 3DR. The vehicle is in one piece still so that is a success! I do know that this copter of theirs is in need of tuning... 

With this issue being an obvious hardware failure, where do we stand with the other 3 pixhawks that we have issues with? Since it happens randomly, are there any other tests we can do? We have been logging a lot now but have yet to catch the lean. Would it help if we sent these units (or one of them) that have had the lean in to 3DR for further analysis?

Best regards,
Oliver

Randy Mackay

unread,
Apr 29, 2015, 9:10:25 AM4/29/15
to drones-...@googlegroups.com, ch...@3drobotics.com, and...@aeromapix.com, wal...@unirove.com

Oliver,

 

     My guess is that the two issues are not related to each other.  So this most recent log was clearly an accel failure but that failure doesn’t appear in the earlier logs we saw which developed the lean.  I suspect that in this most recent failure the user didn’t see the leans right?

 

     I don’t think that sending the boards into 3DR will help get to the root cause.  With such an intermittent problem, I suspect if they look over the boards (from a hardware point of view) and nothing will turn up.  Of course if a replacement board(s) is good enough then an RMA with 3DR is a quickest solution.  Of course we would then be left with the nagging doubt of what was wrong and could it happen again.

 

     We just need to be able to reproduce the issue to nail the cause down.  My guess is it’s vibration and not being able to reproduce it on the desk is consistent with this (although it’s not proof).  If it is vibes then I suspect AC3.3 will solve the problem and will include faster logging of IMU data and will likely also include warnings of accelerometer clipping.

Oliver Volkmann

unread,
May 11, 2015, 9:30:27 AM5/11/15
to drones-...@googlegroups.com, ch...@3drobotics.com, wal...@unirove.com, and...@aeromapix.com
Hi Randy, 

So we have been running these Pixhawks that were subject to the leans for the last week plus some days and have yet to replicate the issue. I was thinking about what could possibly be causing this to start happening and wanted to run an idea/theory by you to see if you think it is an avenue worth investigating... 

With all 3 units that we have that have demonstrated the leans, they each flew for a bit, then were landed and disarmed and then armed and flown again without power having been removed. Is there a chance that perhaps that is where the issue lies? Perhaps there is something in there where the weighting between Gyro and Acc gets messed up between these multiple take-offs, disarmings and landings? 

I have thrown yet another Pixhawk on a crash testing drone and am doing this test however based on past experiences so far, this issue happens randomly which is making it very frustrating... This is the first time I can honestly say I WANT to see the thing crash so that we can get the data out of it! 

Please let me know your thoughts on my theory.

Thanks, 
Oliver
...

Oliver Volkmann

unread,
May 11, 2015, 1:41:14 PM5/11/15
to drones-...@googlegroups.com, wal...@unirove.com, ch...@3drobotics.com, and...@aeromapix.com
Randy,

In the hopes of finding some reason for the leans starting, I performed the following tests:

Here is what I did: Place quad on the ground, power it up, wait for the blinking green light, press the safety switch button, move the drone to a different location, arm and then fly. On take-off, the drone immediately started flying off to its right. I could still control it using the remote control (I was in Stabilize) however whenever I let go of the lateral control stick, it would try to fly off that way again. I then landed the drone, took power off, then powered back on and repeated the same procedure another 4 times but this time without the drone flying off as if the gyros were calibrated incorrectly. I made absolutely sure that the unit was still when the blue/red LED sequence was running. 

Upon looking at the logs, I found that one of the IMUs does not show up in the logs which is very strange. In the subsequent flight logs both IMUs are present. The pixhawk that I am flying is not one that had the leans before but perhaps it belongs to the batch with the faulty hardware. Could you please take a look at the two logs and let me know what you see? The one with the drone flying away is: 2015-05-11 09-02.log. The flight there-after is: 2015-05-11 12-10-27.log. 

I will now repeat this above test with a Pixhawk that had the leans and get back to you. 

Looking forward to your response.

Kind regards,
Oliver
2015-05-11 12-10-27.log
2015-05-11 12-09-02.log
2015-05-11 12-12-33.log

Randy Mackay

unread,
May 11, 2015, 3:04:08 PM5/11/15
to drones-...@googlegroups.com, wal...@unirove.com, ch...@3drobotics.com, and...@aeromapix.com

Oliver,

 

     Ok, thanks for the determined testing.  You may have uncovered the issue!

 

     If the 1st IMU (the mpu6k) fails to start-up (like it did in the problem log you provided), the 2ndary lsm303d will becomes the 1st IMU but the code will use the offsets and scaling meant for the mpu6k.  In AC3.1.5 we had a similar problem with the compasses and we put a lot of effort into AC3.2.1 to ensuring this kind of error could never happen (with the compass).  I don’t think it ever crossed our minds that the same thing could happen with the accels.

 

     I’ve just done a test and confirmed that the issue happens with master.  On my particular pixhawk the lean angle difference between the two accels is only about 2.5deg but on you vehicle’s board I think it’s closer to 7 degrees so this could explain the lean.

 

     so we have a bug to fix – we should certainly ensure that all expected accels are present and if not, fail to arm or at least use the correct accel offsets and scaling.

 

      Thanks very much for this.

--

Andy Piper

unread,
May 11, 2015, 3:07:00 PM5/11/15
to drones-...@googlegroups.com, ch...@3drobotics.com, and...@aeromapix.com, wal...@unirove.com
Is this why some folk see slight lean (myself included) in 3.3 after calibration?

andy
...

Andy Piper

unread,
May 11, 2015, 3:09:43 PM5/11/15
to drones-...@googlegroups.com, and...@aeromapix.com, ch...@3drobotics.com, wal...@unirove.com
Incidentally I was mulling doing a IMU health log patch like I did for the compass. Is this worth it do you think, or is it covered elsewhere?


andy

On Monday, 11 May 2015 20:04:08 UTC+1, Randy Mackay wrote:
...

Randy Mackay

unread,
May 11, 2015, 3:25:54 PM5/11/15
to drones-...@googlegroups.com, ch...@3drobotics.com, and...@aeromapix.com, wal...@unirove.com

Andy,

 

     This IMU failure should be quite rare and I don’t think most people should see it.

 

     Immediately after the first upload of AC3.3 to the board and before the required accel calibration, I’d expect a lean but otherwise I wouldn’t expect it to lean any more than AC3.2.1 I think.

 

     For the logging, yes, we certainly need some kind of logging of accel health.  That could be an event message if an accel becomes unhealthy or it could someone log what the blend of IMU1 vs IMU2 is (that info is in the AHRS).  That blending has changed a bit - at one point it could slide up and down from 0% ~ 100% for each accel but it may be locked at 50/50 now as long as the accels are reporting healthy.

 

-Randy

--

Oliver Volkmann

unread,
May 11, 2015, 3:34:25 PM5/11/15
to drones-...@googlegroups.com, and...@aeromapix.com, wal...@unirove.com, ch...@3drobotics.com
Randy! 

Let me just say... "Phew"! I'm glad that you found something useful there with the logs I sent you... I have probably about 100 more of them but was fighting desperately to get something to go obviously wrong as I am no fundi at reading logs. 

So, with what you have discovered now, how soon do you think that a patch/fix would be available? I am sorry for the question with obvious pressure there-in however we have a lot of clients with Pixhawks who have been grounded (our call) and there are plenty of Pixhawks out there that may have this potentially fatal issue. As you already know, 3 of our drones have crashed so far because of this and the issue was not apparent on take-off but rather happened during flights. 

Thank you for your time, patience and willingness to help here Randy. I really wish I knew more about coding so that I could jump in and help but if there is something else that I can do, please do not hesitate to let me know! 

Looking forward to hearing back from you.

Best regards,
Oliver
...

Oliver Volkmann

unread,
May 11, 2015, 3:39:51 PM5/11/15
to drones-...@googlegroups.com, ch...@3drobotics.com, wal...@unirove.com, and...@aeromapix.com
Sorry, might have gotten too excited here about your findings so just wanted to confirm that this is also the reason for the in-flight failures. Do you think that one of the IMUs might fail or "restart" during flight, then the code switches to the other while the failed/restarted one "re-boots/calibrates" and then back again and hence the copter suddenly thinks that it is off level, resulting in the (dare i say) fly away? 

Regards,
Oliver
...

Andy Piper

unread,
May 11, 2015, 4:02:50 PM5/11/15
to drones-...@googlegroups.com, wal...@unirove.com, and...@aeromapix.com, ch...@3drobotics.com


On Monday, 11 May 2015 20:25:54 UTC+1, Randy Mackay wrote:

Andy,

 

     This IMU failure should be quite rare and I don’t think most people should see it.

I should admit that I have some self-interest here since I have one of the pixhawks from the known bad batch, but have not seen evidence of the known failure. I'm wondering whether the mpu6k failures can be more subtle than simply flat-lining. Could vibration cause very intermittent failure (it was a soldering issue right?) - not enough to trigger the IMU mismatch error but enough to cause subtle errors in flight.
 

 

     Immediately after the first upload of AC3.3 to the board and before the required accel calibration, I’d expect a lean but otherwise I wouldn’t expect it to lean any more than AC3.2.1 I think.

 

     For the logging, yes, we certainly need some kind of logging of accel health.  That could be an event message if an accel becomes unhealthy or it could someone log what the blend of IMU1 vs IMU2 is (that info is in the AHRS).  That blending has changed a bit - at one point it could slide up and down from 0% ~ 100% for each accel but it may be locked at 50/50 now as long as the accels are reporting healthy.

Ok I might try this if only to rule this out as a source of my problems (I'm hoping lightening doesn't strike twice!)

andy
 

 

-Randy

 

James Harrison

unread,
May 11, 2015, 4:37:15 PM5/11/15
to drones-...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/05/2015 20:25, 'Randy Mackay' via drones-discuss wrote:
> For the logging, yes, we certainly need some kind of logging of
> accel health. That could be an event message if an accel becomes
> unhealthy
>

As a humble user, I'd certainly expect that any indication of a
hardware failure (in any system, but particularly the IMU) would be
flagged up as both an item in the logs and as an event message for my
GCS to alert me to. Depending on payload/vehicle, a failed IMU at
startup is likely to make me go back to the bench and test, test, test
(or replace) rather than fly. Which is the right outcome!

- --
Cheers,
James Harrison
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iQIcBAEBAgAGBQJVURL0AAoJENTyYHL8dmp9r2gP/igt1Cs4k9YY2Tq18qM/uDHG
OWSTLyIqj/i4CCtzVdoAnOFbCJ4BowgsHVXf4H56pLBVYcf+AsrB2OYtguqvY0F7
k/1fkdKOpNgm8k8PEItcIg8xJ/zN68HpmTfGT2zviPaimD9hO/iukyqutbVKK5o/
+A42X5j4uMF0ky40AdsJGus/e3jBGxat/Ml4akBbggPoitqMW+fuucCzoKLxFl4/
TklWzUdMLA7sEKSNGelqJ0uWo4Y9Z5M5IOSvwmIS87t4Q+P6Ota1+YF65Y4Dy9Mc
yFNLOgk1pqHnRGbOphBFua8doGY4BookkcguRirtJH+EDq6UNiIn6eVFdApQtqs4
RUa/hoT5swE4c+T+S+w0ga94job1RCIP+ieYOuL3TZgR2/05n6TsON/XIhJTrTyJ
BfIvMgYfHCdU4ZUHIwcbJSWG8DHrOu9iRDCiCyPrESTlQKaE96T8MH2+kkGOEoqm
ACEV4UzmwEL6Q0CC5eBeAx3bpi5hFBeAxKePBJDJS61XeOlfxrrl3zRRTzfFM9De
FQscWd15YD/KyytNTLFQ57DDJityUtV3BwPNOaZlHb6ejbkNLRnFB6YRV5WOV2Oi
EYrwBHYQ/pkd87MAx3zgfNBgQTFNoCOLu/AajSWSjQPHGOh9TCQdnyxrZfyPIZ5g
xq4D5bb8uG2pdfZ0B4b7
=j4D0
-----END PGP SIGNATURE-----

Andy Piper

unread,
May 11, 2015, 5:12:24 PM5/11/15
to drones-...@googlegroups.com
I'm pretty sure GCS already does this at startup - it's the in-flight logging that is missing. Compass was the same, but not any more :)

andy