Lion Air flight 610 : The Maintenance
One of the aspects of Lion Air flight 610 has been the quality of the maintenance. It is clear, now, that intermittent problems related to the air data system had been logged and clearly had not been resolved in a satisfactory manner.
I have been wanting to do a deeper analysis of the various aspects of Lion Air flight 610 but to be honest, I haven’t been sure where to begin. I think there’s a logic to following the aircraft around for the weeks previous in order to get an understanding of the fault(s) and the level of maintenance received.
For this post, I’m going to presume basic familiarity with crash. I have already covered the Preliminary Report on Lion Air flight 610. However, I’d like to invite you to highlight areas that are unclear or confusing, which will help me to know where further explanation would be useful and I’ll tackle those aspects separately.
There’s a lot of ground to cover, so my writing about the maintenance doesn’t mean that the MCAS system or the previous flight or the crash itself are not important, just that they aren’t the focus of this specific post. Instead I want to follow the events that affected aircraft serial number 43000, with the registration PK-LQP, in the weeks before the crash.
The aircraft was a Boeing 737-8 (MAX) manufactured in 2018 with a Certificate of Airworthiness issued on the 15th of August 2018. Before crashing, it had flown 895 hours and 21 minutes with a total of 443 cycles (a take-off and landing makes for one cycle). The first issue related to the crash came up when it was just six weeks old.
On the 9th of October 2018, twenty days before the crash, an error of Angle of Attack Signal is Out of Range was detected by the left-hand (captain’s side) Air Data Inertial Reference Unit shortly before arrival in Jakarta.
This is important as it shows what seems to be the start of an ongoing series of errors and flags to do with the air data and the angle of attack readings of the aircraft.
The angle of attack (AOA) is the angle between the direction of the relative wind and a reference line on the aircraft or the aircraft’s wing; a simple way to describe it is the difference between where a wing is pointing and where it is going. Increasing the angle of attack increases lift and induced drag up to a certain point, known as the critical angle of attack. If the critical angle of attack (usually around 17°) is exceeded, the air begins to flow less smoothly and starts to separate from the upper surface of the wing. As the angle of attack increases, the wing loses lift and the aircraft stalls.
The stall speed of an aircraft changes based on configuration and weight, so the angle of attack cannot easily be determined by the pilots while in flight. Modern commercial aircraft have an AOA indicator: the angle of attack is measured using an AOA sensor and displayed on the primary flight display. They also have warning systems to signal a high AOA, for example aural alarms and stick shaker.
While the Boeing 737 was being parked, the STBY PWR OFF (stand-by power off) light illuminated. Then a number of circuit breakers tripped: the DC battery, the Auxiliary Power Unit Generator Control Unit, the Generator Control Unit for both engines and the Generator Disconnect of both the left and the right generator. The engineer in Jakarta reset all of the circuit breakers and ran the engine to test before logging an entry to say that the problem was resolved. The aircraft was released back into service.
On the 26th of October, just three days before the crash, the flight crew of the aircraft that day arrived in Manado and logged a report that the SPD and ALT flags had appeared on the captain’s pilot flight display (PFD).
These flags appear over the speed and altitude information to show that there has been a data failure. The maintenance light on the overhead panel was also illuminated. The engineer in Manado checked the Onboard Maintenance Function to check the maintenance messages and found one highlighting a fault related to the Stall Management Yaw Damper (SMYD). After performing the associated task for the message, he did a self-test of the Stall Management Yaw Damper. The self-test passed and the engineer erased the maintenance message. The airspeed and altitude indicators now correctly showed on the captain’s display so the engineer released the aircraft for flight.
The Boeing 737-8 (MAX) flew to Denpasar next, arriving the following day. The SPD and ALT flags had appeared again during this flight along with the MAINT light on the overhead. The engineer in Denpasar checked the Onboard Maintenance Function and found another message to do with the Stall Management Yaw Damper: AD DATA INVALID. He performed the self-test which passed, and the MAINT light extinguished. He released the aircraft for flight.
The next flight for the 737 was a round trip between Denpasar and Lombok. Upon arrival back in Denpasar, the flight crew did not report any problems.
However, on the next flight that day, from Denpasar to Manado, the SPD and ALT flags again appeared on the captain’s display. This time the SPEED TRIM FAIL and MACH TRIM FAIL lights also illuminated during the flight. The speed trim and the mach trim both trim the stabiliser (as does the MCAS).
The stabiliser is a horizontal piece on the tail which tilts up and down in order to affect the pitch. The pilots can trim the angle of the stabiliser for nose-up or nose-down by using the electric trim switches or the manual trim wheel. The autopilot and other systems within the aircraft can also make trim inputs. For those interested, here’s more detail from The Boeing 737 Technical Site page about flight controls and stabiliser trim:
Speed trim is applied to the stabilizer automatically at low speed, low weight, aft C of G and high thrust. Sometimes you may notice that the speed trim is trimming in the opposite direction to you, this is because the speed trim is trying to trim the stabilizer in the direction calculated to provide the pilot with positive speed stability characteristics. The speed trim system adjusts stick force so the pilot must provide significant amount of pull force to reduce airspeed or a significant amount of push force to increase airspeed. Whereas, pilots are typically trying to trim the stick force to zero. Occasionally these may be in opposition.
Mach trim is automatically applied above M0.615 (Classics onwards), M0.715 (-1/200) to the elevators. This provides speed stability against Mach Tuck; i.e. as Mach increases, the centre of pressure moves aft and the nose of the aircraft tends to drop.
The important thing to understand in all this is that the speed trim and mach trim are automatically applied to the stabiliser in every 737, just as the MCAS trim inputs are in the 737 MAX.
The pilot can disengage the electric trim, and thus the automatic trim inputs, by flipping a cutout switch on the control stand. Generally, moving the control column (or yoke) in the opposite direction will stop the trimming action, so if a nose-up trim input is made then pushing the control column forward to pitch down will stop the trim input.
All that said, the really important thing to understand here is that the air data was already wrong and the speed trim and mach trim were failing as a result.
At this point, the 737 MAX had arrived in Manado for an overnight stay, with a flight to Denpasar scheduled for first thing the following morning.
The engineer used the Onboard Maintenance Function to find the error messages. This time when he performed the self-test of the Stall Management Yaw Damper, it failed. He found correlated maintenance messages and stepped through other equipment tests which led him to receive two more maintenance messages: AIR DATA INVALID and AOA SIGNAL FAIL.
Briefly: AOA stands for Angle of Attack. The performance of an aircraft is affected by the speed that the aircraft moves through the air; it is critical for the aircraft’s Angle of Attack. If your aircraft is about to stall and you aren’t able to reduce the angle of attack, you will crash. A common cause for a stall is that the airspeed is too low: the wings lose lift and it doesn’t matter one bit what your groundspeed is, your aircraft will stop flying. On the other hand, if your airspeed gets too high, it can cause fatal structural damage to the aircraft. The important thing to understand here is that Air Data is used to determine your angle of attack and the faulty angle of attack on the left (captain’s) side was a critical point in the crash on the 29th.
The engineer at Manado reset a number of circuit breakers to do with the left Air Data Inertial Reference Unit and then ran another self-test of the Stall Management Yaw Damper and the Digital Flight Control System. This time both tests passed.
As a part of the fault isolation procedure, he should have checked the wiring of the Air Data Module and the Air Data Inertial Reference Unit. But it was raining and there was a risk of lightning in the bad weather, so the engineer did not perform the wiring checks. He inspected the electrical connectors and did not find anything wrong. He recorded in the log that the problems were not active.
When the flight crew arrived the next morning (the 28th of October and the day before the crash), the engineer spoke to them about the actions he had taken. This particular flight crew had experienced one of the previous faults with the SPD and ALT flags and requested that more be done to fix the underlying problem.
The engineer suggested that they would be better off doing this in Denpasar.
The 737 MAX departed Manado for the flight to Denpasar and, as the flight crew expected, they experienced another data failure and the SPD and ALT flags again appeared where the speed and altitude information should have been. In addition, the SPEED TRIM FAIL and the MACH TRIM FAIL lights had illuminated again. The Onboard Maintenance Function showed AD DATA INVALID and STALL WARNING SYS L on the status message page.
The 737 MAX offers an advanced onboard network system (ONS) to connect airline operations and maintenance with key data. This is a network of aircraft systems collecting a high volume of aircraft data and consolidating that data with the Onboard Maintenance Function (OMF) which can be access via the flight deck or remotely, for example on a tablet. This data is used to create maintenance procedures, enabling the engineers to perform maintenance and fault isolation for each of the aircraft systems without having to access each one individually in the electronics equipment bay. It allows for focused troubleshooting and report.
The engineer at Denparsa once again performed the self-test of the Stall Management Yaw Damper, which failed. The Onboard Maintenance Function showed various maintenance messages again referring to the Inertial Air Data and the AOA signal. The engineer reset the circuit breakers of the left Air Data Inertial Reference Unit and conducted the various self tests again. This time the self-tests all passed.
However this time, probably at the request of the flight crew, the engineer took into account that the faults kept recurring and decided to replace the AOA sensors for trouble-shooting. There was no AOA sensor on site so he ordered an AOA sensor from Batam Aero Technique in Batam and grounded the aircraft until it arrived in Denpasar.
Once the AOA sensor arrived, he removed the existing left-hand sensor and installed the new one which had arrived from Batam Aero Technique. The next step was to perform an installation test.
The Aircraft Maintenance Manual gives two methods for doing this but one of them required test equipment which the engineer did not have. The second method involves deflecting the AOA vane up and down to the furthest and checking each position on the Stall Management Yaw Damper computer to ensure it gives the correct indication. There is no record of the results which means that it is now impossible to know whether the installation test was successful or not.
The engineer then performed the heater test, which involves dropping water onto the AOA vane. It passed the test. He used the Built-in Test Equipment on the control display unit of the flight management computer and it showed that there were no current faults.
Jumping ahead for a moment: The engineer claims to have completed all of these tests as normal and even supplied photographs, which he claimed he had taken after he had replaced the AOA sensor replacement.
However, the photograph of the Captain’s pilot flight display has a time-stamp that was taken before the AOA sensor part had arrived in Denpasar and the photographs of the Stall Management Yaw Damper were not the same aircraft.
So when I say that the aircraft passed all the standard tests after the new AOA sensor was installed, we should remember that this is based on the word of one man, an engineer who did not correctly log his results. He may have cut corners and certainly had high motivation to claim that he had run all the necessary checks but no evidence to back his claims. Or maybe he did everything correctly except for the log and the photographs.
He released the 737 MAX back into service on the evening of the 28th October (local time), the night before the crash. It was put into action for the next flight which was from Denpasar to Jakarta. This flight deserves looking at in more detail but within this context I’d like to note that when the flight crew landed in Jakarta, they reported that after they took off, they received IAS DISAGREE and ALT DISAGREE alterts, that is, the indicated air speeds and the altitude information as collected from multiple sources did not all match. The FEEL DIFF PRESS light had also illuminated which meant that there was a very specific failure.
A common reason for the illumination is that one of the hydraulic systems powering the elevators (which control the pitch) has failed. This might be because one of the elevator pitots has failed. The Elevator Feel Computer receives dynamic pressure from two pitot tubes mounted on either side of the vertical stabiliser.
Finally, it may be signalling a fault related to the Stall Management and Yaw Damper which uses a reduced system A pressure. If this reducer fails, the A system pressure related to the feel actuator is higher than normal, which triggers the FEEL DIFF PRESS light.
The engineer at Jakarta responded to this report by flushing the left pitot and the static Air Data Module and ran an operational test. He noted that the results were satisfactory. Then he cleaned the electrical connector to the computer which had illuminated the error and ran a self-test, which passed.
He released the 737 MAX back into service at 2:30 in the morning of the 29th, local time, ready for Lion Air flight 610 departing at 05:45.
“The engineer suggested that they would be better off doing this in Denpasar.”
I would like to suggest that this ‘engineer’ would be better off in prison then shirking his responsibilities when lives are in his hands.
I did think that was rather flippant, especially after almost every flight was reporting speed and altitude errors.
Just from my fast read through, it appears that the plane (only a couple of months old!) was throwing errors or warnings during most flights for almost 3 weeks. Is this number of warnings/errors remotely close to normal, or is it reasonable to assume that the plane itself should have been grounded earlier and the manufacturer told to sort it out due to not being airworthy?
It clearly had a faulty part and the repeated issues showed that the minimal maintenance being done was not actually addressing the issue, just temporarily clearing the symptoms. It shouldn’t have gone back to the manufacturer but the issues should have been treated more seriously.
However, when the AOA sensor was finally replaced, it clearly had serious issues of its own. The engineer who did the installation showed evidence that he had tested it — evidence that was proved to be false. So a new part was put into place which was clearly not working correctly.
There are another two issues related to this, which I will try to cover soon: 1) why was there no AOA disagree message which might have helped the engineers actually identify the problem and 2) how did it happen that a replacement AOA sensor was faulty?
A pretty interesting talk on the subject, though it gets a bit incoherent in the last few minutes of the actual talk:
https://media.ccc.de/v/36c3-10961-boeing_737max_automated_crashes
The Q&A is interesting in that the (software orientated) audience seem to characterize the problem as a software bug when it seems to me to be more of a wider systems engineering issue. AFAIK, the software did what it was specified to do; it was the implications of that specification which weren’t properly thought through.
There’s more on the software to come — I have to admit that I had not quite understood the issues before getting deeper into this. And I would agree with you that it was not a bug but a problem in the specification and/or testing.
CCC uploaded it to youtube at https://youtu.be/PlaMQBEg-9M , if you like to peruse the comments.
One comment suggested comparing it to the 2009 crash of Turkish Arlines Flight 1951 at Schiphol.
From the table, “test of the servicable AOA sensor”, it appears that this procedure requires getting out of and back inside the plane 3 times, unless you have a helper, and a 5° margin of error seems rather sizeable for such a consequential piece of data. I think the test at best ensures that the holes were aligned correctly when the new sensor was screwed to the hull, and no wires crossed. (The holes seem irregularly spaced anyway,so it should not be possible to mount this incorrectly?)
I wonder what takes to call up the AOA number on the SMYD display in flight.
The AOA mismatch was 21°, which is 1/17 of a circle, and impossible to achieve by mis-mounting the sensor. It appears that the error was introduced when that sensor was refurbished in the US (it was not a new part), and while that would have been caught had the installation check been carried out properly, I can see why the mechanic felt they could rely on the part working properly and on installing it correctly, but thereby missing a chance to prevent the fatal accident.
“The engineer suggested that they would be better off doing this in Denpasar.” — I’ve looked at both Manado and Denpasar airports on Google Earth, and Manado seems to be a much smaller operation than Denpasar. Given the impact of these ALT and SPD errors, the judgment call to not tinker with the airplane with the probably limited facilities at Manado and leave it to well-rested daytime engineers at Denpasar feels somewhat justified. That Denpasar would then install a faulty part was quite unforeseeable.
I also wonder what the weather was like, and if rain tends to make engineers to want to skip tasks that require working outside the airplane.
It was raining; he skipped the wiring tests because of the risk of lightning. And yes, I’m sure it must be tempting to rush or skip things.
Part of the point of the installation tests is to check for faulty parts, so I disagree that the engineer was reasonable in feeling that they could rely on the part working properly. It’s another hole in the cheese, effectively.
I didn’t say that what the engineer did was reasonable; to skip a procedure that has been deemed so important that it must be documented is not reasonable. But it is just human to skip a safety precaution for convenience, when it has, in our experience, never made a difference, and we must consciously train ourselves to be reasonable and not do that.
Fair enough; although the point of the checklists, tests and supervision is to ensure that people don’t skip processes or take short cuts. I’d be surprised if the engineer still has a job there, although I haven’t investigated to see if there’s any information on this.
Before Sylvia will post her next article:
The missing element here, the big issue or, if you like: the glue that binds these elements is that Boeing appears to have been very frugal with the dissemination of information about the elemental differences that resulted from continuously developing the 737, rather than using their expertise to introduce a new type.
This had many advantages for the manufacturer and operators. Cost was a major factor: certification, training, etc.
But the original B737 was developed as a short-range airliner intended to serve small airports used by airlines that did not have a large infrastucture on these destinations. It had to be as self-sufficient as possible. The landing gear could be adapted even to suit relatively rough, unimproved runways. These considerations led to the design and introduction of a stairs that could be retracted into the fuselage. A very clever design that did away with the need for stairs that had to be brought to the aircraft, a much neater design than the stairs at the back of e.g. the BAC 1-11.
It also restricted the ground clearance, not an issue with the original JT8D engines but with the later, high bypass models this became an issue. Anyone who has had a closer look at them must have noticed that the nacelles are not circular, but flatter at the bottom (Of course, the fans themselves do not have a “flat spot”, the nacelles also house other components).
With the “Max”, Boeing took it all a step further. From the original “Classic” which carried about 70 passengers it grew into a short- and midrange airliner with 200 seats.
It required substantial design adaptations and the need to move the even larger engines more forward. This changed the flight characteristics of the aircraft, which Boeing “solved” by the additional MCAS which now has been found to have had catastrophic faults.
In many aircraft, the number of essential computerised systems installed are three. If one fails or malfunctions, the on-board computers will make a comparison and the two that agree will take over and the faulty one will go off-line. Warnings will appear on the annunciator pabel to alert the crew. Boeing obviously not only deemed two to be sufficient, but also was very secretive about the role that MCAS played. Certainly airlines were not always aware, witness the extra cost for certain alerts that were an “option” and not bought, cockpit crew were not given the extra training they should have been given, and I suspect that the same goes for ground engineers.
What is also not fully understood is that the engine nacelles themselves, at high angles of attack, provided extra lift. This resulted in a further uncommanded “nose-up” during climb, something the crew were not aware of because the MCAS was supposed to counter it.
Meanwhile, there now is talk about potential wiring looms to the command inputs to the tail plane, looms that may be too close together for safety.
There is also a disturbing video on Youtube that suggests that the company culture at Boeing underwent a drastic change, already at the introduction of the 787 “Dreamliner”, resulting in more emphasis on maintaining share value and profits at the expense of safety. It also was in this period that intensive lobbying resulted in the – at least partial – transfer of regulation from the FAA to Boeing itself.
1) The “Boeing philosophy” has always been that the pilot flies the plane; it fits with that philosophy that you stop thinking once you have determined that the pilot can turn this piece of automation (MCAS) off and keep flying, via the “runaway stab trim” checklist. Designing a computerized cockpit means you spend a lot of thought on making the computers so safe that they don’t need to be turned off (e..g through a system with redundancy and voting). In effect, this makes the computers a greater danger in the former case.
2) The “AOA disagree” alert was never meant to be optional. From the final report, KNKT.18.10.35.04 section 2.2:
“The certified design of Boeing 737-8 (MAX) was to include an AOA DISAGREE message on all aircraft. The software which generates the AOA DISAGREE message was subcontracted by Boeing [to] another company. The installed software did not include the AOA DISAGREE message for aircraft that was not installed with the AOA indicator. The Lion Air elected not to enable the AOA indicators on the PFDs and such the AOA DISAGREE message would not appear on both PFDs even though the DFDR recorded AOA value difference of about 21°.
The lack of an AOA DISAGREE message did not match the Boeing system description that was the basis for certifying the aircraft design. The software not having the intended functionality was not detected by Boeing nor the FAA during development and certification of the 737-8 MAX before the aircraft had entered service. Soon after, Boeing reviewed the situation and concluded that the inoperative AOA DISAGREE message on selected aircraft did not represent a safety of flight issue. One consideration was that additional maintenance alerts (e.g. stuck AOA or bent AOA) were still available. As a result, the implementation error was scheduled to be corrected for the next display system software update.”
I haven’t seen the video — but from what I read on the web this morning, the mess has well and truly hit the fan. A lot of internal emails have finally been released, and they are … enlightening … about some people exulting about “putting one over” on the FAA, and another describing the plane as being “designed by clowns who in turn are supervised by monkeys”; see https://www.npr.org/2020/01/10/795366610/boeing-employees-mocked-faa-privately-in-emails-before-737-max-disasters. And that’s only a sidebar in the BBC’s latest story, https://www.bbc.com/news/business-51058929, about Boeing being fined for faulty parts in the wings; this is looking like overall hasty work, not just one bad decision.
I was a software engineer for 20 years, all of it in direct-use systems (DTP and MCAD), but I’d already heard the consecutive lightbulb jokes involving fixing a hardware problem with software, and am glad I didn’t have to work on life-critical systems; the occasional push I got to ship something I wasn’t satisfied with would only risk a user having to restart the software while sitting safely in an office.
I’m working my way through the emails. It would be super nice to have the systems they are referencing explained in one place (Sylvia?); http://www.b737.org.uk has been helpful.
One of the issues is the RCAS (roll command alerting system); if one of the engines fails, the autopilot won’t be able to command enough roll to compensate, and the pilots have to trim for that before they can use the autopilot. If you have alerts to help you, going back to a classic without these alerts might be a problem.
Another neat thing is using the spoilers to control lift (DLC = direct lift control), which is helpful if you need to keep to a glide path on an approach with jammed elevators; and again, if you fly/train with that, going back to classic would be difficult.
These are some of the issues I have understood as a layman where having a differences training by watching some training video at home might ultimately not be enough, and Boeing was aware of that. It looks like Boeing retrofitted these systems to the 737NG (optionally!) to defuse the situation, so that prospective 737Max pilots would be trained on them before the airline received their first 737Max planes, and that would in theory eliminate the need for differences training on these systems.
Nice somments after my last effort.
I have always been a “stick-and-rudder” pilot, to the point that a check pilot of a certain airline queried me about my habit of hand-flying the ILS.
When I answered that I like to keep my skills up, the sneering reply was that if I felt that I needed the practice, maybe I should be taken off the line for some extra simulator training.
This contradicts Mendel when he mentions that Boeing’s philosophy is that “…the pilot flies the plane;” Airlines in general want, no even insist, that the autopilot be used as much as possible.
Anyway, what seems to emerge, is that a faulty MCAS can actually lead to a situation where manually flying the aircraft rapidly becomes impossible. I just wonder if it pushed the nose down before pilots had a reasonable time to assess the situation? I wonder if this was because of “Mach tuck”?
FAA Flight Standardization Board (FSB) Report, Revision: 16, Date: 10/17/2018, section 9.5: “Tuck and Mach Buffet Training: The B-737, B-737-CL, B-737NG, and B-737-MAX do not exhibit any Mach Tuck tendency and therefore no training is required for this flight maneuver. Demonstration of the aircraft’s overspeed protection capabilities is an acceptable substitute.”
I understand this to mean that you can’t possibly fly the B-737 too fast, and therefore mach tuck is not an issue?
The faulty MCAS does not make flying the plane impossible; the pilots have to realize that they need to turn electric trim off, and do it. That is the difficulty. The plane does not give them a choice of having electric trim, but not automatic trim. In theory, in a Boeing, pilot input can always override the computer; pulling back on the yoke to disable automatic nose down trim was a feature consistent with that philosophy, but on the 737Max, the right seat could no longer do that.
On an Airbus, in normal law, the pilot can’t overide the computer’s flight envelope protection. A stackexchange answer sums it up succinctly: “On a Boeing, pilots overrule the flight-envelope-protections by attempting to break the controls (“excessive force”). On an Airbus, pilots overrule the flight-envelope-protections by pressing two switches (2xFAC). – RedGrittyBrick Jan 7 ’15 at 15:33″
On an Airbus, the computer is responsible for keeping the plane safe, as long as the equipment is working. On a Boeing, the pilot is, there are just various bits and pieces of automation to help him do that, more with each generation of plane, but because the control chain is hydro-mechanically linked from the pilot to the control surfaces, it has to kinda compete with the pilot to do so. And this leads to, one, a patchwork nature of verious “helper” systems, and two, a system design that can abdicate responsibility for its own state. The Airbus flight computers know (in theory) when they’re working in a state they’ve not been designed for, they go to “alternate law” and tell the pilot, “we’re done here, automation is off because we can’t guarantee we’re making the correct decisions any longer”. MCAS, on the other hand, lacks this awareness: the flight computer knows when the AOA sensors disagree, but it does not turn MCAS off, that’s the pilot’s responsibility.
In practice, most of the time, this difference does not matter. System like Inertial Navigation, GPS, TCAS or EGPWS have driven improvements in flight safety that meant flying got orders of magnitude safer, sparing pilots edge-of-the-envelope emergency maneouvres. (When will they put ice sensors on the wings?) But in the special case where Boeing designed a system (MCAS) to guarantee the safety of the plane in an edge condition, the lack of authority of that system (due to the design philosophy) also led to a lack of responsibility designed into the system that made it fail to detect when it wasn’t working and turn itself off. And that is something Airbus designs differently: because the computer is responsible for the plane in normal law, it has to give this responsibility up when things go south, so the computer and its designers won’t be blamed when that situation leads to an accident. It needs to detect when it can no longer safely fly the plane, so it won’t be blamed for attempting to fly the plane when it can’t. (The question is, can the pilots actually deal with a flight mode that is not normally in use? That’s a training issue.) Oversimplified, with Boeing, you can always blame the pilots, so the system doesn’t need this kind of self-awareness designed into it. (The question is, do the pilots know they need to disable the system? That’s a training issue.)
The point I am trying to make is that the system design philosophy (which includes the role of the humans in the system) influences the ways in which the system fails, and that led to a plane where “the pilot outranks the automation” being brought down by automation–not quite what anyone expected.
ISTM that “the pilot outranks the automation” wasn’t actually followed here; if it were, the automation would have come with a Big Red Switch that would have let the pilot disable the automation quickly and directly. My read is that somewhere in the design cycle someone(s) at Boeing decided that automation sufficient to avoid requiring pilots to re-train was more important than keeping the pilot in control; this morning’s news (https://www.npr.org/2020/01/13/795835280/as-new-ceo-takes-charge-boeings-challenges-remain, about half-way down) reports that they’ve formally backed off this position — after pushing the first crashed airline very hard not to request training. I’ve been watching an unfolding mess in a writers’ association; like this case, it shows a major Watergate lesson that the coverup will cause more trouble than the original issue.
Well, you can’t put a big red switch on for everything. In fact, that’s part of the problem in this case — there were so many different alerts and alarms, they completely failed to recognise the symptoms of a runaway trim, which is what Boeing expected to happen.
Interesting to hear that you are following the RWA mess too. Another case of lots of moving parts and lots of failures…
I just saw (recommende by youtube) a 2011 documentary about quality control issues with Boeing subcontractors for the 737NG (dating back to 2005 or so). I’m not sure I want to subscribe to its alarmist stance on air safety without further data, but what it says about Boeing company culture has been confirmed by what the 737max affair (including the recent emails release) has now brought to light: it highlights Boeing management ignoring quality issues, Boeing failing its self-regulatory commitments, and the FAA failing its oversight role. The documentary was apparently made by Al Jazeera, but also aired on Australian TV, and features whistleblowers from Boeing’s Wichita plant and a former FAA employee. It kinda fits this blog post, because the LionAir crash had a Florida subcontractor failing to do its own quality control on the refurbished angle vane properly.
https://youtu.be/vWxxtzBTxGU
Hello everyone! My name is Jeff and I worked for Boeing for 25 years. Spent 17 years on the 747 program, the remaining with Boeing A.O.G. Aircraft Services. I saw the video you are referencing. From my experience, some of what they said (the whistle blowers) when they were giving details on confronting the vendor is absolute garbage. I wanted to turn the video off but finished it reluctantly. In my recollection, I really do not wish to sit through the video again. The vendor threatened the women. Boeing is run by some that are brilliant, others that I consider unintelligent. There was always me in every situation I encountered to ensure the issues were dealt with appropriately. I can’t speak with certainty about their story not being true. From every perspective, having been immersed in the manufacturing element, dealing with engineering, stores, managers. There is no way a line Q.A. had authority to go to any vendor. There are systems that have been in place for much longer than I worked there that address the issues as they come up. They were talking about a bear strap or maybe a fail safe having some short edge mnargins around the door if I remember correctly. A couple things to consider. How long had those employees been around any aircraft? How much did they ever interact with liason engineers? Were they ever familiar with the Boeing SRM? The Structural Repair Manual. There are few, not many, places on the 747 wings that edge margins cannot be maintained no matter what you do, or wish for. The attachment of some chords, to upper or lower panels, given the angle of the wing surface and the allowable area to jam one more rivet in up against a vertical member in some places is well under the design standard. Sometimes by half. It is known. It was drawn that way. If it was overlooked there were flagnotes added after engineering reviews that made certain areas acceptable and would state the absolute minimum required. I even paused the video when they were trying to talk about the tooling issues. It didn’t add up.
First. They were probably already fired from the company when they went to the vendor. In that case he may have said some of the things they claimed because they had no business talking about any of it. Much less investigating it. You need to know what youre looking at before you can factually claim that systemic errors are knowing allowed down the assembly line. I have seen Boeing nitpick the smallest of issues. If there are doublers or fail safes with short margins. They know about it, it was evaluated. If it is an issue there are service bulletins to identify increased frequency of inspections in those areas. Nothing that is flagged as defective is let out of the manufacturing environment. Are there things that slip by? Given the size of the machine and the complexity of it, most likely there are. Think about this. I drilled the holes. I inspected them. I stood back while Boeing QA looked at them. After I fastened the parts, they are checked by me, and agin by Boeing QA. When an area of the aircraft is ready to move positions. It is inspected by the entire crew, again by Boeing Qa, then the Boeing QA Lead. Who notifies the customer coordinator with a form 1401. The coordinator brings in the live customer. The aircraft is spotless and they spend at least a couple hours on a section such as the 747 Wing Body Join. All known unresolved issues are spelled out as exceptions on the 1401 and they are required to be inspected in the same chain of eyes when they are resolved.
I reading about the 737. I did my 25 years and left to get out of the rat race. I am so disappointed in what I see in the news. In fact I wrote the company that was kinda my wife and told them I was embarrassed to say I worked there after the second crash. Unreal they didn’t catch the Human Factors issue when we were made aware to look for those very kinds of problems at least annually in a day long class.
This is an absolutely awesome message section, or comments area and I will return to read more. I have some interesting photos if anyone wants to have a look. Quite a few actually.
I better copy this body of text in case it doesn’t want to send.
That done. Have a great Day everyone!!
Jeff
Hi Jeff! Thanks for your detailed comment. You might want to move over to the latest post which is where current discussion is: https://fearoflanding.com/accidents/accident-reports/lion-air-flight-610-and-the-aoa-disagree-alert/
I’ve also logged your details so that future posts by you will go straight through without hitting moderation.
I’d be happy to look at your photos if you want to email me at [email protected]
Hi Jeff!
Thanks for your insights!
Where did you work, in Everett or in one of the newer facilities? The film makes the point to say the culture was different there.
Of course you never know about disgruntled workers, but I did hear they updated the lightning protection on the 737 while it is down, I think that was mentioned in the film?
Covid-19 killed the A380, hopefully we’ll see many 777x showing off their wingtips next year!
I left out the FAA. I worked with an electrician at Boeing that was fired before he had more than a couple years in. We wondered why it took so long. Anyway, he was gone and we were better off for it. A few years later, I heard his voice nearby and looked over to see him standing there bragging he was now at the FAA. Talk about unbelievable. I am stating the absolute truth here. He wasn’t the only one either to return in another capacity for another company be it a vendor or an agency. As I said in my previous post. I was there for all my signatures, stamps, approvals etc. I sleep well at night. The FAA guy in the video likely was washed out of there. Just a little heads up.