MacOS Catalina – Overview of Problems

The purpose of this post is to create a quick summary of problems others and myself have encountered when starting to use the Catalina version of the MacOS operating system for Apple laptops and desktops.  The purpose of an idiot chart is to ask a quick list of items and the reader can determine which items describe their situation.  For the selected items, the page will then list documents or procedures that will aid in further identification and resolution of the problem.

This document is a work in progress.  Please contact me if you have any suggestions.

Some Common Problems

There are a few entries on the internet containing lists of fixes for Catalina.

There are a number of issues that are common to many of the complaints.

  • 32 bit applications are no longer supported.  If a 32 bit application is specified as the default application for opening a file, the file won’t be able to be opened.
  • It may be necessary to change the default applications for opening some files.
  • There are some additional security restrictions on access to files by applications.  Some of the directories that have had problems are network attached filesystems, Desktop, Documents, and Downloads.  Some applications need to be updated to work correctly with these restrictions.  You can also move the files used by these applications to another directory until an updated version is available.
  • Some 64 bit applications will require updates before they work correctly.
  • Resetting the System Management Controller (SMC) and/or Non-Volatile Random Access Memory (NVRAM) will resolve some problems.
  • In some cases, deleting plist files will resolve the problem with the system creating a new starter file.  Be sure to back up the files before doing anything to them.
  • The Activity Monitor application can be used to find processes that hang or use unreasonable amounts of CPU time.

Some Things Are Normal

Some people have been complaining about things they have seen after they have moved to Catalina, but a number of these are normal.

  • If you open the Disk Utility application, you will see two volumes for the disk containing the operating system.  In my case, they are Macintosh HD and Macintosh HD – Data.   The one without the Data suffix is a read-only volume that contains unchanging portions of the operating system.  The volume without the Data suffix is everything else on the disk.  There are also apparently some volumes on the boot disk that don’t show up in Disk Utility.  I suspect that these are for updates and recovery, and you shouldn’t worry about them.
  • If you select About This Mac under the Apple icon in the menu bar, you will see a window appear with information about your installation.  Selecting the Storage tab will cause a summary of the contents of your disks to appear.  Don’t worry about the big amount of disk space marked as Other on the boot drive.  This is actually the read/write portion of the operating system.   (95 gigabytes on my system)  This is normal.  There is no directory or file named Other.  These are not files that can be deleted without damaging the system.

Unable to Start Installation

Not all Apple computers can be loaded with the Catalina version of the operating system.  Apple has provided a document “macOS Catalina is compatible with these computers” and you should go through the list of supported hardware and minimum amount of RAM and available disk space.  If Apple says that your system can’t load Catalina, don’t try to install it even if you have people telling you that they know how to work around the limitations.

Installation Freezes

A number of people, including myself, have had the system hang when running the Catalina installer.  The system may hang with the message “Calculating remaining time” or “Twenty minutes remaining” or Less than one minute remaining.  If this condition lasts for more than twelve hours, do a hard shutdown of the computer, wait thirty seconds, and then start it up again.  You may have to do this two or three times.  If you have to force a shutdown, do a normal restart using the Apple menu once everything seems to be working properly.

Application Won’t Load

One of the major changes in Catalina is that 32 bit applications will no longer work.  If you look at the linked articles, you will see suggestions that will identify the vast bulk of the 32 bit applications on your system.  If you can’t live without those applications, do not load the Catalina version.

Attempting to run a 32 bit application will result in a pop-up menu reading “The developer of this app needs to update it to work with this version of macOS.  Contact the developer for more information”.

In addition, some applications will require updates before they will work with Catalina.  Check with the vendor for the applications having problems.  The vendor forums will normally have a lot of discussions concerning these problems.

Note: A number of people have complained that Microsoft Office for MacOS no longer works.  The contents of this package were 32 bit applications.

Incorrect Default Application for Opening Files

In this case, attempting to open a data file will produce an unexpected response.  It may indicate that the application is unable to handle the format used by the data or it may do nothing at all.

Each data file has a default application for opening it.  The default application can be determined using Finder by right-clicking on the file and selecting Get Info.    The window opened by Get Info allows you to view and change attributes of the file.  Because of changes to the applications (32 bit applications being no longer supported and the supported formats being changed in some cases), it may be that the default application is no longer capable of opening the file.  You can use the drop-down menu under “Open with” to change the default application.  If you click on “Change All…“, all data files of the same type will have the default application changed.

Unable to use Desktop, Downloads, and Documents Folders

Apple has placed additional access restrictions on the Desktop, Downloads, and Documents folders.  Applications using the normal Xcode work flows will set flags so that the system will ask the user if he wants to give access to the application for those folders.  Signaling yes will give the application access.  However, some applications, especially ports of UNIX/Linux software, use non-standard work flows that do not set the flags.  This was true with the MacOS version of GIMP and a number of other packages.  In this case, the application is unable to work with the directories and there is no way to add the access privileges.  Other directories may show the same problem, especially shared directories.

Although the problem has been resolved with the latest version of GIMP, there are probably still applications where the problem has not been resolved.  In this case, the simplest solution is to create a new subdirectory in the user folder (The one that contains the Documents, Desktop, and Downloads directories.) and move all of your work to this new folder.

Background Tasks Using Excessive CPU

If you use the Activity Monitor app to look at the five or six processes using the most CPU time, you may see a large portion of the system being utilized.  This can result in the following.

  • System responds slowly.
  • System is hot to the touch.
  • Fans run continuously or for long periods of time.

The first thing to realize is that there may be legitimate reasons for this.  Some applications, such as Spotlight, may reindex large portions of the disk space after an update.  In iMovie, the application seemed to redo the conversions to MPEG, which took a few hours.  There may also be other reasons for this behavior.

The following process can be used to examine questionable processes.

  1. Open the Activity Monitor application and select the CPU tab.  Click the “% CPU” column header until the processes are sorted by CPU consumption with the largest users at the top.  (A downward pointing caret will appear in the column.)
  2. Pick one of the processes and double click on the name.  This will cause a window to pop up containing more information about the process, including the parent process. If neither the names of the process or the parent process are helpful, click on the name of the parent process, and it will show you the parent of the parent process.

In my case, I found that the offending programs were part of Parallels Toolbox, so I uninstalled it.  (This is a separate product from Parallels Desktop.)  Other people apparently have found that various security and antivirus packages have caused the problem, with specific mentions of Kaspersky and Norton.  I am currently using Avira, and haven’t been having any problems.

A major problem has been with login and startup items.  Startup items are launched by launchd when the system boots, and can be identified by the fact that proceeding up through the chain of processes will have launchd and kernel at the top.  Login items can be identified by going to the Users and Groups panel of System Preferences.

Bluetooth Audio Problems

There have been complaints about connecting Bluetooth audio devices when using Catalina.  People are suggesting resetting SMC and NVRAM as discussed below.

The site CatalinaOSX (https://www.catalinaosx.com) suggests removing the troublesome Bluetooth devices from the list of known devices as well as the plist file containing the options for Bluetooth (/Library/Preferences/com.apple.Bluetooth.plist).   The command entry

$ plutil -convert xml1 -o – /Library/Preferences/com.apple.Bluetooth.plist

can be used to convert this file into a human readable form.

Unable to Access WiFi Routers

People have expressed difficulties in using WiFi access points when running Catalina.  This is especially true when dealing with access points that display pages asking the user to accept the rules for usage when they first connect to the site.

You can go into the Network menu of System Preferences and try the following.

  • Try creating a new location using “Edit Locations…” in the pull-down list of locations.  Select the newly created location and see if you can attach to the access point.
  • Click on “Advanced” in the Network pane.  This will show a list of access points that the system is aware of.  Note that Catalina only shows options for “Auto-Join” while previous versions had options for “Auto-Join” and “Auto-Login”.  If the access point you are attempting to join is already in the list, try deleting it and then turn off WiFi.  Turn WiFi back on and then try to attach to the access point.
  • There have been suggestions that the user should try rebooting in “safe mode” and then try to connect to the access point.

It appears that information on WiFi connections is contained in /Library/Preferences/com.apple.airport.preferences.plist which can displayed in human readable form using the following command.  (There is also a plist-orig file that apparently contains the original version.)

  • $ plutil  -convert xml1 -o  \ /Library/Preferences/SystemConfiguration/com.apple.airport.preferences.plist

Network Attached Storage

There has been some discussion of problems with Network Attached Storage (NAS) with Catalina, especially when waking from sleep mode.  There are a number of different protocols for NAS.

SMC Reset

Some people have indicated that resetting the System Management Controller (SMC) has resolved some problems.  The linked document will describe the procedure for carrying out this action.  The System Management Controller handles a number of hardware related actions such as startup, shutdown, sleep, wake, keyboard and display backlights, indicator lights, battery charging, fan speed.  It can also result in the computer running slowly.  See the article for more information.

NVRAM Reset

Another action that some people have indicated as helping is resetting the Non-Volatile Random Access Memory.  The linked document will describe the procedure for carrying out this action.  NVRAM stores some parameters such as display resolution, audio output volume, startup disk selection, time zone, and information about conditions when the computer was last shut down.  See the article for more information.

Diagnostic Tools

  • Apple Diagnostics – An Apple utility for obtaining information about your MacOS system.
  • EtreCheck – EtreCheck is a diagnostic tool that is available on the Apple MacOS App Store.  It has been mentioned on a number of posts in the Apple Community Forums

 

 

Cellebrite/GrayKey hack (continued)

On June 14, 2019, Wired published an article about Cellebrite having updated their hacking software so that it could unlock any version iOS up to 12.3.  The news was also reported by C/NET (Wired has a partial pay wall, so I wanted to show multiple sources.)  Since Apple only released version 12.4 in July, this means that it was effective for the latest release at the time.  In addition, the Apple security notice for version 12.4 does not mention the vulnerability being fixed nor have there been reports by Apple denying the vulnerability.  Wired said that they expect the problem to be solved in version 13, but Apple has reported the vulnerability fixed many times in the past.

I am not surprised, and have written about this several times in the past.  Let me repeat my hypothesis once more.  The bug is probably in the Secure Enclave Processor.  If it isn’t, Apple has even bigger problems with their code.  Using the nomenclature of the National Vulnerability Database, it is probably a member of the CWE-485:7PK – Encapsulation category.  The code needs to be changed so that it is impossible to test a passcode without incrementing the counter in case the test fails.  This would be accomplished by having both the test of the passcode and incrementing of the counter in the same subroutine with the incrementing of the counter coming first.  Good encapsulation would mean that the value of the counter would be local data to this subroutine so that no other code could modify it.  All tests of the password should be routed through this subroutine.

I was informed by Apple and others that it is impossible to discuss fixes before they are implemented and unwise to provide information on fixes that have been inserted or recommended.  However, if this information is not distributed, how can anyone have confidence that the problem has been resolved?

Instead of saying “continued” in the title, I probably should have said “continued ad nauseam”.

Boeing 737 Max (Part 3)

In Part 1, I discussed some of the nomenclature and systems for the Boeing 737 Max.  In Part 2, I discussed some of the reasons that information released by Boeing and the FAA reduced confidence in those organizations.  In this part, I am going to discuss some of the actions that could increase confidence.  This is based mainly on my experience with customer representatives on government contracts.

Stop Defensive Comments

People are looking for information from Boeing and the FAA.  The perception by the public, operators, and pilots was that some of the statements were made to protect Boeing’s sales and legal status and provided little usable information.

Meeting Legal Requirements

Some of the comments by Boeing were to state that all legal and regulatory requirements were met.  The last time that I heard this argument, it was in response to the sinking of the Titanic.  Although the Titanic had enough lifeboats for only one third of the passengers and crew, the White Star line stated that that was all the law required.  Meeting the legal requirements is required, but not necessarily sufficient.  In addition, the question of whether regulatory and legal requirements were met is a matter for the courts, not Boeing.  Having the FAA sign off does not mean that they met all requirements.  This means that the press releases serve no purpose except to avoid bad publicity.  Furthermore, it sounds self-serving and won’t improve the reputation of the company.

Senior Management

In Boeings press release on the removal of the AOA (Angle of Attack) Sensor Disagree Alert,  Boeing made the statement “Senior company leadership was not involved in the review and first became aware of this issue in the aftermath of the Lion Air accident.”  The customers, operators, and pilots don’t care who was at fault.  The “senior leadership” is by definition the senior management of the company.  As President Truman indicated when he placed a sign on his desk that “The Buck Stops Here”, senior management can’t pass the buck.  According to the maxim that “you can delegate authority, but responsibility can only be shared“, senior management is always responsible.

Assigning Blame

When problems occur due to a company’s actions, the customers don’t care who is to blame at the company.  As far as they are concerned, that is an issue for the company’s management, not the customer.  All the customer wants to hear is when and how the problem will be fixed, and what they are to do in the interim.

Pilot Error

Yes, there is such a thing as pilot error.  However, when the error is in how to respond to a fault in the system, there is a question as to whether the manufacturer did what was required to prevent pilot error. designers of the system must share responsibility.  If the manufacturer claims the maintenance and operating personnel were inadequately trained, the question arises as to who specified the amount of training required (manufacturer and regulatory agencies).  The manufacturer is also required to reduce pilot workload so that confusion and pilot error are reduced.

Inadequate Training

Our organization installed a computer system at a customer site, and the initial run was a miserable failure.  Our management blamed the problem on inadequate training of the personnel at the site.  The customer responded by stating (rather forcefully) that we had provided the training.  Even if the manufacturer doesn’t provide the training themselves, they generally supply most of the curriculum and materials for the courses.  Instead of saying that the training was inadequate, state how you feel the training should be improved.  Furthermore, inadequate training is not an excuse for poor design.

Another thing to consider is that proficiency in a time-sensitive action often requires practice in a simulator.  If the manufacturer expects someone to be proficient, they should require simulator time.  Otherwise, they shouldn’t complain about the pilots being unable to carry out the operation fast enough.

Improper Maintenance

If you are claiming inadequate or improper maintenance, many of the same issues apply.    The manufacturer has to insure that the materials that they supply are adequate.  This may also mean explicitly stating some things that the manufacturer thinks are obvious such as when a test flight without passengers should be carried out before a flight with passengers.

Actionable Information

Customers and regulators want actionable information.  They want to know what changes the manufacturer intends to make, and what changes should be made by the operator.  They also want sufficient information so that they can understand the implications of the changes and evaluate the appropriateness of the changes.  In addition, delaying the information can incur additional costs because problems will be harder to fix.

In this case, publishing articles discussing the problems and their resolution in more detail would provide more confidence in Boeing’s actions, but only if the reports are deemed to be honest.  Boeing could also create technical articles on methods for creating reliable aircraft and reviewing work to make sure that nothing is overlooked.

Provide Free Tools

Receiving something useful for free can make people very happy.  Providing something that provides better support for the customer would also demonstrate understanding of the customer’s needs.

Automated Testing

A good way to make the operators, pilots, and ground crews happier is to provide automated testing tools that make life easier for them.  For example, if there are duplicate AOA sensors, pitot tubes, static ports and other instruments, why not check them on a regular basis to make sure that the values match and are reasonable.  Questionable readings could then be recorded and presented to the ground crew at the next stop.  This would serve a variety of purposes.  Part of this would be covered under Built-In Test Equipment (BITE), although I don’t know to what extent such technology is currently being used.

  • The information could be sent to the next airport so that the ground crew could be waiting with the appropriate tools and parts when the plane lands.
  • The ground crew could then be provided with a tablet that would document the repair and/or test process.  This would reduce work for the ground crew and provide better reporting of actions.  It would also make it easier to verify that required follow-up testing is carried out.

Interactive Training and Reference Materials

The Quick Reference Handbook (QRH) is the document that pilots refer to when they don’t know what to do.  Familiarization and understanding of the QRH can be improved by having an annotated electronic version of the QRH with links to other material related to the step.  Instead of simply saying to adjust trim manually, put in an annotation that contains a link to a document to discuss what happens if you are unable to adjust the trim manually.

Improve Accounting

One of the claims, whether justified or not, was that Boeing put too much emphasis on cost-cutting.  In several cases, I have seen cases where the bookkeeping system indicates that individual units have saved money although the company as a whole actually loses money.  So, improving the accounting will often make the customers happy by preventing false savings from ruining the product.

If they want to make the point clearly, they could create articles on this topic, possibly a discussion of Goodhart’s Law as it applies to aeronautical design.  In this instance, the law states that any over-reliance on cost cutting as a target will make the measurement of cost cutting unreliable.  (The law actually refers to any measurement.)

The following are some examples of this.

Delivering The Mail

The mail room saved money by laying off the people who delivered mail to the various departments, and requested that the individual departments send people down to pick up the mail.

  • The people from the mail room had delivered mail to several departments in a single trip.  They also had a cart that allowed them to carry the mail.
  • The people from the departments had to make multiple trips because the time the mail was ready differed from day to day.  They also neglected to give the people bags to carry the mail, and this made the mail awkward to carry.
  • The people from the departments were much more highly paid than the mail room  personnel.

The result was that the cost of delivering the mail was increased by an order of magnitude, although the mail room itself showed a cost savings.

Preparing The Mail

My group was preparing dozens of reports a month, but the recipients were complaining that they weren’t receiving the reports until weeks after they were printed.  I was told to go down to the mail room and tell them to get the reports out quicker.  Instead, I went down and talked to them.  I found out that they were spending several man-days a month sorting the reports so that they could mail them out.  Since they had other tasks, they had to do this in between the other tasks.  I changed the program printing the reports so that they were already collated by recipient when they came off the printer.  After that, it only took half a man-day to put the reports in the envelopes and mail them out.

This meant that my group incurred the costs of changing the programs, while the mail room received the cost savings.  However, my manager was delighted by the fact that the recipients were getting their reports a few weeks earlier and no longer yelling at her.  The number of missing reports was also reduced.

Technical Debt

Technical Debt refers to the fact that saving money now by skipping tasks will often result in much higher expenses later.  The Boeing 737 Max MCAS system can be viewed as an example.  In 2010, Boeing estimated the cost of development for the 737 MAX at two to three billion dollars, while they have already taken a five billion dollar charge against earnings due to problems with the Boeing 737 Max, and it is likely that there will be more charges to come.  I don’t know how much was saved by the cost reductions, but I strongly doubt if it was enough to account for remedial costs.

 

Boeing 737 Max (Part 2)

Part 1 of this series discussed some of the nomenclature and attributes of the controls for the Boeing 737 Max.  This part is going to refer to the crashes of the Boeing 737 MAX and my observations on the actions taken by Boeing in response.  Admittedly, these experiences are based on my experiences as an engineer, but I have also included references to responses by others.

The Problem

The Boeing 737 Max has two Angle of Attack sensors (AOA) that measure the angle between the direction the nose is pointing and the direction from which the air is flowing.  If the angle of attack is too great, the plane can enter a stall condition.  If the auto-pilot is disengaged and some other conditions are met, a high AOA will cause the MCAS system to move the horizontal stabilizer to push the nose down a little to make it easier to recover from the stall.

If the AOA sensor has a fault (indicating an incorrect angle), the MCAS system can cause a non-normal condition that resembles a runaway stabilizer trim.  (mentour.pilot has a video on YouTube that shows the procedure for a runaway stabilizer trim recovery on a Boeing 737.)  However, in the two incidents that led to the grounding of the Boeing 737 MAX, the crew was unable to successfully apply the procedure, resulting in the loss of aircraft with all passengers and crew.  Part of the reason that they were unable to follow the procedure was it required too much torque to turn the trim wheel.

Background

The following facts are not disputed.  One source for a timeline of events is “Boeing 737 MAX crisis: a timeline” on AeroTimes News Hub (Part 1, Part 2).  There have been two crashes of Boeing 737 max aircraft: Lion Air Flight 610 on October 29, 2018 and Ethiopian Airlines Flight 302 on March 10, 2019.

Prior Opinion

Prior to these incidents, it appears that other regulatory agencies around the world were willing to accept FAA directives without review.  It also appears that people were willing to accept that an aircraft was safe as long as the FAA stated it was safe.  Boeing also had a reputation that caused people to accept its statements, although this may have been partly because of faith in the FAA.

My Opinion

It appears that no statements were issued by Boeing or the FAA between November 10, 2018 and March 10, 2019.  This is a period of four months, after which the FAA says that the list of required changes will be issued in April.  What was happening in the four months?  If they really wanted to increase trust, it seems that they should have released the draft document and requested comments.  After all, their memorandum states that 80% of the time for the preparation of the document had already elapsed.  It is now August, and the directive has not been released.

Boeing’s Reaction

The sequence of events damaged confidence in Boeing and the FAA ,and BusinessInsider wrote an article on this. My only complaint was that as an engineer, I resented the fact that a few of the people interviewed said Boeing executives were talking like engineers. They were talking like lawyers, not engineers.  And when people feel that the comments are written by lawyers, they tend to view them in a different light.  Bill Clinton is famous for saying “it depends on what the meaning of the word is is” in his testimony concerning Monica Lewinsky.

So when I read Boeing’s description of the MCAS software changes, I am tempted to parse the individual words.  The following is taken from the Boeing statement:

  • Flight control system will now compare inputs from both AOA sensors. If the sensors disagree by 5.5 degrees or more with the flaps retracted, MCAS will not activate. An indicator on the flight deck display will alert the pilots.
  • If MCAS is activated in non-normal conditions, it will only provide one input for each elevated AOA event. There are no known or envisioned failure conditions where MCAS will provide multiple inputs.
  • MCAS can never command more stabilizer input than can be counteracted by the flight crew pulling back on the column. The pilots will continue to always have the ability to override MCAS and manually control the airplane.

If the goal of Boeing is to provide assurance to passengers and operators of the 737 max aircraft, I see a number of problems. In addition to the sparsity of information, there are the following issues.

  • Will there be an alert for the pilots if the AOA sensors disagree or only if the MCAS software would activate without the disagreement. I would think that the flight and maintenance crew would want to know about the disagreement before an even where the nose would pitch upward.
  • The second statement says that there will be only one input to the stabilizer trim per event.Is this assumption based on the fact that MCAS will not trigger if the AOA sensors disagree, or is it based on a timer or other device to prevent multiple activations in a short period of time. Perhaps it would be reasonable to prevent repeated activations less than thirty seconds apart? Does the MCAS software know if the cutout switches are thrown? The statement that there are “no known known or envisioned failure conditions” is extremely weak.  Have they looked for potential failure conditions?
  • On the third statement, are they referring to the cutout switch as the means of overrriding MCAS, necessitating the use of the trim wheels? Is there another means of overriding MCAS while leaving the trim switches on the yoke operational? (Perhaps by using additional cutout switches to provide a finer granularity of control.)  Apparently the cutout switches were changed from the design on the Boeing 737 NG.

The statement on the necessity for the AOA Disagree Alert  also has problems. The main purpose of the release appears to be to indicate that senior management had no input on the decision that the AOA Disagree Alert was unnecessary. However, senior management had apparently made statements to lower management that delays to the delivery schedule were to be avoided at all costs, and did have indirect influence via this pathway. Adding the comment about senior management would not appear to assure the concerned parties.  Adding the comment would appear to be an attempt to show that the senior management were not at fault.

It also indicates that the AOA Disagree Alert was present on the 737 NG. Was this change in included in the training manuals and handbooks for the Boeing 737 MAX. Also, the statement that something is unnecessary is not the same as saying it would not be required by good engineering practice.  The fact that they presented the information to the FAA does not indicate that the decision by Boeing was correct.  Again, this added statement appears to be provided for use as a legal defense.  Another question is whether removal of the AOA Disagree Alert was indicated to the operators.

Trust in Software and Instruments

It is essential that the crew have confidence in their instruments and documentation as well as all of the systems on board the aircraft.  Although it is possible that instruments will sometimes lie, the first instinct has to be to believe the instruments.  Although pilots can often determine which instrument readings are bad by comparing them with other information, it will cause confusion, and confusion is bad.  However, computer programs can’t look out the plane’s windows and see that the nose isn’t pointing straight down and have a strong tendency to believe instrument readings.  This means that an automated system that believes an incorrect instrument reading can be disastrous.  There are several steps that can be carried out to increase the ability to trust the instruments and other avionics.

Pitot tubes, static tubes, and angle of attack indicators have been known to have a variety of faults. You therefore have to assume the presence of incorrect measurements.  That is why you have redundant sensors.  In addition, another possible source of tests for the quality of the measurements would be the attitude indicator or vertical speed indicator.

Boeing has lost a lot of this trust, especially in regard to the Boeing 737 MAX.  Maybe, I’m a bit paranoid, but I have found the following to be true.

  • When people say “trust me”, I don’t.
  • When people say it is good enough, it almost never is.
  • If they knew enough to deceive me successfully, they would have known enough to do it correctly the first time.  (This is another utterance from an Air Force General that sticks in my mind.)  When I read the Boeing statements, I get the feeling that they are being deceptive.
  • The only way to successfully cover yourself is with excellent work.

This is the type of thinking and methodology that programmers and aviation engineers should view as routine.  Why it isn’t routine is due to a number of factors that are beyond the scope of this blog.

Making Them Angry – Part I

I was originally thinking of making this post about the Boeing 737 max problems, but I realized that many people are reading about things like this and don’t even realize that there is a problem.  I remember an Air Force General many years ago make the statement “We wouldn’t be so angry with them if we thought that they understood why we were so angry with them”. It stuck in my mind as applying to a number of situations that I have seen in the past.   I have listed several of my encounters with these situations below.

Different People, Different Reactions

Many of the stories below involve military officers or civilian employees of the military services.  However, you will find similar situations in many other organizations, such as railroads, trucking, shipping, aviation, manufacturing, and construction.  Either the people in charge are task-oriented or there will be delays, accidents, and injuries.  The attitude may seem obsessive, but it’s the only way to keep your organization in business, prevent legal claims, and keep your employees healthy.

There are people who think that they can get away with not doing things the right way; that they can cut costs and speed things up without creating disasters.  However, if they were really that good, they would realize the dangers and implications of their actions.

The Proposal

Many years ago, a co-worker showed me a copy of a proposal that he had been working on. I looked through the document and was shocked that in many parts of the proposal, information was missing and been replaced by TBD (to be developed) and TBR (to be reviewed). I told him that if the proposal was presented to the customer representative, the representative would pick up the heavy three-ring binder and throw it at them. His response was “Oh, you heard about the meeting. We still don’t know why he was so angry.” The reason that the representative had been so angry was that so many sections had been left blank and replaced by TBD and TBR that there was effectively no proposal to be reviewed. If I could tell that in ten seconds and they couldn’t work it out in a week, I felt that we needed a different proposal writing team. In another case, the customer had asked if we had sent a rough draft by mistake instead of the actual proposal. My gut feeling was that it was a similar situation.

On another project, the customer wanted to know why we felt that the hardware being proposed was fast enough. When we replied that we were buying the fastest unit the vendor had, he became very angry. When dealing with heavily compute-intensive operations such as weather prediction, simulated wind tunnels, optimization, etc., the feasibility depends on the efficiency of the proposed algorithms. With inefficient algorithms and techniques, even the world’s fastest supercomputers might be too slow for the task.

When a customer sends an RFP (request for proposal), he expects a response that will demonstrate the vendor’s understanding of the problem, the ability to satisfy the RFP and some rough ideas of cost and development time. If the proposal doesn’t satisfy these requirements, the customer will not accept it as a proposal, and the representative will view it as an insult and a waste of his time.

Some people will tell the people evaluating the proposals that the problems aren’t that bad, and that their response is unreasonable.  I guarantee that that will just make them angrier.

The Manual

A co-worker had previously been involved in manual writing for a military contract. When the customer representative saw the first draft, he was furious. He sat down and marked up several pages, showing locations where the manual was hard to understand and ambiguous, used bad word choices, or failed to clearly indicate the tools to be used and how they would be used. My co-worker realized that this would take weeks for the entire document and asked his manager how he should proceed. His manager told him to fix the things that were explicitly marked and ignore the rest of the document, following which he would go on to his other assigned tasks.

A few weeks later, the customer representative came back for another visit. When he saw how much of the document had been left untouched, he was furious. The intensity was such that the co-worker actually felt afraid. I asked him where his manager had been at the time, and he replied that his manager had been on vacation that week.  I then asked him why he thought his manager had gone on vacation that week.

When the customer indicates that he wants to see changes made, he generally expects the whole document to be edited. Expecting him to be satisfied with a smaller amount of work will make the customer unhappy. After berating my co-worker, I suspect that the representative expressed his displeasure to the manager’s manager, who then had a talk with the co-worker’s manager when he returned from vacation.

With regard to writing instructions and manuals, being able to understand what you write is not enough. You have to make it clear enough for the person who is going to try to follow the manual. If the people using the manual can’t understand it, it is usually the fault of the writer, not the reader. The military takes manual writing very seriously. When you are dealing with explosives, weapons, large pieces of machinery, and other hazardous items, having people not understanding the manuals can result in very bad situations.

The Data Center

I was talking to a co-worker who had a friend who was a junior officer in the Air Force and who he felt had been treated very unfairly by the general. It turns out that the general was located on a base in the mid-west and had been talking to an old classmate in the Pentagon. The classmate then mentioned that he had just had a report that a data node on his base was no longer connected to the network. When the call ended,, he walked over to the building where the computer equipment was located. Being informed by someone in the Pentagon about a problem on his base before he hears it from the people on his base can make an officer very irritable. It turns out that the computer center had been below ground level and a flash flood had filled the ramp area leading down to the door with water. The power had been turned off to avoid electrical damage. The general found the junior officer in charge just standing there and demanded to know what was happening and when it would be fixed. The officer replied “I’m doing everything that can be done, sir”, after which the general demonstrated his usage of the English language. I don’t believe that “ripped up one side and down the other” really applied, but the general made his displeasure known.

I indicated that what the general had wanted to hear was something along the lines of “I will have the ramp emptied of water in ten minutes, all standing water in the data center mopped up within twenty minutes, and the power restored in half an hour. The data center will be operational and online in an hour. In the morning, I will send a memorandum to the base engineer asking that the area around the ramp be regraded to prevent this from happening again. Sir.” followed by a salute.

The co-worker was dismayed by such a logical answer, but responded by saying that there was no way to get all of the water out of the ramp area in ten minutes. I then asked him if this was a United States Air Force facility, and he agreed that it was. I informed him that all such facilities had large water pumps mounted on truck bodies for mobility as standard equipment, just waiting for such a problem. When he responded that he had been on a great many bases and never seen such a piece of equipment, I replied “You have never seen a fire engine?”

When a senior manager or customer representative wants a full report, he wants the person making the reports to follow the old rules for newspaper reporting: “Who, What, When, Where, Why, and How.” Not all of the information may be available at the time, but there should be enough to satisfy the other person that the situation is in hand. A senior officer also expects a junior officer to figure out what to do. It wouldn’t hurt for the junior officer to have a little list of options of who to contact.

  • Fire Department — They are good for dealing with fires, floods, and a variety of different types of building damage.
  • Medical Services — If somebody is bleeding, injured, non-responsive, or seems confused, this would be the logical source for assistance.
  • Officer of the Day (OOD) — If you feel there is a threat to the base.
  • Shore Patrol/Military Police — For the drunk who insists on singing at the top of his lungs at 0200.
  • Non-commissioned Officer — The Sergeants and Petty Officers are good sources of information. Don’t worry about looking foolish. They have already formed that opinion.

Just be sure that you talk to somebody and get the problem resolved.

Management Bliss Through Ignorance

One of the sections that I worked for had an “all hands” meeting and the manager asked for questions from the audience.  One of the attendees said that he had heard that our three biggest customers were “mad as hell” with us and wanted the director’s response.  The director said that it was true but that we were good at smoothing over problems with our customers.  That seemed rather doubtful as the customers were telling us that it “was becoming very difficult to have confidence in us” and talking about canceling contracts.  When somebody with a military background talks about it being “difficult to have confidence”, that is a very high level of disapproval.

They had had an employee attitude survey which indicated that the employees had very little trust in management.  The manager indicated that our section had only been two-thirds of the employees being polled.  He said the problem must have been with the other group being polled because he hadn’t heard of any trust problems in his section.

Next Steps

The next few parts will cover some more situations, the personalities involved in these situations, what drives them, and what can make the people happy.

Boeing 737 Max (Part 1)

There has been a great deal of discussion concerning the problems experienced with the Boeing 737 Max that resulted in all of the aircraft being grounded until the problems are identified and remedial actions are developed, tested, approved, and installed.

I am viewing this from the viewpoint of a software developer and systems engineer.  However, this still requires an understanding of the basic concepts of aviation.  If you don’t understand the subject matter, you won’t understand how to develop the system.

However, I am still learning.  I will update this as I learn more.

Terms

I’m not a pilot and have provided my best understanding of the terms being used in the articles.  In writing this glossary, I am specifically writing about the Boeing 737 NG and Boeing 737 Max.  (The Boeing 737 Max is the version after the Boeing 737 NG.

Please let me know if you find any errors or places where it needs more information.

  • Aircraft Flight Manual — This is the manual approved by the FAA that gives the pilot the information that he needs to operate the aircraft.
  • Auto-pilot — A system that will adjust the control surfaces to maintain a constant course and elevation
  • Auto-throttle — A system that will adjust the engine throttles to maintain a constant speed.
  • Elevator — This a horizontal control surface at the rear of the plane.  By tilting the surface up and down, forces are applied to move the nose up or down relative to the tail.  (This is known as pitch.)  The elevator is normally controlled by moving the sidestick (In video games, we call this a joystick) or the yoke away or from the pilot.
  • Horizontal stabilizer — This is another horizontal control surface that modifies the force applied by the elevator.  Horizontal stabilizer trim is the act of using this to reduce the force applied by the elevator.  The horizontal stabilizer trim tab is the actual control surface.  On the Boeing 737, the position of the horizontal stabilizer is controlled using a switch on the yoke.  (See here for an explanation from Air&Space Magazine.  See here for a brief video from mentour.pilot on the controls located on the Boeing 737 NG control yoke.)  There is also a horizontal stabilizer tab wheel that is connected to the horizontal stabilizer trim tab with a cable.  The wheel shows the position of the tab and allows it to be adjusted manually.  (See here for a video from mentour.pilot for Boeing 737 NG.)
  • Horizontal Stabilizer Trim Runaway – It appears that this is defined in the Boeing manuals as continuous uncommanded movement of the horizontal stabilizer tab.  However, the instructions for dealing with Horizontal Stabilizer Trim Runaway covers some other problems as well.
  • Maneuvering Characteristics Augmentation System (MCAS) This was a piece of software introduced in the Boeing 737 MAX to aid the pilot in avoiding and recovering from stalls. It was only active when the auto-pilot was disengaged and iIt had big problems.    It was activating because of a mechanical fault and sending commands to the motor for the stabilizer trim tab.  (Video from mentour.pilot  Why does the B737Max8 need MCAS in the first place? )
  • Non-normal checklist (NNC)
  • Quick Reference Handbook (QRH) — This is a short document listing things the pilot will have to reference in a hurry, often due to something going wrong.
  • Requirements Traceability Matrix — A list of every requirement and specification in a set of documents.  The location in other documents where the item is implemented is then identified and placed in the document.
  • Roller Coaster Maneuver — This was listed as a way to move the stabilizer trim tab manually if turning the trim wheel required too much force.
  • Speed Trim System (STS) — This piece of software sends commands to the motor on the stabilizer trim jackscrew to help keep the speed steady.  It is only active when the auto-pilot is disengaged.

Control of Horizontal Stabilizer Trim

There is a jackscrew in the tail that moves stabilizer up and down.  It can be moved by an electric motor or by a cable which can be manually moved from the cockpit.  Commands can be sent by the following methods.

  • There is a pair of switches on the control yoke that can send power to the electric motor driving the jackscrew.
  • The autopilot can trigger the electric motor that drives the stabilizer.
  • The MCAS system can trigger the electric motor that drives the stabilizer. (Boeing 737 Max only)
  • The STS can also trigger the electric motor that drives the stabilizer.
  • The trim wheel can be used to manually turn the jackscrew.

There are also cutout switches that can be used if there are problems with the aircraft. (See here for pictures of the cutout switches.)

On the Boeing737 NG, there are two switches marked “MAIN ELEC” and “AUTO PILOT”.  Flipping the switch marked “AUTO PILOT” will stop the auto-pilot from sending commands to the electric motor.  It should be mentioned that disengaging the auto-pilot will also stop it from sending commands to the motor.  It appears that flipping “MAIN ELEC” will either prevent the control yoke switches from sending power or cut off the electrical power at the motor.  In any event, flipping both switches will prevent the electric motor from getting power.

On the Boeing 737 Max, the two switches have been renamed “PRI” and “B/U”.  Apparently, throwing either switch will stop commands from the auto-pilot, MCAS, and the control yoke switches as well as power to the electric motor.  The auto-pilot can be disengaged separately, but MCAS can’t be disengaged without also cutting power to the electric motor, forcing the use of the trim wheel for manual control.

Boeing Responsibilities

As I understand it, Boeing has the following responsibilities to the public.

  • Make the plane as safe as practical.
  • Make the plane as easy to operate as practical.
  • Make the plane as easy to maintain as practical.
  • Avoid situations that will confuse the pilot or maintenance crew and cause them to take incorrect actions to the extent that is practical.
  • Provide training manuals and standards that will enable pilots and maintenance crews to have the skills to operate the aircraft.

The responsibility of the pilot is to maintain the skills that Boeing has declared are required to fly the plane.  There may also be some skills that Boeing feels are desirable or required in special situations.  You can see that regulatory agencies and the airlines should also specify additional skills that they feel are required, but they have to act on information from Boeing.

The term “as practical” probably needs to be better defined.  However, it doesn’t mean “as long as it doesn’t cause Boeing to lose money.”  Complaints about monetary losses where over 300 people died is likely to cause a huge backlash and loss of reputation.

Analysis of Problems

It has been said that “Success has many fathers, but defeat is an orphan.”  However, defeat has many fathers.  It’s just that nobody is willing to admit paternity.  When people aren’t willing to admit that there are problems in how they are doing things, they will keep suffering defeats.  Or as has been stated by Alcoholics Anonymous, the first step to recovery is to admit that you have a problem.

I remember a joke I heard several years ago.  A man wanted to show off his twin engine plane to a friend.  As they took off, he said “I’m having a little problem with one of the engines.  However, don’t worry.  I can fly the entire distance on just one engine.”  An hour later, one engine went out and the friend asked the pilot why he was so worried if he could fly on one engine.  His reply was that it was the other engine that he had expected problems with.  Now, he should have had both engines looked at before taking off, but it hadn’t seemed important at the time.  The condition of the plane had seemed to be “good enough”.  However, I have found that whenever someone says that something is good enough, it isn’t.  But you have to look for the things that aren’t good enough.   Of course you won’t find the problems if you don’t look for them.

As this is getting a little long, I am going to break this into a few parts.  The next part will cover the Quick Reference Handbook (QRH) and possible needs for improvement.

 

 

 

 

 

Robotics Parts and Information

Local Vendors

I used to get my electronics parts from a number of brick and mortar stores as Radio Shack (no longer in operation), Lafayette Radio Electronics (went out of business many years ago), and some stores specializing in amateur radio equipment.  However, I have found it difficult to find brick and mortar stores in the Philadelphia, PA area.

MicroCenter in Wayne, Pennsylvania has a number of robotics and electronics parts.  They carry starter kits for the Arduino and Raspberry Pi, which are good places to start experimenting with these technologies.  However, I have found the organization of this section of the store to be confusing, and it is hard to find things.

Electronics Suppliers

It now appears that I am going to have to get most of the supplies by ordering over the internet.  The firms listed here carry a wide variety of electronic parts. In addition, a list of suppliers can be found at https://www.build-electronic-circuits.com/buy-electronic-components/.  There is also a blog entry comparing some of the vendors at https://lowpowerlab.com/2018/04/14/component-sourcing-and-mouser-vs-digikey-pros-and-cons/.

If anyone has information on ordering from these firms, I would like to add it to the listing.  I would especially be interested in minimum purchase and shipping charges, since many people only need a small part or two to complete a project. This list is merely intended as a list of references that I have found in various pieces of literature.  No endorsement of statement of quality is intended.

  1. All Electronics
  2. Allied Electronics
  3. Arrow Electronics
  4. Avnet
  5. Digi-Key Electronics
  6. Hammond Manufacturing – Enclosures and Transformers
  7. Jameco Electronics – Five dollar processing fee for orders less than twenty five dollars (fifteen dollars for web orders).  Shipping via UPS, FedEx, or USPS.
  8. Markertek – The selection here is designed for broadcast and audio-video operators.  Because of this, the selection of parts is somewhat different from the other vendors.
  9. Mouser Electronics – No minimum purchase for normally stocked parts with shipping via UPS, FedEx, or USPS.
  10. Newark – also Newark Electronics and
    Newark Element14™

Robotics Oriented Vendors

These are companies that specialize in DIY robotics and electronics.  Many of them have educational sections, blogs, and forums as part of their website.  This is merely intended as a list of references that I have found in various pieces of literature.  No endorsement of statement of quality is intended.

  1. Actobotics  – When I went to MicroCenter, I found a number of robotic parts that were listed under the brand name Actobotics.  Some of these are now found on the ServoCity website but without the Actobotics branding.  (The Actobotics branding still appears in some of the videos.)  The trademark was apparently sold to SparkFun, with the web page being located at https://www.sparkfun.com/actobotics.   Actobotics is now apparently a set of modular parts for building robot chassis.
  2. AdaFruit   – The items sold on this web site are mainly designed for the Raspberry Pi microcomputer, although there are a few items for the Arduino microcontroller.
  3. Arduino  This is the official site for information on the Arduino microcontroller.  This is a very low cost microcontroller that is designed for a very small form factor and it can be embedded in DIY projects.
  4. Evil Mad Scientist – With a name like this, you can tell that they don’t take things too seriously.
  5. OSepp – Appears to be mainly oriented towards Arduino.  Many of their products are carried at MicroCenter and mail order distributors.
  6. Pololu – A number of robotics kits.
  7. Raspberry Pi – This is the official site for the Raspberry Pi.  The Raspberry Pi is a low cost UNIX microcomputer and is designed for a very small form factor that can be embedded in DIY projects.
  8. RobotShop
  9. ServoCity – They have a large selection of servos and motors for use in robots.  Some of the Actobatics robot kits are now listed on the ServoCity website, but without the Actobatics branding.  (The Actobotics brand name is still visible in some of the videos.)
  10. Solarbotics – A number of robotics kits and parts.
  11. Sparkfun   – Robotics kits and parts.
  12. Technologic Systems – This appears to be another source of embedded computers.  I’m not sure how useful it would be for robotics and electronics hobbyists.
  13. Tinkersphere – They have a brick and mortar store in New York City at 152 Allen Street, although they also sell online.  Now that there is no more Radio Row in Manhattan, this might be worth looking at if you are visiting the city.
  14. Velleman – Although their main site is at https://www.velleman.eu, the American site is at https://www.vellemanusa.com.  MicroCenter carries a number of items from this company as do many of the mail order firms in the USA..

Fixing the Cellebrite/GrayKey Hack

Tags

, , , ,

According to an article by Thomas Brewster on the Forbes website, it appears that Apple has closed the vulnerability on the iPhone used by the GrayKey unlocking tool.  However, what I have seen would seem to indicate that Apple is being deceptive about how they blocked the exploit.  Normally, I approve of deception in this regard, but I am also unable to determine whether the problem has been completely rectified.

  • Apple claimed that the USB Restricted Mode was what resolved the problem.  However, several organizations indicated that they had workarounds for this mode.  It would also seem that there are many circumstances where a phone could be seized within minutes of the lock screen appearing.  For example, seizure at customs inspection points and airplane boarding.  (See Bradley Ross, “Failure by Apple to Stop iPhone Unlock Exploit”.)
  • When I looked at the information on the exploit (Bradley Ross, “Analyzing a Hack”), it appeared that there was a better than even chance that the problem could be resolved by incrementing the pointer before testing the passcode.  There was also a similar problem with an earlier version of iOS, and it appears that the problem was only fixed for that specific instance as opposed to looking for similar situations.  I still don’t know if there was a search for similar situations.  Fixing a bug is not sufficient, you need to fix all instances of that class of vulnerability.
  • The articles on the exploit being blocked appear to be based on the one article by Thomas Brewster.  Having only one source makes it difficult to trust information.
  • As I said before, it is possible that Apple has corrected the vulnerability, but it does not appear that it was done in the way described in Apple.  More transparency would help in evaluation of the potential security problems.

What I would really like to see would be some of the following.

  • A statement from a different source that the problem has been resolved.  (I realize that others are searching for ways to work around Apple’s patch, and this would only be correct at this time.)
  • A statement on how to prevent this vulnerability in other programs or platforms.  This is not the same as saying how Apple fixed it.  It is quite possible to list a number of topics without mentioning which were used for this specific patch.  To quote the musical “Hello, Dolly”, techniques for eliminating vulnerabilities “are like manure.  They only work if you spread them around.”

References

Failure by Apple to Stop iPhone Unlock Exploit

Tags

, , , , ,

Cellebrite and GrayShift (GrayShift’s product is named GrayKey) have been selling their services for unlocking iPhones despite seeting the phone to delete data after ten failed login attempts. Since these companies apparently use the USB connection to run the exploit, Apple has inserted code (USB Restricted Mode) that will prevent data transfer via USB if the phone is locked and the phone has not been unlocked within the past hour. (For more information, see my previous blog entry.)

A least, that’s how it’s supposed to work. Even before the USB Restricted Mode was implemented in beta versions of the operating system, several people were saying that they already had updates ready that would enable their exploits to continue working. Personally, I thought that USB Restricted Mode would never be effective for the following reasons.

  • The real problem is that it is possible to test passcodes without incrementing the counters in the SEP (Secure Enclave Processor). This is not addressed in the press releases from Apple, and there is no indication that they are aware of the underlying cause. Unless you can identify the root cause of this vulnerability, it is impossible to determine all the ways that this vulnerability can be exploited.
  • USB Restricted Mode only addresses one method of exploiting the vulnerability. I suspect that there is a high probability that any vulnerability that can be exploited using items attached via USB can also be exploited via Bluetooth, WiFi, and malicious applications being introduced in the iOS App Store.
  • The chances that the phone has been unlocked in the past hour is actually very high. People seem to find it very hard to keep from using their iPhone for more than an hour at a time. A malicious actor could also wait until the user has accessed the phone and then steal it within fifteeen minutes.

As I stated in the previous blog entry, there is a strong possibility that the exploit can be thwarted by simply incrementing the counter of failed login attempts before testing the passcode. Whatever the case, it would appear unreasonable to declare fix without getting to the underlying flaw. It almost seems like Apple is denying the existence of an underlying flaw that can be exploited.

References

“New IOS Security Feature Ripe for Defeat”, SoftBro.tech, 21-August,2018, https://www.softbro.tech/new-ios-security-feature/

Bradley Ross, “Analyzing a Hack”, Bradley Ross’s Blog on Life and Computer Software, 16-June-2018, https://bradleyaross.wordpress.com/2018/06/16/analyzing-a-hack/ – This is my previous blog entry on this topic.

“Grayshift claims it defeated Apple’s forthcoming ‘USB Restricted Mode’ security feature”, AppleInsider, 14-June-2018, https://appleinsider.com/articles/18/06/14/grayshift-claims-it-defeated-apples-forthcoming-usb-restricted-mode-security-feature

Joseph Cox and Lorenzo Franceschi-Bicchierai, “Cops Are Confident iPhone Hackers Have Found a Workaround to Apple’s New Security Feature”, Vice Motherboard, 14-June-2018, https://motherboard.vice.com/en_us/article/pavwzv/cops-are-confident-iphone-hackers-have-found-a-workaround-to-apples-new-security-feature

Thomas Fox-Brewster, “Apple iOS Security Boost Not Stopping Cops Hacking iPhones”, Forbes26-July, 2018, https://www.forbes.com/sites/thomasbrewster/2018/07/26/apple-ios-security-boost-not-stopping-cops-hacking-iphones/#34279dcd7129

Vladimir Katalov, “USB Restricted Mode Inside Out”, ElcomSoft, 12-July-2018, https://blog.elcomsoft.com/2018/07/usb-restricted-mode-inside-out/

Oleg Afonin, “This $39 Device Can Defeat iOS USB Restricted Mode”, ElcomSoft Blog, 9-July-2018, https://blog.elcomsoft.com/2018/07/this-9-device-can-defeat-ios-usb-restricted-mode/

Analyzing a Hack

Tags

, , ,

There has been a lot of discussion on the internet recently on the ability of an attacker to test unlimited numbers of PIN/Passcode values without triggering the auto-delete function puilt into the iPhones. So I have put together this article to show a means of looking at such a problem logically.

Determining Probable Location of Vulnerability

The exploit involves testing more than ten possible passwords without triggering the erasure of data on the phone. According to the Apple literature, this would not be possible without flaws in the hardware, firmware, or software in the Secure Enclave Processor (SEP). There may be flaws in other parts of the phone that enable the flaw in the SEP to be exploited, but the exploits described involving Cellebrite, GrayKey, and others require flaws in the SEP that violate the postulates that make up the requirements for the SEP. Some of these postulates are as follows.

  • It is not feasible to remove the cover of the chip carrier containing the SEP and inserting probes to access or modify data without triggering the erasure. The description of several of the exploits state that the case and chip carriers are not opened or modified.
  • Key pieces of persistent data contained in the SEP can’t be read or written using the pins for the chip carrier without going through the software in the SEP.
  • There are certain “salt” values that are set during the manufacture of the device and stored in persistent memory in the SEP. Output data lines for memory containing this information only connects to the cryptographic portion of the chip and not by the general purpose CPU in the SEP. It is not possible to read operating system, firmware, or other files without using the cryptographic portion of the chip.
  • Certain variables, such as the number of consecutive unsuccessful login attempts and the time of the last unsuccessful login attempt are considered extremely critical. They should only be accessible by the SEP firmware and the code in the firmware capable of reading and/or writing this information has been subjected to additional review.

Web Search

The first step in analyzing anything is generally research. So I spent a little time on the internet using Google to search for relevant articles. Some combinations that will work well with Google are the following.

  • ios unlock tool
  • “ios 11” unlock tool
  • iphone crack passcode

When you use something like Google Search, it isn’t enough to just look at the first few entries. It is necessary to go through pages of listings and then try new searches based on the items you find.

Relevent Events

Based on the Google search, I found that the relevant literature seemed to fall within a few categories. I have listed them below in chronological order.

  1. Apple CVE-2014-4451 This was an exploit that allowed users to try an unlimited number of PIN codes. The articles were dated November 18, 2014, and the vulnerability was reported detected and resolved within the development of iOS 8.
  2. Secure Enclave Processor Decryption – Articles published on August 17, 2017 indicate that the decryption key for the Secure Enclave Processor had been published. This means that an attacker could reverse compile the SEP code to learn how it works.
  3. Chinese Cracker Box – This is a cracker box (reportedly from China) that will crack the PIN code on the latest iPhones. The articles were dated August 17, 2017. I am simply referring to it as the “Chinese Cracker Box” for convenience. The cracker community is not necessarily known for “truth in labeling”.
  4. GrayKey Cracker Box – GrayKey is a cracker device manufactured by GrayShift. The articles state that information on the device first appeared in late 2017.
  5. Cellebrite Exploit – Celebrite provides much less information than the others, but a number of articles in February and March 2018 indicate claims that Cellebrite has developed mean of unlocking the latest iPhones.It appears that development of this technique was in late 2017 since one of the articles stated that it was developed in the last few months.

A review of the dates seems to indicate a few connections between the problems.

  • Reports on the decryption of the Secure Enclave Processor (SEP) and the Chinese Cracker Box appeared at the same time. This was a few years after CVE-201404451.
  • Usage of CVE-2014-4451, the Chinese Cracker Box, and the GrayKey device appear to interrupt a process after a passcode is tested, but before the counter for unsucessful login attempts is incremented.
  • The GrayKey and Cellebrite cracks appear to have been developed at about the same time and development was after
    the Chinese Cracker Box.

Hypothesis

Scientific analysis first requires observation and research. The next step is to create a hypothesis. Let us consider the following as a hypothesis.

  1. Assume that a person interested in studying vulnerabilities on the iPhone has been collecting iPhones after each update. That would seem reasonable given the amount of money being spent.
  2. After the firmware decryption key was determined, it would be possible to reverse compile the SEP code before and after CVE-2014-4451 was resolved.
  3. Comparing the two versions of the SEP code would enable the researcher to find the section of code that waschanged to remove the vulnerability.
  4. (This step is a wild-ass guess.) The fix may have been to move the incrementing of the counter before the test of the trial password. After all, a successful login will reset the counter, so the overall logic would remain unchanged.
  5. You would then examine the current version for other places where the same erroneous code appears. (e.g., other locations where the password is tested before incrementing the counter.) Each of these locations would then be checked against the exploit in CVE-2014-4451 to see if any of them could be exploited.
  6. The Chinese Cracker Box would then be developed using the newly found vulnerability. Sales of the device took place by August, 2017.
  7. The release of the firmware decryption key came from someone who had access to the development of the Chinese Cracker Box. This is based on the fact that both items were reported at the same time.
  8. Since the Chinese Cracker Box was available on the underground market, both GrayKey and Cellebrite would likely have purchased copies of the unit. Four months sounds like a reasonable development time for them to implement their “new and improved” versions.

Human Behavior

I have seen many cases where a maintainer will fix one occurence of a bug but not look for other occurences. In fact many times, the word from management is “Just fix this bug and close out the task. Don’t waste time.” I have heard this expressed very frequently and very forcefully.

Since the potential hackers can now reverse compile the code on the SEP, it would be only logical for them to determine the coding changes that fixed earlier problems and then look for other occurrences of the incorrect logic. In this case, a good method would be to look for all code that can change the value of the counter, a task well within the capability of many IDE’s (Integrated Development Environment) or even the UNIX grep command.

Another problem that I have seen is that nobody seems to use fault trees or coverage charts. This is an attempt to use formal logic to determine items that need to be examined. Many consider it too much work to through all the possibilities, and state that it’s good enough. People keep telling me “Don’t worry. It’s good enough”. My experience in these cases is that it’s almost never good enough and I worry a lot. However, this will covered in more detail in a future post.

References

I have divided the references according to the events that they describe.

Apple CVE-2014-4451

Decryption of Secure Enclave Processor

Chinese Cracker Box

GrayKey Device

Cellebrite Exploit

Secure Enclave Processor