ANR问题分析和原理学习

Application Not Responding

发布日期 2018-05-22

一、概述

ANR即Application Not Responding(应用程序无响应),一般在ANR的时候会弹出一个应用无响应对话框,同时会候产生一个日志文件trace.txt,位于/data/anr/文件夹下面,trace文件是Android Davik虚拟机在收到异常终止信号时产生的,最常见的一个触发条件就是Android应用中产生了FC(force close)。由于该文件的产生是在DVM中的,所以只有运行DVM实例的进程才能产生该文件,也就是说只有Java代码才能产生该文件,App应用的Native层(如Android Library、用c/c++编译的库)即使异常也不会产生ANR日志文件。我们可以通过ANR产生的traces日志文件分析应用在哪里产生了ANR,以此来有效解决应用中的ANR。</p>

二、ANR产生的原因

只用当应用程序的UI线程响应超时才会引起ANR,超时原因一般有两种:

1.当前的事件没有机会得到处理,例如UI线程正在响应另一个事件,当前事件由于某种原因被阻塞了。

2.当前的事件正在被处理,但是由于好事太长没有能够及时完成

根据ANR产生的原因不同,超时时间也不尽相同,从本质上讲,产生ANR的原因有三种,大致可以对应到Android中四大组件中的三个(Activity、BroadcastReceive、Service)

第一种:KeyDispatchTimeout:最常见的一种类型,原因是View事件或者触摸事件在特定时间内(5s)无法得到响应

第二种:BroadcastTimeout:BroadcastReceiver的onReceive函数运行在主线程,并在特定的时间内(10s)无法完成处理

第三种:ServiceTimeout:Service的各个生命周期函数在特定时间内(20s)无法完成响应

三、典型的ANR场景

1)主线程频繁进行IO操作,比如读写文件或者数据库;
2)硬件操作如进行调用照相机或者录音等操作;
3)多线程操作的死锁,导致主线程等待超时;
4)主线程操作调用join()方法、sleep()方法或者wait()方法;
5)耗时动画/耗资源行为导致CPU负载过重
6)system server中发生WatchDog ANR;
7)service binder的数量达到上限

四、ANR的定位和分析

1)Logcat分析

查询关键字(ANR|ActivityManager)

/system_process E/ActivityManager:
ANR in admanager.zhh.com.anrdemo (admanager.zhh.com.anrdemo/.MainActivity)
                                                             PID: 12639
                                                             Reason: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 6.  Wait queue head age: 5841.3ms.)
                                                             Load: 13.28 / 11.53 / 9.48
                                                             CPU usage from 71228ms to 0ms ago (1970-03-01 00:08:47.864 to 1970-03-01 00:09:59.093):
                                                               15% 920/system_server: 9.2% user + 5.9% kernel / faults: 10139 minor 285 major
                                                               8.2% 13985/com.tencent.mm: 6.5% user + 1.6% kernel / faults: 35481 minor 1879 major
                                                               0.1% 555/fingerprintd: 0% user + 0.1% kernel / faults: 51 minor 22 major
                                                               3.5% 266/mmcqd/0: 0% user + 3.5% kernel
                                                               3.2% 4424/com.tencent.android.qqdownloader:daemon: 1.7% user + 1.5% kernel / faults: 12007 minor 357 major
                                                               3.2% 430/adbd: 0.4% user + 2.7% kernel / faults: 2002 minor
                                                               3.1% 135/kswapd0: 0% user + 3.1% kernel
                                                               1.8% 12271/com.tencent.mobileqq: 1.2% user + 0.5% kernel / faults: 8799 minor 580 major
                                                               1.6% 11695/com.cleanmaster.mguard: 1.5% user + 0.1% kernel / faults: 9171 minor 72 major
                                                               1.3% 13039/com.tencent.mm:push: 0.9% user + 0.3% kernel / faults: 9969 minor 192 major
                                                               1.2% 4954/com.google.android.gms: 0.9% user + 0.2% kernel / faults: 7546 minor 209 major
                                                               1.1% 342/logd: 0.3% user + 0.8% kernel / faults: 628 minor 3 major
                                                               1.1% 1485/kworker/u13:3: 0% user + 1.1% kernel
                                                               1.1% 10/rcu_preempt: 0% user + 1.1% kernel
                                                               1.1% 12166/com.android.vending: 0.8% user + 0.3% kernel / faults: 9897 minor 380 major
                                                               1% 3846/com.tencent.qqmusic:QQPlayerService: 0.6% user + 0.4% kernel / faults: 7354 minor 485 major
                                                               1% 10486/kworker/u13:4: 0% user + 1% kernel
                                                               1% 11251/kworker/u13:1: 0% user + 1% kernel
                                                               1% 9498/com.cleanmaster.security:DefendService: 0.7% user + 0.3% kernel / faults: 11998 minor 287 major
                                                               0.9% 1973/kworker/u13:5: 0% user + 0.9% kernel
                                                               0.9% 14113/com.tencent.mm:exdevice: 0.7% user + 0.1% kernel / faults: 8802 minor 142 major
                                                               0.8% 10814/com.tencent.qqmusic: 0.4% user + 0.4% kernel / faults: 9823 minor 221 major
                                                               0.8% 2062/sogou.mobile.explorer: 0.5% user + 0.3% kernel / faults: 8532 minor 1288 major
                                                               0.8% 528/netd: 0.1% user + 0.7% kernel / faults: 4203 minor 35 major
                                                               0.8% 3925/com.google.android.gms.persistent: 0.4% user + 0.3% kernel / faults: 12578 minor 546 major
                                                               0.7% 556/cnss_diag: 0.6% user + 0% kernel / faults: 121 minor 5 major
                                                               0.6% 4971/com.cleanmaster.mguard:service: 0.4% user + 0.2% kernel / faults: 7468 minor 174 major
                                                               0.5% 9180/com.tencent.android.qqdownloader: 0.2% user + 0.3% kernel / faults: 9571 minor 216 major
                                                               0.5% 15/ksoftirqd/1: 0% user + 0.5% kernel
                                                               0.4% 3280/com.android.phone: 0.2% user + 0.2% kernel / faults: 1857 minor 133 major
                                                               0.4% 8489/kworker/0:1: 0% user + 0.4% kernel
                                                               0.4% 3127/sdcard: 0% user + 0.4% kernel / faults: 92 minor 17 major
                                                               0.4% 397/servicemanager: 0.1% user + 0.2% kernel / faults: 127 minor
                                                               0.4% 11878/swift.space.cleaner.free:service: 0.2% user + 0.1% kernel / faults: 6087 minor 140 major
                                                               0.4% 20/ksoftirqd/2: 0% user + 0.4% kernel
                                                               0.4% 8761/com.netease.cloudmusic:play: 0.3% user + 0% kernel / faults: 4819 minor 977 major
                                                               0.3% 6525/com.tencent.qqlive:services: 0.2% user + 0.1% kernel / faults: 7834 minor 483 major
                                                               0.3% 12250/com.tencent.mobileqq:MSF: 0.2% user + 0.1% kernel / faults: 5368 minor 114 major
                                                               0.3% 9717/com.speed.boost.booster:service: 0.1% user + 0.1% kernel / faults: 5526 minor 77 major
                                                               0.3% 8524/sogou.mobile.explorer:push_service: 0.2% user + 0.1% kernel / faults: 6280 minor 42 major
                                                               0.3% 427/irq/215-fc38800: 0% user + 0.3% kernel
                                                               0.3% 8721/kworker/1:2: 0% user + 0.3% kernel
                                                               0.3% 260/cfinteractive: 0% user + 0.3% kernel
                                                               0.3% 294/msm-core:sampli: 0% user + 0.3% kernel
                                                               0.1% 8756/kworker/u12:3: 0% user + 0.1% kernel
                                                               0.1% 11204/kworker/2:0: 0% user + 0.1% kernel
                                                               0.3% 3150/com.android.systemui: 0.1% user + 0.1% kernel / faults: 3318 minor 129 major
                                                               0.2% 7244/kworker/u12:2: 0% user + 0.2% kernel
                                                               0% 3084/VosMCThread: 0% user + 0% kerne

可以看到,Logcat日志信息中主要包含如下内容:

1)导致ANR的类名及所在包名:
admanager.zhh.com.anrdemo (admanager.zhh.com.anrdemo/.MainActivity)

2)导致ANR的进程名及ID:admanager.zhh.com.anrdemo,12639

3)ANR产生的原因(类型):Reason: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 6. Wait queue head age: 5841.3ms.),明显属于KeyDispatchTimeout类型。

4)系统中CPU使用率的统计信息及某段时间内的变换情况

从trace.txt文件分析

ANR Logcat信息可以帮助我们分析引发ANR的具体类的信息以及ANR的类型,但是却无法帮助我们定位到具体引发问题的代码行,这时候就需要借助ANR发生过程中生成的堆栈信息文件data/anr/trace.txt

获取trace文件:adb pull /data/anr/trace.txt

trace.txt文件分析

由于trace文件记录的是整个手机的ANR日志,所以我们需要根据进程名(包名)和ANR发生的时间找到对应日志,具体以pid 进程id开始,以pid进程id结束。由于阻塞始终会发生在主线程,因此我们需要关注
线程名为main的线程状态。下面摘出一部分关键日志进行分析:

//关键日志的起始标记
----- pid 20678 at 2018-08-13 21:58:59 -----
//应用程序包名
Cmd line: admanager.zhh.com.anrdemo
Build fingerprint: 'Android/aosp_bullhead/bullhead:7.1.2/N2G48C/han-li03221443:userdebug/test-keys'

//手机的CPU架构
ABI: 'arm64'

//堆内存信息
Heap: 33% free, 4MB/6MB; 27144 objects
Dumping cumulative Gc timings

Total time spent in GC: 10.680ms
Mean GC size throughput: 68MB/s
Mean GC object throughput: 1.32996e+06 objects/s
Total number of allocations 41348
Total bytes allocated 5MB
Total bytes freed 753KB

//手机Memory内存信息
Free memory 2MB
Free memory until GC 2MB
Free memory until OOME 187MB
Total memory 6MB
Max memory 192MB
Zygote space size 1128KB
Total mutator paused time: 463us
Total time waiting for GC to complete: 938ns
Total GC count: 1
Total GC time: 10.680ms
Total blocking GC count: 0
Total blocking GC time: 0

//主线程日志分析
//基本信息:main-线程名称,prio-线程优先级,tid-线程id,Sleeping-线程状态
"main" prio=5 tid=1 Sleeping
  //详细信息:group-线程组名称,sCount-线程被挂起的次数,dsCount-线程被调试器刮起的次数,obj-线程的Java对象地址,self-线程本身的Native对象地址
  | group="main" sCount=1 dsCount=0 obj=0x75190ed0 self=0x7f72095a00
  //线程的调度信息:sysTid-linux系统中内核线程id(观察发现主线程的线程号和进程号相同),nice-线程调度的优先级,cgrp-线程调度组,sched-线程调度策略和优先级,handle-线程处理函数地址
  | sysTid=20678 nice=-10 cgrp=default sched=0/0 handle=0x7f75fe0a98
  //上下文信息:state-线程调度状态,schedstat-线程在CPU中的执行时间、线程等待时间、线程执行的时间片长度,utm-线程在用户状态中调度的时间值,core-最后执行这个线程的CPU核序号
  | state=S schedstat=( 274637713 30817705 229 ) utm=20 stm=6 core=1 HZ=100
  //堆栈信息:stack-堆栈地址,stackSize-堆栈大小,余下的为堆栈信息,也是我们分析引发ANR问题的关键
  | stack=0x7fc8318000-0x7fc831a000 stackSize=8MB
  | held mutexes=
  at java.lang.Thread.sleep!(Native method)
  - sleeping on <0x02f69763> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:371)
  - locked <0x02f69763> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:313)
  at admanager.zhh.com.anrdemo.MainActivity$1.onClick(MainActivity.java:24)
  at android.view.View.performClick(View.java:5637)
  at android.view.View$PerformClick.run(View.java:22429)
  at android.os.Handler.handleCallback(Handler.java:751)
  at android.os.Handler.dispatchMessage(Handler.java:95)
  at android.os.Looper.loop(Looper.java:154)
  at android.app.ActivityThread.main(ActivityThread.java:6121)
  at java.lang.reflect.Method.invoke!(Native method)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:889)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:779)

"Jit thread pool worker thread 0" daemon prio=5 tid=2 Native
  | group="main" sCount=1 dsCount=0 obj=0x12c64790 self=0x7f6a635000
  | sysTid=20683 nice=9 cgrp=default sched=0/0 handle=0x7f71701450
  | state=S schedstat=( 696978 5677 2 ) utm=0 stm=0 core=2 HZ=100
  | stack=0x7f71603000-0x7f71605000 stackSize=1021KB
  | held mutexes=
  kernel: __switch_to+0x8c/0x98
  kernel: futex_wait_queue_me+0xd4/0x130
  kernel: futex_wait+0xfc/0x210
  kernel: do_futex+0xe0/0x920
  kernel: SyS_futex+0x11c/0x1b0
  kernel: cpu_switch_to+0x48/0x4c
  native: #00 pc 000000000001bcec  /system/lib64/libc.so (syscall+28)
  native: #01 pc 00000000000e7e4c  /system/lib64/libart.so (_ZN3art17ConditionVariable16WaitHoldingLocksEPNS_6ThreadE+156)
  native: #02 pc 000000000046c910  /system/lib64/libart.so (_ZN3art10ThreadPool7GetTaskEPNS_6ThreadE+248)
  native: #03 pc 000000000046bdac  /system/lib64/libart.so (_ZN3art16ThreadPoolWorker3RunEv+124)
  native: #04 pc 000000000046b6d0  /system/lib64/libart.so (_ZN3art16ThreadPoolWorker8CallbackEPv+132)
  native: #05 pc 0000000000068734  /system/lib64/libc.so (_ZL15__pthread_startPv+208)
  native: #06 pc 000000000001da7c  /system/lib64/libc.so (__start_thread+16)
  (no managed stack frames)

//关键日志的结束标记
----- end 20678 -----

由于ANR只会发生在主线程,所以我们需要关注主线程的状态,从上 面日志中分析可以知道:发生ANR时主线程的状态为Sleep,且引起该状态的地方在MainActivity$1.onClick(MainActivity.java:24)
这里写图片描述

五、ANR源码分析

下面分析Service内onCreate发生ANR异常时源代码的执行情况

大家都知道Android中的四大组件启动时,都会通过跨进程的方式利用Handler消息机制最终将消息Push到ActivityThread中的内部类去处理,因此我先在ActivityThread中搜索service.onCreate调用,然后依次追溯:

注:下面均以删除无关代码

ActivityThread中调用:

  private void handleCreateService(CreateServiceData data) {
            //调用service的onCreate
            service.onCreate();

    }

该方法由Activity的内部类H(继承与Handler)接收到Servic onCreate消息时调用,该消息由通过scheduleCreateService发出,该方法被ActiveService类调用

        public final void scheduleCreateService(IBinder token,
                ServiceInfo info, CompatibilityInfo compatInfo, int processState) {
            updateProcessState(processState, false);
            CreateServiceData s = new CreateServiceData();
            s.token = token;
            s.info = info;
            s.compatInfo = compatInfo;

            sendMessage(H.CREATE_SERVICE, s);
        }

ActiveServicde的realStartServiceLocked方法调用上面方法,这个方法比较关键需要认真分析:

private final void realStartServiceLocked(ServiceRecord r,
            ProcessRecord app, boolean execInFg) throws RemoteException {

        r.app = app;
        r.restartTime = r.lastActivity = SystemClock.uptimeMillis();

        final boolean newService = app.services.add(r);
        //发送delay消息(SERVICE_TIMEOUT_MSG)
        bumpServiceExecutingLocked(r, execInFg, "create");
        mAm.updateLruProcessLocked(app, false, null);
        updateServiceForegroundLocked(r.app, /* oomAdj= */ false);
        mAm.updateOomAdjLocked();

        boolean created = false;
        try {
            synchronized (r.stats.getBatteryStats()) {
                r.stats.startLaunchedLocked();
            }
            mAm.notifyPackageUse(r.serviceInfo.packageName,
                                 PackageManager.NOTIFY_PACKAGE_USE_SERVICE);
            app.forceProcessStateUpTo(ActivityManager.PROCESS_STATE_SERVICE);
            //最终执行服务的onCreate()方法
            app.thread.scheduleCreateService(r, r.serviceInfo,
                    mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
                    app.repProcState);
            r.postNotification();
            created = true;
        } catch (DeadObjectException e) {
            Slog.w(TAG, "Application dead when creating service " + r);
            mAm.appDiedLocked(app);
            throw e;
        } finally {
            if (!created) {
                // Keep the executeNesting count accurate.
                final boolean inDestroying = mDestroyingServices.contains(r);
                //当service启动完毕,则remove SERVICE_TIMEOUT_MSG消息
                serviceDoneExecutingLocked(r, inDestroying, inDestroying);

                // Cleanup.
                if (newService) {
                    app.services.remove(r);
                    r.app = null;
                }

                // Retry.
                if (!inDestroying) {
                    scheduleServiceRestartLocked(r, false);
                }
            }
        }

        if (r.whitelistManager) {
            app.whitelistManager = true;
        }

        requestServiceBindingsLocked(r, execInFg);

        updateServiceClientActivitiesLocked(app, null, true);

        // If the service is in the started state, and there are no
        // pending arguments, then fake up one so its onStartCommand() will
        // be called.
        if (r.startRequested && r.callStart && r.pendingStarts.size() == 0) {
            r.pendingStarts.add(new ServiceRecord.StartItem(r, false, r.makeNextStartId(),
                    null, null, 0));
        }

        sendServiceArgsLocked(r, execInFg, true);
    }

bumpServiceExecutingLocked方法内又调用了scheduleServiceTimeoutLocked方法:

    void scheduleServiceForegroundTransitionTimeoutLocked(ServiceRecord r) {
        if (r.app.executingServices.size() == 0 || r.app.thread == null) {
            return;
        }
        //指定id为SERVICE_TIMEOUT_MSG的消息
        Message msg = mAm.mHandler.obtainMessage(
                ActivityManagerService.SERVICE_FOREGROUND_TIMEOUT_MSG);
        msg.obj = r;
        r.fgWaiting = true;
        //前台Service和后台Service发送的Delay消息时间不同
        mAm.mHandler.sendMessageDelayed(msg, SERVICE_START_FOREGROUND_TIMEOUT);
    }

serviceDoneExecutingLocked方法

    private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying,
            boolean finishing) {
        r.executeNesting--;
        if (r.executeNesting <= 0) {
            if (r.app != null) {
                if (DEBUG_SERVICE) Slog.v(TAG_SERVICE,
                        "Nesting at 0 of " + r.shortName);
                r.app.execServicesFg = false;
                r.app.executingServices.remove(r);
                if (r.app.executingServices.size() == 0) {
                    if (DEBUG_SERVICE || DEBUG_SERVICE_EXECUTING) Slog.v(TAG_SERVICE_EXECUTING,
                            "No more executingServices of " + r.shortName);
                   //remove掉id为SERVICE_TIMEOUT_MSG的消息,从上面的分析可知,该方法是onCreate调用之前发出的一个Delay执行的消息
                   mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);
                } else if (r.executeFg) {
                    // Need to re-evaluate whether the app still needs to be in the foreground.
                    for (int i=r.app.executingServices.size()-1; i>=0; i--) {
                        if (r.app.executingServices.valueAt(i).executeFg) {
                            r.app.execServicesFg = true;
                            break;
                        }
                    }
                }
                if (inDestroying) {
                    mDestroyingServices.remove(r);
                    r.bindings.clear();
                }
                mAm.updateOomAdjLocked(r.app, true);
            }
            r.executeFg = false;
            if (r.tracker != null) {
                r.tracker.setExecuting(false, mAm.mProcessStats.getMemFactorLocked(),
                        SystemClock.uptimeMillis());
                if (finishing) {
                    r.tracker.clearCurrentOwner(r, false);
                    r.tracker = null;
                }
            }
            if (finishing) {
                if (r.app != null && !r.app.persistent) {
                    r.app.services.remove(r);
                    if (r.whitelistManager) {
                        updateWhitelistManagerLocked(r.app);
                    }
                }
                r.app = null;
            }
        }
    }

小结:Service启动调用onCreate方法之前send一个Delay执行的消息,当发生异常或者Service的onCreate方法执行完毕之后,remove掉之前send的消息,如果onCreate方法执行时间超过Delay的时间,那么该消息就会被处理,此时如果Delay的时间是我们设定的ANR时间,则发生ANR,系统作出处理,否则该消息被remove,不会被执行。以一种观察者模式的角度去实现。

id为SERVICE_TIMEOUT_MSG的消息被AMS中MainHandler接收

            case SERVICE_TIMEOUT_MSG: {
                if (mDidDexOpt) {
                    mDidDexOpt = false;
                    Message nmsg = mHandler.obtainMessage(SERVICE_TIMEOUT_MSG);
                    nmsg.obj = msg.obj;
                    mHandler.sendMessageDelayed(nmsg, ActiveServices.SERVICE_TIMEOUT);
                    return;
                }
                mServices.serviceTimeout((ProcessRecord)msg.obj);
            } break;

ActiveService的serviceTimeout方法中调用AppErrors中的appNotResponding方法,很明显ANR异常处理的方法:

    final void appNotResponding(ProcessRecord app, ActivityRecord activity,
            ActivityRecord parent, boolean aboveSystem, final String annotation) {
        ArrayList<Integer> firstPids = new ArrayList<Integer>(5);
        SparseArray<Boolean> lastPids = new SparseArray<Boolean>(20);

        if (mService.mController != null) {
            try {
                // 0 == continue, -1 = kill process immediately
                int res = mService.mController.appEarlyNotResponding(
                        app.processName, app.pid, annotation);
                if (res < 0 && app.pid != MY_PID) {
                    app.kill("anr", true);
                }
            } catch (RemoteException e) {
                mService.mController = null;
                Watchdog.getInstance().setActivityController(null);
            }
        }

        long anrTime = SystemClock.uptimeMillis();
        if (ActivityManagerService.MONITOR_CPU_USAGE) {
            mService.updateCpuStatsNow();
        }

        // Unless configured otherwise, swallow ANRs in background processes & kill the process.
        boolean showBackground = Settings.Secure.getInt(mContext.getContentResolver(),
                Settings.Secure.ANR_SHOW_BACKGROUND, 0) != 0;

        boolean isSilentANR;

        synchronized (mService) {
            // PowerManager.reboot() can block for a long time, so ignore ANRs while shutting down.
            if (mService.mShuttingDown) {
                Slog.i(TAG, "During shutdown skipping ANR: " + app + " " + annotation);
                return;
            } else if (app.notResponding) {
                Slog.i(TAG, "Skipping duplicate ANR: " + app + " " + annotation);
                return;
            } else if (app.crashing) {
                Slog.i(TAG, "Crashing app skipping ANR: " + app + " " + annotation);
                return;
            } else if (app.killedByAm) {
                Slog.i(TAG, "App already killed by AM skipping ANR: " + app + " " + annotation);
                return;
            } else if (app.killed) {
                Slog.i(TAG, "Skipping died app ANR: " + app + " " + annotation);
                return;
            }

            // In case we come through here for the same app before completing
            // this one, mark as anring now so we will bail out.
            app.notResponding = true;

            // Log the ANR to the event log.
            EventLog.writeEvent(EventLogTags.AM_ANR, app.userId, app.pid,
                    app.processName, app.info.flags, annotation);

            // Dump thread traces as quickly as we can, starting with "interesting" processes.
            firstPids.add(app.pid);

            // Don't dump other PIDs if it's a background ANR
            isSilentANR = !showBackground && !isInterestingForBackgroundTraces(app);
            if (!isSilentANR) {
                int parentPid = app.pid;
                if (parent != null && parent.app != null && parent.app.pid > 0) {
                    parentPid = parent.app.pid;
                }
                if (parentPid != app.pid) firstPids.add(parentPid);

                if (MY_PID != app.pid && MY_PID != parentPid) firstPids.add(MY_PID);

                for (int i = mService.mLruProcesses.size() - 1; i >= 0; i--) {
                    ProcessRecord r = mService.mLruProcesses.get(i);
                    if (r != null && r.thread != null) {
                        int pid = r.pid;
                        if (pid > 0 && pid != app.pid && pid != parentPid && pid != MY_PID) {
                            if (r.persistent) {
                                firstPids.add(pid);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding persistent proc: " + r);
                            } else if (r.treatLikeActivity) {
                                firstPids.add(pid);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding likely IME: " + r);
                            } else {
                                lastPids.put(pid, Boolean.TRUE);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding ANR proc: " + r);
                            }
                        }
                    }
                }
            }
        }

        //输出ANR异常Log

        // Log the ANR to the main log.
        StringBuilder info = new StringBuilder();
        info.setLength(0);
        //ANR发生时的进程Name
        info.append("ANR in ").append(app.processName);
        if (activity != null && activity.shortComponentName != null) {
            info.append(" (").append(activity.shortComponentName).append(")");
        }
        info.append("\n");
        //进程ID
        info.append("PID: ").append(app.pid).append("\n");
        //ANR发生的原因
        if (annotation != null) {
            info.append("Reason: ").append(annotation).append("\n");
        }
        if (parent != null && parent != activity) {
            info.append("Parent: ").append(parent.shortComponentName).append("\n");
        }

        ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);

        // don't dump native PIDs for background ANRs unless it is the process of interest
        String[] nativeProcs = null;
        if (isSilentANR) {
            for (int i = 0; i < NATIVE_STACKS_OF_INTEREST.length; i++) {
                if (NATIVE_STACKS_OF_INTEREST[i].equals(app.processName)) {
                    nativeProcs = new String[] { app.processName };
                    break;
                }
            }
        } else {
            nativeProcs = NATIVE_STACKS_OF_INTEREST;
        }

        int[] pids = nativeProcs == null ? null : Process.getPidsForCommands(nativeProcs);
        ArrayList<Integer> nativePids = null;

        if (pids != null) {
            nativePids = new ArrayList<Integer>(pids.length);
            for (int i : pids) {
                nativePids.add(i);
            }
        }

        // For background ANRs, don't pass the ProcessCpuTracker to
        // avoid spending 1/2 second collecting stats to rank lastPids.
        //dump栈信息
        File tracesFile = mService.dumpStackTraces(true, firstPids,
                                                   (isSilentANR) ? null : processCpuTracker,
                                                   (isSilentANR) ? null : lastPids,
                                                   nativePids);

        String cpuInfo = null;
        if (ActivityManagerService.MONITOR_CPU_USAGE) {
            mService.updateCpuStatsNow();
            synchronized (mService.mProcessCpuTracker) {
                //各进程使用CPU情况
                cpuInfo = mService.mProcessCpuTracker.printCurrentState(anrTime);
            }
            //CPU当前负载
            info.append(processCpuTracker.printCurrentLoad());
            info.append(cpuInfo);
        }

        info.append(processCpuTracker.printCurrentState(anrTime));

        Slog.e(TAG, info.toString());
        if (tracesFile == null) {
            // There is no trace file, so dump (only) the alleged culprit's threads to the log
            Process.sendSignal(app.pid, Process.SIGNAL_QUIT);
        }
        //将anr信息添加到dropbox
        mService.addErrorToDropBox("anr", app, app.processName, activity, parent, annotation,
                cpuInfo, tracesFile, null);

        if (mService.mController != null) {
            try {
                // 0 == show dialog, 1 = keep waiting, -1 = kill process immediately
                int res = mService.mController.appNotResponding(
                        app.processName, app.pid, info.toString());
                if (res != 0) {
                    if (res < 0 && app.pid != MY_PID) {
                        app.kill("anr", true);
                    } else {
                        synchronized (mService) {
                            mService.mServices.scheduleServiceTimeoutLocked(app);
                        }
                    }
                    return;
                }
            } catch (RemoteException e) {
                mService.mController = null;
                Watchdog.getInstance().setActivityController(null);
            }
        }

        synchronized (mService) {
            mService.mBatteryStatsService.noteProcessAnr(app.processName, app.uid);

            if (isSilentANR) {
                app.kill("bg anr", true);
                return;
            }

            // Set the app's notResponding state, and look up the errorReportReceiver
            makeAppNotRespondingLocked(app,
                    activity != null ? activity.shortComponentName : null,
                    annotation != null ? "ANR " + annotation : "ANR",
                    info.toString());

            通过Handler消息机制弹出ANR对话框
            // Bring up the infamous App Not Responding dialog
            Message msg = Message.obtain();
            HashMap<String, Object> map = new HashMap<String, Object>();
            msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
            msg.obj = map;
            msg.arg1 = aboveSystem ? 1 : 0;
            map.put("app", app);
            if (activity != null) {
                map.put("activity", activity);
            }

            mService.mUiHandler.sendMessage(msg);
        }
    }

Android中的ANR监测以Handler消息机制实现,使用观察者模式,当开始监测时注册消息(该消息在规定时间后执行),事件处理完之后移除消息,当该消息执行时说明事件处理超过了规定的时间,即ANR。

六、总结

当ANR发生时,我们可以先通过logcat定位ANR的类型,然后通过trace信息分析产生ANR的原因,需要重点关注主线程(main)的当前状态和CPU的使用情况!当然ANR还是以预防为主,切记不要在主线程做耗时操作,不管是主动发起还是调用系统方法都需要对耗时的地方进行评估!