Android10系统启动之Zygote进程详解

这篇文章我们来详细学习下Android10系统启动中Zygote进程的启动过程

发布日期 2020-03-11

1.概述

上一节接讲解了InIt进程的整个启动流程。Init进程启动后,最重要的一个进程就是Zygote进程,Zygote是所有应用的鼻祖。SystemServer和其他所有Dalivik虚拟机进程都是由Zygote fork而来。

Zygote进程由app_process启动,Zygote是一个C/S模型,Zygote进程作为服务端,其他进程作为客户端向它发出“孵化-fork”请求,而Zygote接收到这个请求后就“孵化-fork”出一个新的进程。

由于Zygote进程在启动时会创建Java虚拟机,因此通过fork而创建的应用程序进程和SystemServer进程可以在内部获取一个Java虚拟机的实例拷贝。

2.核心源码

/system/core/rootdir/init.rc
/system/core/init/main.cpp
/system/core/init/init.cpp
/system/core/rootdir/init.zygote64_32.rc
/frameworks/base/cmds/app_process/app_main.cpp
/frameworks/base/core/jni/AndroidRuntime.cpp
/libnativehelper/JniInvocation.cpp
/frameworks/base/core/java/com/android/internal/os/Zygote.java
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
/frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
/frameworks/base/core/java/com/android/internal/os/ZygoteConnection.java
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
/frameworks/base/core/java/android/net/LocalServerSocket.java
/system/core/libutils/Threads.cpp

3.架构

3.1 架构图

3.2 Zygote 是如何被启动的

rc解析和进程调用

Init进程启动后,会解析init.rc文件,然后创建和加载service字段指定的进程。zygote进程就是以这种方式,被init进程加载的。

在 /system/core/rootdir/init.rc中,通过如下引用来load Zygote的rc:

import /init.${ro.zygote}.rc

其中${ro.zygote} 由各个厂家使用,现在的主流厂家基本使用zygote64_32,因此,我们的rc文件为 init.zygote64_32.rc

3.2.1 init.zygote64_32.rc

第一个Zygote进程

进程名:zygote

进程通过 /system/bin/app_process64来启动

启动参数:-Xzygote /system/bin --zygote --start-system-server --socket-name=zygote

socket的名称:zygote

   service zygote /system/bin/app_process64 -Xzygote /system/bin --zygote --start-system-server --socket-name=zygote
            class main
            priority -20
            user root                                   //用户为root
            group root readproc reserved_disk           //访问组支持 root readproc reserved_disk
            socket zygote stream 660 root system        //创建一个socket,名字叫zygote,以tcp形式  ,可以在/dev/socket 中看到一个 zygote的socket
            socket usap_pool_primary stream 660 root system
            onrestart write /sys/android_power/request_state wake       // onrestart 指当进程重启时执行后面的命令
            onrestart write /sys/power/state on
            onrestart restart audioserver
            onrestart restart cameraserver
            onrestart restart media
            onrestart restart netd
            onrestart restart wificond
            onrestart restart vendor.servicetracker-1-1
            writepid /dev/cpuset/foreground/tasks       // 创建子进程时,向 /dev/cpuset/foreground/tasks 写入pid

第二个Zygote进程:

zygote_secondary

进程通过 /system/bin/app_process32来启动

启动参数:-Xzygote /system/bin --zygote --socket-name=zygote_secondary --enable-lazy-preload

socket的名称:zygote_secondary

     service zygote_secondary /system/bin/app_process32 -Xzygote /system/bin --zygote --socket-name=zygote_secondary --enable-lazy-preload
            class main
            priority -20
            user root
            group root readproc reserved_disk
            socket zygote_secondary stream 660 root system      //创建一个socket,名字叫zygote_secondary,以tcp形式  ,可以在/dev/socket 中看到一个 zygote_secondary的socket
            socket usap_pool_secondary stream 660 root system
            onrestart restart zygote
            writepid /dev/cpuset/foreground/tasks

从上面我们可以看出,zygote是通过进程文件 /system/bin/app_process64 和/system/bin/app_process32 来启动的。对应的代码入口为:

frameworks/base/cmds/app_process/app_main.cpp

3.2.2 Zygote进程在什么时候会被重启

Zygote进程重启,主要查看rc文件中有没有 “restart zygote” 这句话。在整个Android系统工程中搜索“restart zygote”,会发现以下文件:

/frameworks/native/services/inputflinger/host/inputflinger.rc      对应进程: inputflinger
/frameworks/native/cmds/servicemanager/servicemanager.rc           对应进程: servicemanager
/frameworks/native/services/surfaceflinger/surfaceflinger.rc       对应进程: surfaceflinger
/system/netd/server/netd.rc                                        对应进程: netd

通过上面的文件可知,zygote进程能够重启的时机:

  1. inputflinger 进程被杀 (onrestart)
  2. servicemanager 进程被杀 (onrestart)
  3. surfaceflinger 进程被杀 (onrestart)
  4. netd 进程被杀 (onrestart)
  5. zygote进程被杀 (oneshot=false)
  6. system_server进程被杀(waitpid)

3.3 Zygote 启动后做了什么

Zygote启动时序图:

  1. init进程通过init.zygote64_32.rc来调用/system/bin/app_process64 来启动zygote进程,入口app_main.cpp
  2. 调用AndroidRuntime的startVM()方法创建虚拟机,再调用startReg()注册JNI函数;
  3. 通过JNI方式调用ZygoteInit.main(),第一次进入Java世界;
  4. registerZygoteSocket()建立socket通道,zygote作为通信的服务端,用于响应客户端请求;
  5. preload()预加载通用类、drawable和color资源、openGL以及共享库以及WebView,用于提高app启动效率;
  6. zygote完毕大部分工作,接下来再通过startSystemServer(),fork得力帮手system_server进程,也是上层framework的运行载体。
  7. zygote任务完成,调用runSelectLoop(),随时待命,当接收到请求创建新进程请求时立即唤醒并执行相应工作。

3.4 Zygote启动相关主要函数:

C空间:

[app_main.cpp] main()
[AndroidRuntime.cpp] start()
[JniInvocation.cpp] Init()
[AndroidRuntime.cpp] startVm()
[AndroidRuntime.cpp] startReg()
[Threads.cpp] androidSetCreateThreadFunc
[AndroidRuntime.cpp] register_jni_procs()    --> gRegJNI.mProc

Java空间:

[ZygoteInit.java] main()
[ZygoteInit.java] preload()
[ZygoteServer.java] ZygoteServer
[ZygoteInit.java] forkSystemServer
[Zygote.java] forkSystemServer
[Zygote.java] nativeForkSystemServer
[ZygoteServer.java] runSelectLoop

4. Zygote进程启动源码分析

我们主要是分析Android Q(10.0) 的Zygote启动的源码。

4.1 Nativate-C世界的Zygote启动要代码调用流程:

4.1.1 [app_main.cpp] main()

int main(int argc, char* const argv[])
{
	 //zygote传入的参数argv为“-Xzygote /system/bin --zygote --start-system-server --socket-name=zygote”
	 //zygote_secondary传入的参数argv为“-Xzygote /system/bin --zygote --socket-name=zygote_secondary”
	...
    while (i < argc) {
        const char* arg = argv[i++];
        if (strcmp(arg, "--zygote") == 0) {
            zygote = true;
			//对于64位系统nice_name为zygote64; 32位系统为zygote
            niceName = ZYGOTE_NICE_NAME;
        } else if (strcmp(arg, "--start-system-server") == 0) {
			 //是否需要启动system server
            startSystemServer = true;
        } else if (strcmp(arg, "--application") == 0) {
			//启动进入独立的程序模式
            application = true;
        } else if (strncmp(arg, "--nice-name=", 12) == 0) {
			//niceName 为当前进程别名,区别abi型号
            niceName.setTo(arg + 12);
        } else if (strncmp(arg, "--", 2) != 0) {
            className.setTo(arg);
            break;
        } else {
            --i;
            break;
        }
    }
	..
	if (!className.isEmpty()) { //className不为空,说明是application启动模式
		...
	} else {
	  //进入zygote模式,新建Dalvik的缓存目录:/data/dalvik-cache
        maybeCreateDalvikCache();
		if (startSystemServer) { //加入start-system-server参数
            args.add(String8("start-system-server"));
        }
		String8 abiFlag("--abi-list=");
		abiFlag.append(prop);
		args.add(abiFlag);	//加入--abi-list=参数
		// In zygote mode, pass all remaining arguments to the zygote
		// main() method.
		for (; i < argc; ++i) {	//将剩下的参数加入args
			args.add(String8(argv[i]));
		}
	}
	...
    if (!niceName.isEmpty()) {
//设置一个“好听的昵称” zygote\zygote64,之前的名称是app_process
        runtime.setArgv0(niceName.string(), true /* setProcName */);
    }
    if (zygote) {	 //如果是zygote启动模式,则加载ZygoteInit
        runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
    } else if (className) {	//如果是application启动模式,则加载RuntimeInit
        runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
    } else {
		//没有指定类名或zygote,参数错误
        fprintf(stderr, "Error: no class name or --zygote supplied.\n");
        app_usage();
        LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
    }
}

Zygote本身是一个Native的应用程序,刚开始的进程名称为“app_process”,运行过程中,通过调用setArgv0将名字改为zygote 或者 zygote64(根据操作系统而来),最后通过runtime的start()方法来真正的加载虚拟机并进入JAVA世界。

4.1.2 [AndroidRuntime.cpp] start()

void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{
    ALOGD(">>>>>> START %s uid %d <<<<<<\n",
            className != NULL ? className : "(unknown)", getuid());
	...
    JniInvocation jni_invocation;
    jni_invocation.Init(NULL);
    JNIEnv* env;
	 // 虚拟机创建,主要是关于虚拟机参数的设置
    if (startVm(&mJavaVM, &env, zygote, primary_zygote) != 0) {
        return;
    }
    onVmCreated(env);	//空函数,没有任何实现

	// 注册JNI方法
    if (startReg(env) < 0) {
        ALOGE("Unable to register all android natives\n");
        return;
    }

    jclass stringClass;
    jobjectArray strArray;
    jstring classNameStr;

	//等价 strArray= new String[options.size() + 1];
    stringClass = env->FindClass("java/lang/String");
    assert(stringClass != NULL);
	
	//等价 strArray[0] = "com.android.internal.os.ZygoteInit"
    strArray = env->NewObjectArray(options.size() + 1, stringClass, NULL);
    assert(strArray != NULL);
    classNameStr = env->NewStringUTF(className);
    assert(classNameStr != NULL);
    env->SetObjectArrayElement(strArray, 0, classNameStr);

	//strArray[1] = "start-system-server";
    //strArray[2] = "--abi-list=xxx";
    //其中xxx为系统响应的cpu架构类型,比如arm64-v8a.
    for (size_t i = 0; i < options.size(); ++i) {
        jstring optionsStr = env->NewStringUTF(options.itemAt(i).string());
        assert(optionsStr != NULL);
        env->SetObjectArrayElement(strArray, i + 1, optionsStr);
    }

	//将"com.android.internal.os.ZygoteInit"转换为"com/android/internal/os/ZygoteInit"
    char* slashClassName = toSlashClassName(className != NULL ? className : "");
    jclass startClass = env->FindClass(slashClassName);
	//找到Zygoteinit类
    if (startClass == NULL) {
        ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
    } else {
		//找到这个类后就继续找成员函数main方法的Mehtod ID
        jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
            "([Ljava/lang/String;)V");
        if (startMeth == NULL) {
            ALOGE("JavaVM unable to find main() in '%s'\n", className);
        } else {
			// 通过反射调用ZygoteInit.main()方法
            env->CallStaticVoidMethod(startClass, startMeth, strArray);
        }
    }
	//释放相应对象的内存空间
    free(slashClassName);
    ALOGD("Shutting down VM\n");
    if (mJavaVM->DetachCurrentThread() != JNI_OK)
        ALOGW("Warning: unable to detach main thread\n");
    if (mJavaVM->DestroyJavaVM() != 0)
        ALOGW("Warning: VM did not shut down cleanly\n");
}

start()函数主要做了三件事情,一调用startVm开启虚拟机,二调用startReg注册JNI方法,三就是使用JNI把Zygote进程启动起来。

相关log:

01-10 11:20:31.369 722 722 D AndroidRuntime: >>>>>> START com.android.internal.os.ZygoteInit uid 0 <<<<<<
01-10 11:20:31.429 722 722 I AndroidRuntime: Using default boot image
01-10 11:20:31.429 722 722 I AndroidRuntime: Leaving lock profiling enabled

4.1.3 [JniInvocation.cpp] Init()

Init函数主要作用是初始化JNI,具体工作是首先通过dlopen加载libart.so获得其句柄,然后调用dlsym从libart.so中找到
JNI_GetDefaultJavaVMInitArgsJNI_CreateJavaVMJNI_GetCreatedJavaVMs三个函数地址,赋值给对应成员属性,这三个函数会在后续虚拟机创建中调用.

bool JniInvocation::Init(const char* library) {
  char buffer[PROP_VALUE_MAX];
  const int kDlopenFlags = RTLD_NOW | RTLD_NODELETE;
  /*
   * 1.dlopen功能是以指定模式打开指定的动态链接库文件,并返回一个句柄
   * 2.RTLD_NOW表示需要在dlopen返回前,解析出所有未定义符号,如果解析不出来,在dlopen会返回NULL
   * 3.RTLD_NODELETE表示在dlclose()期间不卸载库,并且在以后使用dlopen()重新加载库时不初始化库中的静态变量
   */
  handle_ = dlopen(library, kDlopenFlags); // 获取libart.so的句柄
  if (handle_ == NULL) { //获取失败打印错误日志并尝试再次打开libart.so
    if (strcmp(library, kLibraryFallback) == 0) {
      // Nothing else to try.
      ALOGE("Failed to dlopen %s: %s", library, dlerror());
      return false;
    }
    library = kLibraryFallback;
    handle_ = dlopen(library, kDlopenFlags);
    if (handle_ == NULL) {
      ALOGE("Failed to dlopen %s: %s", library, dlerror());
      return false;
    }
  }
  /*
   * 1.FindSymbol函数内部实际调用的是dlsym
   * 2.dlsym作用是根据 动态链接库 操作句柄(handle)与符号(symbol),返回符号对应的地址
   * 3.这里实际就是从libart.so中将JNI_GetDefaultJavaVMInitArgs等对应的地址存入&JNI_GetDefaultJavaVMInitArgs_中
   */
  if (!FindSymbol(reinterpret_cast<void**>(&JNI_GetDefaultJavaVMInitArgs_),
                  "JNI_GetDefaultJavaVMInitArgs")) {
    return false;
  }
  if (!FindSymbol(reinterpret_cast<void**>(&JNI_CreateJavaVM_),
                  "JNI_CreateJavaVM")) {
    return false;
  }
  if (!FindSymbol(reinterpret_cast<void**>(&JNI_GetCreatedJavaVMs_),
                  "JNI_GetCreatedJavaVMs")) {
    return false;
  }
  return true;
}

4.1.4 [AndroidRuntime.cpp] startVm()

该函数主要作用就是配置虚拟机的相关参数,再调用之前 JniInvocation初始化得到的 JNI_CreateJavaVM_来启动虚拟机。

int AndroidRuntime::startVm(JavaVM** pJavaVM, JNIEnv** pEnv, bool zygote, bool primary_zygote)
{
    JavaVMInitArgs initArgs;
	...
	 // JNI检测功能,用于native层调用jni函数时进行常规检测,比较弱字符串格式是否符合要求,资源是否正确释放。
	 //该功能一般用于早期系统调试或手机Eng版,对于User版往往不会开启,引用该功能比较消耗系统CPU资源,降低系统性能。
    bool checkJni = false;
    property_get("dalvik.vm.checkjni", propBuf, "");
    if (strcmp(propBuf, "true") == 0) {
        checkJni = true;
    } else if (strcmp(propBuf, "false") != 0) {
        /* property is neither true nor false; fall back on kernel parameter */
        property_get("ro.kernel.android.checkjni", propBuf, "");
        if (propBuf[0] == '1') {
            checkJni = true;
        }
    }
    ALOGV("CheckJNI is %s\n", checkJni ? "ON" : "OFF");
    if (checkJni) {
        /* extended JNI checking */
        addOption("-Xcheck:jni");
    }

    addOption("exit", (void*) runtime_exit); //将参数放入mOptions数组中

	 //对于不同的软硬件环境,这些参数往往需要调整、优化,从而使系统达到最佳性能
    parseRuntimeOption("dalvik.vm.heapstartsize", heapstartsizeOptsBuf, "-Xms", "4m");
    parseRuntimeOption("dalvik.vm.heapsize", heapsizeOptsBuf, "-Xmx", "16m");
    parseRuntimeOption("dalvik.vm.heapgrowthlimit", heapgrowthlimitOptsBuf, "-XX:HeapGrowthLimit=");
    parseRuntimeOption("dalvik.vm.heapminfree", heapminfreeOptsBuf, "-XX:HeapMinFree=");
    parseRuntimeOption("dalvik.vm.heapmaxfree", heapmaxfreeOptsBuf, "-XX:HeapMaxFree=");
	
	...
	
	//检索生成指纹并将其提供给运行时这样,anr转储将包含指纹并可以解析。
    std::string fingerprint = GetProperty("ro.build.fingerprint", "");
    if (!fingerprint.empty()) {
        fingerprintBuf = "-Xfingerprint:" + fingerprint;
        addOption(fingerprintBuf.c_str());
    }

    initArgs.version = JNI_VERSION_1_4;
    initArgs.options = mOptions.editArray(); //将mOptions赋值给initArgs
    initArgs.nOptions = mOptions.size();
    initArgs.ignoreUnrecognized = JNI_FALSE;

     //调用之前JniInvocation初始化的JNI_CreateJavaVM_, 参考[4.1.3]
    if (JNI_CreateJavaVM(pJavaVM, pEnv, &initArgs) < 0) {
        ALOGE("JNI_CreateJavaVM failed\n");
        return -1;
    }

    return 0;
}

4.1.5 [AndroidRuntime.cpp] startReg()

startReg首先是设置了Android创建线程的处理函数,然后创建了一个200容量的局部引用作用域,用于确保不会出现OutOfMemoryException,最后就是调用register_jni_procs进行JNI方法的注册

int AndroidRuntime::startReg(JNIEnv* env)
{
    ATRACE_NAME("RegisterAndroidNatives");
	//设置Android创建线程的函数javaCreateThreadEtc,这个函数内部是通过Linux的clone来创建线程的
    androidSetCreateThreadFunc((android_create_thread_fn) javaCreateThreadEtc);

    ALOGV("--- registering native functions ---\n");

	//创建一个200容量的局部引用作用域,这个局部引用其实就是局部变量
    env->PushLocalFrame(200);

	//注册JNI方法
    if (register_jni_procs(gRegJNI, NELEM(gRegJNI), env) < 0) {
        env->PopLocalFrame(NULL);
        return -1;
    }
    env->PopLocalFrame(NULL); //释放局部引用作用域
    return 0;
}

4.1.6 [Thread.cpp] androidSetCreateThreadFunc()

虚拟机启动后startReg()过程,会设置线程创建函数指针gCreateThreadFn指向javaCreateThreadEtc.

void androidSetCreateThreadFunc(android_create_thread_fn func) {
    gCreateThreadFn = func;
}

4.1.7 [AndroidRuntime.cpp] register_jni_procs()

它的处理是交给RegJNIRec的mProc,RegJNIRec是个很简单的结构体,mProc是个函数指针

static int register_jni_procs(const RegJNIRec array[], size_t count, JNIEnv* env)
{
    for (size_t i = 0; i < count; i++) {
        if (array[i].mProc(env) < 0) {/ /调用gRegJNI的mProc,参考[4.1.8]
            return -1;
        }
    }
    return 0;
}

4.1.8 [AndroidRuntime.cpp] gRegJNI()

static const RegJNIRec gRegJNI[] = {
    REG_JNI(register_com_android_internal_os_RuntimeInit),
    REG_JNI(register_com_android_internal_os_ZygoteInit_nativeZygoteInit),
    REG_JNI(register_android_os_SystemClock),
    REG_JNI(register_android_util_EventLog),
    REG_JNI(register_android_util_Log),
	...
}

    #define REG_JNI(name)      { name, #name }
    struct RegJNIRec {
        int (*mProc)(JNIEnv*);
    };
gRegJNI 中是一堆函数指针,因此循环调用 gRegJNI 的mProc,即等价于调用其所对应的函数指针。
例如调用: register_com_android_internal_os_RuntimeInit
这是一个JNI函数动态注册的标准方法。

int register_com_android_internal_os_RuntimeInit(JNIEnv* env)
{
    const JNINativeMethod methods[] = {
        { "nativeFinishInit", "()V",
            (void*) com_android_internal_os_RuntimeInit_nativeFinishInit },
        { "nativeSetExitWithoutCleanup", "(Z)V",
            (void*) com_android_internal_os_RuntimeInit_nativeSetExitWithoutCleanup },
    };

    //跟Java侧的com/android/internal/os/RuntimeInit.java 的函数nativeFinishInit() 进行一一对应
    return jniRegisterNativeMethods(env, "com/android/internal/os/RuntimeInit",
        methods, NELEM(methods));
}

4.2 Java世界的Zygote启动主要代码调用流程:

上节我们通过JNI调用ZygoteInit的main函数后,Zygote便进入了Java框架层,此前没有任何代码进入过Java框架层,换句换说Zygote开创了Java框架层。

4.2.1 [ZygoteInit.java]main.cpp

代码路径:frameworks\base\core\java\com\android\internal\os\ZygoteInit.java

main的主要工作:

  1. 调用preload()来预加载类和资源
  2. 调用ZygoteServer()创建两个Server端的Socket--/dev/socket/zygote 和 /dev/socket/zygote_secondary,Socket用来等待ActivityManagerService来请求Zygote来创建新的应用程序进程。
  3. 调用forkSystemServer 来启动SystemServer进程,这样系统的关键服务也会由SystemServer进程启动起来。
  4. 最后调用runSelectLoop函数来等待客户端请求

下面我们主要来分析这四件事。

public static void main(String argv[]) {
        // 1.创建ZygoteServer
        ZygoteServer zygoteServer = null;

        // 调用native函数,确保当前没有其它线程在运行
        ZygoteHooks.startZygoteNoThreadCreation();
        
        //设置pid为0,Zygote进入自己的进程组
        Os.setpgid(0, 0);
        ......
        Runnable caller;
        try {
            ......
            //得到systrace的监控TAG
            String bootTimeTag = Process.is64Bit() ? "Zygote64Timing" : "Zygote32Timing";
            TimingsTraceLog bootTimingsTraceLog = new TimingsTraceLog(bootTimeTag,
                    Trace.TRACE_TAG_DALVIK);
            //通过systradce来追踪 函数ZygoteInit, 可以通过systrace工具来进行分析
            //traceBegin 和 traceEnd 要成对出现,而且需要使用同一个tag
            bootTimingsTraceLog.traceBegin("ZygoteInit");

            //开启DDMS(Dalvik Debug Monitor Service)功能
            //注册所有已知的Java VM的处理块的监听器。线程监听、内存监听、native 堆内存监听、debug模式监听等等
            RuntimeInit.enableDdms();

            boolean startSystemServer = false;
            String zygoteSocketName = "zygote";
            String abiList = null;
            boolean enableLazyPreload = false;
            
            //2. 解析app_main.cpp - start()传入的参数
            for (int i = 1; i < argv.length; i++) {
                if ("start-system-server".equals(argv[i])) {
                    startSystemServer = true; //启动zygote时,才会传入参数:start-system-server
                } else if ("--enable-lazy-preload".equals(argv[i])) {
                    enableLazyPreload = true; //启动zygote_secondary时,才会传入参数:enable-lazy-preload
                } else if (argv[i].startsWith(ABI_LIST_ARG)) { //通过属性ro.product.cpu.abilist64\ro.product.cpu.abilist32 从C空间传来的值
                    abiList = argv[i].substring(ABI_LIST_ARG.length());
                } else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
                    zygoteSocketName = argv[i].substring(SOCKET_NAME_ARG.length()); //会有两种值:zygote和zygote_secondary
                } else {
                    throw new RuntimeException("Unknown command line argument: " + argv[i]);
                }
            }

            // 根据传入socket name来决定是创建socket还是zygote_secondary
            final boolean isPrimaryZygote = zygoteSocketName.equals(Zygote.PRIMARY_SOCKET_NAME);

            // 在第一次zygote启动时,enableLazyPreload为false,执行preload
            if (!enableLazyPreload) {
                //systrace 追踪 ZygotePreload
                bootTimingsTraceLog.traceBegin("ZygotePreload");
                EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
                        SystemClock.uptimeMillis());
                // 3.加载进程的资源和类,参考[4.2.2]
                preload(bootTimingsTraceLog);
                EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
                        SystemClock.uptimeMillis());
                //systrae结束 ZygotePreload的追踪
                bootTimingsTraceLog.traceEnd(); // ZygotePreload
            } else {
                // 延迟预加载, 变更Zygote进程优先级为NORMAL级别,第一次fork时才会preload
                Zygote.resetNicePriority();
            }

            //结束ZygoteInit的systrace追踪
            bootTimingsTraceLog.traceEnd(); // ZygoteInit
            //禁用systrace追踪,以便fork的进程不会从zygote继承过时的跟踪标记
            Trace.setTracingEnabled(false, 0);
            
            // 4.调用ZygoteServer 构造函数,创建socket,会根据传入的参数,
            // 创建两个socket:/dev/socket/zygote 和 /dev/socket/zygote_secondary
            //参考[4.2.3]
            zygoteServer = new ZygoteServer(isPrimaryZygote);

            if (startSystemServer) {
                //5. fork出system server,参考[4.2.4]
                Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);

                // 启动SystemServer
                if (r != null) {
                    r.run();
                    return;
                }
            }

            // 6.  zygote进程进入无限循环,处理请求
            caller = zygoteServer.runSelectLoop(abiList);
        } catch (Throwable ex) {
            Log.e(TAG, "System zygote died with exception", ex);
            throw ex;
        } finally {
            if (zygoteServer != null) {
                zygoteServer.closeServerSocket();
            }
        }

        // 7.在子进程中退出了选择循环。继续执行命令
        if (caller != null) {
            caller.run();
        }
    }

日志:

01-10 11:20:32.219   722   722 D Zygote  : begin preload
01-10 11:20:32.219   722   722 I Zygote  : Calling ZygoteHooks.beginPreload()
01-10 11:20:32.249   722   722 I Zygote  : Preloading classes...
01-10 11:20:33.179   722   722 I Zygote  : ...preloaded 7587 classes in 926ms.
01-10 11:20:33.449   722   722 I Zygote  : Preloading resources...
01-10 11:20:33.459   722   722 I Zygote  : ...preloaded 64 resources in 17ms.
01-10 11:20:33.519   722   722 I Zygote  : Preloading shared libraries...
01-10 11:20:33.539   722   722 I Zygote  : Called ZygoteHooks.endPreload()
01-10 11:20:33.539   722   722 I Zygote  : Installed AndroidKeyStoreProvider in 1ms.
01-10 11:20:33.549   722   722 I Zygote  : Warmed up JCA providers in 11ms.
01-10 11:20:33.549   722   722 D Zygote  : end preload
01-10 11:20:33.649   722   722 D Zygote  : Forked child process 1607
01-10 11:20:33.649   722   722 I Zygote  : System server process 1607 has been created
01-10 11:20:33.649   722   722 I Zygote  : Accepting command socket connections
10-15 06:11:07.749   722   722 D Zygote  : Forked child process 2982
10-15 06:11:07.789   722   722 D Zygote  : Forked child process 3004

4.2.2 [ZygoteInit.java] preload()

static void preload(TimingsTraceLog bootTimingsTraceLog) {
        Log.d(TAG, "begin preload");

        beginPreload(); // Pin ICU Data, 获取字符集转换资源等

        //预加载类的列表---/system/etc/preloaded-classes, 在版本:/frameworks/base/config/preloaded-classes 中,Android10.0中预计有7603左右个类
        //从下面的log看,成功加载了7587个类
        preloadClasses();

        preloadResources(); //加载图片、颜色等资源文件,部分定义在 /frameworks/base/core/res/res/values/arrays.xml中
        ......
        preloadSharedLibraries();   // 加载 android、compiler_rt、jnigraphics等library
        preloadTextResources();    //用于初始化文字资源

        WebViewFactory.prepareWebViewInZygote();    //用于初始化webview;
        endPreload();   //预加载完成,可以查看下面的log
        warmUpJcaProviders();
        Log.d(TAG, "end preload");

        sPreloadComplete = true;
    }

什么是预加载:

预加载是指在zygote进程启动的时候就加载,这样系统只在zygote执行一次加载操作,所有APP用到该资源不需要再重新加载,减少资源加载时间,加快了应用启动速度,一般情况下,系统中App共享的资源会被列为预加载资源。

zygote fork子进程时,根据fork的copy-on-write机制可知,有些类如果不做改变,甚至都不用复制,子进程可以和父进程共享这部分数据,从而省去不少内存的占用。

预加载的原理:

zygote进程启动后将资源读取出来,保存到Resources一个全局静态变量中,下次读取系统资源的时候优先从静态变量中查找。

frameworks/base/config/preloaded-classes:

参考:

相关日志:

01-10 11:20:32.219 722 722 D Zygote : begin preload
01-10 11:20:32.219 722 722 I Zygote : Calling ZygoteHooks.beginPreload()
01-10 11:20:32.249 722 722 I Zygote : Preloading classes...
01-10 11:20:33.179 722 722 I Zygote : ...preloaded 7587 classes in 926ms.
01-10 11:20:33.539 722 722 I Zygote : Called ZygoteHooks.endPreload()
01-10 11:20:33.549 722 722 D Zygote : end preload

4.2.3 [ZygoteServer.java] ZygoteServer()
path: frameworks\base\core\java\com\android\internal\os\ZygoteServer.java
作用:ZygoteServer 构造函数初始化时,根据传入的参数,利用LocalServerSocket 创建了4个本地服务端的socket,用来建立连接。
分别是:zygote、usap_pool_primary、zygote_secondary、usap_pool_secondary

    private LocalServerSocket mZygoteSocket;
    private LocalServerSocket mUsapPoolSocket;

    //创建zygote的socket
    ZygoteServer(boolean isPrimaryZygote) {
        mUsapPoolEventFD = Zygote.getUsapPoolEventFD();

        if (isPrimaryZygote) {
            //创建socket,并获取socket对象,socketname: zygote
            mZygoteSocket = Zygote.createManagedSocketFromInitSocket(Zygote.PRIMARY_SOCKET_NAME);
            //创建socket,并获取socket对象,socketname:usap_pool_primary
            mUsapPoolSocket =
                    Zygote.createManagedSocketFromInitSocket(
                            Zygote.USAP_POOL_PRIMARY_SOCKET_NAME);
        } else {
            //创建socket,并获取socket对象,socketname: zygote_secondary
            mZygoteSocket = Zygote.createManagedSocketFromInitSocket(Zygote.SECONDARY_SOCKET_NAME);
            //创建socket,并获取socket对象,socketname: usap_pool_secondary
            mUsapPoolSocket =
                    Zygote.createManagedSocketFromInitSocket(
                            Zygote.USAP_POOL_SECONDARY_SOCKET_NAME);
        }
        fetchUsapPoolPolicyProps();
        mUsapPoolSupported = true;
    }

    static LocalServerSocket createManagedSocketFromInitSocket(String socketName) {
        int fileDesc;
        // ANDROID_SOCKET_PREFIX为"ANDROID_SOCKET_" 
        //加入传入参数为zygote,则fullSocketName:ANDROID_SOCKET_zygote
        final String fullSocketName = ANDROID_SOCKET_PREFIX + socketName;

        try {
            //init.zygote64_32.rc启动时,指定了4个socket:
            //分别是zygote、usap_pool_primary、zygote_secondary、usap_pool_secondary
            // 在进程被创建时,就会创建对应的文件描述符,并加入到环境变量中
            // 这里取出对应的环境变量
            String env = System.getenv(fullSocketName);
            fileDesc = Integer.parseInt(env);
        } catch (RuntimeException ex) {
            throw new RuntimeException("Socket unset or invalid: " + fullSocketName, ex);
        }

        try {
            FileDescriptor fd = new FileDescriptor();
            fd.setInt$(fileDesc);   // 获取zygote socket的文件描述符
            return new LocalServerSocket(fd);   // 创建Socket的本地服务端
        } catch (IOException ex) {
            throw new RuntimeException(
                "Error building socket from file descriptor: " + fileDesc, ex);
        }
    }

    path: \frameworks\base\core\java\android\net\LocalServerSocket.java
    public LocalServerSocket(FileDescriptor fd) throws IOException
    {
        impl = new LocalSocketImpl(fd);
        impl.listen(LISTEN_BACKLOG);
        localAddress = impl.getSockAddress();
    }

4.2.4 [ZygoteInit.java] forkSystemServer()

  private static Runnable forkSystemServer(String abiList, String socketName,
          ZygoteServer zygoteServer) {
      
      long capabilities = posixCapabilitiesAsBits(
              OsConstants.CAP_IPC_LOCK,
              OsConstants.CAP_KILL,
              OsConstants.CAP_NET_ADMIN,
              OsConstants.CAP_NET_BIND_SERVICE,
              OsConstants.CAP_NET_BROADCAST,
              OsConstants.CAP_NET_RAW,
              OsConstants.CAP_SYS_MODULE,
              OsConstants.CAP_SYS_NICE,
              OsConstants.CAP_SYS_PTRACE,
              OsConstants.CAP_SYS_TIME,
              OsConstants.CAP_SYS_TTY_CONFIG,
              OsConstants.CAP_WAKE_ALARM,
              OsConstants.CAP_BLOCK_SUSPEND
      );
      ......
      //参数准备
      /* Hardcoded command line to start the system server */
      String args[] = {
              "--setuid=1000",
              "--setgid=1000",
              "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010
,1018,1021,1023,"
                      + "1024,1032,1065,3001,3002,3003,3006,3007,3009,3010",
              "--capabilities=" + capabilities + "," + capabilities,
              "--nice-name=system_server",
              "--runtime-args",
              "--target-sdk-version=" + VMRuntime.
SDK_VERSION_CUR_DEVELOPMENT,
              "com.android.server.SystemServer",
      };
      ZygoteArguments parsedArgs = null;

      int pid;

      try {
          //将上面准备的参数,按照ZygoteArguments的风格进行封装
          parsedArgs = new ZygoteArguments(args);
          Zygote.applyDebuggerSystemProperty(parsedArgs);
          Zygote.applyInvokeWithSystemProperty(parsedArgs);

          boolean profileSystemServer = SystemProperties.getBoolean(
                  "dalvik.vm.profilesystemserver", false);
          if (profileSystemServer) {
              parsedArgs.mRuntimeFlags |= Zygote.PROFILE_SYSTEM_SERVER;
          }

          //通过fork"分裂"出子进程system_server
          /* Request to fork the system server process */
          pid = Zygote.forkSystemServer(
                  parsedArgs.mUid, parsedArgs.mGid,
                  parsedArgs.mGids,
                  parsedArgs.mRuntimeFlags,
                  null,
                  parsedArgs.mPermittedCapabilities,
                  parsedArgs.mEffectiveCapabilities);
      } catch (IllegalArgumentException ex) {
          throw new RuntimeException(ex);
      }

      //进入子进程system_server
      /* For child process */
      if (pid == 0) {
          // 处理32_64和64_32的情况
          if (hasSecondZygote(abiList)) {
              waitForSecondaryZygote(socketName);
          }

          // fork时会copy socket,system server需要主动关闭
          zygoteServer.closeServerSocket();
          // system server进程处理自己的工作
          return handleSystemServerProcess(parsedArgs);
      }

      return null;
  }


ZygoteInit。forkSystemServer()会在新fork出的子进程中调用 handleSystemServerProcess(),

主要是返回Runtime.java的MethodAndArgsCaller的方法,然后通过r.run() 启动com.android.server.SystemServer的main 方法

这个当我们后面的SystemServer的章节进行详细讲解。

handleSystemServerProcess代码流程:
handleSystemServerProcess()
    |
    [ZygoteInit.java]
    zygoteInit()
        |
    [RuntimeInit.java]
    applicationInit()
        |
    findStaticMain()
        |
    MethodAndArgsCaller()

4.2.5 [ZygoteServer.java] runSelectLoop()

代码路径: frameworks\base\core\java\com\android\internal\os\ZygoteServer.java

 Runnable runSelectLoop(String abiList) {
        ArrayList<FileDescriptor> socketFDs = new ArrayList<FileDescriptor>();
        ArrayList<ZygoteConnection> peers = new ArrayList<ZygoteConnection>();

        // 首先将server socket加入到fds
        socketFDs.add(mZygoteSocket.getFileDescriptor());
        peers.add(null);

        while (true) {
            fetchUsapPoolPolicyPropsWithMinInterval();

            int[] usapPipeFDs &##61; null;
            StructPollfd[] pollFDs = null;

            // 每次循环,都重新创建需要监听的pollFds
            // Allocate enough space for the poll structs, taking into account
            // the state of the USAP pool for this Zygote (could be a
            // regular Zygote, a WebView Zygote, or an AppZygote).
            if (mUsapPoolEnabled) {
                usapPipeFDs = Zygote.getUsapPipeFDs();
                pollFDs = new StructPollfd[socketFDs.size() + 1 + usapPipeFDs.
length];
            } else {
                pollFDs = new StructPollfd[socketFDs.size()];
            }

            /*
             * For reasons of correctness the USAP pool pipe and event FDs
             * must be processed before the session and server sockets.  This
             * is to ensure that the USAP pool accounting information is
             * accurate when handling other requests like API blacklist
             * exemptions.
             */

            int pollIndex = 0;
            for (FileDescriptor socketFD : socketFDs) {
                 // 关注事件到来
                pollFDs[pollIndex] = new StructPollfd();
                pollFDs[pollIndex].fd = socketFD;
                pollFDs[pollIndex].events = (short) POLLIN;
                ++pollIndex;
            }

            final int usapPoolEventFDIndex = pollIndex;

            if (mUsapPoolEnabled) {
                pollFDs[pollIndex] = new StructPollfd();
                pollFDs[pollIndex].fd = mUsapPoolEventFD;
                pollFDs[pollIndex].events = (short) POLLIN;
                ++pollIndex;

                for (int usapPipeFD : usapPipeFDs) {
                    FileDescriptor managedFd = new FileDescriptor();
                    managedFd.setInt$(usapPipeFD);

                    pollFDs[pollIndex] = new StructPollfd();
                    pollFDs[pollIndex].fd = managedFd;
                    pollFDs[pollIndex].events = (short) POLLIN;
                    ++pollIndex;
                }
            }

            try {
                // 等待事件到来
                Os.poll(pollFDs, -1);
            } catch (ErrnoException ex) {
                throw new RuntimeException("poll failed", ex);
            }

            boolean usapPoolFDRead = false;

            //倒序处理,即优先处理已建立链接的信息,后处理新建链接的请求
            while (--pollIndex >= 0) {
                if ((pollFDs[pollIndex].revents & POLLIN) == 0) {
                    continue;
                }

                // server socket最先加入fds, 因此这里是server socket收到数据
                if (pollIndex == 0) {
                    // 收到新的建立通信的请求,建立通信连接
                    ZygoteConnection newPeer = acceptCommandPeer(abiList);
                    // 加入到peers和fds, 即下一次也开始监听
                    peers.add(newPeer);
                    socketFDs.add(newPeer.getFileDescriptor());

                } else if (pollIndex < usapPoolEventFDIndex) {
                    //说明接收到AMS发送过来创建应用程序的请求,调用
processOneCommand
                    //来创建新的应用程序进程
                    // Session socket accepted from the Zygote server socket
                    try {
                        //有socket连接,创建ZygoteConnection对象,并添加到fds。
                        ZygoteConnection connection = peers.get(pollIndex);
                        //处理连接,参考[4.2.6]
                        final Runnable command = connection.processOneCommand(
this);

                        // TODO (chriswailes): Is this extra check necessary?
                        if (mIsForkChild) {
                            // We're in the child. We should always have a 
command to run at this
                            // stage if processOneCommand hasn't called "exec".
                            if (command == null) {
                                throw new IllegalStateException("command == 
null");
                            }

                            return command;
                        } else {
                            // We're in the server - we should never have any 
commands to run.
                            if (command != null) {
                                throw new IllegalStateException("command != 
null");
                            }

                            // We don't know whether the remote side of the 
socket was closed or
                            // not until we attempt to read from it from 
processOneCommand. This
                            // shows up as a regular POLLIN event in our 
regular processing loop.
                            if (connection.isClosedByPeer()) {
                                connection.closeSocket();
                                peers.remove(pollIndex);
                                socketFDs.remove(pollIndex);    //处理完则从
fds中移除该文件描述符
                            }
                        }
                    } catch (Exception e) {
                        ......
                    } finally {
                        mIsForkChild = false;
                    }
                } else {
                    //处理USAP pool的事件
                    // Either the USAP pool event FD or a USAP reporting pipe.

                    // If this is the event FD the payload will be the number 
of USAPs removed.
                    // If this is a reporting pipe FD the payload will be the 
PID of the USAP
                    // that was just specialized.
                    long messagePayload = -1;

                    try {
                        byte[] buffer = new byte[Zygote.
USAP_MANAGEMENT_MESSAGE_BYTES];
                        int readBytes = Os.read(pollFDs[pollIndex].fd, buffer
, 0, buffer.length);

                        if (readBytes == Zygote.USAP_MANAGEMENT_MESSAGE_BYTES
) {
                            DataInputStream inputStream =
                                    new DataInputStream(new 
ByteArrayInputStream(buffer));

                            messagePayload = inputStream.readLong();
                        } else {
                            Log.e(TAG, "Incomplete read from USAP management 
FD of size "
                                    + readBytes);
                            continue;
                        }
                    } catch (Exception ex) {
                        if (pollIndex == usapPoolEventFDIndex) {
                            Log.e(TAG, "Failed to read from USAP pool event FD
: "
                                    + ex.getMessage());
                        } else {
                            Log.e(TAG, "Failed to read from USAP reporting 
pipe: "
                                    + ex.getMessage());
                        }

                        continue;
                    }

                    if (pollIndex > usapPoolEventFDIndex) {
                        Zygote.removeUsapTableEntry((int) messagePayload);
                    }

                    usapPoolFDRead = true;
                }
            }

            // Check to see if the USAP pool needs to be refilled.
            if (usapPoolFDRead) {
                int[] sessionSocketRawFDs =
                        socketFDs.subList(1, socketFDs.size())
                                .stream()
                                .mapToInt(fd -> fd.getInt$())
                                .toArray();

                final Runnable command = fillUsapPool(sessionSocketRawFDs);

                if (command != null) {
                    return command;
                }
            }
        }
}

4.2.6 [ZygoteConnection.java] processOneCommand()

Runnable processOneCommand(ZygoteServer zygoteServer) {
    ...
    //fork子进程
    pid = Zygote.forkAndSpecialize(parsedArgs.mUid, parsedArgs.mGid, 
parsedArgs.mGids,
            parsedArgs.mRuntimeFlags, rlimits, parsedArgs.mMountExternal, 
parsedArgs.mSeInfo,
            parsedArgs.mNiceName, fdsToClose, fdsToIgnore, parsedArgs.
mStartChildZygote,
            parsedArgs.mInstructionSet, parsedArgs.mAppDataDir, parsedArgs
.mTargetSdkVersion);
    if (pid == 0) {
        // 子进程执行
        zygoteServer.setForkChild();
        //进入子进程流程,参考[4.2.7]
        return handleChildProc(parsedArgs, descriptors, childPipeFd,
                parsedArgs.mStartChildZygote);
    } else {
        //父进程执行
        // In the parent. A pid < 0 indicates a failure and will be handled in 
        //handleParentProc.
        handleParentProc(pid, descriptors, serverPipeFd);
        return null;
    }
    ...
}

4.2.7 [ZygoteConnection.java] handleChildProc()

private Runnable handleChildProc(ZygoteArguments parsedArgs, FileDescriptor[] descriptors, FileDescriptor pipeFd, boolean isZygote) {
      ...
      if (parsedArgs.mInvokeWith != null) {
          ...
          throw new IllegalStateException("WrapperInit.execApplication 
unexpectedly returned");
      } else {
          if (!isZygote) {
              // App进程将会调用到这里,执行目标类的main()方法
              return ZygoteInit.zygoteInit(parsedArgs.mTargetSdkVersion,
                      parsedArgs.mRemainingArgs, null /* classLoader */);
          } else {
              return ZygoteInit.childZygoteInit(parsedArgs.mTargetSdkVersion,
                      parsedArgs.mRemainingArgs, null /* classLoader */);
          }
      }
  }

5.问题分析

5.1 为什么SystemServer和Zygote之间通信要采用Socket

进程间通信我们常用的是binder,为什么这里要采用socket呢。
主要是为了解决fork的问题:


UNIX上C++程序设计守则3:多线程程序里不准使用fork
Binder通讯是需要多线程操作的,代理对象对Binder的调用是在Binder线程,需要再通过Handler调用主线程来操作。
比如AMS与应用进程通讯,AMS的本地代理IApplicationThread通过调用ScheduleLaunchActivity,调用到的应用进程ApplicationThread的ScheduleLaunchActivity是在Binder线程,
需要再把参数封装为一个ActivityClientRecord,sendMessage发送给H类(主线程Handler,ActivityThread内部类)
主要原因:害怕父进程binder线程有锁,然后子进程的主线程一直在等其子线程(从父进程拷贝过来的子进程)的资源,但是其实父进程的子进程并没有被拷贝过来,造成死锁。


所以fork不允许存在多线程。而非常巧的是Binder通讯偏偏就是多线程,所以干脆父进程(Zgote)这个时候就不使用binder线程

5.2为什么一个java应用一个虚拟机?

  1. android的VM(vm==Virtual Machine )也是类似JRE的东西,当然,各方面都截然不同,不过有一个作用都是一样的,为app提供了运行环境
  2. android为每个程序提供一个vm,可以使每个app都运行在独立的运行环境,使稳定性提高。
  3. vm的设计可以有更好的兼容性。android apk都被编译成字节码(bytecode),在运行的时候,vm是先将字节码编译真正可执行的代码,否则不同硬件设备的兼容是很大的麻烦。
  4. android(非ROOT)没有windows下键盘钩子之类的东西,每个程序一个虚拟机,各个程序之间也不可以随意访问内存,所以此类木马病毒几乎没有。

5.3 什么是Zygote资源预加载

预加载是指在zygote进程启动的时候就加载,这样系统只在zygote执行一次加载操作,所有APP用到该资源不需要再重新加载,减少资源加载时间,加快了应用启动速度,一般情况下,系统中App共享的资源会被列为预加载资源。
zygote fork子进程时,根据fork的copy-on-write机制可知,有些类如果不做改变,甚至都不用复制,子进程可以和父进程共享这部分数据,从而省去不少内存的占用。

5.4 Zygote为什么要预加载

应用程序都从Zygote孵化出来,应用程序都会继承Zygote的所有内容。
如果在Zygote启动的时候加载这些类和资源,这些孵化的应用程序就继承Zygote的类和资源,这样启动引用程序的时候就不需要加载类和资源了,启动的速度就会快很多。
开机的次数不多,但是启动应用程序的次数非常多。

5.5 Zygote 预加载的原理是什么?

zygote进程启动后将资源读取出来,保存到Resources一个全局静态变量中,下次读取系统资源的时候优先从静态变量中查找。

6.总结

至此,Zygote启动流程结束,Zygote进程共做了如下几件事:

  1. 解析init.zygote64_32.rc,创建AppRuntime并调用其start方法,启动Zygote进程。
  2. 创建JavaVM并为JavaVM注册JNI.
  3. 通过JNI调用ZygoteInit的main函数进入Zygote的Java框架层。
  4. 通过ZygoteServer创建服务端Socket,预加载类和资源,并通过runSelectLoop函数等待如ActivityManagerService等的请求。
  5. 启动SystemServer进程。