当咱们在Java里调用Native办法时,多数人会认为程序将直接跳转到对应的C/C++函数中。但实际情况并非如此,咱们需求一个中间函数来处理线程状况切换、Local Reference Table更新、参数转化等一系列作业。这个函数通常被称为”JNI Trampoline”(trampoline:蹦床),它的运转时刻越短,JNI调用的功能就越好。
在Android发展的历史中,Google针对JNI Trampoline有过几次大的优化,总的来说能够分为两个方向:
- 适用于所有参数类型的generic trampoline向只适用于特定参数类型的specific trampoline改变。
- 依据C/C++函数的实际情况,省去trampoline中的一些作业。
众所周知,ART虚拟机支撑三种履行形式:解说履行、AOT和JIT。后两者同归于机器码履行。当解说器碰到Native办法时,它会选择虚拟机内置的art_quick_generic_jni_trampoline
来处理中间事务。这种generic trampoline能够适用所有的参数类型,但由于考虑了各种情况,甚至最极端的情况,因而功能并不好。举个比如,即便咱们只传递一个参数,art_quick_generic_jni_trampoline
也会在栈上分配5K的巨细。
// Reserved area on stack for art_quick_generic_jni_trampoline:
// 4 local state ref
// 4 padding
// 4096 4k scratch space, enough for 2x 256 8-byte parameters
// 8*(32+32) max 32 GPRs and 32 FPRs on each architecture, 8 bytes each
// + 4 padding for 16-bytes alignment
// -----------
// 4616
// Round up to 5k, total 5120
#define GENERIC_JNI_TRAMPOLINE_RESERVED_AREA 5120
至于注释中为什么写”256 8-byte parameters”,原因是JVM Specification中约束了Java参数传递的数量上限为255。
The number of method parameters is limited to 255 by the definition of a method descriptor (4.3.3), where the limit includes one unit for
this
in the case of instance or interface method invocations.
这种情况在机器码履行时有所改善,原因是编译器会为每种参数类型生成特定的trampoline,通常也被称为”compiler JNI trampoline”。由于这些trampoline知道了参数类型,所以在参数转化、传递时更加直接。此外,编译器还对线程状况切换做了inline的处理。这些都使得compiler JNI trampoline相较于generic JNI trampoline有了功能上的提高。
留意这儿说的是为每种参数类型生成一个trampoline,而不是为每个Native办法。举个比如,下面是boot.oat中两个不同的Native办法。它们尽管有着不同的返回类型,可是参数类型是一致的,第一个参数为引证类型,第二个参数为4字节的根本类型。因而它们的trampoline是共用的。
- static native void listen(FileDescriptor fd, int backlog) throws IOException;
- private static native int chmod(String fileName, int permission);
通过oatdump拿到这两个办法的汇编代码(也即compiler trampoline),能够发现二者完全一致,连地址都是相同的,这标明此trampoline在oat文件中只存在一份,符合参数类型的不同Native办法都指向它。(关于这个汇编代码的具体解说能够看我之前的文章)
32: int java.util.prefs.FileSystemPreferences.chmod(java.lang.String, int) (dex_method_idx=26767)
DEX CODE:
OatMethodOffsets (offset=0x0000e9fc)
code_offset: 0x00098d90
OatQuickMethodHeader (offset=0x00098d8c)
vmap_table: (offset=0x0006f7bd)
QuickMethodFrameInfo
frame_size_in_bytes: 176
core_spill_mask: 0x7ff80000 (r19, r20, r21, r22, r23, r24, r25, r26, r27, r28, r29, r30)
fp_spill_mask: 0x0000ff00 (fr8, fr9, fr10, fr11, fr12, fr13, fr14, fr15)
CODE: (code_offset=0x00098d90 size=304)...
0x00098d90: d102c3ff sub sp, sp, #0xb0 (176)
0x00098d94: a90553f3 stp tr, x20, [sp, #80]
0x00098d98: a9065bf5 stp x21, x22, [sp, #96]
0x00098d9c: a90763f7 stp x23, x24, [sp, #112]
0x00098da0: a9086bf9 stp x25, x26, [sp, #128]
0x00098da4: a90973fb stp x27, x28, [sp, #144]
0x00098da8: a90a7bfd stp x29, lr, [sp, #160]
0x00098dac: 6d0127e8 stp d8, d9, [sp, #16]
0x00098db0: 6d022fea stp d10, d11, [sp, #32]
0x00098db4: 6d0337ec stp d12, d13, [sp, #48]
0x00098db8: 6d043fee stp d14, d15, [sp, #64]
0x00098dbc: f90003e0 str x0, [sp]
0x00098dc0: 35000614 cbnz w20, #+0xc0 (addr 0x98e80)
0x00098dc4: b900bbe1 str w1, [sp, #184]
0x00098dc8: 910003f0 mov x16, sp
0x00098dcc: f9005a70 str x16, [tr, #176] ; top_quick_frame_method
0x00098dd0: 885f7e70 ldxr w16, [tr]
0x00098dd4: 52ab8011 mov w17, #0x5c000000
0x00098dd8: 35000610 cbnz w16, #+0xc0 (addr 0x98e98)
0x00098ddc: 8810fe71 stlxr w16, w17, [tr]
0x00098de0: 35ffff90 cbnz w16, #-0x10 (addr 0x98dd0)
0x00098de4: f904f67f str xzr, [tr, #2536] ; 2536
0x00098de8: f9406a76 ldr x22, [tr, #208] ; jni_env
0x00098dec: b9401ad7 ldr w23, [x22, #24]
0x00098df0: b94022d8 ldr w24, [x22, #32]
0x00098df4: b9001ad8 str w24, [x22, #24]
0x00098df8: 2a0203e3 mov w3, w2
0x00098dfc: 9102e3f0 add x16, sp, #0xb8 (184)
0x00098e00: 7100003f cmp w1, #0x0 (0)
0x00098e04: 9a9f1202 csel x2, x16, xzr, ne
0x00098e08: aa0003e1 mov x1, x0
0x00098e0c: aa1603e0 mov x0, x22
0x00098e10: f940083e ldr lr, [x1, #16]
0x00098e14: d63f03c0 blr lr
0x00098e18: 885ffe70 ldaxr w16, [tr]
0x00098e1c: 52ab8011 mov w17, #0x5c000000
0x00098e20: 6b11021f cmp w16, w17
0x00098e24: 54000401 b.ne #+0x80 (addr 0x98ea4)
0x00098e28: 88107e7f stxr w16, wzr, [tr]
0x00098e2c: 35ffff70 cbnz w16, #-0x14 (addr 0x98e18)
0x00098e30: f943d270 ldr x16, [tr, #1952] ; 1952
0x00098e34: f904f670 str x16, [tr, #2536] ; 2536
0x00098e38: b9401ad8 ldr w24, [x22, #24]
0x00098e3c: b90022d8 str w24, [x22, #32]
0x00098e40: b9001ad7 str w23, [x22, #24]
0x00098e44: f9405270 ldr x16, [tr, #160] ; exception
0x00098e48: b5000350 cbnz x16, #+0x68 (addr 0x98eb0)
0x00098e4c: a94553f3 ldp tr, x20, [sp, #80]
0x00098e50: a9465bf5 ldp x21, x22, [sp, #96]
0x00098e54: a94763f7 ldp x23, x24, [sp, #112]
0x00098e58: a9486bf9 ldp x25, x26, [sp, #128]
0x00098e5c: a94973fb ldp x27, x28, [sp, #144]
0x00098e60: a94a7bfd ldp x29, lr, [sp, #160]
0x00098e64: 6d4127e8 ldp d8, d9, [sp, #16]
0x00098e68: 6d422fea ldp d10, d11, [sp, #32]
0x00098e6c: 6d4337ec ldp d12, d13, [sp, #48]
0x00098e70: 6d443fee ldp d14, d15, [sp, #64]
0x00098e74: b9402674 ldr w20, [tr, #36] ; is_gc_marking
0x00098e78: 9102c3ff add sp, sp, #0xb0 (176)
0x00098e7c: d65f03c0 ret
0x00098e80: b9400016 ldr w22, [x0]
0x00098e84: b94006d0 ldr w16, [x22, #4]
0x00098e88: 37eff9f0 tbnz w16, #29, #-0xc4 (addr 0x98dc4)
0x00098e8c: f942fe7e ldr lr, [tr, #1528] ; pJniReadBarrier
0x00098e90: d63f03c0 blr lr
0x00098e94: 17ffffcc b #-0xd0 (addr 0x98dc4)
0x00098e98: f941967e ldr lr, [tr, #808] ; pJniMethodStart
0x00098e9c: d63f03c0 blr lr
0x00098ea0: 17ffffd2 b #-0xb8 (addr 0x98de8)
0x00098ea4: f9419a7e ldr lr, [tr, #816] ; pJniMethodEnd
0x00098ea8: d63f03c0 blr lr
0x00098eac: 17ffffe3 b #-0x74 (addr 0x98e38)
0x00098eb0: f9405260 ldr x0, [tr, #160] ; exception
0x00098eb4: f942867e ldr lr, [tr, #1288] ; pDeliverException
0x00098eb8: d63f03c0 blr lr
0x00098ebc: d4200000 brk #0x0
39: void sun.nio.ch.Net.listen(java.io.FileDescriptor, int) (dex_method_idx=34223)
DEX CODE:
OatMethodOffsets (offset=0x0000748c)
code_offset: 0x00098d90
OatQuickMethodHeader (offset=0x00098d8c)
vmap_table: (offset=0x0006f7bd)
QuickMethodFrameInfo
frame_size_in_bytes: 176
core_spill_mask: 0x7ff80000 (r19, r20, r21, r22, r23, r24, r25, r26, r27, r28, r29, r30)
fp_spill_mask: 0x0000ff00 (fr8, fr9, fr10, fr11, fr12, fr13, fr14, fr15)
CODE: (code_offset=0x00098d90 size=304)...
0x00098d90: d102c3ff sub sp, sp, #0xb0 (176)
0x00098d94: a90553f3 stp tr, x20, [sp, #80]
0x00098d98: a9065bf5 stp x21, x22, [sp, #96]
0x00098d9c: a90763f7 stp x23, x24, [sp, #112]
0x00098da0: a9086bf9 stp x25, x26, [sp, #128]
0x00098da4: a90973fb stp x27, x28, [sp, #144]
0x00098da8: a90a7bfd stp x29, lr, [sp, #160]
0x00098dac: 6d0127e8 stp d8, d9, [sp, #16]
0x00098db0: 6d022fea stp d10, d11, [sp, #32]
0x00098db4: 6d0337ec stp d12, d13, [sp, #48]
0x00098db8: 6d043fee stp d14, d15, [sp, #64]
0x00098dbc: f90003e0 str x0, [sp]
0x00098dc0: 35000614 cbnz w20, #+0xc0 (addr 0x98e80)
0x00098dc4: b900bbe1 str w1, [sp, #184]
0x00098dc8: 910003f0 mov x16, sp
0x00098dcc: f9005a70 str x16, [tr, #176] ; top_quick_frame_method
0x00098dd0: 885f7e70 ldxr w16, [tr]
0x00098dd4: 52ab8011 mov w17, #0x5c000000
0x00098dd8: 35000610 cbnz w16, #+0xc0 (addr 0x98e98)
0x00098ddc: 8810fe71 stlxr w16, w17, [tr]
0x00098de0: 35ffff90 cbnz w16, #-0x10 (addr 0x98dd0)
0x00098de4: f904f67f str xzr, [tr, #2536] ; 2536
0x00098de8: f9406a76 ldr x22, [tr, #208] ; jni_env
0x00098dec: b9401ad7 ldr w23, [x22, #24]
0x00098df0: b94022d8 ldr w24, [x22, #32]
0x00098df4: b9001ad8 str w24, [x22, #24]
0x00098df8: 2a0203e3 mov w3, w2
0x00098dfc: 9102e3f0 add x16, sp, #0xb8 (184)
0x00098e00: 7100003f cmp w1, #0x0 (0)
0x00098e04: 9a9f1202 csel x2, x16, xzr, ne
0x00098e08: aa0003e1 mov x1, x0
0x00098e0c: aa1603e0 mov x0, x22
0x00098e10: f940083e ldr lr, [x1, #16]
0x00098e14: d63f03c0 blr lr
0x00098e18: 885ffe70 ldaxr w16, [tr]
0x00098e1c: 52ab8011 mov w17, #0x5c000000
0x00098e20: 6b11021f cmp w16, w17
0x00098e24: 54000401 b.ne #+0x80 (addr 0x98ea4)
0x00098e28: 88107e7f stxr w16, wzr, [tr]
0x00098e2c: 35ffff70 cbnz w16, #-0x14 (addr 0x98e18)
0x00098e30: f943d270 ldr x16, [tr, #1952] ; 1952
0x00098e34: f904f670 str x16, [tr, #2536] ; 2536
0x00098e38: b9401ad8 ldr w24, [x22, #24]
0x00098e3c: b90022d8 str w24, [x22, #32]
0x00098e40: b9001ad7 str w23, [x22, #24]
0x00098e44: f9405270 ldr x16, [tr, #160] ; exception
0x00098e48: b5000350 cbnz x16, #+0x68 (addr 0x98eb0)
0x00098e4c: a94553f3 ldp tr, x20, [sp, #80]
0x00098e50: a9465bf5 ldp x21, x22, [sp, #96]
0x00098e54: a94763f7 ldp x23, x24, [sp, #112]
0x00098e58: a9486bf9 ldp x25, x26, [sp, #128]
0x00098e5c: a94973fb ldp x27, x28, [sp, #144]
0x00098e60: a94a7bfd ldp x29, lr, [sp, #160]
0x00098e64: 6d4127e8 ldp d8, d9, [sp, #16]
0x00098e68: 6d422fea ldp d10, d11, [sp, #32]
0x00098e6c: 6d4337ec ldp d12, d13, [sp, #48]
0x00098e70: 6d443fee ldp d14, d15, [sp, #64]
0x00098e74: b9402674 ldr w20, [tr, #36] ; is_gc_marking
0x00098e78: 9102c3ff add sp, sp, #0xb0 (176)
0x00098e7c: d65f03c0 ret
0x00098e80: b9400016 ldr w22, [x0]
0x00098e84: b94006d0 ldr w16, [x22, #4]
0x00098e88: 37eff9f0 tbnz w16, #29, #-0xc4 (addr 0x98dc4)
0x00098e8c: f942fe7e ldr lr, [tr, #1528] ; pJniReadBarrier
0x00098e90: d63f03c0 blr lr
0x00098e94: 17ffffcc b #-0xd0 (addr 0x98dc4)
0x00098e98: f941967e ldr lr, [tr, #808] ; pJniMethodStart
0x00098e9c: d63f03c0 blr lr
0x00098ea0: 17ffffd2 b #-0xb8 (addr 0x98de8)
0x00098ea4: f9419a7e ldr lr, [tr, #816] ; pJniMethodEnd
0x00098ea8: d63f03c0 blr lr
0x00098eac: 17ffffe3 b #-0x74 (addr 0x98e38)
0x00098eb0: f9405260 ldr x0, [tr, #160] ; exception
0x00098eb4: f942867e ldr lr, [tr, #1288] ; pDeliverException
0x00098eb8: d63f03c0 blr lr
0x00098ebc: d4200000 brk #0x0
因而,将Native办法通过JIT/AOT编译后能够提高功能。这种提高并不来源于字节码到机器码的改变,由于Native办法是空的,它没有字节码。它本质上来自art_quick_generic_jni_trampoline
到”compiler JNI trampoline”的改变。
结合当下干流APP在尝试的Baseline Profile计划,或许能够将这些Native办法都放入profile名单。由于相同参数类型的不同办法共用一个trampoline,所以终究编译添加的code size是微乎其微的,但其带来的功能提高会让每个Native办法都享受到。
让咱们回到故事的最开端,为什么JNI跳转需求一个trampoline?
参数转化能够理解,究竟C/C++函数多了一个参数JNIEnv*
,而且Java的引证类型和String都要转化为对应的C++类型。可是线程状况切换和Local Reference Table的目的又是什么呢?
简言之,它们的目的都是为了保证GC能够正常进行。不管GC Collector怎么演变,有两个基础是不变的。一个是要有停止的窗口期能够观察到稳定的堆状况,另一个是GC Root要找齐全。停止的窗口期又被称为”stop the world”,它标明所有线程在这个阶段都不能去触碰堆内存。基于此,才衍生出Java线程状况的概念。Runnable状况标明线程运转在Java世界,随时可能运用堆内存。Native状况则标明线程运转在C/C++世界,且不会运用到Java堆内存。一个Native状况的线程在GC眼中便是“暂停”的线程,这儿的“暂停”并不标明线程不运转,而仅仅不触摸Java堆。
因而正常的JNI调用产生时,都需求让线程状况由Runnable切换到Native,以此来告知GC:接下来我不会运用到Java堆,你就当我睡去好了。
那假如C/C++函数中需求再次运用Java堆怎么办?这可是常有的事,比如env->CallObjectMethod
调用Java办法,或许obj->GetFieldBoolean
获取某个引证参数的内部字段。这时就需求将线程状况从Native切换回Runnable。因而正常的JNI调用过程中触及频频的线程状况切换,这可是一笔不小的功能开支。
要么干脆不进行线程切换,就让线程坚持Runnable状况!那会有什么结果呢?
不管是解说履行还是机器码履行,Java/Kotlin代码在履行过程中都会刺进很多检测点,它们保证线程在需求的时候能够及时暂停下来,而不成为蒙眼狂奔的疯子。可是C/C++代码并不会刺进这样的检测点(一是影响功能,二是不同编译器的编译规则不同,无法保证),因而假如它的状况为Runnable,那么GC只能等着它回到Java世界,等着它进入检测点才干暂停它。假如C/C++函数中的运转时刻很短,那么省去线程切换的确能够带来功能提高。但假如C/C++函数运转时刻过长,或许函数内部可能会存在挂起的动作(比如等锁或是自动sleep),那么GC将会受到严峻的搅扰,省下的这点功能说不定不及给GC带来的负面影响。
因而当开发者能够保证Native办法对应的C/C++函数耗时很短时,他就能够采用Android提供的方式告知trampoline不要进行线程切换。
早期Android版本中,通过在签名前加上!
能够省去线程切换的时刻,这种方式称为fast jni
,如下所示。
static JNINativeMethod gMethods[] = {
NATIVE_METHOD(Unsafe, compareAndSwapInt, "!(Ljava/lang/Object;JII)Z"),
NATIVE_METHOD(Unsafe, compareAndSwapLong, "!(Ljava/lang/Object;JJJ)Z"),
不过这个方式从Android 8开端就抛弃了,转而被@FastNative
注解替代,如下所示。生成出来的JNI trampoline汇编代码为61行。
@FastNative
private static native int getArrayBaseOffsetForComponentType(Class component_class);
2: int sun.misc.Unsafe.getArrayBaseOffsetForComponentType(java.lang.Class) (dex_method_idx=32803)
DEX CODE:
OatMethodOffsets (offset=0x000070cc)
code_offset: 0x00096080
OatQuickMethodHeader (offset=0x0009607c)
vmap_table: (offset=0x00082eed)
QuickMethodFrameInfo
frame_size_in_bytes: 176
core_spill_mask: 0x7ff80000 (r19, r20, r21, r22, r23, r24, r25, r26, r27, r28, r29, r30)
fp_spill_mask: 0x0000ff00 (fr8, fr9, fr10, fr11, fr12, fr13, fr14, fr15)
CODE: (code_offset=0x00096080 size=244)...
0x00096080: d102c3ff sub sp, sp, #0xb0 (176)
0x00096084: a90553f3 stp tr, x20, [sp, #80]
0x00096088: a9065bf5 stp x21, x22, [sp, #96]
0x0009608c: a90763f7 stp x23, x24, [sp, #112]
0x00096090: a9086bf9 stp x25, x26, [sp, #128]
0x00096094: a90973fb stp x27, x28, [sp, #144]
0x00096098: a90a7bfd stp x29, lr, [sp, #160]
0x0009609c: 6d0127e8 stp d8, d9, [sp, #16]
0x000960a0: 6d022fea stp d10, d11, [sp, #32]
0x000960a4: 6d0337ec stp d12, d13, [sp, #48]
0x000960a8: 6d043fee stp d14, d15, [sp, #64]
0x000960ac: f90003e0 str x0, [sp]
0x000960b0: 35000494 cbnz w20, #+0x90 (addr 0x96140)
0x000960b4: b900bbe1 str w1, [sp, #184]
0x000960b8: 910003f0 mov x16, sp
0x000960bc: f9005a70 str x16, [tr, #176] ; top_quick_frame_method
0x000960c0: f9406a76 ldr x22, [tr, #208] ; jni_env
0x000960c4: b9401ad7 ldr w23, [x22, #24]
0x000960c8: b94022d8 ldr w24, [x22, #32]
0x000960cc: b9001ad8 str w24, [x22, #24]
0x000960d0: 9102e3f0 add x16, sp, #0xb8 (184)
0x000960d4: 7100003f cmp w1, #0x0 (0)
0x000960d8: 9a9f1202 csel x2, x16, xzr, ne
0x000960dc: aa0003e1 mov x1, x0
0x000960e0: aa1603e0 mov x0, x22
0x000960e4: f940083e ldr lr, [x1, #16]
0x000960e8: d63f03c0 blr lr
0x000960ec: b9401ad8 ldr w24, [x22, #24]
0x000960f0: b90022d8 str w24, [x22, #32]
0x000960f4: b9001ad7 str w23, [x22, #24]
0x000960f8: f9405270 ldr x16, [tr, #160] ; exception
0x000960fc: b5000350 cbnz x16, #+0x68 (addr 0x96164)
0x00096100: b9400270 ldr w16, [tr] ; state_and_flags
0x00096104: 72000a1f tst w16, #0x7
0x00096108: 54000281 b.ne #+0x50 (addr 0x96158)
0x0009610c: a94553f3 ldp tr, x20, [sp, #80]
0x00096110: a9465bf5 ldp x21, x22, [sp, #96]
0x00096114: a94763f7 ldp x23, x24, [sp, #112]
0x00096118: a9486bf9 ldp x25, x26, [sp, #128]
0x0009611c: a94973fb ldp x27, x28, [sp, #144]
0x00096120: a94a7bfd ldp x29, lr, [sp, #160]
0x00096124: 6d4127e8 ldp d8, d9, [sp, #16]
0x00096128: 6d422fea ldp d10, d11, [sp, #32]
0x0009612c: 6d4337ec ldp d12, d13, [sp, #48]
0x00096130: 6d443fee ldp d14, d15, [sp, #64]
0x00096134: b9402674 ldr w20, [tr, #36] ; is_gc_marking
0x00096138: 9102c3ff add sp, sp, #0xb0 (176)
0x0009613c: d65f03c0 ret
0x00096140: b9400016 ldr w22, [x0]
0x00096144: b94006d0 ldr w16, [x22, #4]
0x00096148: 37effb70 tbnz w16, #29, #-0x94 (addr 0x960b4)
0x0009614c: f942fe7e ldr lr, [tr, #1528] ; pJniReadBarrier
0x00096150: d63f03c0 blr lr
0x00096154: 17ffffd8 b #-0xa0 (addr 0x960b4)
0x00096158: f942827e ldr lr, [tr, #1280] ; pTestSuspend
0x0009615c: d63f03c0 blr lr
0x00096160: 17ffffeb b #-0x54 (addr 0x9610c)
0x00096164: f9405260 ldr x0, [tr, #160] ; exception
0x00096168: f942867e ldr lr, [tr, #1288] ; pDeliverException
0x0009616c: d63f03c0 blr lr
0x00096170: d4200000 brk #0x0
作为对比,咱们将同参数类型的一般Native办法生成的汇编代码也列在这儿。它和getArrayBaseOffsetForComponentType
相同也是private static办法,因而唯一的差异就在于@FastNative
注解。能够看到,一般Native办法生成的汇编代码行数为75行,而@FastNative
办法生成的汇编代码行数只要61行。
5: int sun.nio.ch.IOUtil.fdVal(java.io.FileDescriptor) (dex_method_idx=34034)
DEX CODE:
OatMethodOffsets (offset=0x0000738c)
code_offset: 0x00097110
OatQuickMethodHeader (offset=0x0009710c)
vmap_table: (offset=0x0008147a)
QuickMethodFrameInfo
frame_size_in_bytes: 176
core_spill_mask: 0x7ff80000 (r19, r20, r21, r22, r23, r24, r25, r26, r27, r28, r29, r30)
fp_spill_mask: 0x0000ff00 (fr8, fr9, fr10, fr11, fr12, fr13, fr14, fr15)
CODE: (code_offset=0x00097110 size=300)...
0x00097110: d102c3ff sub sp, sp, #0xb0 (176)
0x00097114: a90553f3 stp tr, x20, [sp, #80]
0x00097118: a9065bf5 stp x21, x22, [sp, #96]
0x0009711c: a90763f7 stp x23, x24, [sp, #112]
0x00097120: a9086bf9 stp x25, x26, [sp, #128]
0x00097124: a90973fb stp x27, x28, [sp, #144]
0x00097128: a90a7bfd stp x29, lr, [sp, #160]
0x0009712c: 6d0127e8 stp d8, d9, [sp, #16]
0x00097130: 6d022fea stp d10, d11, [sp, #32]
0x00097134: 6d0337ec stp d12, d13, [sp, #48]
0x00097138: 6d043fee stp d14, d15, [sp, #64]
0x0009713c: f90003e0 str x0, [sp]
0x00097140: 350005f4 cbnz w20, #+0xbc (addr 0x971fc)
0x00097144: b900bbe1 str w1, [sp, #184]
0x00097148: 910003f0 mov x16, sp
0x0009714c: f9005a70 str x16, [tr, #176] ; top_quick_frame_method
0x00097150: 885f7e70 ldxr w16, [tr]
0x00097154: 52ab8011 mov w17, #0x5c000000
0x00097158: 350005f0 cbnz w16, #+0xbc (addr 0x97214)
0x0009715c: 8810fe71 stlxr w16, w17, [tr]
0x00097160: 35ffff90 cbnz w16, #-0x10 (addr 0x97150)
0x00097164: f904f67f str xzr, [tr, #2536] ; 2536
0x00097168: f9406a76 ldr x22, [tr, #208] ; jni_env
0x0009716c: b9401ad7 ldr w23, [x22, #24]
0x00097170: b94022d8 ldr w24, [x22, #32]
0x00097174: b9001ad8 str w24, [x22, #24]
0x00097178: 9102e3f0 add x16, sp, #0xb8 (184)
0x0009717c: 7100003f cmp w1, #0x0 (0)
0x00097180: 9a9f1202 csel x2, x16, xzr, ne
0x00097184: aa0003e1 mov x1, x0
0x00097188: aa1603e0 mov x0, x22
0x0009718c: f940083e ldr lr, [x1, #16]
0x00097190: d63f03c0 blr lr
0x00097194: 885ffe70 ldaxr w16, [tr]
0x00097198: 52ab8011 mov w17, #0x5c000000
0x0009719c: 6b11021f cmp w16, w17
0x000971a0: 54000401 b.ne #+0x80 (addr 0x97220)
0x000971a4: 88107e7f stxr w16, wzr, [tr]
0x000971a8: 35ffff70 cbnz w16, #-0x14 (addr 0x97194)
0x000971ac: f943d270 ldr x16, [tr, #1952] ; 1952
0x000971b0: f904f670 str x16, [tr, #2536] ; 2536
0x000971b4: b9401ad8 ldr w24, [x22, #24]
0x000971b8: b90022d8 str w24, [x22, #32]
0x000971bc: b9001ad7 str w23, [x22, #24]
0x000971c0: f9405270 ldr x16, [tr, #160] ; exception
0x000971c4: b5000350 cbnz x16, #+0x68 (addr 0x9722c)
0x000971c8: a94553f3 ldp tr, x20, [sp, #80]
0x000971cc: a9465bf5 ldp x21, x22, [sp, #96]
0x000971d0: a94763f7 ldp x23, x24, [sp, #112]
0x000971d4: a9486bf9 ldp x25, x26, [sp, #128]
0x000971d8: a94973fb ldp x27, x28, [sp, #144]
0x000971dc: a94a7bfd ldp x29, lr, [sp, #160]
0x000971e0: 6d4127e8 ldp d8, d9, [sp, #16]
0x000971e4: 6d422fea ldp d10, d11, [sp, #32]
0x000971e8: 6d4337ec ldp d12, d13, [sp, #48]
0x000971ec: 6d443fee ldp d14, d15, [sp, #64]
0x000971f0: b9402674 ldr w20, [tr, #36] ; is_gc_marking
0x000971f4: 9102c3ff add sp, sp, #0xb0 (176)
0x000971f8: d65f03c0 ret
0x000971fc: b9400016 ldr w22, [x0]
0x00097200: b94006d0 ldr w16, [x22, #4]
0x00097204: 37effa10 tbnz w16, #29, #-0xc0 (addr 0x97144)
0x00097208: f942fe7e ldr lr, [tr, #1528] ; pJniReadBarrier
0x0009720c: d63f03c0 blr lr
0x00097210: 17ffffcd b #-0xcc (addr 0x97144)
0x00097214: f941967e ldr lr, [tr, #808] ; pJniMethodStart
0x00097218: d63f03c0 blr lr
0x0009721c: 17ffffd3 b #-0xb4 (addr 0x97168)
0x00097220: f9419a7e ldr lr, [tr, #816] ; pJniMethodEnd
0x00097224: d63f03c0 blr lr
0x00097228: 17ffffe3 b #-0x74 (addr 0x971b4)
0x0009722c: f9405260 ldr x0, [tr, #160] ; exception
0x00097230: f942867e ldr lr, [tr, #1288] ; pDeliverException
0x00097234: d63f03c0 blr lr
0x00097238: d4200000 brk #0x0
@FastNative
和fast jni
尽管都省去了线程切换时刻,但实现细节仍然有些不同。从JNI trampoline的角度,你能够认为@FastNative
是fast jni
的重构,功能稍微好一些。
不过@FastNative
不支撑synchronize
,原因是synchronize
加锁的动作产生在trampoline里面,它可能阻塞,而这不符合@FastNative
设计的初衷。
说完线程状况切换,再来说说JNI调用过程中的GC Root。它们主要来自两个地方:
- Native办法传入的引证参数。
- C/C++函数中创立的Java目标。
这些目标必须被当作GC Root,原因是它们可能没有被其他任何地方引证,比如new一个目标将它直接作为参数。为了在GC时能够找到这些GC Root,所以虚拟机引入了Local Reference Table,Global Reference Table,HandleScope(已弃用)等一系列数据结构,而且JNI trampoline中也添加了一些处理环节。那么假如C/C++函数中不需求这些Java目标呢?是不是就意味JNI Trampoline中能够省去这些处理环节?
依据这个优化思路,谷歌引入了@CriticalNative
注解,如下所示。
@CriticalNative
public static native long getNanoTimeAdjustment(long offsetInSeconds);
JNIEXPORT jlong JNICALL VM_getNanoTimeAdjustment(jlong offsetInSeconds) {
return JVM_GetNanoTimeAdjustment(nullptr, nullptr, offsetInSeconds);
}
由于被@CriticalNative
注解的JNI函数内部不能运用Java目标,因而它只能被用于static办法。由于非static办法会默许将this
作为参数传入。此外,C/C++层对应的函数也不再具有JNIEnv*
和jclass
两个参数。
让咱们来见识一下@CriticalNative
的威力!
10: long jdk.internal.misc.VM.getNanoTimeAdjustment(long) (dex_method_idx=32015)
DEX CODE:
OatMethodOffsets (offset=0x00006fd0)
code_offset: 0x000988c0
OatQuickMethodHeader (offset=0x000988bc)
vmap_table: (offset=0x0006fd81)
QuickMethodFrameInfo
frame_size_in_bytes: 0
core_spill_mask: 0x00000000
fp_spill_mask: 0x00000000
CODE: (code_offset=0x000988c0 size=16)...
0x000988c0: aa0003ef mov x15, x0
0x000988c4: aa0103e0 mov x0, x1
0x000988c8: f94009f0 ldr x16, [x15, #16]
0x000988cc: d61f0200 br x16
能够看到,通过@CriticalNative
注解的办法,终究编译生成的汇编代码只剩下了4行,功能可谓大大提高。
@FastNative
和@CriticalNative
尽管对功能提高有协助,但必定要留意它的适用范围,切莫随意运用,造成其他的问题。关于这二者的详细介绍和留意事项能够参考官方链接:@FastNative、@CriticalNative。本文就不再狗尾续貂了。