背景
跟着功用优化逐步步入深水区,咱们也很简单发现,越来越多大厂开始往更底层的方向去进行功用优化的切入。内存相关一直是功用优化中一个比较重要的指标,移动端应用的内存默许是256M/512M,对于常驻的应用来说,内存遇到的挑战会更多,因而,像字节等大厂,针对内存也出了不少“黑科技”计划,比方在android o一些,把bitmap的内存放到native层(android o 之后也官方也的确这么做),还有便是打破堆内存约束,扩大堆内存,比方解救OOM!字节自研 Android 虚拟机内存办理优化黑科技 mSponge。
mSponge的计划最牛逼的是,不但能在android o以下有着【bitmap放在native层的计划】的效果(由于android o以下的bitmap目标,其实真正占用内存的是java层的byte数组,而这个byte数组,其实也是契合大目标的界说的),一起也能把非Bitmap的大目标内存在虚拟机堆巨细核算时进行躲藏,然后到达打破堆内存上限的意图(增量取决于大目标内存LargeObjectSpace的巨细),可惜的是,mSponge的计划并未开源,可是!不要紧,咱们今天就来复刻一个咱们自己的mSponge!!
相关代码现已放在我的mooner项目里边,作为一个子功用,求star
理论部分
art虚拟机堆内存模型
解救OOM!字节自研 Android 虚拟机内存办理优化黑科技 mSponge
上图咱们能够看到,它其实是咱们art虚拟机关于堆内存的模型,里边堆模型相关的内容,我也在这篇文章中介绍过Art 虚拟机系列 – Heap内存模型 1,上面涉及到各个Space,咱们就不再介绍了
mSponge的计划,其实便是把堆中归于LargeObjectSpace的内存进行核算时的躲藏,然后到达进步堆上限的意图。
或许有的小伙伴会问,为什么LargeObjectSpace的内存能够做到核算躲藏,其他Space不能够嘛?其实这是LargeObjectSpace自身的特性决议的,LargeObjectSpace承继自DiscontinuousSpace,它有着一个非常重要的特性,便是在Space内存布局上,不像其他Space是地址紧密联系的,咱们从上图堆内存示例图也能够看到,因而就避免了gc时或者分配时内存时,存在的内存过错的影响。而其他Space,由于存在地址相关等特性,假如躲藏就很简单触发内存访问过错的反常。
咱们再回来,LargeObjectSpace,在Art中,其实有两种完结,一种是FreeList的办法[FreeListSpace],另一种是Map的办法[LargeObjectMapSpace]
FreeListSpace 经过找到list中闲暇单位进行分配,找到契合单位,目标放进去即可
mirror::Object* FreeListSpace::Alloc(Thread* self, size_t num_bytes, size_t* bytes_allocated,
size_t* usable_size, size_t* bytes_tl_bulk_allocated) {
MutexLock mu(self, lock_);
const size_t allocation_size = RoundUp(num_bytes, kAlignment);
AllocationInfo temp_info;
temp_info.SetPrevFreeBytes(allocation_size);
temp_info.SetByteSize(0, false);
AllocationInfo* new_info;
// Find the smallest chunk at least num_bytes in size.
auto it = free_blocks_.lower_bound(&temp_info);
if (it != free_blocks_.end()) {
AllocationInfo* info = *it;
free_blocks_.erase(it);
// Fit our object in the previous allocation info free space.
new_info = info->GetPrevFreeInfo();
// Remove the newly allocated block from the info and update the prev_free_.
info->SetPrevFreeBytes(info->GetPrevFreeBytes() - allocation_size);
if (info->GetPrevFreeBytes() > 0) {
AllocationInfo* new_free = info - info->GetPrevFree();
new_free->SetPrevFreeBytes(0);
new_free->SetByteSize(info->GetPrevFreeBytes(), true);
// If there is remaining space, insert back into the free set.
free_blocks_.insert(info);
}
} else {
// Try to steal some memory from the free space at the end of the space.
if (LIKELY(free_end_ >= allocation_size)) {
// Fit our object at the start of the end free block.
new_info = GetAllocationInfoForAddress(reinterpret_cast<uintptr_t>(End()) - free_end_);
free_end_ -= allocation_size;
} else {
return nullptr;
}
}
DCHECK(bytes_allocated != nullptr);
*bytes_allocated = allocation_size;
if (usable_size != nullptr) {
*usable_size = allocation_size;
}
DCHECK(bytes_tl_bulk_allocated != nullptr);
*bytes_tl_bulk_allocated = allocation_size;
// Need to do these inside of the lock.
++num_objects_allocated_;
++total_objects_allocated_;
num_bytes_allocated_ += allocation_size;
total_bytes_allocated_ += allocation_size;
mirror::Object* obj = reinterpret_cast<mirror::Object*>(GetAddressForAllocationInfo(new_info));
// We always put our object at the start of the free block, there cannot be another free block
// before it.
if (kIsDebugBuild) {
CheckedCall(mprotect, __FUNCTION__, obj, allocation_size, PROT_READ | PROT_WRITE);
}
new_info->SetPrevFreeBytes(0);
new_info->SetByteSize(allocation_size, false);
return obj;
}
LargeObjectMapSpace MemMap 内存映射分配 依据分配目标巨细,然后对齐,分配即可
mirror::Object* LargeObjectMapSpace::Alloc(Thread* self, size_t num_bytes,
size_t* bytes_allocated, size_t* usable_size,
size_t* bytes_tl_bulk_allocated) {
std::string error_msg;
每次都调用MapAnonymous,其实它最终调用的便是mmap
MemMap mem_map = MemMap::MapAnonymous("large object space allocation",
num_bytes,
PROT_READ | PROT_WRITE,
/*low_4gb=*/ true,
&error_msg);
if (UNLIKELY(!mem_map.IsValid())) {
LOG(WARNING) << "Large object allocation failed: " << error_msg;
return nullptr;
}
mirror::Object* const obj = reinterpret_cast<mirror::Object*>(mem_map.Begin());
const size_t allocation_size = mem_map.BaseSize();
MutexLock mu(self, lock_);
large_objects_.Put(obj, LargeObject {std::move(mem_map), false /* not zygote */});
DCHECK(bytes_allocated != nullptr);
if (begin_ == nullptr || begin_ > reinterpret_cast<uint8_t*>(obj)) {
begin_ = reinterpret_cast<uint8_t*>(obj);
}
end_ = std::max(end_, reinterpret_cast<uint8_t*>(obj) + allocation_size);
*bytes_allocated = allocation_size;
if (usable_size != nullptr) {
*usable_size = allocation_size;
}
DCHECK(bytes_tl_bulk_allocated != nullptr);
*bytes_tl_bulk_allocated = allocation_size;
num_bytes_allocated_ += allocation_size;
total_bytes_allocated_ += allocation_size;
++num_objects_allocated_;
++total_objects_allocated_;
return obj;
}
而在Art中,默许的LargeObjectSpace的完结是FreeListSpace,因而假如咱们依照文章中解救OOM!字节自研 Android 虚拟机内存办理优化黑科技 mSponge的完结,去hook LargeObjectMapSpace相关的符号的时分,其实对大部分手机是不收效的,需求注意噢!!那么咱们再回过头来看,既然默许完结不是LargeObjectMapSpace,那么FreeListSpace能进行内存躲藏吗?尽管FreeListSpace内部办理内存是freelist这种有内存相关性的分配计划,可是对于FreeListSpace自身与外部Space的地址,是存在隔离的,因而mSponge的计划仍旧能够作用在FreeListSpace上,对FreeListSpace的内存进行躲藏核算,而不破坏FreeListSpace自身的内存办理!
大目标界说
咱们刚刚也吧啦吧啦一大堆,还有个重要的前提,便是什么是大目标?虚拟机对于大目标的界说是啥?由于只要大目标才会落到LargeObjectSpace区域进行堆内存分配。
art/runtime/gc/heap-inl.h
inline bool Heap::ShouldAllocLargeObject(ObjPtr<mirror::Class> c, size_t byte_count) const {
// We need to have a zygote space or else our newly allocated large object can end up in the
// Zygote resulting in it being prematurely freed.
// We can only do this for primitive objects since large objects will not be within the card table
// range. This also means that we rely on SetClass not dirtying the object's card.
return byte_count >= large_object_threshold_ && (c->IsPrimitiveArray() || c->IsStringClass());
}
能够看到,当该次内存分配的目标大于large_object_threshold,且类型为基础类型数组或字符串时,就会在LargeObjectSpace进行分配
large_object_threshold 默许为12kb,3 * kPageSize,3个页的巨细
static constexpr size_t kMinLargeObjectThreshold = 3 * kPageSize;
static constexpr size_t kDefaultLargeObjectThreshold = kMinLargeObjectThreshold;
详细内存分配过程在Heap::AllocObjectWithAllocator 中,我就不在本篇介绍,后续会有更多堆相关的文章噢!
mSponge计划的思路
咱们来看一下字节大佬给出的流程图
- 第一步,咱们需求做到,在oom的时分进行监听,当oom产生的时分,进行堆中LargeObjectSpace的内存进行躲藏,并阻拦本次oom
- 进行LargeObjectSpace内存躲藏,内存的巨细等于当时LargeObjectSpace
- 重新建议内存申请
首要的流程如上,当然,咱们还需求兼顾一些额定的副作用,比方咱们需求屏蔽虚拟机中对gc内存的校验(看提交记载,这个首要是为了验证虚拟机gc的正确性)可是对于虚拟机gc发展了这么多年,其内部过错的概率可忽略不计了,还有便是gc完结之后,假如开释了归于LargeObjectSpace的内存,咱们在额定条件下需求进行堆补偿(由于上面第2步,其实咱们现已删去了堆中归于LargeObjectSpace的内存了)。
好了,咱们直接进入实战环节!!
实战环节
咱们要完结上述的计划,需求完结以下几个小步骤,当几个步骤完结后,其实咱们的计划就现已完结了,我的测验手机是android11,以下符号的hook也针对android11噢~ 里边会涉及到inlinehook【采用子节的shadowhook计划,原汁原味噢】的运用,假如对inlinehook还不太明晰的小伙伴,能够先预习一下噢
获取当时LargeObjectSpace的巨细
LargeObjectSpace类中,提供了获取当时Space所占有的内存
uint64_t GetBytesAllocated() override {
MutexLock mu(Thread::Current(), lock_);
return num_bytes_allocated_;
}
因而,咱们能够经过符号解析的办法调用该计划,这儿符号,便是经过dlopen打开某个so获取到so的句柄,一起经过dlsym去寻找so中的特定符号,然后找到函数自身,可是dlopen现已被谷歌维护起来了,咱们不能够直接调用(之前咱们在这篇文章有说过,一起破解手法也有提到过噢!),不过这儿咱们能够直接用shadowhook提供的dlopen即可,如
void *handle = shadowhook_dlopen("libart.so");
void *func = shadowhook_dlsym(handle,
"_ZN3art2gc5space16LargeObjectSpace17GetBytesAllocatedEv");
_ZN3art2gc5space16LargeObjectSpace17GetBytesAllocatedEv 是GetBytesAllocated在libart中的符号
一起咱们也发现这是一个实例办法
((int (*)(void *)) func)
它的函数界说是这个,即需求一个LargeObjectSpace的目标的入参
获取LargeObjectSpace目标
咱们刚刚也说过,LargeObjectSpace在art中,其实是由它的子类完结,默许的是FreeListSpace,因而咱们能够在FreeListSpace进行内存分配的时分,即调用Alloc办法的时分,进行hook即可获取到FreeListSpace指针。
FreeListSpace::Alloc办法的符号是这个
_ZN3art2gc5space13FreeListSpace5AllocEPNS_6ThreadEmPmS5_S5_
因而咱们hook后,即可取得FreeListSpace的指针,便利后续调用GetBytesAllocated办法
void *los_alloc_proxy(void *thiz, void *self, size_t num_bytes, size_t *bytes_allocated,
size_t *usable_size,
size_t *bytes_tl_bulk_allocated) {
void *largeObjectMap = ((los_alloc) los_alloc_orig)(thiz, self, num_bytes, bytes_allocated,
usable_size,
bytes_tl_bulk_allocated);
los = thiz;
return largeObjectMap;
}
删去堆中LargeObjectSpace的巨细
voidHeap::RecordFree(uint64_tfreed_objects,int64_tfreed_bytes){
......
//Note:Thisrelieson2scomplementforhandlingnegativefreed_bytes.
//开释之后,需求同步更新虚拟机整体Heap内存运用
num_bytes_allocated_.fetch_sub(static_cast<ssize_t>(freed_bytes),std::memory_order_relaxed);
......
}
RecordFree办法能够删去heap中的堆巨细,freed_bytes是开释的巨细,freed_objects是某一个目标的地址,这儿咱们要注意把其设置为一个无效数值,比方-1,由于咱们其实没有真正开释某个目标,其巨细也是咱们LargeObjectSpace中的巨细。
//阻拦并跳过本次OutOfMemory,并置标记位
void *handle = shadowhook_dlopen("libart.so");
void *func = shadowhook_dlsym(handle, "_ZN3art2gc4Heap10RecordFreeEml");
((void (*)(void *, uint64_t, int64_t)) func)(heap, -1, freeSize);
监听oom
咱们计划中,还需求监听oom的产生,且把该次oom给阻拦掉,去触发一次gc回收。这儿的流程是左边是正常OOM流程,右图是咱们计划的流程
这儿判别oom是否产生,咱们能够经过inline hook 该符号即可
_ZN3art2gc4Heap21ThrowOutOfMemoryErrorEPNS_6ThreadEmNS0_13AllocatorTypeE
void throw_out_of_memory_error_proxy(void *heap, void *self, size_t byte_count,
enum AllocatorType allocator_type) {
__android_log_print(ANDROID_LOG_ERROR, MSPONGE_TAG, "%s,%d,%d", "产生了oom ",pthread_gettid_np(pthread_self()),sForceAllocateInternalWithGc);
// 产生了oom,把oom的标志位设置为true
sFindThrowOutOfMemoryError = true;
// 假如当时不是铲除堆空间后再引发的oom,则进行堆铲除,不然直接oom
if (!sForceAllocateInternalWithGc) {
__android_log_print(ANDROID_LOG_ERROR, MSPONGE_TAG, "%s", "产生了oom,进行gc阻拦");
if (los != NULL){
uint64_t currentAlloc = get_num_bytes_allocated(los);
if (currentAlloc > lastAllocLOS){
call_record_free(heap,currentAlloc - lastAllocLOS);
__android_log_print(ANDROID_LOG_ERROR, MSPONGE_TAG, "%s,%d", "本次增量:",currentAlloc - lastAllocLOS);
lastAllocLOS = currentAlloc;
return;
}
}
.....
}
//假如不允许阻拦,则直接调用原函数,抛出OOM反常
__android_log_print(ANDROID_LOG_ERROR, MSPONGE_TAG, "%s", "oom阻拦失效");
((out_of_memory) throw_out_of_memory_error_orig)(heap, self, byte_count, allocator_type);
}
AllocateInternalWithGc
咱们也注意到,ThrowOutOfMemoryError被调用的时分,并不一定会产生OOM,而是会尝试用AllocateInternalWithGc,对各个Space进行一次gc,假如gc后有闲暇内存得以分配,就不会触发真正的oom反常。因而咱们需求hook AllocateInternalWithGc办法,判别分配的目标是否为null,假如为null证明之后又会触发到ThrowOutOfMemoryError办法真正抛出oom。
该办法的符号是
_ZN3art2gc4Heap22AllocateInternalWithGcEPNS_6ThreadENS0_13AllocatorTypeEbmPmS5_S5_PNS_6ObjPtrINS_6mirror5ClassEEE
void *allocate_internal_with_gc_proxy(void *heap, void *self,
enum AllocatorType allocator,
bool instrumented,
size_t alloc_size,
size_t *bytes_allocated,
size_t *usable_size,
size_t *bytes_tl_bulk_allocated,
void *klass) {
__android_log_print(ANDROID_LOG_ERROR, MSPONGE_TAG, "%s", "gc 后分配");
sForceAllocateInternalWithGc = false;
void *object = ((alloc_internal_with_gc_type) alloc_internal_with_gc_orig)(heap, self,
allocator,
instrumented,
alloc_size,
bytes_allocated,
usable_size,
bytes_tl_bulk_allocated,
klass);
// 分配内存为null,且产生了oom
if (object == NULL && sFindThrowOutOfMemoryError) {
// 证明oom后体系进行gc仍旧没能找到合适的内存,所以要尝试进行堆铲除
__android_log_print(ANDROID_LOG_ERROR, MSPONGE_TAG, "%s", "分配内存不足,采取堆铲除策略");
sForceAllocateInternalWithGc = true;
object = ((alloc_internal_with_gc_type) alloc_internal_with_gc_orig)(heap, self, allocator,
instrumented,
alloc_size,
bytes_allocated,
usable_size,
bytes_tl_bulk_allocated,
klass);
// 假如当时heap 经过gc后开释了归于largeobjectspace 的空间,此刻要进行heap补偿
if (los != NULL){
uint64_t currentAllocLOS = get_num_bytes_allocated(los);
__android_log_print(ANDROID_LOG_ERROR, MSPONGE_TAG, "%s %lu : %lu", "当时数值",currentAllocLOS, lastAllocLOS);
if (currentAllocLOS < lastAllocLOS){
call_record_free(heap,currentAllocLOS - lastAllocLOS);
__android_log_print(ANDROID_LOG_ERROR, MSPONGE_TAG, "%s %lu", "los进行补偿",currentAllocLOS - lastAllocLOS);
}
}
sForceAllocateInternalWithGc = false;
}
return object;
}
gc后内存校验
由于咱们屏蔽了LargeObjectSpace的内存,因而gc前后的巨细会不一致,会走到这个判别
voidHeap::GrowForUtilization(collector::GarbageCollector*collector_ran,
uint64_tbytes_allocated_before_gc){
//GC结束后,再次获取当时虚拟机内存巨细
constuint64_tbytes_allocated=GetBytesAllocated();
......
if(!ignore_max_footprint_){
constuint64_tfreed_bytes=current_gc_iteration_.GetFreedBytes()+
current_gc_iteration_.GetFreedLargeObjectBytes()+
current_gc_iteration_.GetFreedRevokeBytes();
//GC之后虚拟机已运用内存加上本次GC开释内存理论上要大于等于GC之前虚拟机运用的内存,假如不满足,则抛出Fatel反常!!!
CHECK_GE(bytes_allocated+freed_bytes,bytes_allocated_before_gc);
}
......
}
这儿首要是为了验证相关gc后策略对内存是否存在反常,实际上gc计划现已出来多年,由gc引起的内存反常简直能够忽略不计,一起依据头条的验证,将bytes_allocated_before_gc写死为0后也没有什么影响,所以咱们之间hook GrowForUtilization符号调用,设置bytes_allocated_before_gc为0就不会调用到CHECK_GE之后的反常判别。
void
grow_for_utilization_proxy(void *heap, void *collector_ran, uint64_t bytes_allocated_before_gc) {
((grow_for_utilization) grow_for_utilization_orig)(heap, collector_ran, 0);
}
总结
经过上面咱们拆分的几个环节,咱们就能够把mSponge计划给完结了,一起也依据咱们的理论对计划进行了一定的调整。
看完实战部分后,假如还有小伙伴不清楚一些细节,一起也苦恼没有效果体验。没关系,我现已开源啦,放在了咱们mooner项目里边,作为它的一个子功用,快去体验一下!!github.com/TestPlanB/m…
经过demo,你能够很直观的看到msponge计划的魅力,真的很强壮【狗头】,别忘了star呀
最终,假如需求更多交流的小伙伴,也或许在一些共享组织,比方bagutree/沙龙等活动能不定期看到我的身影,假如你有疑问,抓住我就问吧哈哈哈哈!逃!
本文正在参加「金石计划」