这是功能优化系列之matrix结构的第2
篇文章,我将在功能优化专栏中对matrix apm结构做一个全面的代码剖析,功能优化是Android高级工程师必知必会的点,也是面试过程中的高频题目,对功能优化感兴趣的小伙伴能够去我主页查看一切关于matrix的分享。
前语
matrix 对io的监控包含四个方面
- 监控在主线程履行 IO 操作的问题
- 监控缓冲区过小的问题
- 监控重复读同一文件
- 监控内存走漏问题
IOCanaryPlugin,内部由IOCanaryCore完结真正的操作。
start办法
依据配置进行hook的装置
//io流hook
if (ioConfig.isDetectFileIOInMainThread() || ioConfig.isDetectFileIOBufferTooSmall() || ioConfig.isDetectFileIORepeatReadSameFile()) {
IOCanaryJniBridge.install(ioConfig, this);
}
//内存走漏hook
if (ioConfig.isDetectIOClosableLeak()) {
this.mCloseGuardHooker = new CloseGuardHooker(this);
this.mCloseGuardHooker.hook();
}
stop办法
取消hook
if (this.mCloseGuardHooker != null) {
this.mCloseGuardHooker.unHook();
}
IOCanaryJniBridge.uninstall();
IOCanaryJniBridge.install()
底层hook装置包函几个步骤,加载so,设置hook内容,分别对应了下面几个办法
loadJni
System.loadLibrary("io-canary")
履行了System.loadLibrary(“io-canary”),此时会进入io_canary_jni.cc中的JNI_OnLoad办法,在这个办法中有两项要害操作,1.获取到java层的一些信息,2.设置一个回调接口,用于上传监控信息.
InitJniEnv()
static bool InitJniEnv(JavaVM *vm) {
....
jclass temp_cls = env->FindClass("com/tencent/matrix/iocanary/core/IOCanaryJniBridge");
....
}
SetIssuedCallback()
iocanary::IOCanary::Get().SetIssuedCallback(OnIssuePublish)
其间OnIssuePublish是在拿到信息之后将信息组装成java层的目标IOIssue,然后放入List中,经过调用java层IOCanaryJniBridge类的onIssuePublish实现信息的抛出。
com/tencent/matrix/iocanary/core/IOIssue
com/tencent/matrix/iocanary/core/IOCanaryJniBridge
enableDetector
经过传入定义好的type类型到底层,实现此类型的io监控,代码如下
iocanary::IOCanary::Get().RegisterDetector(static_cast<DetectorType>(detector_type));
能够看到终究是往detectors_这个vector集合中存入了对应的Detector,每个Detector都FileIODetector的子类。
- FileIOMainThreadDetector
- FileIORepeatReadDetector
- FileIOSmallBufferDetector
void IOCanary::RegisterDetector(DetectorType type) {
switch (type) {
case DetectorType::kDetectorMainThreadIO:
detectors_.push_back(new FileIOMainThreadDetector());
break;
case DetectorType::kDetectorSmallBuffer:
detectors_.push_back(new FileIOSmallBufferDetector());
break;
case DetectorType::kDetectorRepeatRead:
detectors_.push_back(new FileIORepeatReadDetector());
break;
default:
break;
}
}
setConfig
给对应的io监控设置监控阈值,存入configs_数组,配置和对应的默认值如下,超过阈值则触发监控
- kMainThreadThreshold = 500 毫秒
- kSmallBufferThreshold = 4096 kb
- kRepeatReadThreshold = 20 次
iocanary::IOCanary::Get().SetConfig(static_cast(key), val);
void IOCanaryEnv::SetConfig(IOCanaryConfigKey key, long val) {
if (key >= IOCanaryConfigKey::kConfigKeysLen) {
return;
}
configs_[key] = val;
}
dohook
dohook是核心办法,前边配置信息准备好后,这儿开始进行对应办法的hook。被hook的so文件为
const static char* TARGET_MODULES[] = {
"libopenjdkjvm.so",
"libjavacore.so",
"libopenjdk.so"
};
关于GOT hook,能够查看爱奇艺的开源结构XHook,这儿不再描述细节。github.com/iqiyi/xHook…
被hook的办法如下
open、open64、close、android_fdsan_close_with_tag,
假如so是libjavacore.so,会测验hook它内部的这几个办法
read、__read_chk、write、__write_chk
open
当一个文件被翻开时,回调到设置好的办法ProxyOpen中,在这儿会检测是否是主线程操作,如不是则不做处理,如是主线程,则履行DoProxyOpenLogic逻辑。
int ProxyOpen(const char *pathname, int flags, mode_t mode) {
if(!IsMainThread()) {
return original_open(pathname, flags, mode);
}
int ret = original_open(pathname, flags, mode);
if (ret != -1) {
DoProxyOpenLogic(pathname, flags, mode, ret);
}
return ret;
}
在DoProxyOpenLogic办法中会获取到当时仓库信息
static void DoProxyOpenLogic(const char *pathname, int flags, mode_t mode, int ret) {
....
//kJavaBridgeClass = com/tencent/matrix/iocanary/core/IOCanaryJniBridge
//kMethodIDGetJavaContext = getJavaContext() 得到的是一个JavaContext,
//是一个内部类,这个类上有一个变量stack,在java Context 创立的时候,
//就会获取到仓库信息,保存在stack变量上
jobject java_context_obj = env->CallStaticObjectMethod(kJavaBridgeClass, kMethodIDGetJavaContext);
if (NULL == java_context_obj) {
return;
}
//仓库信息
jstring j_stack = (jstring) env->GetObjectField(java_context_obj, kFieldIDStack);
jstring j_thread_name = (jstring) env->GetObjectField(java_context_obj, kFieldIDThreadName);
//当时线程名
char* thread_name = jstringToChars(env, j_thread_name);
char* stack = jstringToChars(env, j_stack);
JavaContext java_context(GetCurrentThreadId(), thread_name == NULL ? "" : thread_name, stack == NULL ? "" : stack);
....
//pathname是被翻开的文件名,java_context中包含了仓库和线程名
//flags和mode都是系统open办法调用传过来的值,ret是open履行的结果
//这儿进入了IOCanary OnOpen办法
iocanary::IOCanary::Get().OnOpen(pathname, flags, mode, ret, java_context);
....
}
open64
同open()
close
检测是主线程,进入IOCanary OnClose办法
int ProxyClose(int fd) {
if(!IsMainThread()) {
return original_close(fd);
}
int ret = original_close(fd);
iocanary::IOCanary::Get().OnClose(fd, ret);
return ret;
}
android_fdsan_close_with_tag
同close()
read
主要是获取到read耗费的时长,然后携带信息进入IOCanary OnRead
ssize_t ProxyRead(int fd, void *buf, size_t size) {
if(!IsMainThread()) {
return original_read(fd, buf, size);
}
//获取到当时时刻
int64_t start = GetTickCountMicros();
//履行原read办法
size_t ret = original_read(fd, buf, size);
//记载read时刻距离
long read_cost_us = GetTickCountMicros() - start;
//将信息传入IOCanary OnRead
iocanary::IOCanary::Get().OnRead(fd, buf, size, ret, read_cost_us);
return ret;
}
__read_chk
同read
write
ssize_t ProxyWrite(int fd, const void *buf, size_t size) {
if(!IsMainThread()) {
return original_write(fd, buf, size);
}
//获取到当时时刻
int64_t start = GetTickCountMicros();
//履行write
size_t ret = original_write(fd, buf, size);
//记载时刻距离
long write_cost_us = GetTickCountMicros() - start;
//将信息传入IOCanary OnRead
iocanary::IOCanary::Get().OnWrite(fd, buf, size, ret, write_cost_us);
return ret;
}
__write_chk
同write
IOCanary
从上边open close read write办法的流向可知,终究都仍是聚集到了IOCanary这个C++类中,进入对应的办法可知,IOCanary内部又调用了IOInfoCollector这个类。
。
OnOpen
void IOCanary::OnOpen(const char *pathname, int flags, mode_t mode,
int open_ret, const JavaContext& java_context) {
collector_.OnOpen(pathname, flags, mode, open_ret, java_context);
}
办法内部逻辑也很明晰,直接将文件名和相关信息组装成info,然后以文件描述符为key,info为value存入了c++的info_map_(一个std::unordered_map)中,信息存起来肯定是要用的,咱们后边会看到。文件翻开之后,下一步便是或读或写,继续去看read办法。
void IOInfoCollector::OnOpen(const char *pathname, int flags, mode_t mode
, int open_ret, const JavaContext& java_context) {
if (open_ret == -1) {
return;
}
//open_ret参数指的是open办法调用后的结果,也便是当时被翻开的文件的文件描述符,
//假如已存在,则回来
if (info_map_.find(open_ret) != info_map_.end()) {
return;
}
std::shared_ptr<IOInfo> info = std::make_shared<IOInfo>(pathname, java_context);
info_map_.insert(std::make_pair(open_ret, info));
}
OnRead
void IOCanary::OnRead(int fd, const void *buf, size_t size,
ssize_t read_ret, long read_cost) {
collector_.OnRead(fd, buf, size, read_ret, read_cost);
}
看起来要害内容在CountRWInfo中,从办法名上能够看出,读和写都与此办法有关,所以咱们先不看CountRWInfo办法内容,看完write后再去深入CountRWInfo办法。
void IOInfoCollector::OnRead(int fd, const void *buf, size_t size,
ssize_t read_ret, long read_cost) {
if (read_ret == -1 || read_cost < 0) {
return;
}
if (info_map_.find(fd) == info_map_.end()) {
return;
}
CountRWInfo(fd, FileOpType::kRead, size, read_cost);
}
OnWrite
void IOCanary::OnWrite(int fd, const void *buf, size_t size,
ssize_t write_ret, long write_cost) {
collector_.OnWrite(fd, buf, size, write_ret, write_cost);
}
和read相同,进入了CountRWInfo办法
void IOInfoCollector::OnWrite(int fd, const void *buf, size_t size,
ssize_t write_ret, long write_cost) {
if (write_ret == -1 || write_cost < 0) {
return;
}
if (info_map_.find(fd) == info_map_.end()) {
return;
}
CountRWInfo(fd, FileOpType::kWrite, size, write_cost);
}
CountRWInfo
CountRWInfo将每个文件对应的信息封装到IOInfo这个类中,封装的信息包函:
- 读(写)次数
- 文件巨细
- 读(写)耗费的时长
- 单次读(写)最大时长
- 读(写)距离小于8000奇妙的总时长
- 缓存区巨细
- 读写类型,读仍是写
在一个文件被读写过程中,这个办法会不断的被调用,并更新对应的信息,读写完结之后,得到终究的信息,履行close办法。
void IOInfoCollector::CountRWInfo(int fd, const FileOpType &fileOpType, long op_size, long rw_cost) {
if (info_map_.find(fd) == info_map_.end()) {
return;
}
const int64_t now = GetSysTimeMicros();
//读写次数
info_map_[fd]->op_cnt_ ++;
//文件巨细
info_map_[fd]->op_size_ += op_size;
//读写耗费的时长
info_map_[fd]->rw_cost_us_ += rw_cost;
//单次读写最大时长
if (rw_cost > info_map_[fd]->max_once_rw_cost_time_s_) {
info_map_[fd]->max_once_rw_cost_time_s_ = rw_cost;
}
//读写距离小于8000奇妙的总时长
if (info_map_[fd]->last_rw_time_s_ > 0 && (now - info_map_[fd]->last_rw_time_s_) < kContinualThreshold) {
info_map_[fd]->current_continual_rw_time_s_ += rw_cost;
} else {
info_map_[fd]->current_continual_rw_time_s_ = rw_cost;
}
if (info_map_[fd]->current_continual_rw_time_s_ > info_map_[fd]->max_continual_rw_cost_time_s_) {
info_map_[fd]->max_continual_rw_cost_time_s_ = info_map_[fd]->current_continual_rw_time_s_;
}
info_map_[fd]->last_rw_time_s_ = now;
//缓存区巨细
if (info_map_[fd]->buffer_size_ < op_size) {
info_map_[fd]->buffer_size_ = op_size;
}
//读写类型,读仍是写
if (info_map_[fd]->op_type_ == FileOpType::kInit) {
info_map_[fd]->op_type_ = fileOpType;
}
}
OnClose
void IOCanary::OnClose(int fd, int close_ret) {
std::shared_ptr<IOInfo> info = collector_.OnClose(fd, close_ret);
if (info == nullptr) {
return;
}
OfferFileIOInfo(info);
}
close时记载总时长,文件巨细,然后回来,回来后进入OfferFileIOInfo办法
std::shared_ptr<IOInfo> IOInfoCollector::OnClose(int fd, int close_ret) {
if (info_map_.find(fd) == info_map_.end()) {
return nullptr;
}
//从翻开到封闭的总时长
info_map_[fd]->total_cost_s_ = GetSysTimeMicros() - info_map_[fd]->start_time_s_;
//获取到文件巨细
info_map_[fd]->file_size_ = GetFileSize(info_map_[fd]->path_.c_str());
std::shared_ptr<IOInfo> info = info_map_[fd];
//从map中移除
info_map_.erase(fd);
//回来信息
return info;
}
OfferFileIOInfo将info放入队列,并调用notify_one办法告诉顾客消费,这儿用到了生产消费形式,生产者将生产果实放在队列中,顾客从队列取出进行消费,咱们找下顾客在哪。
void IOCanary::OfferFileIOInfo(std::shared_ptr<IOInfo> file_io_info) {
std::unique_lock<std::mutex> lock(queue_mutex_);
queue_.push_back(file_io_info);
queue_cv_.notify_one();
lock.unlock();
}
能够看到,IOCanary在创立的时候,启动了一个线程
IOCanary::IOCanary() {
exit_ = false;
std::thread detect_thread(&IOCanary::Detect, this);
detect_thread.detach();
}
线程中有一个无限循环,它担任不断的从队列中拿info,假如队列为空则挂起线程等待。
前边咱们看到了拿到一条info之后,将info放入到队列中,然后告诉顾客消费,此时顾客线程会从TakeFileIOInfo办法中被唤醒,并拿到一条info,交给各个detector去检测。
检测完结之后,满意条件的信息会被放入published_issues中,然后issued_callback_将信息回调出去。前边说到有三个detector接下来详细看下他们的内部逻辑。
void IOCanary::Detect() {
std::vector<Issue> published_issues;
std::shared_ptr<IOInfo> file_io_info;
while (true) {
published_issues.clear();
int ret = TakeFileIOInfo(file_io_info);
if (ret != 0) {
break;
}
for (auto detector : detectors_) {
detector->Detect(env_, *file_io_info, published_issues);
}
if (issued_callback_ && !published_issues.empty()) {
issued_callback_(published_issues);
}
file_io_info = nullptr;
}
}
FileIOMainThreadDetector
检测主线程io
void FileIOMainThreadDetector::Detect(const IOCanaryEnv &env, const IOInfo &file_io_info,
std::vector<Issue>& issues) {
//有必要是主线程才会履行
if (GetMainThreadId() == file_io_info.java_context_.thread_id_) {
int type = 0;
//单次io时长超过13毫秒,要记载
//constexpr static const int kPossibleNegativeThreshold = 13*1000;
if (file_io_info.max_once_rw_cost_time_s_ > IOCanaryEnv::kPossibleNegativeThreshold) {
type = 1;
}
//最大连续读写时长超过env.GetMainThreadThreshold()=500
if(file_io_info.max_continual_rw_cost_time_s_ > env.GetMainThreadThreshold()) {
type |= 2;
}
if (type != 0) {
Issue issue(kType, file_io_info);
issue.repeat_read_cnt_ = type;
//存入
PublishIssue(issue, issues);
}
}
}
FileIORepeatReadDetector
监听重复读取同一文件
void FileIORepeatReadDetector::Detect(const IOCanaryEnv &env,
const IOInfo &file_io_info,
std::vector<Issue>& issues) {
const std::string& path = file_io_info.path_;
if (observing_map_.find(path) == observing_map_.end()) {
if (file_io_info.max_continual_rw_cost_time_s_ < env.kPossibleNegativeThreshold) {
return;
}
observing_map_.insert(std::make_pair(path, std::vector<RepeatReadInfo>()));
}
std::vector<RepeatReadInfo>& repeat_infos = observing_map_[path];
if (file_io_info.op_type_ == FileOpType::kWrite) {
repeat_infos.clear();
return;
}
RepeatReadInfo repeat_read_info(file_io_info.path_, file_io_info.java_context_.stack_, file_io_info.java_context_.thread_id_,
file_io_info.op_size_, file_io_info.file_size_);
if (repeat_infos.size() == 0) {
repeat_infos.push_back(repeat_read_info);
return;
}
if((GetTickCount() - repeat_infos[repeat_infos.size() - 1].op_timems) > 17) { //17ms todo astrozhou add to params
repeat_infos.clear();
}
bool found = false;
int repeatCnt;
for (auto& info : repeat_infos) {
if (info == repeat_read_info) {
found = true;
info.IncRepeatReadCount();
repeatCnt = info.GetRepeatReadCount();
break;
}
}
if (!found) {
repeat_infos.push_back(repeat_read_info);
return;
}
if (repeatCnt >= env.GetRepeatReadThreshold()) {
Issue issue(kType, file_io_info);
issue.repeat_read_cnt_ = repeatCnt;
issue.stack = repeat_read_info.GetStack();
PublishIssue(issue, issues);
}
}
FileIOSmallBufferDetector
监听缓存区过小
void FileIOSmallBufferDetector::Detect(const IOCanaryEnv &env, const IOInfo &file_io_info,
std::vector<Issue>& issues) {
if (file_io_info.op_cnt_ > env.kSmallBufferOpTimesThreshold && (file_io_info.op_size_ / file_io_info.op_cnt_) < env.GetSmallBufferThreshold()
&& file_io_info.max_continual_rw_cost_time_s_ >= env.kPossibleNegativeThreshold) {
PublishIssue(Issue(kType, file_io_info), issues);
}
}
OnIssuePublish
一切信息都拿到之后就开始要回调了,也就回到了咱们最开始开到的
iocanary::IOCanary::Get().SetIssuedCallback(OnIssuePublish)
void OnIssuePublish(const std::vector<Issue>& published_issues) {
....
//这儿new了一个Java层的List
jobject j_issues = env->NewObject(kListClass, kMethodIDListConstruct);
//遍历一切的info,拿到信息,每一条信息创立一个Java层的IOIssue目标,封装到这个目标中
for (const auto& issue : published_issues) {
jint type = issue.type_;
jstring path = env->NewStringUTF(issue.file_io_info_.path_.c_str());
jlong file_size = issue.file_io_info_.file_size_;
jint op_cnt = issue.file_io_info_.op_cnt_;
jlong buffer_size = issue.file_io_info_.buffer_size_;
jlong op_cost_time = issue.file_io_info_.rw_cost_us_/1000;
jint op_type = issue.file_io_info_.op_type_;
jlong op_size = issue.file_io_info_.op_size_;
jstring thread_name = env->NewStringUTF(issue.file_io_info_.java_context_.thread_name_.c_str());
jstring stack = env->NewStringUTF(issue.stack.c_str());
jint repeat_read_cnt = issue.repeat_read_cnt_;
jobject issue_obj = env->NewObject(kIssueClass, kMethodIDIssueConstruct, type, path, file_size, op_cnt, buffer_size,
op_cost_time, op_type, op_size, thread_name, stack, repeat_read_cnt);
//讲IOIssue目标add到List中
env->CallBooleanMethod(j_issues, kMethodIDListAdd, issue_obj);
....
}
//回调到Java层的IOCanaryJniBridge类中的静态办法onIssuePublish中
env->CallStaticVoidMethod(kJavaBridgeClass, kMethodIDOnIssuePublish, j_issues);
....
}
后边在Java层onIssuePublish中就开始拼接信息转为json打印到控制台或上传服务器,流程至此就算完毕了。
总结
IOCanaryPlugin经过hook底层io办法open、read、write、close来实现对io操作的拦截,所以一切的io操作都会被监控到,这样就能够在每一个io操作的过程中记载操作的信息,并剖析io操作是否超过设定阈值,如满意条件则进行上报。