本文为稀土技术社区首发签约文章,30天内制止转载,30天后未获授权制止转载,侵权必究!
引言
本篇是ArkUI Engine 系列的第五篇,经过前四篇文章,相信读者能够掌握一个ArkUI控件最重要的绘制进程与事情绑定进程的原理了,控件的绘制是Engine中的主要流程。当然,Engine做的不只是UI的绘制工作,还有一个流通度监控体系,即WatchDog机制。
经过学习本篇,你将了解到鸿蒙的WatchDog机制与ANR(使用无呼应)断定相关的代码细节,便利咱们进行后续的性能监控与优化。
WatchDog
无论是哪个UI体系,都有着体系流通度监控的需求,鸿蒙也不破例,当咱们遇到以下代码时,点击Text就会进入死循环,此刻咱们再次进行点击事情,就会出现咱们熟知的ANR弹窗
Column() {
Text(this.father.name)
.width("200vp")
.onClick(() => {
while (true){
}
})
ANR弹窗如下:
ANR的检测,其实就经过WatchDog 机制完结的,下面咱们来详细了解一下WatchDog机制
WatchDog 初始化
WatchDog机制中,有两个相关的类,一个是Watchers 结构体,另一个是WatchDog类,WatchDog中会有一个持有着value为Watchers的map
namespace OHOS::Ace {
class ThreadWatcher;
struct Watchers {
RefPtr<ThreadWatcher> jsWatcher;
RefPtr<ThreadWatcher> uiWatcher;
};
class WatchDog final : public Referenced {
public:
WatchDog();
~WatchDog() override;
void Register(int32_t instanceId, const RefPtr<TaskExecutor>& taskExecutor, bool useUIAsJSThread);
void Unregister(int32_t instanceId);
void BuriedBomb(int32_t instanceId, uint64_t bombId);
void DefusingBomb(int32_t instanceId);
private:
std::unordered_map<int32_t, Watchers> watchMap_;
ACE_DISALLOW_COPY_AND_MOVE(WatchDog);
};
} // namespace OHOS::Ace
WatchDog在结构函数的时分,会创立发动一个AnrThread
WatchDog::WatchDog()
{
AnrThread::Start();
#if defined(OHOS_PLATFORM) || defined(ANDROID_PLATFORM)
AnrThread::PostTaskToTaskRunner(InitializeGcTrigger, GC_CHECK_PERIOD);
#endif
}
AnrThread界说也很简单,它用于一个事情循环的才能,即像Android的Looper相同不断进行事情的分发
namespace OHOS::Ace {
class AnrThread {
public:
static void Start();
static void Stop();
using Task = std::function<void()>;
static bool PostTaskToTaskRunner(Task&& task, uint32_t delayTime);
};
} // namespace OHOS::Ace
#endif
事情分发的才能是由TaskRunnerAdapter类供给的,TaskRunnerAdapter抽象了事情分发的才能,它的事情可所以任何具有才能分发的类供给,比方(OHOS::AppExecFwk::EventRunner)
namespace {
需要一个TaskRunnerAdapter,用于事情的分发
RefPtr<TaskRunnerAdapter> g_anrThread;
} // namespace
void AnrThread::Start()
{
if (!g_anrThread) {
g_anrThread = TaskRunnerAdapterFactory::Create(false, "anr");
}
}
void AnrThread::Stop()
{
g_anrThread.Reset();
}
bool AnrThread::PostTaskToTaskRunner(Task&& task, uint32_t delayTime)
{
if (!g_anrThread || !task) {
return false;
}
if (delayTime > 0) {
g_anrThread->PostDelayedTask(std::move(task), delayTime, {});
} else {
g_anrThread->PostTask(std::move(task), {});
}
return true;
}
} // namespace OHOS::Ace
初始化的动作很简单,即发动一个具有事情循环机制的类,用于后边进行事情的循环分发,一起当时渠道假如界说了这两个宏情况下OHOS_PLATFORM或者ANDROID_PLATFORM, 那么将会建议第一个事情,用于GC信号的注册。没错,Engine中需要经过信号触发GC,经过注册自界说信号SIGNAL_FOR_GC(60)来进行信号绑定
void InitializeGcTrigger()
{
// Record watch dog thread as signal handling thread
g_signalThread = pthread_self();
int32_t result = BlockGcSignal();
if (result != 0) {
LOGE("Failed to block GC signal, errno = %{public}d", result);
return;
}
// Start to receive GC signal
signal(SIGNAL_FOR_GC, OnSignalReceive);
// Start check GC signal
CheckGcSignal();
}
CheckGcSignal 经过sigtimedwait函数,用于当必定时刻内等待信号降临,假如在时刻内有收到信号,那么顺利履行AceEngine::Get().TriggerGarbageCollection();办法进行GC。(sigtimedwait 超时时result会小于0一起errno会被设置为EAGAIN,一起判别EINTR的目的是其他信号降临时也会打断sigtimedwait调用)
void CheckGcSignal()
{
// Check if GC signal is in pending signal set
sigset_t sigSet;
sigemptyset(&sigSet);
sigaddset(&sigSet, SIGNAL_FOR_GC);
struct timespec interval = {
.tv_sec = 0,
.tv_nsec = 0,
};
int32_t result = sigtimedwait(&sigSet, nullptr, &interval);
if (result < 0) {
if (errno != EAGAIN && errno != EINTR) {
LOGE("Failed to wait signals, errno = %{public}d", errno);
return;
}
} else {
ACE_DCHECK(result == SIGNAL_FOR_GC);
// Start GC
LOGE("Receive GC signal");
AceEngine::Get().TriggerGarbageCollection();
}
// Check again
AnrThread::AnrThread::PostTaskToTaskRunner(CheckGcSignal, GC_CHECK_PERIOD);
}
至此,WatchDog事情循环机制已经完结初始化,能够承受后边的“埋炸弹”与“拆炸弹”动作了
ANR机制
WatchDog 经过露出Register 办法,供给给Engine以外的模块进行注册,注册之后就能够使用WatchDog的监控
void WatchDog::Register(int32_t instanceId, const RefPtr<TaskExecutor>& taskExecutor, bool useUIAsJSThread)
{
Watchers watchers = {
.jsWatcher = AceType::MakeRefPtr<ThreadWatcher>(instanceId, TaskExecutor::TaskType::JS),
.uiWatcher = AceType::MakeRefPtr<ThreadWatcher>(instanceId, TaskExecutor::TaskType::UI, useUIAsJSThread),
};
watchers.uiWatcher->SetTaskExecutor(taskExecutor);
if (!useUIAsJSThread) {
watchers.jsWatcher->SetTaskExecutor(taskExecutor);
} else {
watchers.jsWatcher = nullptr;
}
const auto resExecutor = watchMap_.try_emplace(instanceId, watchers);
if (!resExecutor.second) {
LOGW("Duplicate instance id: %{public}d when register to watch dog", instanceId);
}
}
在ArkTS环境中,WatchDog只会创立uiWatcher并赋值给结构体(Watchers的uiWatcher),它是一个ThreadWatcher目标
ThreadWatcher目标初始化的时分,将发动检查,经过AnrThread::PostTaskToTaskRunner发动了一个检查使命
ThreadWatcher::ThreadWatcher(int32_t instanceId, TaskExecutor::TaskType type, bool useUIAsJSThread)
: instanceId_(instanceId), type_(type), useUIAsJSThread_(useUIAsJSThread)
{
InitThreadName();
AnrThread::PostTaskToTaskRunner(
[weak = Referenced::WeakClaim(this)]() {
auto sp = weak.Upgrade();
CHECK_NULL_VOID(sp);
调用了ThreadWatcherCheck办法
sp->Check();
},
NORMAL_CHECK_PERIOD);
}
Check办法是整个ANR机制中最核心的完成,下面咱们来看一下代码
void ThreadWatcher::Check()
{
int32_t period = NORMAL_CHECK_PERIOD;
if (!IsThreadStuck()) {
if (state_ == State::FREEZE) {
RawReport(RawEventType::RECOVER);
}
freezeCount_ = 0;
state_ = State::NORMAL;
canShowDialog_ = true;
showDialogCount_ = 0;
} else {
if (state_ == State::NORMAL) {
HiviewReport();
RawReport(RawEventType::WARNING);
state_ = State::WARNING;
period = WARNING_CHECK_PERIOD;
} else if (state_ == State::WARNING) {
RawReport(RawEventType::FREEZE);
state_ = State::FREEZE;
period = FREEZE_CHECK_PERIOD;
DetonatedBomb();
} else {
if (!canShowDialog_) {
showDialogCount_++;
if (showDialogCount_ >= ANR_DIALOG_BLOCK_TIME) {
canShowDialog_ = true;
showDialogCount_ = 0;
}
}
if (++freezeCount_ >= 5) {
RawReport(RawEventType::FREEZE);
freezeCount_ = 0;
}
period = FREEZE_CHECK_PERIOD;
DetonatedBomb();
}
}
check使命完结后,继续进行check使命
AnrThread::PostTaskToTaskRunner(
[weak = Referenced::WeakClaim(this)]() {
auto sp = weak.Upgrade();
CHECK_NULL_VOID(sp);
sp->Check();
},
period);
}
为了理解上面的代码,咱们简单总结一下上面提到的三种状况,分别是NORMAL,WARNING,FREEZE
NORMAL
NORMAL状况是正常的状况,咱们能够看到,当IsThreadStuck返回false时,state变量就会被设置为NORMAL状况,咱们看一下IsThreadStuck办法
bool ThreadWatcher::IsThreadStuck()
{
...
要害的判别逻辑在这儿
if (((loopTime_ - threadTag_) > (lastLoopTime_ - lastThreadTag_)) && (lastTaskId_ == taskId)) {
std::string abilityName;
if (AceEngine::Get().GetContainer(instanceId_) != nullptr) {
abilityName = AceEngine::Get().GetContainer(instanceId_)->GetHostClassName();
}
LOGE("thread stuck, ability: %{public}s, instanceId: %{public}d, thread: %{public}s, looptime: %{public}d, "
"checktime: %{public}d",
abilityName.c_str(), instanceId_, threadName_.c_str(), loopTime_, threadTag_);
res = true;
}
lastTaskId_ = taskId;
lastLoopTime_ = loopTime_;
lastThreadTag_ = threadTag_;
}
CheckAndResetIfNeeded();
PostCheckTask();
return res;
}
这儿边触及了非常要害的两个变量loopTime_ ,与threadTag_ 。咱们能够想一下,ANR假如发生时,必定是音讯循环的某个音讯履行时刻过长才会导致的,那么如何判别音讯履行时刻呢?就靠这两个变量
void ThreadWatcher::PostCheckTask()
{
auto taskExecutor = taskExecutor_.Upgrade();
if (taskExecutor) {
// post task to specified thread to check it
taskExecutor->PostTask(
[weak = Referenced::WeakClaim(this)]() {
auto sp = weak.Upgrade();
CHECK_NULL_VOID(sp);
每次真正履行一个task,threadTag_ 才会自增
sp->TagIncrease();
},
type_);
std::unique_lock<std::shared_mutex> lock(mutex_);
每次调用PostCheckTask的时分,loopTime_都会自增
++loopTime_;
....
}
void ThreadWatcher::TagIncrease()
{
std::unique_lock<std::shared_mutex> lock(mutex_);
++threadTag_;
}
loopTime_ :每次engine调用PostCheckTask的时分,就会自增
threadTag_: 每次使命被调度的时分,就会自增
正常情况下,loopTime_都约等于threadTag_,调用PostCheckTask的时分假如没有delay的话,理应使命也会被调度。但是假如处在异常情况,比方这个task是一个耗时履行,比方一个死循环被调度,那么这两个变量的差值会跟着PostCheckTask的调用被不断增大,然后断定为线程卡顿。当然,这儿还一起判别了当时使命与前一个使命的id,两者假如相同,那么就大大证明了这个task存在卡顿。
假如处于无卡顿状况,那么state变量就会被赋值为NORMAL状况。
WARNING
WARNING是一个中间状况,咱们在上文IsThreadStuck函数能够看到,履行完IsThreadStuck后就会又调用PostCheckTask函数,再次向音讯循环中抛出一个check函数履行。
假如IsThreadStuck返回了false,那么state就会被当即设置为WARNING状况,假如音讯循环中的check函数再次被调度时仍是IsThreadStuck返回了false,那么就当即升级为FREEZE状况
FREEZE
FREEZE 状况是ANR的充分状况,因为两次音讯循环中IsThreadStuck都返回了false,那么此刻就会调用DetonatedBomb进行“炸弹引爆”。
值得注意的是,咱们还有一个else分支,即屡次音讯循环中,上一次状况为FREEZE,下一次状况依然为FREEZE,那么当累计次数达到ANR_DIALOG_BLOCK_TIME(5)次时,将再次把canShowDialog_修改为true(canShowDialog_控制着是否弹出ANR弹窗,当上一次ANR弹窗弹出时会被设置为false,因此只需再超过5次时,就会再次把这个变量设置为true让ANR弹窗再次可弹。)。同样的,假如屡次处于FREEZE状况,那么每一次都会调用DetonatedBomb函数“引爆炸弹”
} else if (state_ == State::WARNING) {
RawReport(RawEventType::FREEZE);
state_ = State::FREEZE;
period = FREEZE_CHECK_PERIOD;
DetonatedBomb();
} else {
if (!canShowDialog_) {
showDialogCount_++;
if (showDialogCount_ >= ANR_DIALOG_BLOCK_TIME) {
canShowDialog_ = true;
showDialogCount_ = 0;
}
}
if (++freezeCount_ >= 5) {
RawReport(RawEventType::FREEZE);
freezeCount_ = 0;
}
period = FREEZE_CHECK_PERIOD;
DetonatedBomb();
}
“引爆炸弹”&“埋炸弹”&“拆炸弹”
咱们上面提到的“引爆炸弹”,其实就是指DetonatedBomb函数,它用于触发ANR使命,假如满足条件的情况下。
当然,DetonatedBomb并不是调用了就会产生ANR弹窗,而是会判别inputTaskIds_中第一个使命与当时运行使命的时刻差值是否大于ANR_INPUT_FREEZE_TIME(5000 即5s),假如大于这个阈值那么毫无疑问是一个ANR,不然就只是一个卡顿。假如canShowDialog_为true,那么就调用ShowDialog办法弹出ANR弹窗
void ThreadWatcher::DetonatedBomb()
{
std::shared_lock<std::shared_mutex> lock(mutex_);
会先判别inputTaskIds_这个行列是否为空
if (inputTaskIds_.empty()) {
return;
}
uint64_t currentTime = GetMilliseconds();
uint64_t bombId = inputTaskIds_.front();
if (currentTime - bombId > ANR_INPUT_FREEZE_TIME) {
LOGE("Detonated the Bomb, which bombId is %{public}s and currentTime is %{public}s",
std::to_string(bombId).c_str(), std::to_string(currentTime).c_str());
if (canShowDialog_) {
ShowDialog();
canShowDialog_ = false;
showDialogCount_ = 0;
} else {
LOGE("Can not show dialog when detonated the Bomb.");
}
ANR断定成功后会把整个炸弹行列清除
std::queue<uint64_t> empty;
std::swap(empty, inputTaskIds_);
}
}
inputTaskIds_变量其实是一个行列
std::queue<uint64_t> inputTaskIds_;
使用者能够经过BuriedBomb进行“埋炸弹”,用于要害的流程进行ANR判别
void ThreadWatcher::BuriedBomb(uint64_t bombId)
{
std::unique_lock<std::shared_mutex> lock(mutex_);
inputTaskIds_.emplace(bombId);
}
当然,使用者也能够经过DefusingBomb办法进行“拆炸弹”
void ThreadWatcher::DefusingBomb()
{
auto taskExecutor = taskExecutor_.Upgrade();
CHECK_NULL_VOID(taskExecutor);
taskExecutor->PostTask(
[weak = Referenced::WeakClaim(this)]() {
auto sp = weak.Upgrade();
if (sp) {
sp->DefusingTopBomb();
}
},
type_);
}
实质都是对这个行列的元素进行增删操作,因为后续触发DetonatedBomb办法的时分,会先判别inputTaskIds_是否为空,假如为空的情况下,那么其实就算音讯推迟也不算为ANR。
总结
经过本章,咱们学习到Engine供给的WatchDog机制以及其ANR完成的原理,经过学习这些源码,咱们将会对整个ArkUIEngine更加的熟悉,便利咱们进行后续的监控或者优化。