本文轉載自:Android源碼閱讀:lmkd
本文基于android-13.0.0_r1
1.lmkd進程啟動和初始化過程
??lmkd由init進程啟動,在系統中作為一個單獨的進程存在。
// system/core/rootdir/init.rc
// ...
# Start lmkd before any other services run so that it can register them
write /proc/sys/vm/watermark_boost_factor 0
chown root system /sys/module/lowmemorykiller/parameters/adj
chmod 0664 /sys/module/lowmemorykiller/parameters/adj
chown root system /sys/module/lowmemorykiller/parameters/minfree
chmod 0664 /sys/module/lowmemorykiller/parameters/minfree
start lmkd // 啟動lmkd
// ...
- lmkd.rc
// system/memory/lmkd/lmkd.rc
service lmkd /system/bin/lmkd
class core
user lmkd
group lmkd system readproc
capabilities DAC_OVERRIDE KILL IPC_LOCK SYS_NICE SYS_RESOURCE
critical
socket lmkd seqpacket+passcred 0660 system system
task_profiles ServiceCapacityLow
on property:lmkd.reinit=1
exec_background /system/bin/lmkd --reinit
# reinitialize lmkd after device finished booting if experiments set any flags during boot
on property:sys.boot_completed=1 && property:lmkd.reinit=0
setprop lmkd.reinit 1
# properties most likely to be used in experiments
# setting persist.device_config.* property either triggers immediate lmkd re-initialization
# if the device finished booting or sets lmkd.reinit=0 to re-initialize lmkd after boot completes
on property:persist.device_config.lmkd_native.debug=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.kill_heaviest_task=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.kill_timeout_ms=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.swap_free_low_percentage=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.psi_partial_stall_ms=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.psi_complete_stall_ms=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.thrashing_limit=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.thrashing_limit_decay=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.thrashing_limit_critical=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.swap_util_max=*
setprop lmkd.reinit ${sys.boot_completed:-0}
on property:persist.device_config.lmkd_native.filecache_min_kb=*
setprop lmkd.reinit ${sys.boot_completed:-0}
啟動時直接運行lmkd.cpp中的main函數。main函數中,邏輯較清楚,更新參數,創建logger,之后在if中進行init,之后在mainloop()中循環等待。
// system/memory/lmkd/lmkd.cpp
int main(int argc, char **argv) {
update_props(); // 更新參數
ctx = create_android_logger(KILLINFO_LOG_TAG); // 創建logger
if (!init()) {
//...
mainloop();
}
android_log_destroy(&ctx);
}
1.1 update_props()參數更新
??update_props函數中主要是使用GET_LMK_PROPERTY從屬性中獲取各個參數配置,例如從參數中獲取low 、medium、critical三種壓力等級下,可以kill的adj等級。
static void update_props() {
/* By default disable low level vmpressure events */
level_oomadj[VMPRESS_LEVEL_LOW] =
GET_LMK_PROPERTY(int32, "low", OOM_SCORE_ADJ_MAX + 1);
level_oomadj[VMPRESS_LEVEL_MEDIUM] =
GET_LMK_PROPERTY(int32, "medium", 800);
level_oomadj[VMPRESS_LEVEL_CRITICAL] =
GET_LMK_PROPERTY(int32, "critical", 0);
// ...
}
GET_LMK_PROPERTY是一個宏定義,用來讀取ro.lmk參數
#define GET_LMK_PROPERTY(type, name, def) \
property_get_##type("persist.device_config.lmkd_native." name, \
property_get_##type("ro.lmk." name, def))
1.2 init()初始化過程
1.2.1 創建epoll監聽
??init中比較重要的一步是創建epoll監聽,這里有宏定義MAX_EPOLL_EVENTS是10,也就是epoll監聽了10個event。
/* max supported number of data connections (AMS, init, tests) */
/* 支持的最大數據連接數(AMS、init、測試) */
#define MAX_DATA_CONN 3
/*
* 1 ctrl listen socket, 3 ctrl data socket, 3 memory pressure levels,
* 1 lmk events + 1 fd to wait for process death + 1 fd to receive kill failure notifications
*
* 1個控制監聽socket,3個控制數據通信socket,3個內存壓力等級,1個lmk時間,1個監控進程死亡,1個接收kill失敗通知
*/
#define MAX_EPOLL_EVENTS (1 + MAX_DATA_CONN + VMPRESS_LEVEL_COUNT + 1 + 1 + 1)
static int epollfd;
static int init(void) {
// ...
epollfd = epoll_create(MAX_EPOLL_EVENTS);
if (epollfd == -1) {
ALOGE("epoll_create failed (errno=%d)", errno);
return -1;
}
// ...
}
例如ams會作為socket客戶端,通過/dev/socket/lmkd與lmkd進行socket通信,將進程的adj通知到lmkd,并由lmkd寫入"/proc/[pid]/oom_score_adj"路徑。
1.2.2 初始化lmkd觸發方式
??接下來init函數需要決定lmkd的觸發方式,早期的lmk使用內核驅動的方式,這里通過access確認舊的節點是否還存在(kernel 4.12已廢棄)。不支持的話就是執行 init_monitors()。
has_inkernel_module = !access(INKERNEL_MINFREE_PATH, W_OK);
use_inkernel_interface = has_inkernel_module && !enable_userspace_lmk;
if (use_inkernel_interface) {
// 大多內核已不支持
} else {
if (!init_monitors()) {
return -1;
}
}
注意初始化監控器這里,有4個看起來很像的函數,分別是init_monitors()、init_psi_monitors()、init_mp_psi()、init_psi_monitor(),注意區分。
??看代碼,在init_monitors()函數中,要確認使用PSI觸發還是vmpressure觸發;在“ro.lmk. use_psi”屬性為true的情況下,調用 init_psi_monitors 初始化PSI監控器,失敗才會使用init_mp_common初始化vmpressure監控器,這里可以看出lmkd還是傾向于優先使用PSI觸發。
static bool init_monitors() {
/* 在內核支持的情況下,盡量使用PSI監控器 */
use_psi_monitors = GET_LMK_PROPERTY(bool, "use_psi", true) &&
init_psi_monitors();
/* PSI監控器初始化失敗,回退到vmpressure觸發 */
if (!use_psi_monitors &&
(!init_mp_common(VMPRESS_LEVEL_LOW) ||
!init_mp_common(VMPRESS_LEVEL_MEDIUM) ||
!init_mp_common(VMPRESS_LEVEL_CRITICAL))) {
ALOGE("Kernel does not support memory pressure events or in-kernel low memory killer");
return false;
}
if (use_psi_monitors) {
ALOGI("Using psi monitors for memory pressure detection");
} else {
ALOGI("Using vmpressure for memory pressure detection");
}
return true;
}
(1)PSI觸發
??接下來看調用init_psi_monitors() 初始化PSI監控器,在明確設置屬性use_new_strategy為true的情況下,或低內存設備,或明確use_minfree_levels為false的情況下,都是傾向于使用“新的策略”。這里新的策略其實指的是在PSI觸發之后,是根據free page的情況(水線)去查殺進程,還是根據不同PSI壓力去查殺進程,前者就是舊策略,后者為新策略;個人認為這里用“新舊”去區分非常不優雅。
??注意這里新舊的策略,是依據PSI壓力殺進程還是依據水線殺進程,但都不影響這里是設置的是PSI監控器,即觸發仍然還是用PSI觸發的,是殺進程的方式存在不同。
static bool init_psi_monitors() {
bool use_new_strategy =
GET_LMK_PROPERTY(bool, "use_new_strategy", low_ram_device || !use_minfree_levels);
/* 在默認 PSI模式下,使用系統屬性覆蓋 psi stall閾值 */
if (use_new_strategy) {
/* Do not use low pressure level */
psi_thresholds[VMPRESS_LEVEL_LOW].threshold_ms = 0;
psi_thresholds[VMPRESS_LEVEL_MEDIUM].threshold_ms = psi_partial_stall_ms;
psi_thresholds[VMPRESS_LEVEL_CRITICAL].threshold_ms = psi_complete_stall_ms;
}
if (!init_mp_psi(VMPRESS_LEVEL_LOW, use_new_strategy)) {
return false;
}
if (!init_mp_psi(VMPRESS_LEVEL_MEDIUM, use_new_strategy)) {
destroy_mp_psi(VMPRESS_LEVEL_LOW);
return false;
}
if (!init_mp_psi(VMPRESS_LEVEL_CRITICAL, use_new_strategy)) {
destroy_mp_psi(VMPRESS_LEVEL_MEDIUM);
destroy_mp_psi(VMPRESS_LEVEL_LOW);
return false;
}
return true;
}
決定好新舊策略后,接下來調用init_mp_psi來初始化各個等級的PSI事件。
??init_mp_psi有兩個參數,第一個是壓力等級,第二個新舊策略的標志位。注意第一個參數的命名是“vmpressure_level”,盡管是“vmpressure”,但實際這里用PSI觸發,是根據PSI來判斷內存壓力等級的,和前面說的vmpressure判斷內存壓力等級并非同一個“vmpressure”,這是第二個我認為代碼非常不優雅的地方,容易引起歧義。vmpressure全稱是虛擬內存壓力,難道設計者的想法中,PSI所產生的stall ms也是一種虛擬的內存壓力?
static bool init_mp_psi(enum vmpressure_level level, bool use_new_strategy) {
int fd;
/* Do not register a handler if threshold_ms is not set */
if (!psi_thresholds[level].threshold_ms) {
return true;
}
fd = init_psi_monitor(psi_thresholds[level].stall_type,
psi_thresholds[level].threshold_ms * US_PER_MS,
PSI_WINDOW_SIZE_MS * US_PER_MS);
if (fd < 0) {
return false;
}
vmpressure_hinfo[level].handler = use_new_strategy ? mp_event_psi : mp_event_common;
vmpressure_hinfo[level].data = level;
if (register_psi_monitor(epollfd, fd, &vmpressure_hinfo[level]) < 0) {
destroy_psi_monitor(fd);
return false;
}
maxevents++;
mpevfd[level] = fd;
return true;
}
注意這里的init_psi_monitor和前面的init_psi_monitors做區分,init_psi_monitor是定義在system/memory/lmkd/libpsi/psi.cpp中的,它的作用是根據stall類型、閾值、窗口大小,獲取epoll監聽的句柄。
??然后最重要的就是vmpressure_hinfo[level].handler,其根據是否使用新策略,決定了在這個壓力等級事件發生時,要調用的是mp_event_psi還是mp_event_common。也就是使用新策略的情況下,當這個壓力事件到來時,會調用mp_event_psi。
??后面register_psi_monitor則是epoll監聽壓力事件。
??至此可以認為init_psi_monitors()也就是PSI監控器初始化完成,各個壓力事件發生時,會調用mp_event_psi。
(2)vmpressure觸發
由于現在大部分Android機型均使用PSI觸發,vmpressure觸發這部分暫略過。
init中除了init_monitors()還有其他一些初始化過程,也先略過。
2.PSI觸發后的新策略(mp_event_psi)
??mp_event_psi函數可以大致分為三個部分,第一部分做一些參數和狀態的計算,第二部分根據得出的狀態確定查殺原因(kill_reason),第三部分選擇進程進行一輪查殺。
2.1 參數和狀態的計算
2.1.1 一些static變量
??首先是這個函數中有一些static變量,在多次進入這個函數時,這些static變量持續記錄狀態。
static int64_t init_ws_refault; // 記錄 殺進程后 初始的 workingset_refault
static int64_t prev_workingset_refault; // 記錄上一輪的 workingset_refault
static int64_t base_file_lru; // 記錄初始時的 文件頁緩存大小
static int64_t init_pgscan_kswapd; // 記錄初始時的 kswap回收量
static int64_t init_pgscan_direct; // 記錄初始時的 直接回收量
static bool killing; // 如果有進程被殺會被置為true
static int thrashing_limit = thrashing_limit_pct; // 抖動的閾值,一開始由參數中獲取
static struct zone_watermarks watermarks;
static struct timespec wmark_update_tm;
static struct wakeup_info wi;
static struct timespec thrashing_reset_tm;
static int64_t prev_thrash_growth = 0;
static bool check_filecache = false;
static int max_thrashing = 0;
2.1.2 一些臨時變量
union meminfo mi; // 從 /proc/meminfo 解析
union vmstat vs; // 從 /proc/vmstat 解析
struct psi_data psi_data;
struct timespec curr_tm; // 每輪開始時記錄時間
int64_t thrashing = 0;
bool swap_is_low = false;
enum vmpressure_level level = (enum vmpressure_level)data;
enum kill_reasons kill_reason = NONE;
bool cycle_after_kill = false; // 如果上一輪有進程被殺,這一輪會被置為true
enum reclaim_state reclaim = NO_RECLAIM;
enum zone_watermark wmark = WMARK_NONE;
char kill_desc[LINE_MAX];
bool cut_thrashing_limit = false;
int min_score_adj = 0;
int swap_util = 0;
int64_t swap_low_threshold;
long since_thrashing_reset_ms;
int64_t workingset_refault_file;
bool critical_stall = false;
2.1.3 一些狀態的判斷
??這部分代碼較多,比較重要的是通過vmstat_parse和meminfo_parse讀取信息,判斷thrashing、水線、swap狀態等,便于下一步確認查殺原因。
if (clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_tm) != 0) {
ALOGE("Failed to get current time");
return;
}
record_wakeup_time(&curr_tm, events ? Event : Polling, &wi);
bool kill_pending = is_kill_pending();
if (kill_pending && (kill_timeout_ms == 0 ||
get_time_diff_ms(&last_kill_tm, &curr_tm) < static_cast<long>(kill_timeout_ms))) {
/* Skip while still killing a process */
wi.skipped_wakeups++;
goto no_kill;
}
/*
* Process is dead or kill timeout is over, stop waiting. This has no effect if pidfds are
* supported and death notification already caused waiting to stop.
* 進程死亡或者kill超時結束,停止等待。 如果支持pidfd并且死亡通知已導致等待停止,則此操作無效。
*/
stop_wait_for_proc_kill(!kill_pending);
// vmstat解析
if (vmstat_parse(&vs) < 0) {
ALOGE("Failed to parse vmstat!");
return;
}
/* 從5.9開始內核workingset_refault vmstat字段被重命名為workingset_refault_file */
workingset_refault_file = vs.field.workingset_refault ? : vs.field.workingset_refault_file;
// meminfo解析
if (meminfo_parse(&mi) < 0) {
ALOGE("Failed to parse meminfo!");
return;
}
/* Reset states after process got killed */
/* 殺進程后重置一些狀態 */
if (killing) {
killing = false;
cycle_after_kill = true;
/* Reset file-backed pagecache size and refault amounts after a kill */
base_file_lru = vs.field.nr_inactive_file + vs.field.nr_active_file; // 重置 文件頁 緩存大小
init_ws_refault = workingset_refault_file; // 重置 refault量
thrashing_reset_tm = curr_tm; // thrashing重置時間設置為當前時間
prev_thrash_growth = 0; // thrashing重置為0
}
/* Check free swap levels */
/* 確認swap狀態: swap_is_low */
if (swap_free_low_percentage) { // 由屬性中獲取
swap_low_threshold = mi.field.total_swap * swap_free_low_percentage / 100;
swap_is_low = get_free_swap(&mi) < swap_low_threshold; // free swap低于 XX%,認為swap較低
} else {
swap_low_threshold = 0;
}
/* Identify reclaim state */
/* 確認回收狀態: reclaim */
if (vs.field.pgscan_direct != init_pgscan_direct) { // pgscan_direct發生了變化,說明發生了【DIRECT_RECLAIM】
init_pgscan_direct = vs.field.pgscan_direct;
init_pgscan_kswapd = vs.field.pgscan_kswapd;
reclaim = DIRECT_RECLAIM;
} else if (vs.field.pgscan_kswapd != init_pgscan_kswapd) { // kswapd回收量發生變化,發生了【KSWAPD_RECLAIM】
init_pgscan_kswapd = vs.field.pgscan_kswapd;
reclaim = KSWAPD_RECLAIM;
} else if (workingset_refault_file == prev_workingset_refault) {
/*
* Device is not thrashing and not reclaiming, bail out early until we see these stats
* changing
* 設備沒有抖動也沒有回收,該輪不查殺,直到我們看到這些統計數據發生變化
*/
goto no_kill;
}
prev_workingset_refault = workingset_refault_file;
/*
* It's possible we fail to find an eligible process to kill (ex. no process is
* above oom_adj_min). When this happens, we should retry to find a new process
* for a kill whenever a new eligible process is available.
* 有可能我們找不到合適的進程來終止(例如,沒有進程高于 oom_adj_min)。
* 這種情況下,只要有新的符合條件的進程可用,我們就應該重試尋找新的進程進行kill。
*
* This is especially important for a slow growing refault case.
* 這對于增長緩慢的缺頁場景尤為重要。
*
* While retrying, we should keep monitoring new thrashing counter
* as someone could release the memory to mitigate the thrashing.
* 在重試時,我們應該繼續監視新的抖動計數器(thrashing counter),因為有人可以釋放內存來減輕抖動。
*
* Thus, when thrashing reset window comes,
* we decay the prev thrashing counter by window counts.
* 因此,當抖動重置窗口到來時,我們通過窗口計數衰減前一個抖動計數器。
*
* If the counter is still greater than thrashing limit,
* we preserve the current prev_thrash counter so we will retry kill again.
* 如果計數器仍然大于抖動限制,我們將保留當前的 prev_thrash 計數器,以便我們再次重試 kill。
*
* Otherwise, we reset the prev_thrash counter so we will stop retrying.
* 否則,我們重置 prev_thrash 計數器以停止重試。
*/
// 從thrashing重置 到 現在 的時間差,注意如果上一輪查殺過,時間會被重置,時間差=0
since_thrashing_reset_ms = get_time_diff_ms(&thrashing_reset_tm, &curr_tm);
if (since_thrashing_reset_ms > THRASHING_RESET_INTERVAL_MS) {
long windows_passed;
/* Calculate prev_thrash_growth if we crossed THRASHING_RESET_INTERVAL_MS */
/* 在超過thrashing reset間隔時間的情況下,計算上一次thrash增長 */
prev_thrash_growth = (workingset_refault_file - init_ws_refault) * 100
/ (base_file_lru + 1); // 新增缺頁數量 / 總文件頁數量 * 100
// 代表超過了多少個thrashing reset窗口
windows_passed = (since_thrashing_reset_ms / THRASHING_RESET_INTERVAL_MS);
/*
* Decay prev_thrashing unless over-the-limit thrashing was registered in the window we
* just crossed, which means there were no eligible processes to kill. We preserve the
* counter in that case to ensure a kill if a new eligible process appears.
*
* 減少 prev_thrashing 除非 在我們剛剛越過的窗口中 注冊了超過限制的抖動,這意味著沒有符合條件的進程可以殺死。
* 在這種情況下,我們保留計數器以確保在出現新的合格進程時終止。
*/
// 不太懂
if (windows_passed > 1 || prev_thrash_growth < thrashing_limit) {
prev_thrash_growth >>= windows_passed;
}
/* Record file-backed pagecache size when crossing THRASHING_RESET_INTERVAL_MS */
/* 超過THRASHING_RESET_INTERVAL_MS時,記錄 文件頁數量 */
// 實際看這里是重置了 文件頁大小、refault數量、抖動重置時間、抖動閾值
base_file_lru = vs.field.nr_inactive_file + vs.field.nr_active_file;
init_ws_refault = workingset_refault_file;
thrashing_reset_tm = curr_tm; // thrashing重置
thrashing_limit = thrashing_limit_pct;
} else {
/* Calculate what % of the file-backed pagecache refaulted so far */
// 上一輪發生過查殺,或thrashing剛重置沒多久,就計算到目前為止,文件頁緩存發生缺頁的百分比
thrashing = (workingset_refault_file - init_ws_refault) * 100 / (base_file_lru + 1);
}
/* Add previous cycle's decayed thrashing amount */
// 累加上一輪的thrashing衰減量
thrashing += prev_thrash_growth;
if (max_thrashing < thrashing) {
max_thrashing = thrashing;
}
/*
* Refresh watermarks once per min in case user updated one of the margins.
* 每 60s 刷新一次水線
*
* TODO: b/140521024 replace this periodic update with an API for AMS to notify LMKD
* that zone watermarks were changed by the system software.
* TODO: 使用 AMS的API 替換 此定時更新,以通知 LMKD 水線已被系統軟件更改。
*/
if (watermarks.high_wmark == 0 || get_time_diff_ms(&wmark_update_tm, &curr_tm) > 60000) {
struct zoneinfo zi;
// 進行一次zoninfo解析
if (zoneinfo_parse(&zi) < 0) {
ALOGE("Failed to parse zoneinfo!");
return;
}
// 計算zone水線,看這個函數把各個zone的水線都加了起來,存到watermarks里
calc_zone_watermarks(&zi, &watermarks);
wmark_update_tm = curr_tm;
}
/* Find out which watermark is breached if any */
// 確認到了哪個水線等級
wmark = get_lowest_watermark(&mi, &watermarks);
/* 從/proc/pressure/memory解析 psi 數據,確認是否達到 critical 等級 */
if (!psi_parse_mem(&psi_data)) {
critical_stall = psi_data.mem_stats[PSI_FULL].avg10 > (float)stall_limit_critical;
}
2.2 確認查殺原因和最低adj
??該部分主要是根據上一部分得出的狀態,確認要進行查殺的原因,以及對最低可查殺adj等級(min_score_adj)做出修改,這部分源碼基本上全是if else if注釋比較詳細,kill_reason和kill_desc的賦值也比較直觀,高通等廠商也會對這部分代碼做較大的改動,因此暫不詳細標注這部分內容。如有lmkd異常查殺等情況發生,可以根據lmkd日志中打印的kill reason,在這一部分找到對應的源碼。
- 代碼示例:
if (cycle_after_kill && wmark < WMARK_LOW) {
/*
* Prevent kills not freeing enough memory which might lead to OOM kill.
* This might happen when a process is consuming memory faster than reclaim can
* free even after a kill. Mostly happens when running memory stress tests.
*/
kill_reason = PRESSURE_AFTER_KILL;
strncpy(kill_desc, "min watermark is breached even after kill", sizeof(kill_desc));
} else if (level == VMPRESS_LEVEL_CRITICAL && events != 0) {
// ......
} else if (swap_is_low && thrashing > thrashing_limit_pct) {
// ......
} else if (/*...*/)
2.3 進程查殺
??至此已經確定了查殺原因和最低允許查殺的adj,調用find_and_kill_process函數進行查殺。
if (kill_reason != NONE) {
struct kill_info ki = {
.kill_reason = kill_reason,
.kill_desc = kill_desc,
.thrashing = (int)thrashing,
.max_thrashing = max_thrashing,
};
// ...
int pages_freed = find_and_kill_process(min_score_adj, &ki, &mi, &wi, &curr_tm, &psi_data);
if (pages_freed > 0) {
killing = true;
// ...
}
}
find_and_kill_process函數的作用是在大于等于min_score_adj的范圍內,選擇合適的進程進行查殺。
static int find_and_kill_process(int min_score_adj, struct kill_info *ki, union meminfo *mi,
struct wakeup_info *wi, struct timespec *tm,
struct psi_data *pd) {
int i;
int killed_size = 0;
bool lmk_state_change_start = false;
bool choose_heaviest_task = kill_heaviest_task;
// 從 1000 開始循環
for (i = OOM_SCORE_ADJ_MAX; i >= min_score_adj; i--) {
struct proc *procp;
if (!choose_heaviest_task && i <= PERCEPTIBLE_APP_ADJ) {
/*
* If we have to choose a perceptible process, choose the heaviest one to
* hopefully minimize the number of victims.
* 如果我們必須選擇一個可感知的進程(adj<=200),就選擇最嚴重的一個,來盡量避免查殺過多的進程。
*/
choose_heaviest_task = true;
}
while (true) {
procp = choose_heaviest_task ?
proc_get_heaviest(i) : proc_adj_tail(i); // 在adj==i的進程中找"最嚴重的"或末尾的
if (!procp)
break;
killed_size = kill_one_process(procp, min_score_adj, ki, mi, wi, tm, pd);
if (killed_size >= 0) {
break;
}
}
// 有進程查殺發生時,不再繼續查殺更低adj的進程
if (killed_size) {
break;
}
}
return killed_size;
}
從find_and_kill_process可以得知lmkd每次觸發查殺時,都是從adj1000的進程開始逐個篩選合適的進程查殺,發生查殺后退出。在“選擇合適的進程”的策略中,可以通過kill_heaviest_task系統屬性控制lmkd是用proc_get_heaviest還是proc_adj_tail做篩選,不過注意在查殺到adj小于等于200時,已經到了必須查殺用戶可感知進程的地步,此時強制篩選“最嚴重”的進程。
所謂“最嚴重的進程”,可以在proc_get_heaviest函數中看到是對進程讀取"/proc/[pid]/statm"路徑,獲取進程rss內存占用,排序找出rss最大的進程;