KLEE中Seed模式流程解析

main.cpp

1
2
3
4
5
cl::list<std::string>
SeedOutFile("seed-out");

cl::list<std::string>
SeedOutDir("seed-out-dir");

在main.cpp中首先会对ReplayKTestDir和ReplayKTestFile做判断. 如果用户有指定, 那么会继续判断SeedOutDir和SeedOutFile是否为空.

1
2
3
4
5
6
7
8
9
10
11
12
if (!ReplayKTestDir.empty() || !ReplayKTestFile.empty()) {
assert(SeedOutFile.empty());
assert(SeedOutDir.empty());
...
// replay ktest 文件模式
} else { // 没有指定replay ktest模式
// 直接从SeedOutFile读取seed
for(){
KTest *out = kTest_fromFile(it->c_str());
}
seeds.push_back(out);
}

最终读取到的seed都保存在std::vector<KTest *> seeds中. 然后判断是否有读取到seed

1
2
3
4
5
if (!seeds.empty()) {
klee_message("KLEE: using %lu seeds\n", seeds.size());
// 使用 seed
interpreter->useSeeds(&seeds);
}

既然读取到了seed, 那就interpreter->useSeeds(&seeds);来使用seed. 这之后除了切换目录以及清理seeds的操作, 我们需要重点关注的是interpreter->runFunctionAsMain(mainFn, pArgc, pArgv, pEnvp);. 在interpreter设置好seed后, seed的使用过程会在runFunctionAsMain开始.

1
2
3
4
5
6
7
8
9
if (RunInDir != "") {... // 切换目录
}
// 运行main函数
interpreter->runFunctionAsMain(mainFn, pArgc, pArgv, pEnvp);

while (!seeds.empty()) { // 将seeds里的内容全部释放掉
kTest_free(seeds.back());
seeds.pop_back();
}

useSeeds

既然谈到interpreter->useSeeds(&seeds);那就来看看这个interpreter

1
2
Interpreter *interpreter =
theInterpreter = Interpreter::create(ctx, IOpts, handler);

这里实例化了一个interpreter. 而该方法的声明是在interpreter.h里, 实现却是在Executor.cpp

1
2
3
4
Interpreter *Interpreter::create(LLVMContext &ctx, const InterpreterOptions &opts,
InterpreterHandler *ih) {
return new Executor(ctx, opts, ih);
}

也就是说实际上就是创建了一个Executor实例. 而Executor继承自interpreter类, 像useSeedsrunFunctionAsMain都在Executor类里进行了定义. 所以interpreter->useSeeds(&seeds)实际上就是Executor::useSeeds, interpreter::runFunctionAsMain就是Executor::runFunctionAsMain. 那么接下来就只看Executor

1
2
3
4
5
// Executor.h
const std::vector<struct KTest *> *usingSeeds;
void useSeeds(const std::vector<struct KTest *> *seeds) override {
usingSeeds = seeds;
}

这里将seeds传递给了Executor的私有成员usingSeeds. 那么这也就是interpreter->useSeeds()所做的操作了

runFunctionAsMain

回顾一下runFunctionAsMain的使用: interpreter->runFunctionAsMain(mainFn, pArgc, pArgv, pEnvp);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
void Executor::runFunctionAsMain(Function *f,
int argc,
char **argv,
char **envp) {
std::vector<ref<Expr> > arguments;

// force deterministic initialization of memory objects
srand(1);
srandom(1);

// MemoryObject 表示分配在某个地址上的对象
MemoryObject *argvMO = 0;

int envc;
for (envc=0; envp[envc]; ++envc) ; // envc表示环境变量的数量

unsigned NumPtrBytes = Context::get().getPointerWidth() / 8;
// 这里f对应的函数为 main, 使用KFunction进行结构化
KFunction *kf = kmodule->functionMap[f];
assert(kf);
// 枚举main好参数
Function::arg_iterator ai = f->arg_begin(), ae = f->arg_end();
// 为argc个参数和envc个环境变量分配空间
......

ExecutionState *state = new ExecutionState(kmodule->functionMap[f]);

if (pathWriter)
// 初始化 记录混合符号执行路径的流 pathOS
state->pathOS = pathWriter->open();
if (symPathWriter)
// 初始化 记录单纯符号执行路径的流 pathOS
state->symPathOS = symPathWriter->open();


if (statsTracker)
// 记录符号执行时的状态切换,
// 参数 parentFrame 为0, 说明是初始状态
statsTracker->framePushed(*state, 0);

assert(arguments.size() == f->arg_size() && "wrong number of arguments");
for (unsigned i = 0, e = f->arg_size(); i != e; ++i)
bindArgument(kf, i, *state, arguments[i]);

// 将参数用 MemoryObject 和 ObjectState 等类进行封装, 以便klee使用
......

// 初始化了一些state相关的全局变量
initializeGlobals(*state);

// 实例化一个当前进程的进程树
processTree = new PTree(state);
// 将当前进程作为进程树的根结点
state->ptreeNode = processTree->root;
run(*state);
// run(*state)的善后处理
delete processTree;
processTree = 0;

// hack to clear memory objects
delete memory;
memory = new MemoryManager(NULL);

globalObjects.clear();
globalAddresses.clear();

if (statsTracker)
statsTracker->done();
}

我用 ……省略了在 runFunctionAsMain 中有关参数的部分, 以便更加清晰地来看这个函数. runFunctionAsMain的前半部分都是在做参数的相关操作, 比如为参数分配空间, 将参数进行封装, 将参数与当前状态绑定之类. 后半部分则是关于state的操作, 再为state进行了一些初始化操作后, 使用run(*state)来运行状态. 这也是runFunctionAsMain里我们的重要关注点.

run

联系上部分的interpreter->useSeeds, 在设置好usingSeeds, 而usingSeeds仅在Executor::run中使用. 那么我们就直接来分析run()来看seed对klee执行流程的影响

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
void Executor::run(ExecutionState &initialState) {
// 对模块中每个函数的每条指令的常数进行处理
// 将处理后的结果存储在某个数据结构中
bindModuleConstants();

// 初始化计时器
initTimers();
// 将初始状态 initialState 添加到 states 这个全局变量中
states.insert(&initialState);

if (usingSeeds) { ... // 有传入seed
}
// 创建一个searcher实例
searcher = constructUserSearcher(*this);
// 初始化searcher操作
std::vector<ExecutionState *> newStates(states.begin(), states.end());
searcher->update(0, newStates, std::vector<ExecutionState *>());

while (!states.empty() && !haltExecution) {
// 通过searcher采取的搜索策略来搜索执行下一个state
ExecutionState &state = searcher->selectState();
// ki是当前指令
KInstruction *ki = state.pc;
// 指向下一条指令
stepInstruction(state);
// 基于llvm指令集, 对指令字节码进行具体解析并模拟执行
executeInstruction(state, ki);
// 处理计时器
processTimers(&state, MaxInstructionTime);
// 检查内存使用情况
checkMemoryUsage();
// 对states进行更新
// 不同的搜索策略, 其更新states的策略不同
updateStates(&state);
}

// 善后工作
delete searcher;
searcher = 0;

doDumpStates();
}

如果没有传入seed, usingSeeds为空, 那么run()的流程从上到下就是:

  1. 对模块中每条指令中的常数进行绑定
  2. 将初始状态加入到state中
  3. 创建一个searcher实例并初始化
  4. 不断循环, 直到state为空
    • 根据searcher的搜索策略选择下一个要执行的state
    • 获取当前指令, 保存为ki, 并将pc指向下一条指令
    • executeInstruction执行当前状态的当前指令
    • 处理计时器, 检查内存使用情况, 更新state
  5. 清理工作

实际上, 也就是常规符号执行的正常流程. 也就是说这里并没有对seed进行处理. seed也没有影响到这里的常规执行

那么我们重点关注的地方就是usingSeeds部分

usingSeeds

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
if (usingSeeds) {
// 获得一个 seedMap[&initialState] 的引用
std::vector<SeedInfo> &v = seedMap[&initialState];
// 通过引用, 将 usingSeeds 里的 seed 内容存进 seedMap[&initialState] 中
for (std::vector<KTest*>::const_iterator it = usingSeeds->begin(),
ie = usingSeeds->end(); it != ie; ++it)
v.push_back(SeedInfo(*it));

// 辅助用的变量 初始化
int lastNumSeeds = usingSeeds->size()+10;
double lastTime, startTime = lastTime = util::getWallTime();
ExecutionState *lastState = 0;
while (!seedMap.empty()) {
// 中断
if (haltExecution) {
doDumpStates();
return;
}

std::map<ExecutionState*, std::vector<SeedInfo> >::iterator it =
seedMap.upper_bound(lastState);
if (it == seedMap.end())
it = seedMap.begin();
lastState = it->first;
unsigned numSeeds = it->second.size();
ExecutionState &state = *lastState;
// ki是当前指令
KInstruction *ki = state.pc;
// 指向下一条指令
stepInstruction(state);
// 基于llvm指令集, 对指令字节码进行具体解析并模拟执行
executeInstruction(state, ki);
// 处理计时器
processTimers(&state, MaxInstructionTime * numSeeds);
// 对states进行更新
updateStates(&state);

// 每执行 1000 条语句 打印一次提示信息
if ((stats::instructions % 1000) == 0) {...
}
}

klee_message("seeding done (%d states remain)", (int) states.size());

// XXX total hack, just because I like non uniform better but want
// seed results to be equally weighted.
for (std::set<ExecutionState*>::iterator
it = states.begin(), ie = states.end();
it != ie; ++it) {
(*it)->weight = 1.;
}

if (OnlySeed) {
doDumpStates();
return;
}
}

再看usingSeeds分支内的语句. 首先通过引用, 将 usingSeeds 里的 seed 内容存进 seedMap[&initialState] 中, 然后初始化了一些辅助变量后, 进入一个循环知道seedMap为空. 那么很显然在这个循环里, 一定对seedMap进行了处理, 不然就是一个死循环了.

来看循环里的代码, 跟之前usingSeeds以外的常规符号执行流程中, 最明显的区别就是没有用到searcher, 没有用到对应的搜索策略来选择下一个要执行的state. 而是使用的seedMap里各个seed对应的state.

那么在这个循环里面, 我们检查了这几个函数. 只有executeInstruction(state, ki);有可能包含对seed的操作.

因为在执行指令的时候并没有显式地将seedMap作参在执行时使用, 而seedMap是全局变量, 可以随时被调用, 而只需要比对state也能找到处理state时应该对应的seed. 因此我们就来看下整个klee工程里哪里有用到seedMap

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
branch:
lib/Core/Executor.cpp: seedMap.find(&state);
lib/Core/Executor.cpp: if (it != seedMap.end()) {
lib/Core/Executor.cpp: seedMap.erase(it);
lib/Core/Executor.cpp: seedMap[result[i]].push_back(*siit);
lib/Core/Executor.cpp: if (result[i] && !seedMap.count(result[i])) {
fork:
lib/Core/Executor.cpp: seedMap.find(¤t);
lib/Core/Executor.cpp: bool isSeeding = it != seedMap.end();
lib/Core/Executor.cpp: if (it != seedMap.end()) {
lib/Core/Executor.cpp: std::vector<SeedInfo> &trueSeeds = seedMap[trueState];
lib/Core/Executor.cpp: std::vector<SeedInfo> &falseSeeds = seedMap[falseState];
lib/Core/Executor.cpp: seedMap.erase(trueState);
lib/Core/Executor.cpp: seedMap.erase(falseState);
addConstraint:
lib/Core/Executor.cpp: seedMap.find(&state);
lib/Core/Executor.cpp: if (it != seedMap.end()) {
executeGetValue:
lib/Core/Executor.cpp: seedMap.find(&state);
lib/Core/Executor.cpp: if (it==seedMap.end() || isa<ConstantExpr>(e)) {
updateStates:
lib/Core/Executor.cpp: seedMap.find(es);
lib/Core/Executor.cpp: if (it3 != seedMap.end())
lib/Core/Executor.cpp: seedMap.erase(it3);
run:
lib/Core/Executor.cpp: std::vector<SeedInfo> &v = seedMap[&initialState];
lib/Core/Executor.cpp: while (!seedMap.empty()) {
lib/Core/Executor.cpp: seedMap.upper_bound(lastState);
lib/Core/Executor.cpp: if (it == seedMap.end())
lib/Core/Executor.cpp: it = seedMap.begin();
lib/Core/Executor.cpp: it = seedMap.begin(), ie = seedMap.end();
terminateState:
lib/Core/Executor.cpp: seedMap.find(&state);
lib/Core/Executor.cpp: if (it3 != seedMap.end())
lib/Core/Executor.cpp: seedMap.erase(it3);
terminateStateEarly:
lib/Core/Executor.cpp: (AlwaysOutputSeeds && seedMap.count(&state)))
terminateStateOnExit:
lib/Core/Executor.cpp: (AlwaysOutputSeeds && seedMap.count(&state)))
executeMakeSymbolic:
lib/Core/Executor.cpp: seedMap.find(&state);
lib/Core/Executor.cpp: if (it!=seedMap.end()) { // In seed mode we need to add this as a

总结一下, 有这些函数用到了seedMap: branch, fork, addConstraint, executeGetValue, updateStates, run, terminateState, terminateStateEarly, terminateStateOnExit, executeMakeSymbolic一共十处

那么我们就一个个来看(其中run已经分析过, 除外), seedMap到底是如何影响klee的执行流程的, 以及executeInstruction是通过哪一种方式来操作seedMap

branch

首先明确Executor::branch仅在executeGetValueexecuteInstructionIndirectBrswitch分支用到.

1
2
std::vector<ExecutionState*> branches;
branch(state, conditions, branches);

作用则是在遇到分支的时候, 判断条件是否满足, 并将对应的条件复制到各自的branches里. 而在branch

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
if (MaxForks!=~0u && stats::forks >= MaxForks) {
unsigned next = theRNG.getInt32() % N;
for (unsigned i=0; i<N; ++i) {
if (i == next) {
result.push_back(&state);
} else {
result.push_back(NULL);
}
}
} else {
stats::forks += N-1;

// XXX do proper balance or keep random?
result.push_back(&state);
for (unsigned i=1; i<N; ++i) {
ExecutionState *es = result[theRNG.getInt32() % i];
ExecutionState *ns = es->branch();
addedStates.push_back(ns);
result.push_back(ns);
es->ptreeNode->data = 0;
std::pair<PTree::Node*,PTree::Node*> res =
processTree->split(es->ptreeNode, ns, es);
ns->ptreeNode = res.first;
es->ptreeNode = res.second;
}
}

首先会判断是否超出了状态fork的最大数量MaxForks, 如果没有超过, 那么就会示例化一个ExecutionState对象es, 并使用es->branch()来分出对应的分支. 这里的branch()的原型是ExecutionState::branch(), 就不要混淆了.

1
2
3
4
5
6
7
8
9
10
11
12
ExecutionState *ExecutionState::branch() {
depth++;

ExecutionState *falseState = new ExecutionState(*this);
falseState->coveredNew = false;
falseState->coveredLines.clear();

weight *= .5;
falseState->weight -= weight;

return falseState;
}

以上是ExecutionState::branch()的代码判断. 可以看到这里以当前状态复制了一个falseState出来并进行了相关设置, 也就达成了分支所需要的效果.

在下一部分的fork()中也是使用branch()的方法对状态进行分支.

然后我们再来看Executor::branch中跟seed相关的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// 如有必要, 重新分配种子以满足条件 
// 必要时根据OnlyReplaySeeds来终止状态 (低效但简单)

// 返回 state 对应的迭代器
std::map< ExecutionState*, std::vector<SeedInfo> >::iterator it =
seedMap.find(&state);

if (it != seedMap.end()) {
std::vector<SeedInfo> seeds = it->second;
seedMap.erase(it);

// 假定每个种子仅满足一个条件(当条件互相排斥但结合起来又重言时, 就必然是true)
for (std::vector<SeedInfo>::iterator siit = seeds.begin(),
siie = seeds.end(); siit != siie; ++siit) {
unsigned i;
for (i=0; i<N; ++i) {
ref<ConstantExpr> res;
bool success =
solver->getValue(state, siit->assignment.evaluate(conditions[i]),
res);
assert(success && "FIXME: Unhandled solver failure");
(void) success;
if (res->isTrue())
break;
}

// If we didn't find a satisfying condition randomly pick one
// (the seed will be patched).
if (i==N)
i = theRNG.getInt32() % N;

// Extra check in case we're replaying seeds with a max-fork
if (result[i])
seedMap[result[i]].push_back(*siit);
}

if (OnlyReplaySeeds) {
for (unsigned i=0; i<N; ++i) {
if (result[i] && !seedMap.count(result[i])) {
terminateState(*result[i]);
result[i] = NULL;
}
}
}
}

这段代码中重点需要关注的是solver->getValue(current, siit->assignment.evaluate(condition), res);. 这一句的作用是用来判断我们的seed是否满足相应的condition, 并将结果保存在res里. 如果seed满足condition, 那么就直接跳出循环执行seed指定的路径. 最后的OnlyReplaySeeds也是用来提前终止那些不包含seed的状态.

fork

1
2
3
std::map< ExecutionState*, std::vector<SeedInfo> >::iterator it = 
seedMap.find(¤t);
bool isSeeding = it != seedMap.end();

首先这里取了一个迭代器it以及布尔值isSeeding判断当前是否还在seed模式下. 那么我们重点关注迭代器it的相关操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
if (isSeeding && 
(current.forkDisabled || OnlyReplaySeeds) &&
res == Solver::Unknown) {
bool trueSeed=false, falseSeed=false;
// Is seed extension still ok here?
for (std::vector<SeedInfo>::iterator siit = it->second.begin(),
siie = it->second.end(); siit != siie; ++siit) {
ref<ConstantExpr> res;
bool success =
solver->getValue(current, siit->assignment.evaluate(condition), res);
assert(success && "FIXME: Unhandled solver failure");
(void) success;
if (res->isTrue()) {
trueSeed = true;
} else {
falseSeed = true;
}
if (trueSeed && falseSeed)
break;
}
if (!(trueSeed && falseSeed)) { // trueSeed和falseSeed至少有1个为假
assert(trueSeed || falseSeed); // trueSeed和falseSeed至少有1个为真

res = trueSeed ? Solver::True : Solver::False;
addConstraint(current, trueSeed ? condition : Expr::createIsZero(condition));
}
}

同样使用solver->getValue()来获取seed与condition的关系, 并将结果保存到res里. res->isTrue表明种子满足条件, 为真, 否则不满足条件为假. 当判定seed是trueSeed还是falseSeed后则添加相应的约束. 如果无法判定真假, 则跳出循环

addConstraint

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
void Executor::addConstraint(ExecutionState &state, ref<Expr> condition) {
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(condition)) {
if (!CE->isTrue())
llvm::report_fatal_error("attempt to add invalid constraint");
return;
}

// Check to see if this constraint violates seeds.
std::map< ExecutionState*, std::vector<SeedInfo> >::iterator it =
seedMap.find(&state);
if (it != seedMap.end()) {
bool warn = false;
for (std::vector<SeedInfo>::iterator siit = it->second.begin(),
siie = it->second.end(); siit != siie; ++siit) {
bool res;
bool success =
solver->mustBeFalse(state, siit->assignment.evaluate(condition), res);
assert(success && "FIXME: Unhandled solver failure");
(void) success;
if (res) {
siit->patchSeed(state, condition, solver);
warn = true;
}
}
if (warn)
klee_warning("seeds patched for violating constraint");
}

state.addConstraint(condition);
if (ivcEnabled)
doImpliedValueConcretization(state, condition,
ConstantExpr::alloc(1, Expr::Bool));
}

addConstraint里, 使用seedMap来取得state对应的seed. 并用solver->mustBeFalse(state, siit->assignment.evaluate(condition), res);判断seed与该condition是否满足永假关系, 也即约束是否跟seed相违背. 如果相违背则patchSeed并输出警告信息

executeGetValue

executeGetValue仅在SpecialFunctionHandler::handleGetValue中用到

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  add("klee_get_valuef", handleGetValue, true),
add("klee_get_valued", handleGetValue, true),
add("klee_get_valuel", handleGetValue, true),
add("klee_get_valuell", handleGetValue, true),
add("klee_get_value_i32", handleGetValue, true),
add("klee_get_value_i64", handleGetValue, true),
......
void SpecialFunctionHandler::handleGetValue(ExecutionState &state,
KInstruction *target,
std::vector<ref<Expr> > &arguments) {
assert(arguments.size()==1 &&
"invalid number of arguments to klee_get_value");

executor.executeGetValue(state, arguments[0], target);
}

arguments[0]则是klee_get_value的第一个参数. 也就是通过executeGetValue来获取对应的符号的值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
void Executor::executeGetValue(ExecutionState &state,
ref<Expr> e,
KInstruction *target) {
e = state.constraints.simplifyExpr(e);
std::map< ExecutionState*, std::vector<SeedInfo> >::iterator it =
seedMap.find(&state);
if (it==seedMap.end() || isa<ConstantExpr>(e)) {
ref<ConstantExpr> value;
bool success = solver->getValue(state, e, value);
assert(success && "FIXME: Unhandled solver failure");
(void) success;
bindLocal(target, state, value);
} else {
std::set< ref<Expr> > values;
for (std::vector<SeedInfo>::iterator siit = it->second.begin(),
siie = it->second.end(); siit != siie; ++siit) {
ref<ConstantExpr> value;
bool success =
solver->getValue(state, siit->assignment.evaluate(e), value);
assert(success && "FIXME: Unhandled solver failure");
(void) success;
values.insert(value);
}

std::vector< ref<Expr> > conditions;
for (std::set< ref<Expr> >::iterator vit = values.begin(),
vie = values.end(); vit != vie; ++vit)
conditions.push_back(EqExpr::create(e, *vit));

std::vector<ExecutionState*> branches;
branch(state, conditions, branches);

std::vector<ExecutionState*>::iterator bit = branches.begin();
for (std::set< ref<Expr> >::iterator vit = values.begin(),
vie = values.end(); vit != vie; ++vit) {
ExecutionState *es = *bit;
if (es)
bindLocal(target, *es, *vit);
++bit;
}
}
}

类似, 同solver->getValue(state, siit->assignment.evaluate(e), value);

updateStates

updateStates里进行的操作是将seedMap里的state给擦除掉.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
void Executor::updateStates(ExecutionState *current) {
if (searcher) {
searcher->update(current, addedStates, removedStates);
searcher->update(nullptr, continuedStates, pausedStates);
pausedStates.clear();
continuedStates.clear();
}

states.insert(addedStates.begin(), addedStates.end());
addedStates.clear();

// 逐个擦除掉
for (std::vector<ExecutionState *>::iterator it = removedStates.begin(),
ie = removedStates.end();
it != ie; ++it) {
ExecutionState *es = *it;
std::set<ExecutionState*>::iterator it2 = states.find(es);
assert(it2!=states.end());
states.erase(it2);
// 在seedMap里擦除掉es
std::map<ExecutionState*, std::vector<SeedInfo> >::iterator it3 =
seedMap.find(es);
if (it3 != seedMap.end())
seedMap.erase(it3);
processTree->remove(es->ptreeNode);
delete es;
}
removedStates.clear();
}

searcher->update会将需要擦除的removedStates列出来, 然后在循环里将removeStatesstates中擦除, 如果在seedMap里有removedStates的话, 也一并擦除掉.

terminateState相关

terminateState, terminateState, terminateStateOnExit使用seedMap都是对state进行擦除做的清理工作. 因为这个state已经被终结了, 因此会对seedMap进行擦除, 以提高效率

executeMakeSymbolic

executeMakeSymbolic只由handleMakeSymbolic一处调用, 而handleMakeSymbolic只在处理klee_make_symbolic时会起作用.

1
add("klee_make_symbolic", handleMakeSymbolic, false),

因此可以确定的是executeMakeSymbolic也只有在处理klee_make_symbolic时会触发.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
void Executor::executeMakeSymbolic(ExecutionState &state, 
const MemoryObject *mo,
const std::string &name) {
// Create a new object state for the memory object (instead of a copy).
if (!replayKTest) {
// Find a unique name for this array. First try the original name,
// or if that fails try adding a unique identifier.
unsigned id = 0;
std::string uniqueName = name;
while (!state.arrayNames.insert(uniqueName).second) {
uniqueName = name + "_" + llvm::utostr(++id);
}
const Array *array = arrayCache.CreateArray(uniqueName, mo->size);
bindObjectInState(state, mo, false, array);
state.addSymbolic(mo, array);

std::map< ExecutionState*, std::vector<SeedInfo> >::iterator it =
seedMap.find(&state);
if (it!=seedMap.end()) { // In seed mode we need to add this as a
// binding.
for (std::vector<SeedInfo>::iterator siit = it->second.begin(),
siie = it->second.end(); siit != siie; ++siit) {
SeedInfo &si = *siit;
// 如果指定NamedSeedMatching 获取seed中跟mo->name相同的对象
// 否则, 从上至下依次返回mo的对象
KTestObject *obj = si.getNextInput(mo, NamedSeedMatching);

if (!obj) { // 没有可取对象
if (ZeroSeedExtension) {
std::vector<unsigned char> &values = si.assignment.bindings[array];
values = std::vector<unsigned char>(mo->size, '\0');
} else if (!AllowSeedExtension) {
// seed跟内存对象mo的对象数量不一致
terminateStateOnError(state, "ran out of inputs during seeding",
User);
break;
}
} else {
if (obj->numBytes != mo->size && // 对象大小不匹配
((!(AllowSeedExtension || ZeroSeedExtension)
&& obj->numBytes < mo->size) || // obj大小 < mo大小 又不准扩展种子
(!AllowSeedTruncation && obj->numBytes > mo->size))) { // obj大小 > mo大小, 又不准截断种子
// 输出错误信息并退出
std::stringstream msg;
msg << "replace size mismatch: "
<< mo->name << "[" << mo->size << "]"
<< " vs " << obj->name << "[" << obj->numBytes << "]"
<< " in test\n";

terminateStateOnError(state, msg.str(), User);
break;
} else {
// 开始处理 seed, 进行绑定
std::vector<unsigned char> &values = si.assignment.bindings[array];
values.insert(values.begin(), obj->bytes,
obj->bytes + std::min(obj->numBytes, mo->size));
if (ZeroSeedExtension) {
for (unsigned i=obj->numBytes; i<mo->size; ++i)
values.push_back('\0');
}
}
}
}
}
} else { ... // replay ktest 分支
}
}

executeMakeSymbolic中, 首先会根据seedMap提取出所有的obj, 并与mo进行比较, 如果跟mo的大小不匹配, 就会出现一些错误信息. 在真正处理seed时则会将其绑定给si.assignment.bindings[array]. 如果没有seed模式. 也就只能执行前半部分

1
2
3
4
5
6
7
8
unsigned id = 0;
std::string uniqueName = name;
while (!state.arrayNames.insert(uniqueName).second) {
uniqueName = name + "_" + llvm::utostr(++id);
}
const Array *array = arrayCache.CreateArray(uniqueName, mo->size);
bindObjectInState(state, mo, false, array);
state.addSymbolic(mo, array);