這是系列博文的第二篇,第一篇在此:golang深入源代碼之一:AST的遍歷。
怎么形成一個項目內部的函數調用關系
在一些場景下,需要對一個項目內部的函數調用關系做分析,IDE當然是可以做到一部分。但是對于一個完整調用鏈,IDE就愛莫能助了。上面列舉的第一篇文章講到的golang AST遍歷可以解決這個問題。分析每一個ast.FuncDecl
內部的所有調用可能,記錄所有A->B的調用關系,可以解決這個問題。不過本文沒有直接使用AST,而是運用了golang提供的完備的工具鏈來實現。
一個例子
如下為測試項目的文件結構:
-- /exmaple/test3.go
-- /exmaple/test4.go
-- /example/inner/itest1.go
中間存在著跨文件調用和跨package調用,以及調用類的函數。
/exmaple/test3.go如下:
package main
import (
"context"
"fmt"
"github.com/baixiaoustc/go_code_analysis/example/inner"
)
func main() {
fmt.Println("start")
Test3()
test3a()
test3c()
go receiveFromKafka()
select {}
}
func Test3() {
fmt.Println("test3")
test3b()
}
type XYZ struct {
Name string
}
func (xyz XYZ) print() {
fmt.Println(xyz.Name)
context.WithCancel(nil)
}
func test3a() {
xyz := XYZ{"hello"}
xyz.print()
}
func test3b() {
test3b()
inner.Itest1()
}
func test3c() {
go func() {
fmt.Println("go")
}()
test4a("world")
}
/exmaple/test4.go如下:
package main
import (
"context"
"fmt"
)
func test4a(a string) {
fmt.Println(a)
context.WithCancel(nil)
}
func test4b(a string) {
fmt.Println(a)
context.WithCancel(nil)
}
func receiveFromKafka() {
test4a("kafka")
test4b("kafka")
}
/example/inner/itest1.go如下:
package inner
import "context"
func Itest1() {
context.WithCancel(nil)
}
第一步,比如我們要找到從上到下調用到test4a
的調用鏈,應該怎么做呢?
使用golang提供的靜態編譯工具鏈
我們依賴了如下三個golang工具鏈:
- "golang.org/x/tools/go/loader"
- "golang.org/x/tools/go/pointer"
- "golang.org/x/tools/go/ssa"
go/loader
Package loader loads a complete Go program from source code, parsing and type-checking the initial packages plus their transitive closure of dependencies. The ASTs and the derived facts are retained for later use.
這個包的官方定義如上,大意是指從源代碼加載整個項目,解析代碼并作類型校驗,分析package之間的依賴關系,返回ASTs和衍生的關系。
go/ssa
Package ssa defines a representation of the elements of Go programs (packages, types, functions, variables and constants) using a static single-assignment (SSA) form intermediate representation (IR) for the bodies of functions.
SSA(Static Single Assignment,靜態單賦值),是源代碼和機器碼中間的表現形式。從AST轉換到SSA之后,編譯器會進行一系列的優化。這些優化被應用于代碼的特定階段使得處理器能夠更簡單和快速地執行。
go/pointer
Package pointer implements Andersen's analysis, an inclusion-based pointer analysis algorithm first described in (Andersen, 1994).
指針分析是一類特殊的數據流問題,它是其它靜態程序分析的基礎。算法最終建立各節點間的指向關系,具體可以參考文章Anderson's pointer analysis。
在此沒有進行理論上的研究,單單嘗試使用golang提供的工具鏈做源代碼的語義分析:
var Analysis *analysis
type analysis struct {
prog *ssa.Program
conf loader.Config
pkgs []*ssa.Package
mains []*ssa.Package
result *pointer.Result
}
func doAnalysis(buildCtx *build.Context, tests bool, args []string) {
t0 := time.Now()
conf := loader.Config{Build: buildCtx}
_, err := conf.FromArgs(args, tests)
if err != nil {
log.Printf("invalid args:", err)
return
}
load, err := conf.Load()
if err != nil {
log.Printf("failed conf load:", err)
return
}
log.Printf("loading.. %d imported (%d created) took: %v",
len(load.Imported), len(load.Created), time.Since(t0))
t0 = time.Now()
prog := ssautil.CreateProgram(load, 0)
prog.Build()
pkgs := prog.AllPackages()
var mains []*ssa.Package
if tests {
for _, pkg := range pkgs {
if main := prog.CreateTestMainPackage(pkg); main != nil {
mains = append(mains, main)
}
}
if mains == nil {
log.Fatalln("no tests")
}
} else {
mains = append(mains, ssautil.MainPackages(pkgs)...)
if len(mains) == 0 {
log.Printf("no main packages")
}
}
log.Printf("building.. %d packages (%d main) took: %v",
len(pkgs), len(mains), time.Since(t0))
t0 = time.Now()
ptrcfg := &pointer.Config{
Mains: mains,
BuildCallGraph: true,
}
result, err := pointer.Analyze(ptrcfg)
if err != nil {
log.Fatalln("analyze failed:", err)
}
log.Printf("analysis took: %v", time.Since(t0))
Analysis = &analysis{
prog: prog,
conf: conf,
pkgs: pkgs,
mains: mains,
result: result,
}
}
如上的result
的類型pointer.Result
中含有的callgraph.Graph
結構就是上述的節點間的指向關系,是一顆樹形結構。再用callgraph.GraphVisitEdges
深度優先遍歷它得到函數之間的兩兩調用關系。
需要注意三個坑:
- 針對go func(){}的情況需要處理處理
$
- 針對類的函數,用class@func來標示
- 注意處理跨package的調用情況
執行代碼見: https://github.com/baixiaoustc/go_code_analysis/blob/master/second_post_test.go中的TestAnalysisCallGraphy
。
我們定義了如下結構表示函數之間的兩兩調用關系:
//函數定義
type FuncDesc struct {
File string //文件路徑
Package string //package名
Name string //函數名,格式為Package.Func
}
//描述一個函數調用N個函數的一對多關系
type CallerRelation struct {
Caller FuncDesc
Callees []FuncDesc
}
如上例子的最終結果為:
2019/01/17 22:28:27 loading.. 1 imported (0 created) took: 1.335481764s
2019/01/17 22:28:28 building.. 24 packages (1 main) took: 250.771602ms
2019/01/17 22:28:28 analysis took: 301.707762ms
2019/01/17 22:28:28 0 limit prefixes: []
2019/01/17 22:28:28 0 ignore prefixes: []
2019/01/17 22:28:28 0 include prefixes: []
2019/01/17 22:28:28 no std packages: true
2019/01/17 22:28:29 call node: n11:github.com/baixiaoustc/go_code_analysis/example.receiveFromKafka -> n719:github.com/baixiaoustc/go_code_analysis/example.test4a
2019/01/17 22:28:29 call node: n11:github.com/baixiaoustc/go_code_analysis/example.receiveFromKafka -> n720:github.com/baixiaoustc/go_code_analysis/example.test4b
2019/01/17 22:28:30 call node: n9:github.com/baixiaoustc/go_code_analysis/example.test3a -> n81:(github.com/baixiaoustc/go_code_analysis/example.XYZ).print
2019/01/17 22:28:31 call node: n717:github.com/baixiaoustc/go_code_analysis/example.test3b -> n717:github.com/baixiaoustc/go_code_analysis/example.test3b
2019/01/17 22:28:31 call node: n717:github.com/baixiaoustc/go_code_analysis/example.test3b -> n797:github.com/baixiaoustc/go_code_analysis/example/inner.Itest1
2019/01/17 22:28:31 call node: n8:github.com/baixiaoustc/go_code_analysis/example.Test3 -> n717:github.com/baixiaoustc/go_code_analysis/example.test3b
2019/01/17 22:28:31 call node: n6:github.com/baixiaoustc/go_code_analysis/example.main -> n8:github.com/baixiaoustc/go_code_analysis/example.Test3
2019/01/17 22:28:31 call node: n6:github.com/baixiaoustc/go_code_analysis/example.main -> n9:github.com/baixiaoustc/go_code_analysis/example.test3a
2019/01/17 22:28:31 call node: n10:github.com/baixiaoustc/go_code_analysis/example.test3c -> n718:github.com/baixiaoustc/go_code_analysis/example.test3c$1
2019/01/17 22:28:31 call node: n10:github.com/baixiaoustc/go_code_analysis/example.test3c -> n719:github.com/baixiaoustc/go_code_analysis/example.test4a
2019/01/17 22:28:31 call node: n6:github.com/baixiaoustc/go_code_analysis/example.main -> n10:github.com/baixiaoustc/go_code_analysis/example.test3c
2019/01/17 22:28:31 call node: n6:github.com/baixiaoustc/go_code_analysis/example.main -> n11:github.com/baixiaoustc/go_code_analysis/example.receiveFromKafka
2019/01/17 22:28:31 6/1991 edges
2019/01/17 22:28:31 正向調用關系:example.Test3 {Caller:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:Test3} Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3b}]}
2019/01/17 22:28:31 正向調用關系:example.main {Caller:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:main} Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:Test3} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3a} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3c} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go Package:main Name:receiveFromKafka}]}
2019/01/17 22:28:31 正向調用關系:example.test3c {Caller:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3c} Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go Package:main Name:test4a}]}
2019/01/17 22:28:31 正向調用關系:example.receiveFromKafka {Caller:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go Package:main Name:receiveFromKafka} Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go Package:main Name:test4a} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go Package:main Name:test4b}]}
2019/01/17 22:28:31 正向調用關系:example.test3a {Caller:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3a} Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:XYZ@print}]}
2019/01/17 22:28:31 正向調用關系:example.test3b {Caller:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3b} Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/inner/itest1.go Package:inner Name:Itest1}]}
以上內容大量參考https://github.com/TrueFurby/go-callvis
第二部,需要反向找到test4a
函數被其他函數調用的完整路徑。
形成反向調用關系
針對上面例子,test4a
函數的反向調用關系應為:
實際上可以將其轉為樹形結構,如圖:
那我們的工作就是構造出一棵樹,其根節點為test4a
函數。
構造多叉樹
用如下結構表示多叉樹的節點:
type MWTNode struct {
Key string
Value FuncDesc
N int
Children []*MWTNode
}
其中Key和FuncDesc.Name
一樣,格式為Package.Func,N表示子節點的個數,Children是子節點列表。如下代碼生成多叉樹,callMap
是上一個階段生成的正向調用關系:
func BuildFromCallMap(head *MWTNode, callMap map[string]CallerRelation) {
nodeMap := make(map[string]struct{})
nodeList := make([]*MWTNode, 1)
nodeList[0] = head
for {
if len(nodeList) == 0 {
break
}
tmp := nodeList[0]
log.Printf("tmp %+v", tmp)
for callerName, callRelation := range callMap {
for _, callee := range callRelation.Callees {
if tmp.Key == fmt.Sprintf("%s.%s", callee.Package, callee.Name) {
log.Printf("found caller:%s -> callee:%s", callerName, callee)
key := fmt.Sprintf("%s.%s", callRelation.Caller.Package, callRelation.Caller.Name)
if _, ok := nodeMap[key]; !ok {
newNode := &MWTNode{
Key: key,
Value: FuncDesc{callRelation.Caller.File, callRelation.Caller.Package, callRelation.Caller.Name},
Children: make([]*MWTNode, 0),
}
tmp.N++
tmp.Children = append(tmp.Children, newNode)
nodeList = append(nodeList, newNode)
} else {
nodeMap[key] = struct{}{}
}
}
}
}
nodeList = nodeList[1:]
//log.Printf("head %+v", head)
log.Printf("nodeList len:%d", len(nodeList))
}
}
形成反向調用鏈
再定義一個結構用于描述反向調用鏈:
//描述關鍵函數的一條反向調用關系
type CalledRelation struct {
Callees []FuncDesc
CanFix bool //該調用關系能反向找到gin.Context即可以自動修復
}
再利用深度優先比例該樹,期望得到如下結論:
- test4a <- test3c <- main
- test4a <- receiveFromKafka <- main
func depthTraversal(head *MWTNode, s string, re CalledRelation, list *[]CalledRelation) {
s = fmt.Sprintf("%s<-%s", s, head.Key)
re.Callees = append(re.Callees, head.Value)
//log.Printf("%+v: %s %+v", head, s, re.Callees)
if head.N == 0 {
log.Printf("找到反向調用鏈:%s", s)
log.Printf("re.Callees:%+v", re.Callees)
*list = append(*list, re)
s = ""
re.Callees = make([]FuncDesc, 0)
} else {
for _, node := range head.Children {
depthTraversal(node, s, re, list)
}
}
}
執行代碼見: https://github.com/baixiaoustc/go_code_analysis/blob/master/second_post_test.go中的TestAnalysisReverceCallGraphy
。最終結論如下,得到了印證:
2019/01/20 21:53:32 tmp &{Key:main.test4a Value:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4a.go Package:main Name:test4a} N:0 Level:0 Children:[]}
2019/01/20 21:53:32 found caller:example.receiveFromKafka -> callee:{/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go main test4a}
2019/01/20 21:53:32 found caller:example.test3c -> callee:{/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go main test4a}
2019/01/20 21:53:32 nodeList len:2
2019/01/20 21:53:32 tmp &{Key:main.receiveFromKafka Value:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go Package:main Name:receiveFromKafka} N:0 Level:0 Children:[]}
2019/01/20 21:53:32 found caller:example.main -> callee:{/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go main receiveFromKafka}
2019/01/20 21:53:32 nodeList len:2
2019/01/20 21:53:32 tmp &{Key:main.test3c Value:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3c} N:0 Level:0 Children:[]}
2019/01/20 21:53:32 found caller:example.main -> callee:{/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go main test3c}
2019/01/20 21:53:32 nodeList len:2
2019/01/20 21:53:32 tmp &{Key:main.main Value:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:main} N:0 Level:0 Children:[]}
2019/01/20 21:53:32 nodeList len:1
2019/01/20 21:53:32 tmp &{Key:main.main Value:{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:main} N:0 Level:0 Children:[]}
2019/01/20 21:53:32 nodeList len:0
2019/01/20 21:53:32 找到反向調用鏈:<-main.test4a<-main.receiveFromKafka<-main.main
2019/01/20 21:53:32 re.Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4a.go Package:main Name:test4a} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go Package:main Name:receiveFromKafka} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:main}]
2019/01/20 21:53:32 找到反向調用鏈:<-main.test4a<-main.test3c<-main.main
2019/01/20 21:53:32 re.Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4a.go Package:main Name:test4a} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3c} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:main}]
2019/01/20 21:53:32 list0: {Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4a.go Package:main Name:test4a} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4.go Package:main Name:receiveFromKafka} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:main}] CanFix:false}
2019/01/20 21:53:32 list1: {Callees:[{File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test4a.go Package:main Name:test4a} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:test3c} {File:/Users/baixiao/Go/src/github.com/baixiaoustc/go_code_analysis/example/test3.go Package:main Name:main}] CanFix:false}