Mach-O 文件結構

本文源碼從蘋果開源官網獲得

什么是Mach-O

Mach-OMach Object文件格式的縮寫,是用于 iOS 和 macOS 的可執行文件,目標代碼,動態庫,內核轉儲的文件格式。

Mach-O 文件格式

蘋果官方給的一張文件結構圖:


Mach-O文件結構

我們編寫一個HelloWorld程序,將其編譯,然后通過MachOView來打開.out文件:

可以知道Mach-O由三部分組成:

  • Header:指明了CPU架構、文件類型、Load Commands 個數等一些基本信息。
  • Load Commands:描述了怎樣加載每個 Segment 的信息。在 Mach-O 文件中可以有多個 Segment,每個 Segment 可能包含零個、一個或多個 Section。
  • Data:Segment 的具體數據,包含了代碼和數據等。

Header

/*
 * The 32-bit mach header appears at the very beginning of the object file for
 * 32-bit architectures.
 */
struct mach_header {
    uint32_t    magic;      /* mach magic number identifier */
    cpu_type_t  cputype;    /* cpu specifier */
    cpu_subtype_t   cpusubtype; /* machine specifier */
    uint32_t    filetype;   /* type of file */
    uint32_t    ncmds;      /* number of load commands */
    uint32_t    sizeofcmds; /* the size of all the load commands */
    uint32_t    flags;      /* flags */
};

/*
 * The 64-bit mach header appears at the very beginning of object files for
 * 64-bit architectures.
 */
struct mach_header_64 {
    uint32_t    magic;      /* mach magic number identifier */
    cpu_type_t  cputype;    /* cpu specifier */
    cpu_subtype_t   cpusubtype; /* machine specifier */
    uint32_t    filetype;   /* type of file */
    uint32_t    ncmds;      /* number of load commands */
    uint32_t    sizeofcmds; /* the size of all the load commands */
    uint32_t    flags;      /* flags */
    uint32_t    reserved;   /* reserved */
};
  • magic:魔數,0xfeedface是32位,0xcefaedfe是64位
/* Constant for the magic field of the mach_header (32-bit architectures) */
#define MH_MAGIC    0xfeedface  /* the mach magic number */
#define MH_CIGAM    0xcefaedfe  /* NXSwapInt(MH_MAGIC) */
  • cputype:CPU類型
  • cpusubtype:CPU具體類型
  • filetype:文件類型,例如可執行文件、庫文件等
    文件類型filetype的宏定義有:
#define MH_OBJECT   0x1     /* relocatable object file */
#define MH_EXECUTE  0x2     /* demand paged executable file */
#define MH_FVMLIB   0x3     /* fixed VM shared library file */
#define MH_CORE     0x4     /* core file */
#define MH_PRELOAD  0x5     /* preloaded executable file */
#define MH_DYLIB    0x6     /* dynamically bound shared library */
#define MH_DYLINKER 0x7     /* dynamic link editor */
#define MH_BUNDLE   0x8     /* dynamically bound bundle file */
#define MH_DYLIB_STUB   0x9     /* shared library stub for static */
                    /*  linking only, no section contents */
#define MH_DSYM     0xa     /* companion file with only debug */
                    /*  sections */
#define MH_KEXT_BUNDLE  0xb     /* x86_64 kexts */
  • ncmds:Load Commands的數量
  • sizeofcmds:Load Commands的總大小
  • flags:標志位,用于描述該文件的詳細信息。
  • reserved:64位才有的保留字段,暫時沒用

標志位flags的宏定義有:

#define MH_NOUNDEFS 0x1     /* the object file has no undefined
                       references */
#define MH_INCRLINK 0x2     /* the object file is the output of an
                       incremental link against a base file
                       and can't be link edited again */
#define MH_DYLDLINK 0x4     /* the object file is input for the
                       dynamic linker and can't be staticly
                       link edited again */
#define MH_BINDATLOAD   0x8     /* the object file's undefined
                       references are bound by the dynamic
                       linker when loaded. */
#define MH_PREBOUND 0x10        /* the file has its dynamic undefined
                       references prebound. */
#define MH_SPLIT_SEGS   0x20        /* the file has its read-only and
                       read-write segments split */
#define MH_LAZY_INIT    0x40        /* the shared library init routine is
                       to be run lazily via catching memory
                       faults to its writeable segments
                       (obsolete) */
#define MH_TWOLEVEL 0x80        /* the image is using two-level name
                       space bindings */
#define MH_FORCE_FLAT   0x100       /* the executable is forcing all images
                       to use flat name space bindings */
#define MH_NOMULTIDEFS  0x200       /* this umbrella guarantees no multiple
                       defintions of symbols in its
                       sub-images so the two-level namespace
                       hints can always be used. */
#define MH_NOFIXPREBINDING 0x400    /* do not have dyld notify the
                       prebinding agent about this
                       executable */
#define MH_PREBINDABLE  0x800           /* the binary is not prebound but can
                       have its prebinding redone. only used
                                           when MH_PREBOUND is not set. */
#define MH_ALLMODSBOUND 0x1000      /* indicates that this binary binds to
                                           all two-level namespace modules of
                       its dependent libraries. only used
                       when MH_PREBINDABLE and MH_TWOLEVEL
                       are both set. */ 
#define MH_SUBSECTIONS_VIA_SYMBOLS 0x2000/* safe to divide up the sections into
                        sub-sections via symbols for dead
                        code stripping */
#define MH_CANONICAL    0x4000      /* the binary has been canonicalized
                       via the unprebind operation */
#define MH_WEAK_DEFINES 0x8000      /* the final linked image contains
                       external weak symbols */
#define MH_BINDS_TO_WEAK 0x10000    /* the final linked image uses
                       weak symbols */

#define MH_ALLOW_STACK_EXECUTION 0x20000/* When this bit is set, all stacks 
                       in the task will be given stack
                       execution privilege.  Only used in
                       MH_EXECUTE filetypes. */
#define MH_DEAD_STRIPPABLE_DYLIB 0x400000 /* Only for use on dylibs.  When
                         linking against a dylib that
                         has this bit set, the static linker
                         will automatically not create a
                         LC_LOAD_DYLIB load command to the
                         dylib if no symbols are being
                         referenced from the dylib. */
#define MH_ROOT_SAFE 0x40000           /* When this bit is set, the binary 
                      declares it is safe for use in
                      processes with uid zero */
                                         
#define MH_SETUID_SAFE 0x80000         /* When this bit is set, the binary 
                      declares it is safe for use in
                      processes when issetugid() is true */

#define MH_NO_REEXPORTED_DYLIBS 0x100000 /* When this bit is set on a dylib, 
                      the static linker does not need to
                      examine dependent dylibs to see
                      if any are re-exported */
#define MH_PIE 0x200000         /* When this bit is set, the OS will
                       load the main executable at a
                       random address.  Only used in
                       MH_EXECUTE filetypes. */

對于上面的HelloWorld程序來說,它的Header信息如下:

Load Commands

struct load_command {
    uint32_t cmd;       /* type of load command */
    uint32_t cmdsize;   /* total size of command in bytes */
};
  • cmd類型:指定command類型
  • cmdsize:表示command大小,用于計算到下一個command的偏移量

cmd類型:

cmd 作用
LC_SEGMENT/LC_SEGMENT_64 將段內數據加載映射到內存中去
LC_SYMTAB 符號表信息
LC_DYSYMTAB 動態符號表信息
LC_DYLD_INFO_ONLY 動態庫信息
LC_LOAD_DYLINKER 啟動dyld
LC_UUID 唯一標識符
LC_SOURCE_VERSION 源代碼版本
LC_MAIN 程序入口
LC_LOAD_DYLIB 加載動態庫
LC_FUNCTION_STARTS 函數符號表
LC_DATA_IN_CODE Data注入代碼地址
LC_CODE_SIGNATURE 代碼簽名信息

segment

首先看看segment的定義:

struct segment_command { /* for 32-bit architectures */
    uint32_t    cmd;        /* LC_SEGMENT */
    uint32_t    cmdsize;    /* includes sizeof section structs */
    char        segname[16];    /* segment name */
    uint32_t    vmaddr;     /* memory address of this segment */
    uint32_t    vmsize;     /* memory size of this segment */
    uint32_t    fileoff;    /* file offset of this segment */
    uint32_t    filesize;   /* amount to map from the file */
    vm_prot_t   maxprot;    /* maximum VM protection */
    vm_prot_t   initprot;   /* initial VM protection */
    uint32_t    nsects;     /* number of sections in segment */
    uint32_t    flags;      /* flags */
};
  • cmd:上面提到的Load Command類型
  • cmdsize:Load Command大小
  • segname[16]:段名稱
segname 含義
__PAGEZERO 可執行文件捕獲空指針的段
__TEXT 代碼段和只讀數據
__DATA 全局變量和靜態變量
__LINKEDIT 包含動態鏈接器所需的符號、字符串表等數據
  • vmaddr:段虛擬地址(未偏移),真實虛擬地址要加上ASLR的偏移量
  • vmsize:段的虛擬地址大小
  • fileoff:段在文件內的地址偏移
  • filesize:段在文件內的大小
    加載segment的過程,就是從文件偏移fileoff處,將大小為filesize的段,加載到虛擬機vmaddr處。
  • nsects:段內section數量
  • flags:標志位,用于描述詳細信息
    標志位宏定義:
#define SG_HIGHVM   0x1 /* the file contents for this segment is for
                   the high part of the VM space, the low part
                   is zero filled (for stacks in core files) */
#define SG_FVMLIB   0x2 /* this segment is the VM that is allocated by
                   a fixed VM library, for overlap checking in
                   the link editor */
#define SG_NORELOC  0x4 /* this segment has nothing that was relocated
                   in it and nothing relocated to it, that is
                   it maybe safely replaced without relocation*/
#define SG_PROTECTED_VERSION_1  0x8 /* This segment is protected.  If the
                       segment starts at file offset 0, the
                       first page of the segment is not
                       protected.  All other pages of the
                       segment are protected. */

section

section的定義:

struct section { /* for 32-bit architectures */
    char        sectname[16];   /* name of this section */
    char        segname[16];    /* segment this section goes in */
    uint32_t    addr;       /* memory address of this section */
    uint32_t    size;       /* size in bytes of this section */
    uint32_t    offset;     /* file offset of this section */
    uint32_t    align;      /* section alignment (power of 2) */
    uint32_t    reloff;     /* file offset of relocation entries */
    uint32_t    nreloc;     /* number of relocation entries */
    uint32_t    flags;      /* flags (section type and attributes)*/
    uint32_t    reserved1;  /* reserved (for offset or index) */
    uint32_t    reserved2;  /* reserved (for count or sizeof) */
};
  • sectname:section名稱
  • segname:所屬的segment名稱
    (大寫的__TEXT代表segment,小寫的__text代表section
sectname 含義
__text 主程序代碼
__subs 樁代碼
__stub_helper 用于動態鏈接,啟動dyld
__cstring 硬編碼的C字符串
__la_symbol_ptr 延遲加載
__data 初始化的可變的變量
  • addr:section在內存中的地址
  • size:section大小
  • offset:section在文件中的偏移
  • align:內存對齊邊界
  • reloff:重定位入口在文件中的偏移
  • nreloc:重定位入口數量
?著作權歸作者所有,轉載或內容合作請聯系作者
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 227,663評論 6 531
  • 序言:濱河連續發生了三起死亡事件,死亡現場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發現死者居然都...
    沈念sama閱讀 98,125評論 3 414
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事?!?“怎么了?”我有些...
    開封第一講書人閱讀 175,506評論 0 373
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 62,614評論 1 307
  • 正文 為了忘掉前任,我火速辦了婚禮,結果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 71,402評論 6 404
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發上,一...
    開封第一講書人閱讀 54,934評論 1 321
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,021評論 3 440
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 42,168評論 0 287
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當地人在樹林里發現了一具尸體,經...
    沈念sama閱讀 48,690評論 1 333
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 40,596評論 3 354
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發現自己被綠了。 大學時的朋友給我發了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 42,784評論 1 369
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,288評論 5 357
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質發生泄漏。R本人自食惡果不足惜,卻給世界環境...
    茶點故事閱讀 44,027評論 3 347
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 34,404評論 0 25
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 35,662評論 1 280
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 51,398評論 3 390
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 47,743評論 2 370

推薦閱讀更多精彩內容