本文源碼從蘋果開源官網獲得
什么是Mach-O
Mach-O
為 Mach Object
文件格式的縮寫,是用于 iOS 和 macOS 的可執行文件,目標代碼,動態庫,內核轉儲的文件格式。
Mach-O 文件格式
蘋果官方給的一張文件結構圖:
Mach-O文件結構
我們編寫一個HelloWorld程序,將其編譯,然后通過MachOView來打開.out
文件:
可以知道Mach-O由三部分組成:
-
Header
:指明了CPU架構、文件類型、Load Commands 個數等一些基本信息。 -
Load Commands
:描述了怎樣加載每個 Segment 的信息。在 Mach-O 文件中可以有多個 Segment,每個 Segment 可能包含零個、一個或多個 Section。 -
Data
:Segment 的具體數據,包含了代碼和數據等。
Header
/*
* The 32-bit mach header appears at the very beginning of the object file for
* 32-bit architectures.
*/
struct mach_header {
uint32_t magic; /* mach magic number identifier */
cpu_type_t cputype; /* cpu specifier */
cpu_subtype_t cpusubtype; /* machine specifier */
uint32_t filetype; /* type of file */
uint32_t ncmds; /* number of load commands */
uint32_t sizeofcmds; /* the size of all the load commands */
uint32_t flags; /* flags */
};
/*
* The 64-bit mach header appears at the very beginning of object files for
* 64-bit architectures.
*/
struct mach_header_64 {
uint32_t magic; /* mach magic number identifier */
cpu_type_t cputype; /* cpu specifier */
cpu_subtype_t cpusubtype; /* machine specifier */
uint32_t filetype; /* type of file */
uint32_t ncmds; /* number of load commands */
uint32_t sizeofcmds; /* the size of all the load commands */
uint32_t flags; /* flags */
uint32_t reserved; /* reserved */
};
-
magic
:魔數,0xfeedface是32位,0xcefaedfe是64位
/* Constant for the magic field of the mach_header (32-bit architectures) */
#define MH_MAGIC 0xfeedface /* the mach magic number */
#define MH_CIGAM 0xcefaedfe /* NXSwapInt(MH_MAGIC) */
-
cputype
:CPU類型 -
cpusubtype
:CPU具體類型 -
filetype
:文件類型,例如可執行文件、庫文件等
文件類型filetype的宏定義有:
#define MH_OBJECT 0x1 /* relocatable object file */
#define MH_EXECUTE 0x2 /* demand paged executable file */
#define MH_FVMLIB 0x3 /* fixed VM shared library file */
#define MH_CORE 0x4 /* core file */
#define MH_PRELOAD 0x5 /* preloaded executable file */
#define MH_DYLIB 0x6 /* dynamically bound shared library */
#define MH_DYLINKER 0x7 /* dynamic link editor */
#define MH_BUNDLE 0x8 /* dynamically bound bundle file */
#define MH_DYLIB_STUB 0x9 /* shared library stub for static */
/* linking only, no section contents */
#define MH_DSYM 0xa /* companion file with only debug */
/* sections */
#define MH_KEXT_BUNDLE 0xb /* x86_64 kexts */
-
ncmds
:Load Commands的數量 -
sizeofcmds
:Load Commands的總大小 -
flags
:標志位,用于描述該文件的詳細信息。 -
reserved
:64位才有的保留字段,暫時沒用
標志位flags的宏定義有:
#define MH_NOUNDEFS 0x1 /* the object file has no undefined
references */
#define MH_INCRLINK 0x2 /* the object file is the output of an
incremental link against a base file
and can't be link edited again */
#define MH_DYLDLINK 0x4 /* the object file is input for the
dynamic linker and can't be staticly
link edited again */
#define MH_BINDATLOAD 0x8 /* the object file's undefined
references are bound by the dynamic
linker when loaded. */
#define MH_PREBOUND 0x10 /* the file has its dynamic undefined
references prebound. */
#define MH_SPLIT_SEGS 0x20 /* the file has its read-only and
read-write segments split */
#define MH_LAZY_INIT 0x40 /* the shared library init routine is
to be run lazily via catching memory
faults to its writeable segments
(obsolete) */
#define MH_TWOLEVEL 0x80 /* the image is using two-level name
space bindings */
#define MH_FORCE_FLAT 0x100 /* the executable is forcing all images
to use flat name space bindings */
#define MH_NOMULTIDEFS 0x200 /* this umbrella guarantees no multiple
defintions of symbols in its
sub-images so the two-level namespace
hints can always be used. */
#define MH_NOFIXPREBINDING 0x400 /* do not have dyld notify the
prebinding agent about this
executable */
#define MH_PREBINDABLE 0x800 /* the binary is not prebound but can
have its prebinding redone. only used
when MH_PREBOUND is not set. */
#define MH_ALLMODSBOUND 0x1000 /* indicates that this binary binds to
all two-level namespace modules of
its dependent libraries. only used
when MH_PREBINDABLE and MH_TWOLEVEL
are both set. */
#define MH_SUBSECTIONS_VIA_SYMBOLS 0x2000/* safe to divide up the sections into
sub-sections via symbols for dead
code stripping */
#define MH_CANONICAL 0x4000 /* the binary has been canonicalized
via the unprebind operation */
#define MH_WEAK_DEFINES 0x8000 /* the final linked image contains
external weak symbols */
#define MH_BINDS_TO_WEAK 0x10000 /* the final linked image uses
weak symbols */
#define MH_ALLOW_STACK_EXECUTION 0x20000/* When this bit is set, all stacks
in the task will be given stack
execution privilege. Only used in
MH_EXECUTE filetypes. */
#define MH_DEAD_STRIPPABLE_DYLIB 0x400000 /* Only for use on dylibs. When
linking against a dylib that
has this bit set, the static linker
will automatically not create a
LC_LOAD_DYLIB load command to the
dylib if no symbols are being
referenced from the dylib. */
#define MH_ROOT_SAFE 0x40000 /* When this bit is set, the binary
declares it is safe for use in
processes with uid zero */
#define MH_SETUID_SAFE 0x80000 /* When this bit is set, the binary
declares it is safe for use in
processes when issetugid() is true */
#define MH_NO_REEXPORTED_DYLIBS 0x100000 /* When this bit is set on a dylib,
the static linker does not need to
examine dependent dylibs to see
if any are re-exported */
#define MH_PIE 0x200000 /* When this bit is set, the OS will
load the main executable at a
random address. Only used in
MH_EXECUTE filetypes. */
對于上面的HelloWorld程序來說,它的Header信息如下:
Load Commands
struct load_command {
uint32_t cmd; /* type of load command */
uint32_t cmdsize; /* total size of command in bytes */
};
-
cmd
類型:指定command類型 -
cmdsize
:表示command大小,用于計算到下一個command的偏移量
cmd類型:
cmd | 作用 |
---|---|
LC_SEGMENT/LC_SEGMENT_64 | 將段內數據加載映射到內存中去 |
LC_SYMTAB | 符號表信息 |
LC_DYSYMTAB | 動態符號表信息 |
LC_DYLD_INFO_ONLY | 動態庫信息 |
LC_LOAD_DYLINKER | 啟動dyld |
LC_UUID | 唯一標識符 |
LC_SOURCE_VERSION | 源代碼版本 |
LC_MAIN | 程序入口 |
LC_LOAD_DYLIB | 加載動態庫 |
LC_FUNCTION_STARTS | 函數符號表 |
LC_DATA_IN_CODE | Data注入代碼地址 |
LC_CODE_SIGNATURE | 代碼簽名信息 |
segment
首先看看segment的定義:
struct segment_command { /* for 32-bit architectures */
uint32_t cmd; /* LC_SEGMENT */
uint32_t cmdsize; /* includes sizeof section structs */
char segname[16]; /* segment name */
uint32_t vmaddr; /* memory address of this segment */
uint32_t vmsize; /* memory size of this segment */
uint32_t fileoff; /* file offset of this segment */
uint32_t filesize; /* amount to map from the file */
vm_prot_t maxprot; /* maximum VM protection */
vm_prot_t initprot; /* initial VM protection */
uint32_t nsects; /* number of sections in segment */
uint32_t flags; /* flags */
};
-
cmd
:上面提到的Load Command類型 -
cmdsize
:Load Command大小 -
segname[16]
:段名稱
segname | 含義 |
---|---|
__PAGEZERO | 可執行文件捕獲空指針的段 |
__TEXT | 代碼段和只讀數據 |
__DATA | 全局變量和靜態變量 |
__LINKEDIT | 包含動態鏈接器所需的符號、字符串表等數據 |
-
vmaddr
:段虛擬地址(未偏移),真實虛擬地址要加上ASLR的偏移量 -
vmsize
:段的虛擬地址大小 -
fileoff
:段在文件內的地址偏移 -
filesize
:段在文件內的大小
加載segment的過程,就是從文件偏移fileoff
處,將大小為filesize
的段,加載到虛擬機vmaddr
處。 -
nsects
:段內section數量 -
flags
:標志位,用于描述詳細信息
標志位宏定義:
#define SG_HIGHVM 0x1 /* the file contents for this segment is for
the high part of the VM space, the low part
is zero filled (for stacks in core files) */
#define SG_FVMLIB 0x2 /* this segment is the VM that is allocated by
a fixed VM library, for overlap checking in
the link editor */
#define SG_NORELOC 0x4 /* this segment has nothing that was relocated
in it and nothing relocated to it, that is
it maybe safely replaced without relocation*/
#define SG_PROTECTED_VERSION_1 0x8 /* This segment is protected. If the
segment starts at file offset 0, the
first page of the segment is not
protected. All other pages of the
segment are protected. */
section
section的定義:
struct section { /* for 32-bit architectures */
char sectname[16]; /* name of this section */
char segname[16]; /* segment this section goes in */
uint32_t addr; /* memory address of this section */
uint32_t size; /* size in bytes of this section */
uint32_t offset; /* file offset of this section */
uint32_t align; /* section alignment (power of 2) */
uint32_t reloff; /* file offset of relocation entries */
uint32_t nreloc; /* number of relocation entries */
uint32_t flags; /* flags (section type and attributes)*/
uint32_t reserved1; /* reserved (for offset or index) */
uint32_t reserved2; /* reserved (for count or sizeof) */
};
-
sectname
:section名稱 -
segname
:所屬的segment名稱
(大寫的__TEXT
代表segment
,小寫的__text
代表section
)
sectname | 含義 |
---|---|
__text | 主程序代碼 |
__subs | 樁代碼 |
__stub_helper | 用于動態鏈接,啟動dyld |
__cstring | 硬編碼的C字符串 |
__la_symbol_ptr | 延遲加載 |
__data | 初始化的可變的變量 |
-
addr
:section在內存中的地址 -
size
:section大小 -
offset
:section在文件中的偏移 -
align
:內存對齊邊界 -
reloff
:重定位入口在文件中的偏移 -
nreloc
:重定位入口數量