OC底层探索 – _class_createInstanceFromZone-六虎

本文探究OC底层探究 – alloc & init中，并未解释的 _class_createInstanceFromZone办法。

static ALWAYS_INLINE id _class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone,
                                                       int construct_flags = OBJECT_CONSTRUCT_NONE, 
                                                       bool cxxConstruct = true, 
                                                       size_t *outAllocatedSize = nil) { 
    ASSERT(cls->isRealized());
    // Read class's info bits all at once for performance
    bool hasCxxCtor = cxxConstruct && cls->hasCxxCtor();
    bool hasCxxDtor = cls->hasCxxDtor();
    bool fast = cls->canAllocNonpointer();
    size_t size;
    size = cls->instanceSize(extraBytes);
    if (outAllocatedSize) *outAllocatedSize = size;
    id obj;
    if (zone) {
        obj = (id)malloc_zone_calloc((malloc_zone_t *)zone, 1, size);
    } else {
        obj = (id)calloc(1, size);
    }
    if (slowpath(!obj)) {
        if (construct_flags & OBJECT_CONSTRUCT_CALL_BADALLOC) {
            return _objc_callBadAllocHandler(cls);
        }
        return nil;
    }
    if (!zone && fast) {
        obj->initInstanceIsa(cls, hasCxxDtor);
    } else {
        // Use raw pointer isa on the assumption that they might be
        // doing something weird with the zone or RR.
        obj->initIsa(cls);
    }
    if (fastpath(!hasCxxCtor)) {
        return obj;
    }
    construct_flags |= OBJECT_CONSTRUCT_FREE_ONFAILURE;
    return object_cxxConstructFromClass(obj, cls, construct_flags);
}

_class_createInstanceFromZone 办法真正的完成了实例的内存分配，那么详细是怎么完成的呢？咱们渐渐来看。

实例内存巨细的核算

首先来看 instanceSize , 该办法姓名表明这是实例的巨细

size_t size;
size = cls->instanceSize(extraBytes);

咱们进来看一下他的完成

inline size_t instanceSize(size_t extraBytes) const {
    if (fastpath(cache.hasFastInstanceSize(extraBytes))) {
        return cache.fastInstanceSize(extraBytes);
    }
    size_t size = alignedInstanceSize() + extraBytes;
    // CF requires all objects be at least 16 bytes.
    if (size < 16) size = 16;
    return size;
}

关于 if 条件中的部分，依据命名猜想，大概应该是缓存现已核算过的类的实例巨细，假如存在直接回来。

那么咱们就看一下下面真正核算实例巨细的办法好了。

size_t size = alignedInstanceSize() + extraBytes;

其间 extraBytes 是经过办法外部传入的，额定字节数。alignedInstanceSize() 是对齐实例的巨细。然后有一个判别，使得 size 的巨细，至少是 16 字节。

uint32_t alignedInstanceSize() const {
    return word_align(unalignedInstanceSize());
}
#ifdef __LP64__
#   define WORD_SHIFT 3UL
#   define WORD_MASK 7UL
#   define WORD_BITS 64
#else
#   define WORD_SHIFT 2UL
#   define WORD_MASK 3UL
#   define WORD_BITS 32
#endif
/* 字节对齐算法 */
static inline uint32_t word_align(uint32_t x) {
    return (x + WORD_MASK) & ~WORD_MASK;
}
// 可能是未对齐的，这取决于类的成员变量
// May be unaligned depending on class's ivars. 
uint32_t unalignedInstanceSize() const {
    ASSERT(isRealized());
    return data()->ro()->instanceSize;
}

经过上面代码，能够知道，这是先得到未对齐的实例巨细，然后进行8字节对齐得到的对齐后的实例巨细。(关于目标字节对齐的内容，请看下面 字节对齐核算 内容)

8字节对齐这是以64位设备而言的，32位是4字节对齐, 经过上面源码中的宏能够看出来

if (fastpath(cache.hasFastInstanceSize(extraBytes))) {
    return cache.fastInstanceSize(extraBytes);
}
size_t fastInstanceSize(size_t extra) const
{
    ASSERT(hasFastInstanceSize(extra));
    if (__builtin_constant_p(extra) && extra == 0) {
        return _flags & FAST_CACHE_ALLOC_MASK16;
    } else {
        size_t size = _flags & FAST_CACHE_ALLOC_MASK;
        // remove the FAST_CACHE_ALLOC_DELTA16 that was added
        // by setFastInstanceSize
        return align16(size + extra - FAST_CACHE_ALLOC_DELTA16);
    }
}
static inline size_t align16(size_t x) {
    return (x + size_t(15)) & ~size_t(15);
}
void setFastInstanceSize(size_t newSize)
{
    // Set during realization or construction only. No locking needed.
    uint16_t newBits = _flags & ~FAST_CACHE_ALLOC_MASK;
    uint16_t sizeBits;
    // Adding FAST_CACHE_ALLOC_DELTA16 allows for FAST_CACHE_ALLOC_MASK16
    // to yield the proper 16byte aligned allocation size with a single mask
    sizeBits = word_align(newSize) + FAST_CACHE_ALLOC_DELTA16;
    sizeBits &= FAST_CACHE_ALLOC_MASK;
    if (newSize <= sizeBits) {
        newBits |= sizeBits;
    }
    _flags = newBits;
}

经过上面源码看，缓存中的size是16字节对齐的，那这个跟咱们8字节核算的到的值不一致啊。

id obj;
    if (zone) {
        obj = (id)malloc_zone_calloc((malloc_zone_t *)zone, 1, size);
    } else {
        obj = (id)calloc(1, size);
    }

找网上别人的解释是，calloc 办法内部是会进行16字节对齐的。

为什么是16位对齐

1、一般内存是一个个字节组成的，cpu在存取数据时，并不是以字节为单位存储，而是以块为单位存取，块的巨细为内存存取力度，频频存取字节未对齐的数据，会极大下降CPU的性能，所以能够经过减少存取次数来下降cup的开支 (空间换时刻) 2、16字节对齐是因为一个目标中的第一个特点isa占8字节，当然目标中肯定还有其他特点，当无特点时会预留8字节，即16字节对齐，假如不预留相当于这个目标的isa和其它目标的isa紧挨着，简单形成拜访混乱
3、16字节对齐后，能够加快cpu的存取速度，一起增加拜访安全性

这个暂时没有深入研究，不过经过打印测验能够验证。

#import <objc/runtime.h>
#import <malloc/malloc.h>
NSObject *obj = [[NSObject alloc] init];
NSLog(@"NSObject实例size：%lu",sizeof(obj));
NSLog(@"NSObject实例分配的内存巨细：%lu",malloc_size((__bridge const void*)(obj)));
void *a = calloc(1, sizeof(obj));
NSLog(@"size_t = %lu 时 calloc办法 实际分配的内存巨细：%lu",sizeof(obj), malloc_size(a));
NSLog(@"calloc(1, 1) = %lu", malloc_size(calloc(1, 1)));
NSLog(@"calloc(1, 5) = %lu", malloc_size(calloc(1, 5)));
NSLog(@"calloc(1, 8) = %lu", malloc_size(calloc(1, 8)));
NSLog(@"calloc(1,13) = %lu", malloc_size(calloc(1, 13)));
NSLog(@"calloc(1,16) = %lu", malloc_size(calloc(1, 16)));
NSLog(@"calloc(1,17) = %lu", malloc_size(calloc(1, 17)));

那么核算的8字节对齐的和缓存的16字节对齐的值，对实际分配的内存的巨细来说都是16。

分配内存与实例进行相关

这个看看姓名和判别应该是内存分配失利时的处理

if (slowpath(!obj)) {
    if (construct_flags & OBJECT_CONSTRUCT_CALL_BADALLOC) {
        return _objc_callBadAllocHandler(cls);
    }
    return nil;
}

持续看下面代码

if (!zone && fast) {
obj->initInstanceIsa(cls, hasCxxDtor);
} else {
    // Use raw pointer isa on the assumption that they might be
    // doing something weird with the zone or RR.
    obj->initIsa(cls);
}
inline void 
objc_object::initInstanceIsa(Class cls, bool hasCxxDtor)
{
    ASSERT(!cls->instancesRequireRawIsa());
    ASSERT(hasCxxDtor == cls->hasCxxDtor());
    initIsa(cls, true, hasCxxDtor);
}
inline void 
objc_object::initIsa(Class cls)
{
    initIsa(cls, false, false);
}

咱们在if之前打印 obj

在执行完 initIsa 之后再打印 obj

可见 initIsa 办法把实例与分配的内存建立了联系。

inline void
objc_object::initIsa(Class cls)
{
  initIsa(cls, false, false);
}
inline void 
objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT bool hasCxxDtor)
{ 
    ASSERT(!isTaggedPointer()); 
    isa_t newisa(0);
    if (!nonpointer) {
        newisa.setClass(cls, this);
    } else {
        ASSERT(!DisableNonpointerIsa);
        ASSERT(!cls->instancesRequireRawIsa());
#if SUPPORT_INDEXED_ISA
        ASSERT(cls->classArrayIndex() > 0);
        newisa.bits = ISA_INDEX_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
        newisa.has_cxx_dtor = hasCxxDtor;
        newisa.indexcls = (uintptr_t)cls->classArrayIndex();
#else
        newisa.bits = ISA_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
#   if ISA_HAS_CXX_DTOR_BIT
        newisa.has_cxx_dtor = hasCxxDtor;
#   endif
        newisa.setClass(cls, this);
#endif
        newisa.extra_rc = 1;
    }
    // This write must be performed in a single store in some cases
    // (for example when realizing a class because other threads
    // may simultaneously try to use the class).
    // fixme use atomics here to guarantee single-store and to
    // guarantee memory order w.r.t. the class index table
    // ...but not too atomic because we don't want to hurt instantiation
    isa = newisa;
}

从 initIsa 办法中咱们能够找到

isa_t newisa(0);
isa = newisa;

所以 isa_t 便是 isa 的类型。

后续会有独自的华章来对 isa 进行探究

字节对齐核算

目标的内存对齐

以64位设备的8字节对齐为例 (对齐是缺乏的补满)

define WORD_MASK 7UL
static inline uint32_t word_align(uint32_t x) {
    return (x + WORD_MASK) & ~WORD_MASK;
}

假如咱们自己完成一个8字节对齐的话，能够这样写

func aligned(x: Int) {
    return (x + 7)/8*8
}

而源码中的 & ~WORD_MASK , 实际上的效果与咱们的 /8*8 是一致的，只不过位运算比咱们的乘除法效率要高许多。

咱们经过对 Int 型进行 /8 ，把原值中缺乏 8 的余数舍弃，然后 *8 ，还原为 8 的整数倍。其实就相当于减去余数。而位运算便是直接减去了余数。

首先：define WORD_MASK 7UL (UL表明这是无符号长整型)
写为二进制：
WORD_MASK    00000000 00000111
取反：
~WORD_MASK   11111111 11111000 
~WORD_MASK和x进行 & 则会把x的二进制中的后3为置0，而这3位中寄存的正是不满8的余数部分
8的二进制：    00000000 00001000
/* 上面的二进制都只写了后8位 */

经过上面分析，咱们能够知道，去除缺乏8的余数，其实便是把二进制的后3位清0，那么其实咱们也能够这样写

func aligned(x: Int) {
    return (x + 7) >> 3 << 3
}
与苹果源码比较，假如要相同兼容64位和32位，咱们的 7 和 3 就需要2个宏来定义，而源码中只需要1个宏

结构体的内存对齐方法

数据巨细

数据类型	64位字节巨细	32位字节巨细
bool char unsigned char BOOL int8_t Boolean	1	1
short unsigned short int16_t unichar	2	2
int int32_t unsigned int float boolean_t	4	4
long unsigned long NSInteger NUSInteger CGFloat	8	4
long long double int64_t	8	8

结构体内存对齐规则

结构体( struct )的第一个数据成员从 offset 为0的地方开端放置，后续每个数据成员存储的开端方位都要从该成员巨细或者成员的子成员的整数倍处开端 (比如 int 巨细为4字节，则要从4的整数倍地址开端存储)

假如一个结构体里包含结构体作为成员，则结构体成员要从其内部最大成员的整数倍地址开端存储 ( struct a 里有成员 struct b, b 里有 char, int, double 成员，那么 b 应该从 double 巨细，也便是8的整数倍开端存储)

结构体的总巨细，也便是 sizeof 的成果必须是其内部最大成员的整数倍，缺乏的要补齐

先看基础数据类型结构体的比如

struct StructA {
    double a;   // 8 [0~7]
    int b;      // 4 [8~11]
    char c;     // 1 [12]
    short d;    // 2 [14~15]
} structa;      // 15 8的倍数 16
struct StructB {
    double a;   // 8 [0~7]
    char b;     // 1 [8]
    int c;      // 4 [12~15]
    short d;    // 2 [16~17]
} structb;      // 17 8的倍数 24
NSLog(@"StructA size: %lu", sizeof(structa));
NSLog(@"StructB size: %lu", sizeof(structb));

可见结构体成员不同排列顺序，对结构体巨细是有影响的。

嵌套结构体

struct StructC {
    double a;           // 8 [0~7]
    int b;              // 4 [8~11]
    char c;             // 1 [12]
    short d;            // 2 [14~15]
    float e;            // 4 [16~19]
    struct StructA f;   // 15 8的倍数开端 24
} structc;              // 39 8的倍数 40
struct StructA {
    double a;   // 8 [24~31]
    int b;      // 4 [32~35]
    char c;     // 1 [36]
    short d;    // 2 [38~39]
} structa;
NSLog(@"StructC size: %lu", sizeof(structc));

struct StructC {
    double a;           // 8 [0~7]
    char b;             // 1 [8]
    int c;              // 4 [12~15]
    short d;            // 4 [16~19]
    struct StructB f;   // 19 8的倍数开端 24
} structc;              // 41 8的倍数 48
struct StructB {
    double a;   // 8 [24~31]
    char b;     // 1 [32]
    int c;      // 4 [36~39]
    short d;    // 2 [40~41]
} structb;
NSLog(@"StructC size: %lu", sizeof(structc));

struct StructC {
    int b;              // 4 [0~3]
    char c;             // 1 [4]
    short d;            // 2 [6~7]
    float e;            // 4 [8~11]
    struct StructA f;   // 11 8的倍数开端 16
} structc;              // 28 8的倍数 32
struct StructA {
    double a;   // 8 [16~23]
    int b;      // 4 [24~27]
    char c;     // 1 [28]
} structa;
NSLog(@"StructC size: %lu", sizeof(structc));

OC底层探索 – _class_createInstanceFromZone

实例内存巨细的核算

为什么是16位对齐

分配内存与实例进行相关

字节对齐核算

目标的内存对齐

结构体的内存对齐方法

结构体内存对齐规则

先看基础数据类型结构体的比如

嵌套结构体

相关文章

静态库冲突问题思路全解

android中gzip数据压缩与网络框架解压缩（gzip）

FCN、ReSeg、U-Net、ParseNet、DeepMask…你都掌握了吗？一文总结图像分割必备经典模型（一）

Golang 并发编程

作者信息