现代 C++ 跨平台开发-内存篇:多平台跨层调用场景的内存管理

本文是整个【现代 C++ 跨平台开发-内存篇】系列的第 8 篇,主要涉及:多平台跨层调用场景的内存管理。

JNI 内存管理

JNI 的世界,GC 不一定管用。

引用

  • 本地引用:

    • 只有返回 jobject 的函数才会产生本地引用,MethodID/FieldID 均不是;

    • 循环或回调等频繁调用场景,本地引用需显式释放;

    • 也可通过 PushLocalFrame()/PopLocalFrame() 自动管理;

  • 全局引用:

    • 用于延长生命周期,需要在合适的时机手动释放。

源码分析

static bool ensureLocalCapacity(Thread* self, int capacity) {
    int numEntries = self->jniLocalRefTable.capacity();
    return ((kJniLocalRefMax - numEntries) >= capacity);
}

bool dvmPushLocalFrame(Thread* self, const Method* method) {
    //...
#ifdef USE_INDIRECT_REF
    saveBlock->xtra.localRefCookie = self->jniLocalRefTable.segmentState.all;
#else
    saveBlock->xtra.localRefCookie = self->jniLocalRefTable.nextEntry;
#endif
    //...
    return true;
}

static jint PushLocalFrame(JNIEnv* env, jint capacity) {
    //...
    if (!ensureLocalCapacity(ts.self(), capacity) ||
            !dvmPushLocalFrame(ts.self(), dvmGetCurrentJNIMethod())) {
        //...
    }
    return JNI_OK;
}

static jobject PopLocalFrame(JNIEnv* env, jobject jresult) {
    //...
    if (!dvmPopLocalFrame(ts.self())) {
        //...
    }
    return addLocalReference(ts.self(), result);
}

bool dvmPopLocalFrame(Thread* self) {
    //...
    dvmPopJniLocals(self, saveBlock);
    //...
    return true;
}

INLINE void dvmPopJniLocals(Thread* self, StackSaveArea* saveArea) {
    self->jniLocalRefTable.segmentState.all = saveArea->xtra.localRefCookie;
}

数组

  • GetStringUTFChars()GetIntArrayElements()GetByteArrayElements() 会返回 C 指针,需要通过 ReleaseStringUTFChars()ReleaseXXXArrayElements() 显式释放;

  • GetObjectArrayElement() 返回的 jobject (本地引用),无需手动释放 native 内存。

源码分析

static jobject GetObjectArrayElement(JNIEnv* env, jobjectArray java_array, jsize index) {
    //...
    return soa.AddLocalReference<jobject>(array->Get(index));
}

//////////

static jbyte* GetByteArrayElements(JNIEnv* env, jbyteArray array, jboolean* is_copy) {
    //...
    return GetPrimitiveArray<jbyteArray, jbyte*, ByteArray>(soa, array, is_copy);
}

static void ReleaseByteArrayElements(JNIEnv* env, jbyteArray array, jbyte* elements, jint mode) {
    ReleasePrimitiveArray<jbyteArray, jbyte, mirror::ByteArray>(env, array, elements, mode);
}

template <typename ArrayT, typename ElementT, typename ArtArrayT>
  static ElementT* GetPrimitiveArray(JNIEnv* env, ArrayT java_array, jboolean* is_copy) {
    //...
    ObjPtr<ArtArrayT> array = DecodeAndCheckArrayType<ArrayT, ElementT, ArtArrayT>(
        soa, java_array, "GetArrayElements", "get");
    //...
    if (Runtime::Current()->GetHeap()->IsMovableObject(array)) {
      //...
      const size_t component_size = sizeof(ElementT);
      size_t size = array->GetLength() * component_size;
      void* data = new uint64_t[RoundUp(size, 8) / 8];
      memcpy(data, array->GetData(), size);
      return reinterpret_cast<ElementT*>(data);
    } else {
      //...
      return reinterpret_cast<ElementT*>(array->GetData());
    }
}

static void ReleasePrimitiveArray(ScopedObjectAccess& soa, ObjPtr<mirror::Array> array, size_t component_size, void* elements, jint mode)
      REQUIRES_SHARED(Locks::mutator_lock_) {
    void* array_data = array->GetRawData(component_size, 0);
    gc::Heap* heap = Runtime::Current()->GetHeap();
    bool is_copy = array_data != elements;
    size_t bytes = array->GetLength() * component_size;
    if (is_copy) {
      //...
      if (mode != JNI_ABORT) {
        memcpy(array_data, elements, bytes);
      } else if (kWarnJniAbort && memcmp(array_data, elements, bytes) != 0) {
        //...
      }
    }
    if (mode != JNI_COMMIT) {
      if (is_copy) {
        delete[] reinterpret_cast<uint64_t*>(elements);
      } else if (heap->IsMovableObject(array)) {
        //...
      }
    }
 }

可见:

  • 数值类的 JNI 数组 getter 会调用 GetPrimitiveArray(),内部可能发生 memcpy(),所以必须手动调用对应的 release 接口;

  • JNI object 数组 getter 直接返回 Local Reference,所以一般无需手动释放。

更进一步,它是通过 IsMovableObject() 判断是否需要拷贝的:

  • 首先判断是否使用可移动式 GC(ART 默认开启);

  • 然后判断当前内存区域是否属于可移动。

拷贝的深层原因是:某些 GC 算法为防止内存碎片,会主动将存活对象从一个内存区域(From-space) 移动到另一个区域(To-space),无法保证 Java 堆内存地址的不变性,所以必须拷贝。

// Garbage collector constants.
static constexpr bool kMovingCollector = true;
static constexpr bool kMarkCompactSupport = false && kMovingCollector;

bool Heap::IsMovableObject(ObjPtr<mirror::Object> obj) const {
  if (kMovingCollector) {
    space::Space* space = FindContinuousSpaceFromObject(obj.Ptr(), true);
    if (space != nullptr) {
      return space->CanMoveObjects();
    }
  }
  return false;
}

space::ContinuousSpace* Heap::FindContinuousSpaceFromObject(ObjPtr<mirror::Object> obj, bool fail_ok) const {
  space::ContinuousSpace* space = FindContinuousSpaceFromAddress(obj.Ptr());
  if (space != nullptr) {
    return space;
  }
  //...
  return nullptr;
}

space::ContinuousSpace* Heap::FindContinuousSpaceFromAddress(const mirror::Object* addr) const {
  for (const auto& space : continuous_spaces_) {
    if (space->Contains(addr)) {
      return space;
    }
  }
  return nullptr;
}

//////////

class Space {
 public:
  virtual bool CanMoveObjects() const = 0;
  //...
};

class RegionSpace final : public ContinuousMemMapAllocSpace {
 public:
  bool CanMoveObjects() const override {
    return true;
  }
//...
};

class LargeObjectSpace : public DiscontinuousSpace, public AllocSpace {
 public:
  bool CanMoveObjects() const override {
    return false;
  }
};

高效字节传输

DirectByteBuffer 可用于Java/JNI 之间高效共享内存,它属于堆外内存(off-heap memory),不存在隐式拷贝:

  • 如果是 Java 层分配的,GC 通过 Cleaner 释放;

  • 如果是 JNI 层分配的,需要手动 free() 释放。

Java 层创建

ByteBuffer buf = ByteBuffer.allocateDirect(1024);
nativeHandleBuffer(buffer);

JNI 层使用

NIEXPORT void JNICALL
Java_Foo_Bar_nativeHandleBuffer(JNIEnv *env, jobject obj, jobject byteBuffer) {
    // 获取 DirectByteBuffer 的起始地址
    void* address = (*env)->GetDirectBufferAddress(env, byteBuffer);
    if (address == NULL) {
        // 不是 DirectBuffer 或其他错误
        return;
    }

    // 获取容量
    jlong capacity = (*env)->GetDirectBufferCapacity(env, byteBuffer);

    // 直接读写内存
    memset(address, 0, capacity);
}

JNI 层创建并负责释放

NIEXPORT jobject JNICALL
Java_Foo_Bar_createBuffer(JNIEnv *env, jclass cls, jint size) {
    void* mem = malloc(size); // 或使用 mmap 等
    if (!mem) return NULL;

    // 创建 DirectByteBuffer 包装该内存
    jobject directBuffer = (*env)->NewDirectByteBuffer(env, mem, size);

    return directBuffer;
}

JNIEXPORT void JNICALL
Java_Foo_Bar_releaseBuffer(JNIEnv *env, jobject obj, jobject buffer) {
    void* addr = (*env)->GetDirectBufferAddress(env, buffer);
    free(addr); // 释放之前 malloc 的内存
}

源码分析

final Cleaner cleaner;
final MemoryRef memoryRef;

DirectByteBuffer(int capacity, MemoryRef memoryRef) {
    super(-1, 0, capacity, capacity, memoryRef.buffer, memoryRef.offset);
    // Only have references to java objects, no need for a cleaner since the GC will do all the work.
    this.memoryRef = memoryRef;
    this.address = memoryRef.allocatedAddress + memoryRef.offset;
    cleaner = null;
    this.isReadOnly = false;
}

////////////

final static class MemoryRef {
    byte[] buffer;
    long allocatedAddress;
    final int offset;
    boolean isAccessible;
    boolean isFreed;

    final Object originalBufferObject;

    MemoryRef(int capacity) {
        VMRuntime runtime = VMRuntime.getRuntime();
        buffer = (byte[]) runtime.newNonMovableArray(byte.class, capacity + 7);
        allocatedAddress = runtime.addressOf(buffer);
        offset = (int) (((allocatedAddress + 7) & ~(long) 7) - allocatedAddress);
        isAccessible = true;
        isFreed = false;
        originalBufferObject = null;
    }

    //...

    void free() {
        buffer = null;
        allocatedAddress = 0;
        isAccessible = false;
        isFreed = true;
    }
}

////////////

public class Cleaner extends PhantomReference<Object> {
    // Dummy reference queue
    private static final ReferenceQueue<Object> dummyQueue = new ReferenceQueue<>();

    // Doubly-linked list of live cleaners, which prevents the cleaners
    // themselves from being GC'd before their referents
    //
    static private Cleaner first = null;

    private Cleaner
        next = null,
        prev = null;

    private static synchronized Cleaner add(Cleaner cl) {
        if (first != null) {
            cl.next = first;
            first.prev = cl;
        }
        first = cl;
        return cl;
    }
    //...
}
template <bool kInstrumented = true, typename PreFenceVisitor>
  mirror::Object* AllocNonMovableObject(Thread* self, ObjPtr<mirror::Class> klass, size_t num_bytes, const PreFenceVisitor& pre_fence_visitor) {
    mirror::Object* obj = AllocObjectWithAllocator<kInstrumented>(self, klass, num_bytes,
                                                                  GetCurrentNonMovingAllocator(), pre_fence_visitor);
    //...
    return obj;
  }

AllocatorType GetCurrentNonMovingAllocator() const {
    return current_non_moving_allocator_;
}

template <bool kInstrumented, bool kCheckLargeObject, typename PreFenceVisitor>
inline mirror::Object* Heap::AllocObjectWithAllocator(Thread* self, ObjPtr<mirror::Class> klass, size_t byte_count,
                                                      AllocatorType allocator, const PreFenceVisitor& pre_fence_visitor) {
    //...
    obj = TryToAllocate<kInstrumented, false>(self, allocator, byte_count, &bytes_allocated, &usable_size, &bytes_tl_bulk_allocated);
    //...
}

template <const bool kInstrumented, const bool kGrow>
inline mirror::Object* Heap::TryToAllocate(Thread* self, AllocatorType allocator_type,
                                           size_t alloc_size, size_t* bytes_allocated, size_t* usable_size, size_t* bytes_tl_bulk_allocated) {
    //...
    switch (allocator_type) {
        //...
        case kAllocatorTypeNonMoving: {
          ret = non_moving_space_->Alloc(self, alloc_size, bytes_allocated, usable_size, bytes_tl_bulk_allocated);
          break;
        }
        //...
    }
    //...
}

Heap::Heap(//...) {
    //...
    MemMap non_moving_space_mem_map;
    if (separate_non_moving_space) {
        //...
        if (heap_reservation.IsValid()) {
          non_moving_space_mem_map = heap_reservation.RemapAtEnd(heap_reservation.Begin(), space_name, PROT_READ | PROT_WRITE, &error_str);
    } else {
          non_moving_space_mem_map = MapAnonymousPreferredAddress(space_name, request_begin, non_moving_space_capacity, &error_str);
    }
    //...
    non_moving_space_ = space::DlMallocSpace::CreateFromMemMap(std::move(non_moving_space_mem_map),
                                                               "zygote / non moving space",
                                                               kDefaultStartingSize, initial_size, size, size,
                                                               /* can_move_objects= */ false);
    //...
}

可见 Android 版 DirectByteBuffer 实际采用了简化实现:

并未使用 Cleaner(PhantomReference), 而是通过 MemoryRef

  • 持有 Runtime 分配的不可移动内存的 address,既不会隐式拷贝、也不用自己释放,还能高效访问;

  • 维护可访问状态,读写操作都会检查;

字节序

Java 默认使用大端(IO 流、Buffer 等),ByteBuffer 可指定大小端:

ByteBuffer buf = ByteBuffer.allocate(4).order(ByteOrder.LITTLE_ENDIAN);
buf.putInt(0x12345678); // 写入小端: [0x78, 0x56, 0x34, 0x12]

此外,还有之前的几篇涉及 JNI 内存管理的文章:

OC/C++ 混编场景内存管理

OC/C++ 混编很爽,但内存管理时,他们是两个独立世界:
“上帝的归上帝,凯撒的归凯撒。”

自动释放池

OC 方法一般会隐式创建 AutoReleasePool(block 除外),但 C++ 函数不会:内部如果创建 OC 对象,需要包在 @autoreleasepool{} 代码块。

弱引用

  • __weak 的自动置 nil 依赖 OC 运行时的 dealloc 调用:

    • Runtime 会扫描所有 __weak 引用,将其置为 nil

    • 前提是这些 __weak 变量必须位于 Runtime 可追踪的内存中(如 OC 对象实例变量、全局/栈上变量);

  • C++ 对象的析构与 OC 对象的 dealloc 是两个独立生命周期系统:

    • C++ 成员变量(即使是 __weak id)通常不在 OC Runtime 的弱引用表监控范围内;

所以,如果 C++ 对象直接持有 OC __weak 成员,会导致严重内存问题:

  • weakSelf 不会被自动置 nil,成为悬垂指针;

  • 后续消息发送 EXC_BAD_ACCESS 直接 crash。

建议的方案:

  • .mm 内部匿名 namespace 定义 NSObject 子类持有 __weak id

  • C++ 构造函数利用 id 创建 OC 对象,析构函数显式释放 OC 对象(MRC 调用 release,ARC 置为 nil)。

字节序

Apple 的 Core Foundation 通过 CFByteOrder.h 提供大小端相关 API:

//获取当前机器大小端模式:
CFByteOrderGetCurrent(); 

//将 32 位的整型从大端转为本机的模式(若本机为大端,则原值不变)
uint32_t CFSwapInt32BigToHost(uint32_t arg);

//将 32 位的整型从本机的模式转为大端(若本机为大端,则原值不变)
uint32_t CFSwapInt32HostToBig(uint32_t arg);

//将 32 位的整型从小端转为本机的模式(若本机为小端,则原值不变)
uint32_t CFSwapInt32LittleToHost(uint32_t arg);

//将 32 位的整型从本机的模式转为小端(若本机为小端,则原值不变)
uint32_t CFSwapInt32HostToLittle(uint32_t arg);

此外,还有之前的一篇关于 OC/C++ 混编的文章有涉及内存管理相关问题:

iOS 引用 C/C++ 项目:交叉编译与 Objective-C++