现代 C++ 跨平台开发-内存篇:多平台跨层调用场景的内存管理
本文是整个【现代 C++ 跨平台开发-内存篇】系列的第 8 篇,主要涉及:多平台跨层调用场景的内存管理。
现代 C++ 跨平台开发-内存篇:多平台跨层调用场景的内存管理
JNI 内存管理
JNI 的世界,GC 不一定管用。
引用
本地引用:
只有返回
jobject的函数才会产生本地引用,MethodID/FieldID均不是;循环或回调等频繁调用场景,本地引用需显式释放;
也可通过
PushLocalFrame()/PopLocalFrame()自动管理;
全局引用:
- 用于延长生命周期,需要在合适的时机手动释放。
源码分析:
static bool ensureLocalCapacity(Thread* self, int capacity) {
int numEntries = self->jniLocalRefTable.capacity();
return ((kJniLocalRefMax - numEntries) >= capacity);
}
bool dvmPushLocalFrame(Thread* self, const Method* method) {
//...
#ifdef USE_INDIRECT_REF
saveBlock->xtra.localRefCookie = self->jniLocalRefTable.segmentState.all;
#else
saveBlock->xtra.localRefCookie = self->jniLocalRefTable.nextEntry;
#endif
//...
return true;
}
static jint PushLocalFrame(JNIEnv* env, jint capacity) {
//...
if (!ensureLocalCapacity(ts.self(), capacity) ||
!dvmPushLocalFrame(ts.self(), dvmGetCurrentJNIMethod())) {
//...
}
return JNI_OK;
}
static jobject PopLocalFrame(JNIEnv* env, jobject jresult) {
//...
if (!dvmPopLocalFrame(ts.self())) {
//...
}
return addLocalReference(ts.self(), result);
}
bool dvmPopLocalFrame(Thread* self) {
//...
dvmPopJniLocals(self, saveBlock);
//...
return true;
}
INLINE void dvmPopJniLocals(Thread* self, StackSaveArea* saveArea) {
self->jniLocalRefTable.segmentState.all = saveArea->xtra.localRefCookie;
}
数组
GetStringUTFChars()、GetIntArrayElements()、GetByteArrayElements()会返回 C 指针,需要通过ReleaseStringUTFChars()、ReleaseXXXArrayElements()显式释放;GetObjectArrayElement()返回的jobject(本地引用),无需手动释放 native 内存。
源码分析:
static jobject GetObjectArrayElement(JNIEnv* env, jobjectArray java_array, jsize index) {
//...
return soa.AddLocalReference<jobject>(array->Get(index));
}
//////////
static jbyte* GetByteArrayElements(JNIEnv* env, jbyteArray array, jboolean* is_copy) {
//...
return GetPrimitiveArray<jbyteArray, jbyte*, ByteArray>(soa, array, is_copy);
}
static void ReleaseByteArrayElements(JNIEnv* env, jbyteArray array, jbyte* elements, jint mode) {
ReleasePrimitiveArray<jbyteArray, jbyte, mirror::ByteArray>(env, array, elements, mode);
}
template <typename ArrayT, typename ElementT, typename ArtArrayT>
static ElementT* GetPrimitiveArray(JNIEnv* env, ArrayT java_array, jboolean* is_copy) {
//...
ObjPtr<ArtArrayT> array = DecodeAndCheckArrayType<ArrayT, ElementT, ArtArrayT>(
soa, java_array, "GetArrayElements", "get");
//...
if (Runtime::Current()->GetHeap()->IsMovableObject(array)) {
//...
const size_t component_size = sizeof(ElementT);
size_t size = array->GetLength() * component_size;
void* data = new uint64_t[RoundUp(size, 8) / 8];
memcpy(data, array->GetData(), size);
return reinterpret_cast<ElementT*>(data);
} else {
//...
return reinterpret_cast<ElementT*>(array->GetData());
}
}
static void ReleasePrimitiveArray(ScopedObjectAccess& soa, ObjPtr<mirror::Array> array, size_t component_size, void* elements, jint mode)
REQUIRES_SHARED(Locks::mutator_lock_) {
void* array_data = array->GetRawData(component_size, 0);
gc::Heap* heap = Runtime::Current()->GetHeap();
bool is_copy = array_data != elements;
size_t bytes = array->GetLength() * component_size;
if (is_copy) {
//...
if (mode != JNI_ABORT) {
memcpy(array_data, elements, bytes);
} else if (kWarnJniAbort && memcmp(array_data, elements, bytes) != 0) {
//...
}
}
if (mode != JNI_COMMIT) {
if (is_copy) {
delete[] reinterpret_cast<uint64_t*>(elements);
} else if (heap->IsMovableObject(array)) {
//...
}
}
}
可见:
数值类的 JNI 数组 getter 会调用
GetPrimitiveArray(),内部可能发生memcpy(),所以必须手动调用对应的 release 接口;JNI object 数组 getter 直接返回 Local Reference,所以一般无需手动释放。
更进一步,它是通过 IsMovableObject() 判断是否需要拷贝的:
首先判断是否使用可移动式 GC(ART 默认开启);
然后判断当前内存区域是否属于可移动。
拷贝的深层原因是:某些 GC 算法为防止内存碎片,会主动将存活对象从一个内存区域(From-space) 移动到另一个区域(To-space),无法保证 Java 堆内存地址的不变性,所以必须拷贝。
// Garbage collector constants.
static constexpr bool kMovingCollector = true;
static constexpr bool kMarkCompactSupport = false && kMovingCollector;
bool Heap::IsMovableObject(ObjPtr<mirror::Object> obj) const {
if (kMovingCollector) {
space::Space* space = FindContinuousSpaceFromObject(obj.Ptr(), true);
if (space != nullptr) {
return space->CanMoveObjects();
}
}
return false;
}
space::ContinuousSpace* Heap::FindContinuousSpaceFromObject(ObjPtr<mirror::Object> obj, bool fail_ok) const {
space::ContinuousSpace* space = FindContinuousSpaceFromAddress(obj.Ptr());
if (space != nullptr) {
return space;
}
//...
return nullptr;
}
space::ContinuousSpace* Heap::FindContinuousSpaceFromAddress(const mirror::Object* addr) const {
for (const auto& space : continuous_spaces_) {
if (space->Contains(addr)) {
return space;
}
}
return nullptr;
}
//////////
class Space {
public:
virtual bool CanMoveObjects() const = 0;
//...
};
class RegionSpace final : public ContinuousMemMapAllocSpace {
public:
bool CanMoveObjects() const override {
return true;
}
//...
};
class LargeObjectSpace : public DiscontinuousSpace, public AllocSpace {
public:
bool CanMoveObjects() const override {
return false;
}
};
高效字节传输
DirectByteBuffer 可用于Java/JNI 之间高效共享内存,它属于堆外内存(off-heap memory),不存在隐式拷贝:
如果是 Java 层分配的,GC 通过
Cleaner释放;如果是 JNI 层分配的,需要手动
free()释放。
Java 层创建:
ByteBuffer buf = ByteBuffer.allocateDirect(1024);
nativeHandleBuffer(buffer);
JNI 层使用:
NIEXPORT void JNICALL
Java_Foo_Bar_nativeHandleBuffer(JNIEnv *env, jobject obj, jobject byteBuffer) {
// 获取 DirectByteBuffer 的起始地址
void* address = (*env)->GetDirectBufferAddress(env, byteBuffer);
if (address == NULL) {
// 不是 DirectBuffer 或其他错误
return;
}
// 获取容量
jlong capacity = (*env)->GetDirectBufferCapacity(env, byteBuffer);
// 直接读写内存
memset(address, 0, capacity);
}
JNI 层创建并负责释放:
NIEXPORT jobject JNICALL
Java_Foo_Bar_createBuffer(JNIEnv *env, jclass cls, jint size) {
void* mem = malloc(size); // 或使用 mmap 等
if (!mem) return NULL;
// 创建 DirectByteBuffer 包装该内存
jobject directBuffer = (*env)->NewDirectByteBuffer(env, mem, size);
return directBuffer;
}
JNIEXPORT void JNICALL
Java_Foo_Bar_releaseBuffer(JNIEnv *env, jobject obj, jobject buffer) {
void* addr = (*env)->GetDirectBufferAddress(env, buffer);
free(addr); // 释放之前 malloc 的内存
}
源码分析:
final Cleaner cleaner;
final MemoryRef memoryRef;
DirectByteBuffer(int capacity, MemoryRef memoryRef) {
super(-1, 0, capacity, capacity, memoryRef.buffer, memoryRef.offset);
// Only have references to java objects, no need for a cleaner since the GC will do all the work.
this.memoryRef = memoryRef;
this.address = memoryRef.allocatedAddress + memoryRef.offset;
cleaner = null;
this.isReadOnly = false;
}
////////////
final static class MemoryRef {
byte[] buffer;
long allocatedAddress;
final int offset;
boolean isAccessible;
boolean isFreed;
final Object originalBufferObject;
MemoryRef(int capacity) {
VMRuntime runtime = VMRuntime.getRuntime();
buffer = (byte[]) runtime.newNonMovableArray(byte.class, capacity + 7);
allocatedAddress = runtime.addressOf(buffer);
offset = (int) (((allocatedAddress + 7) & ~(long) 7) - allocatedAddress);
isAccessible = true;
isFreed = false;
originalBufferObject = null;
}
//...
void free() {
buffer = null;
allocatedAddress = 0;
isAccessible = false;
isFreed = true;
}
}
////////////
public class Cleaner extends PhantomReference<Object> {
// Dummy reference queue
private static final ReferenceQueue<Object> dummyQueue = new ReferenceQueue<>();
// Doubly-linked list of live cleaners, which prevents the cleaners
// themselves from being GC'd before their referents
//
static private Cleaner first = null;
private Cleaner
next = null,
prev = null;
private static synchronized Cleaner add(Cleaner cl) {
if (first != null) {
cl.next = first;
first.prev = cl;
}
first = cl;
return cl;
}
//...
}
template <bool kInstrumented = true, typename PreFenceVisitor>
mirror::Object* AllocNonMovableObject(Thread* self, ObjPtr<mirror::Class> klass, size_t num_bytes, const PreFenceVisitor& pre_fence_visitor) {
mirror::Object* obj = AllocObjectWithAllocator<kInstrumented>(self, klass, num_bytes,
GetCurrentNonMovingAllocator(), pre_fence_visitor);
//...
return obj;
}
AllocatorType GetCurrentNonMovingAllocator() const {
return current_non_moving_allocator_;
}
template <bool kInstrumented, bool kCheckLargeObject, typename PreFenceVisitor>
inline mirror::Object* Heap::AllocObjectWithAllocator(Thread* self, ObjPtr<mirror::Class> klass, size_t byte_count,
AllocatorType allocator, const PreFenceVisitor& pre_fence_visitor) {
//...
obj = TryToAllocate<kInstrumented, false>(self, allocator, byte_count, &bytes_allocated, &usable_size, &bytes_tl_bulk_allocated);
//...
}
template <const bool kInstrumented, const bool kGrow>
inline mirror::Object* Heap::TryToAllocate(Thread* self, AllocatorType allocator_type,
size_t alloc_size, size_t* bytes_allocated, size_t* usable_size, size_t* bytes_tl_bulk_allocated) {
//...
switch (allocator_type) {
//...
case kAllocatorTypeNonMoving: {
ret = non_moving_space_->Alloc(self, alloc_size, bytes_allocated, usable_size, bytes_tl_bulk_allocated);
break;
}
//...
}
//...
}
Heap::Heap(//...) {
//...
MemMap non_moving_space_mem_map;
if (separate_non_moving_space) {
//...
if (heap_reservation.IsValid()) {
non_moving_space_mem_map = heap_reservation.RemapAtEnd(heap_reservation.Begin(), space_name, PROT_READ | PROT_WRITE, &error_str);
} else {
non_moving_space_mem_map = MapAnonymousPreferredAddress(space_name, request_begin, non_moving_space_capacity, &error_str);
}
//...
non_moving_space_ = space::DlMallocSpace::CreateFromMemMap(std::move(non_moving_space_mem_map),
"zygote / non moving space",
kDefaultStartingSize, initial_size, size, size,
/* can_move_objects= */ false);
//...
}
可见 Android 版 DirectByteBuffer 实际采用了简化实现:
并未使用 Cleaner(PhantomReference), 而是通过 MemoryRef:
持有 Runtime 分配的不可移动内存的 address,既不会隐式拷贝、也不用自己释放,还能高效访问;
维护可访问状态,读写操作都会检查;
字节序
Java 默认使用大端(IO 流、Buffer 等),ByteBuffer 可指定大小端:
ByteBuffer buf = ByteBuffer.allocate(4).order(ByteOrder.LITTLE_ENDIAN);
buf.putInt(0x12345678); // 写入小端: [0x78, 0x56, 0x34, 0x12]
此外,还有之前的几篇涉及 JNI 内存管理的文章:
OC/C++ 混编场景内存管理
OC/C++ 混编很爽,但内存管理时,他们是两个独立世界:
“上帝的归上帝,凯撒的归凯撒。”
自动释放池
OC 方法一般会隐式创建 AutoReleasePool(block 除外),但 C++ 函数不会:内部如果创建 OC 对象,需要包在 @autoreleasepool{} 代码块。
弱引用
__weak的自动置nil依赖 OC 运行时的dealloc调用:Runtime 会扫描所有
__weak引用,将其置为nil;前提是这些
__weak变量必须位于 Runtime 可追踪的内存中(如 OC 对象实例变量、全局/栈上变量);
C++ 对象的析构与 OC 对象的
dealloc是两个独立生命周期系统:- C++ 成员变量(即使是
__weak id)通常不在 OC Runtime 的弱引用表监控范围内;
- C++ 成员变量(即使是
所以,如果 C++ 对象直接持有 OC __weak 成员,会导致严重内存问题:
weakSelf不会被自动置nil,成为悬垂指针;后续消息发送
EXC_BAD_ACCESS直接 crash。
建议的方案:
.mm 内部匿名
namespace定义NSObject子类持有__weak id;C++ 构造函数利用
id创建 OC 对象,析构函数显式释放 OC 对象(MRC 调用release,ARC 置为nil)。
字节序
Apple 的 Core Foundation 通过 CFByteOrder.h 提供大小端相关 API:
//获取当前机器大小端模式:
CFByteOrderGetCurrent();
//将 32 位的整型从大端转为本机的模式(若本机为大端,则原值不变)
uint32_t CFSwapInt32BigToHost(uint32_t arg);
//将 32 位的整型从本机的模式转为大端(若本机为大端,则原值不变)
uint32_t CFSwapInt32HostToBig(uint32_t arg);
//将 32 位的整型从小端转为本机的模式(若本机为小端,则原值不变)
uint32_t CFSwapInt32LittleToHost(uint32_t arg);
//将 32 位的整型从本机的模式转为小端(若本机为小端,则原值不变)
uint32_t CFSwapInt32HostToLittle(uint32_t arg);
此外,还有之前的一篇关于 OC/C++ 混编的文章有涉及内存管理相关问题: