Engineering

Unified Memory and the Future of Game Architecture

PUBLISHED 2026.01.01

Unified Memory
Apple Silicon
Metal
ECS
Architecture
Performance
Game Development

I work with Metal on Apple Silicon in my day job, and the experience of Unified Memory has been revealing. No explicit transfers between CPU and GPU. No SetData / GetData choreography. Just… shared memory.

This made me think: what if this became the norm everywhere? And how does it relate to data-oriented design patterns like ECS?

My Experience with Metal

Working with Metal on Apple Silicon, I’ve come to appreciate what Unified Memory enables:

// Metal - storageModeShared means no transfer
let buffer = device.makeBuffer(
    bytes: &states,
    length: bufferLength,
    options: .storageModeShared
)

The development experience feels quite smooth. No thinking about transfers, no synchronization headaches. CPU writes, GPU reads — same memory.

This reduced a lot of the cognitive overhead I used to deal with. And it made me wonder: what if all platforms worked this way?

The Current Landscape

Looking at today’s gaming platforms:

Platform	Memory	Notes
Apple Silicon	16-128GB LPDDR5X	True Unified Memory (WWDC20)
PS5	16GB GDDR6	Shared pool, 448 GB/s bandwidth
Xbox Series X	16GB GDDR6	Split bandwidth (10GB + 6GB)
Switch 2	12GB LPDDR5X	NVIDIA T239 SoC
PC (Gaming)	Separated	dGPU with dedicated VRAM

Apple Silicon is the clearest example of true unified memory — officially documented as CPU and GPU sharing the same memory pool.

For consoles, the situation is less clear-cut. They use shared GDDR6 pools, which eliminates PCIe transfer overhead, but whether this qualifies as “true” unified memory depends on how strictly you define the term. Either way, they’re closer to unified than traditional PC architecture.

The Traditional PC Architecture

flowchart LR
    CPU[CPU Memory] <-->|PCIe Transfer| GPU[GPU Memory]

This separation creates friction:

Explicit data transfers required
Transfer can become a bottleneck
Programming complexity increases

Anyone who’s worked with compute shaders knows this — you’re constantly thinking about when and what to transfer.

Why This Matters for Architecture Design

Whether you’re using:

Unity DOTS (Archetype storage)
EnTT (Sparse Set)
Entitas (Group-based)

They all share one thing: contiguous memory layout.

flowchart TB
    subgraph Scattered[Scattered]
        direction LR
        O1[Obj 1] ~~~ O2[Obj 2] ~~~ O3[Obj 3]
    end
    subgraph Contiguous[Contiguous]
        direction LR
        Data[D1, D2, D3, ...]
    end

In a Unified Memory world:

Contiguous data can be accessed efficiently by both CPU and GPU
No marshalling or conversion needed
Cache-friendly for CPU, transfer-friendly (or transfer-free) for GPU

This isn’t a new insight — it’s why ECS architectures exist. But Unified Memory could make these patterns even more valuable.

What I’d Like to See

It would be nice if Unified Memory became more widespread across platforms.

Currently:

Apple Silicon has true unified memory
Consoles have shared memory pools (which helps)
PC gaming still relies on separated memory

If unified memory spread further — whether through better APUs, new interconnect technologies, or something else — the benefits of data-oriented design would apply even more broadly. That’s a future I’d find exciting.

Some principles that seem relevant regardless:

Prefer contiguous data layouts — Whether Archetype or Sparse Set, packed data wins
Use unmanaged types — struct, no references, blittable data
Think about access patterns — Sequential access is still faster than random

These aren’t new ideas — they’re the same principles that make DOTS fast. They’re valuable today for cache efficiency, and could become even more valuable if unified memory spreads further.

Summary

The gaming landscape varies:

Apple Silicon — True unified memory, officially documented
Consoles (PS5, Xbox, Switch) — Shared memory pools, details vary
PC Gaming — Still separated, likely to remain so for a while

It would be nice to see unified memory become more common. My experience with Metal has shown me how much simpler development can be when you don’t have to think about CPU/GPU transfers. Whether that future arrives more broadly remains to be seen.

In the meantime, designing with contiguous memory layouts is valuable regardless — for cache efficiency today, and potentially for unified memory tomorrow.

References

WWDC20: Apple Silicon Architecture — Official Apple documentation on unified memory
MLX Unified Memory