Engineering

Unified Memory and the Future of Game Architecture

  • Unified Memory
  • Apple Silicon
  • Metal
  • ECS
  • Architecture
  • Performance
  • Game Development

I work with Metal on Apple Silicon in my day job, and the experience of Unified Memory has been revealing. No explicit transfers between CPU and GPU. No SetData / GetData choreography. Just… shared memory.

This made me think: what if this became the norm everywhere? And how does it relate to data-oriented design patterns like ECS?

My Experience with Metal

Working with Metal on Apple Silicon, I’ve come to appreciate what Unified Memory enables:

// Metal - storageModeShared means no transfer
let buffer = device.makeBuffer(
    bytes: &states,
    length: bufferLength,
    options: .storageModeShared
)

The development experience feels quite smooth. No thinking about transfers, no synchronization headaches. CPU writes, GPU reads — same memory.

This reduced a lot of the cognitive overhead I used to deal with. And it made me wonder: what if all platforms worked this way?

The Current Landscape

Looking at today’s gaming platforms:

PlatformMemoryNotes
Apple Silicon16-128GB LPDDR5XTrue Unified Memory (WWDC20)
PS516GB GDDR6Shared pool, 448 GB/s bandwidth
Xbox Series X16GB GDDR6Split bandwidth (10GB + 6GB)
Switch 212GB LPDDR5XNVIDIA T239 SoC
PC (Gaming)SeparateddGPU with dedicated VRAM

Apple Silicon is the clearest example of true unified memory — officially documented as CPU and GPU sharing the same memory pool.

For consoles, the situation is less clear-cut. They use shared GDDR6 pools, which eliminates PCIe transfer overhead, but whether this qualifies as “true” unified memory depends on how strictly you define the term. Either way, they’re closer to unified than traditional PC architecture.

The Traditional PC Architecture

flowchart LR
    CPU[CPU Memory] <-->|PCIe Transfer| GPU[GPU Memory]

This separation creates friction:

  • Explicit data transfers required
  • Transfer can become a bottleneck
  • Programming complexity increases

Anyone who’s worked with compute shaders knows this — you’re constantly thinking about when and what to transfer.

Why This Matters for Architecture Design

Whether you’re using:

  • Unity DOTS (Archetype storage)
  • EnTT (Sparse Set)
  • Entitas (Group-based)

They all share one thing: contiguous memory layout.

flowchart TB
    subgraph Scattered[Scattered]
        direction LR
        O1[Obj 1] ~~~ O2[Obj 2] ~~~ O3[Obj 3]
    end
    subgraph Contiguous[Contiguous]
        direction LR
        Data[D1, D2, D3, ...]
    end

In a Unified Memory world:

  • Contiguous data can be accessed efficiently by both CPU and GPU
  • No marshalling or conversion needed
  • Cache-friendly for CPU, transfer-friendly (or transfer-free) for GPU

This isn’t a new insight — it’s why ECS architectures exist. But Unified Memory could make these patterns even more valuable.

What I’d Like to See

It would be nice if Unified Memory became more widespread across platforms.

Currently:

  • Apple Silicon has true unified memory
  • Consoles have shared memory pools (which helps)
  • PC gaming still relies on separated memory

If unified memory spread further — whether through better APUs, new interconnect technologies, or something else — the benefits of data-oriented design would apply even more broadly. That’s a future I’d find exciting.

Some principles that seem relevant regardless:

  1. Prefer contiguous data layouts — Whether Archetype or Sparse Set, packed data wins
  2. Use unmanaged typesstruct, no references, blittable data
  3. Think about access patterns — Sequential access is still faster than random

These aren’t new ideas — they’re the same principles that make DOTS fast. They’re valuable today for cache efficiency, and could become even more valuable if unified memory spreads further.

Summary

The gaming landscape varies:

  • Apple Silicon — True unified memory, officially documented
  • Consoles (PS5, Xbox, Switch) — Shared memory pools, details vary
  • PC Gaming — Still separated, likely to remain so for a while

It would be nice to see unified memory become more common. My experience with Metal has shown me how much simpler development can be when you don’t have to think about CPU/GPU transfers. Whether that future arrives more broadly remains to be seen.

In the meantime, designing with contiguous memory layouts is valuable regardless — for cache efficiency today, and potentially for unified memory tomorrow.


References