Adventures with Intel VT-d (IOMMU)
After spending a couple of weeks of implementing VT-d support in the Muen Kernel, it finally worked today: A device was passed through to a Linux guest while undesired accesses (initiated by the UEFI firmware’s USB driver during hand-over) were blocked and reported.
The last issue, which took a week to debug, was that the IOMMU has a particular idea about how page tables must be laid out.
IOMMUs provide memory management for memory accesses coming from peripheral devices, like USB or ethernet controllers. Without them, these devices are capable of accessing all parts of memory, and even write everywhere. Obviously that’s a bad thing if you’re aiming for security.
They’re a standard capability of server chipsets (SPARC, PowerPC and the like) and spread to the x86 architecture in the last couple of years.
For them to work, they typically get a map that assigns a page table to each device, like the one that exists for every process in an operating system to assign memory to processes and control what they can do with each part of it. The page table is a multi-level data structure that closes in on specific virtual memory regions (as seen by processes or devices) and maps them to physical memory.
Originally, Intel designed their IOMMU’s tables so that they’re compatible with the tables used for processes. Their specification claims in Section 9.8 that “SL-PTE” (the last level in the page table) has a number of bits that are ignored by the chipset if certain features aren’t supported (like on the Ivy Bridge chipset this code has to run on). But setting these bits ends in the chipset blocking the request (and reporting a failure).
So when implementing VT-d do the fault reporting first (no matter how bare the report is) and make sure that you keep the page tables clean, including the parts that the specification claim are “ignored” or “available” - even if that means that you’ll start building separate page tables early.
Also don’t assume that the IOMMU supports pages larger than 4KB, or 48bit addressing - Ivy Bridge supports neither.