Previously, the SGX-signing Python script had a hard-coded value of
1 page. However, the Linux-SGX runtime calculated the SSA frame size
based on the information from CPUID and XFRM. The SSA frame size is
a total of XSAVE area size + GPRs + MISC region, and on feature-rich
CPUs may exceed 1 page. Thus, the SSA frame size in the SIGSTRUCT
(during SGX signing) and in the SECS (during runtime) may mismatch on
such CPUs, and EINIT fails with SGX_INVALID_MEASUREMENT. This commit
simply hard-codes SSA frame size to overapproximated value of 4 pages.
Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
Commit "Introduce one, central manifest, zero-config children and
constant MRENCLAVE" broke GSC. This commit fixes GSC (mainly adjusts
it the single-manifest change in that commit). Also, significant
internal refactoring is done (no user-visible changes). Also, scripts
now explicitly use UTF-8 when reading/writing the manifest files because
they are written in TOML which forces UTF-8.
This is the next part of the great loader rework, with a lot of breaking changes:
- Complete removal of the "trusted children" thing - now children
processes can be spawned arbitrarily and from arbitrary mountpoint
types, without any additional configuration needed.
- There's a new, required option in the manifest: `libos.entrypoint` - it
specifies the URI to the entry binary in the first process. There's no
need anymore to name the manifest and the first binary identically.
- On SGX, the main binary is not measured in MRENCLAVE anymore - only
PAL, LibOS and the manifest are measured. This is enough to bind
MRENCLAVE to a specific entrypoint user executable if wanted - it
just has to be mounted as a trusted file.
- All Graphene SGX enclaves have now exactly the same MRENCLAVE. This is
a hash of a "Graphene stub", which can "fork" into one of two states
in runtime: initial process or child. The initial process creates a
new "Graphene namespace" with a clean state, it can also be attested
remotely (contrary to child processes). The initial process can spawn
children processes by spawning a Graphene stub and directing it to
start in the child mode. It then attests it locally, and if
successful, establishes an encrypted pipe, "connects" to its own
namespace and treats as trusted (including sending protected files
key).
- Now, there's only one, central manifest describing the initial state
of a Graphene instance which can be spawned from it (previously, each
process required a separate manifest which could have different
configuration - which wasn't actually supported and didn't make sense
design-wise). One downside of central manifests is that all processes
require the same enclave configuration (e.g. size), but that was
already the case so far because of broken checkpointing code. Also,
this is only a temporary problem, which will cease to exist after the
introduction of EDMM.
- `sgx.static_address` was renamed to `sgx.nonpie_binary` and now has to
be inserted manually by users (`sgx_sign` tools doesn't know about the
binaries run inside, which can be even provided or generated in
runtime by the user's workload).
- Caveat: the memory gap for non-PIE executables was removed because it
requires adding a new option to the manifest to be cleanly
implemented. This is left for some future loader rework PR.
GCC (and other compilers, e.g. Clang) provide a stack protector
feature to detect stack corruptions. This is achieved by storing
a 64-bit canary value on the stack frame on function entry and
verifying this value on function exit. Previously, Graphene disabled
stack protector completely. This commit enables it in LibOS and PAL
code (only if `-mstack-protector` feature is supported by compiler).
The stack protector uses a random per-thread canary stored in the
TLS/TCB of each thread. Each PAL implementation must follow the
rule that TLS/TCB is accessed via the GS register and that the offset
of canary in TLS/TCB is 0x8. Since LibOS re-uses TLS/TCB of the PAL,
there is no need for additional enabling at the LibOS layer.
Since `-mstack-protector` feature is architecture-specific, it is
currently enabled only for x86-64 (and above rules on using gs:[0x8]
to access the canary apply only to x86-64).
Co-authored-by: Isaku Yamahata <isaku.yamahata@gmail.com>
This is a major refactor of the way manifests are loaded and handled,
which will be followed by a complete rework of the loader code (which
will include e.g. centralized config).
Changes/fixes:
- Huge part of manifest handling was refactored and untangled.
- Starting without a manifest is now disallowed. This was actually
accidentally broken for some time and no one complained. It also makes
little sense in practice and in Graphene's overall design, e.g. it
conflicts with protected argv.
- Now we only allow starting by giving the executable, not manifest (the
magic resolution logic was removed).
- Now manifests are sent over pipes between parent and children, instead
of children finding and loading them on their own. This is a
preparation for the upcoming centralized manifests change.
- Previously manifests were parsed 2 times on Linux and 3 times on
Linux-SGX (by untrusted PAL, trusted PAL and LibOS). This is now
fixed.
- The common `pal_main()` now requires that the backend-specific PAL
loader loads the manifest before calling it. SGX code already has to
do it (for proper initialization), so let's unify this interface for
all PALs.
- Fix for a PAL crash when manifest size was divisible by page size
(sic!). NULL termination was missing, but most of the time the padding
to page size saved Graphene from crashing.
This is to choose which PALs are to be build and installed. Currently
build happens outside of meson, so this only affects installing and not
building, but if one PAL was not built, then meson would fail because it
won't find the dependency.