Running Microsoft FXC in Docker

Microsoft DXC is the new shader compiler stack, but the FXC compiler is still the dominant HLSL compiler for a number of reasons:

  • Performance and correctness regressions of DXIL shaders compared to DXBC
  • Many cross compilers and custom toolchains still rely on DXBC
  • IHV drivers are still being adapted to consume DXIL, which is more low-level compared to DXBC
  • DXC is a complex codebase, as it is based on LLVM - difficult to build, and many components
  • DXIL is Direct3D 12 only, which makes it Windows 10 only

Therefore, it is still important to support shader compilation with FXC in some situations.

The performance and correctness regressions are a point of ongoing effort, but this is less of a problem today than it was 6 months ago - at least in my opinion, based on my own shaders and tests. In fact, most issues that are reported are fixed in just a couple days - an example. The opposite is also true, where some shaders have massive performance or compile-time cliffs when compiled with FXC compared to DXC, especially when arrays are involved.

Halcyon (SEED’s R&D engine) currently has a mixture of FXC and DXC compiled shaders when running under Direct3D 12, whereas the Vulkan path exclusively uses shaders compiled with DXC.

Given this scene:

Lets compare the the performance:

NameDirect3D 12VulkanUsing DXBC
Depth Clear0.003 ms0.003 msNo
GBuffer Meshes3.126 ms3.387 msNo
Velocity Vector0.035 ms0.033 msNo
GBuffer Sky0.046 ms0.048 msNo
Reproject Meta0.091 ms0.089 msNo
Temporal Reproject0.163 ms0.158 msNo
DiffuseSh0.012 ms0.011 msNo
Shadow Pass1.084 ms1.086 msNo
Shadow Pass1.091 ms1.114 msNo
Shadow Pass1.080 ms1.100 msNo
Depth Pyramid0.041 ms0.032 msNo
GTAO Pass0.284 ms0.181 msYes
GTAO Bilateral0.084 ms0.083 msNo
GTAO Bilateral0.085 ms0.086 msNo
GTAO Temporal0.099 ms0.173 msYes
Lighting1.081 ms3.048 msNo
SSR Trace0.595 ms0.604 msNo
IBL Reflection0.021 ms0.031 msYes
Reflection Filter0.831 ms1.239 msYes
Reflection Filter0.486 ms0.478 msYes
Reflection Merge0.049 ms0.065 msNo
Temporal AA0.269 ms0.219 msNo
Velocity Reduce0.019 ms0.029 msNo
Velocity Reduce0.004 ms0.004 msNo
Velocity Dilate0.011 ms0.004 msNo
Motion Blur0.111 ms0.113 msNo
Bloom Extract0.013 ms0.045 msNo
Bloom Downsample0.004 ms0.008 msNo
Bloom Blur0.004 ms0.004 msNo
Exposure Adaption0.004 ms0.003 msNo
Bloom Upsample0.005 ms0.005 msNo
Bloom Upsample0.006 ms0.004 msNo
Bloom Upsample0.009 ms0.008 msNo
Bloom Upsample0.022 ms0.023 msNo
Bloom Apply0.039 ms0.255 msNo
Final Output0.041 ms0.092 msYes
Present0.018 ms0.017 msNo
Totals11.173 ms13.928 ms5 / 37

A bit hand-wavy, but if we assume that DXIL and SPIR-V are translated by backend compilers into comparable IL, then we can draw some conclusions about these performance metrics.

In cases where DXBC is used but the Direct3D 12 performance is worse than Vulkan, this typically indicates a case where DXIL is likely faster than DXBC, but correctness prevents us from using it.

In cases where DXBC is used and the Direct3D 12 performance is better than Vulkan, this typically indicates a case where DXIL is slower than DXBC, indicating a performance regression.

The most interesting case is the Lighting pass which uses DXIL, and Vulkan is ~3x more expensive. In the DXC stack, HLSL to SPIR-V uses the same AST as HLSL to DXIL, indicating this performance cliff exists in the translation from AST to SPIR-V.

NOTE: A fun data point is that the Lighting pass takes ~200ms to compile to SPIR-V, and about ~10s to compile to DXIL - surely we can fix the compile time and performance cliffs in this instance? ;)

The performance issue with the Reflection passes is largely related to pow(x, 2) differences; FXC emits x * x whereas DXC emits exp2(log2(x) * 2). It’s of course easy to solve this app-side, but it’s important to track and fix these issues in the compiler itself (i.e. supporting power expansion up to 16). Aside from performance, there are numerical differences which cause corruption when DXIL is used for these passes instead of DXBC.

In general, DXIL is used for nearly all passes, and with good performance and compile times.

One of the components in the DXC compiler stack is dxbc2dxil, would could possibly help with transitioning existing DXBC toolchains over to DXIL. Source

HLSL   Other shading langs  DSL          DXBC IL
+      +                    +            +
|      |                    |            |
v      v                    v            v
Clang  Clang                Other Tools  dxbc2dxil
+      +                    +            +
|      |                    |            |
v      v                    v            |
+------+--------------------+---------+  |
|          High level IR (DXIR)       |  |
+-------------------------------------+  |
                  |                      |
                  |                      |
                  v                      |
              Optimizer <-----+ Linker   |
              +      ^             +     |
              |      |             |     |
              |      |             |     |
 +------------v------+-------------v-----v-------+
 |              Low level IR (DXIL)              |
 +------------+----------------------+-----------+
              |                      |
              v                      v
      Driver Compiler             Verifier

Regarding IHV driver stability, I definitely don’t envy the hard work the driver engineers have been needing to do in order to support DXIL. Previously, they just needed to support the more higher level DXBC specification, which gave them a lot more freedom to map these concepts to their internal IL, whereas DXIL is a lot lower level and more explicit around flow control, intrinsics, and overall behavior.

This is definitely a controversial topic, but I personally feel that the overall benefits of an open source compiler stack, proper support for features like wave intrinsics, and an actual specification are very advantageous. As one example, the open source nature of DXC has allowed for Google to collaborate with Microsoft and add HLSL to SPIR-V support to the same codebase, making it less problematic to develop or maintain a complex engine that runs on Vulkan and Direct3D 12, using only HLSL as a source language.

Following my previous posts regarding shader compilation on Linux and scaling out in Kubernetes, I looked into running FXC in Docker. One major problem of FXC is that it is only a closed source Windows binary, which eliminates any ability to cross-compile it for Linux.

Without any source, the only other alternative was to give Wine a shot, which has no problem running fxc.exe correctly.

Repository

FROM ubuntu:18.04
ARG DEBIAN_FRONTEND="noninteractive"
RUN dpkg --add-architecture i386 \
  && apt-get update \
  && apt-get install -y \
    software-properties-common \
    winbind \
    cabextract \
    p7zip \
    unzip \
    wget \
    curl \
    zenity \
  && wget -O- https://dl.winehq.org/wine-builds/Release.key | apt-key add - \
  && apt-add-repository https://dl.winehq.org/wine-builds/ubuntu/ \
  && apt-get update \
  && apt-get install -y --install-recommends winehq-stable \
  && mkdir -p /home/wine/.cache/wine \
  && wget https://dl.winehq.org/wine/wine-mono/4.7.3/wine-mono-4.7.3.msi \
    -O /home/wine/.cache/wine/wine-mono-4.6.4.msi \
  && wget https://dl.winehq.org/wine/wine-gecko/2.47/wine_gecko-2.47-x86.msi \
    -O /home/wine/.cache/wine/wine_gecko-2.47-x86.msi \
  && wget https://dl.winehq.org/wine/wine-gecko/2.47/wine_gecko-2.47-x86_64.msi \
    -O /home/wine/.cache/wine/wine_gecko-2.47-x86_64.msi \
  && wget https://raw.githubusercontent.com/Winetricks/winetricks/master/src/winetricks \
    -O /usr/bin/winetricks \
  && chmod +rx /usr/bin/winetricks \
  && mkdir -p /home/wine/.cache/winetricks/win7sp1 \
  && wget https://download.microsoft.com/download/0/A/F/0AFB5316-3062-494A-AB78-7FB0D4461357/windows6.1-KB976932-X86.exe \
    -O /home/wine/.cache/winetricks/win7sp1/windows6.1-KB976932-X86.exe \
  && groupadd -g 1010 wine \
  && useradd -s /bin/bash -u 1010 -g 1010 wine \
  && chown -R wine:wine /home/wine \
  && apt-get autoremove -y \
    software-properties-common \
  && apt-get autoclean \
  && apt-get clean \
  && apt-get autoremove

VOLUME /home/wine
ENV WINEARCH=win64
ENV WINEDEBUG=fixme-all
RUN winecfg

WORKDIR /fxc
COPY d3dcompiler_47.dll .
COPY fxc.exe .

ENTRYPOINT ["wine", "fxc"]

The above Dockerfile has been published to Docker Hub as gwihlidal/fxc.

The published image can be invoked with:

$ docker run --rm gwihlidal/fxc /help

The host machine file system can also be bind mounted into the container so that fxc can be used like a regular command line application on any machine:

$ docker run --rm -v $(pwd):$(pwd) -w $(pwd) gwihlidal/fxc /T <target> /E <entry-point-name> <input-hlsl-file>

Example output (DXBC):

% docker run --rm -v $(pwd):$(pwd) -w $(pwd) gwihlidal/fxc /T ps_5_1 /E main simple.hlsl

Microsoft (R) Direct3D Shader Compiler 10.1
Copyright (C) 2013 Microsoft. All rights reserved.

//
// Generated by Microsoft (R) HLSL Shader Compiler 10.1
//
//
//
// Input signature:
//
// Name                 Index   Mask Register SysValue  Format   Used
// -------------------- ----- ------ -------- -------- ------- ------
// no Input
//
// Output signature:
//
// Name                 Index   Mask Register SysValue  Format   Used
// -------------------- ----- ------ -------- -------- ------- ------
// SV_TARGET                0   xyzw        0   TARGET   float   xyzw
//
ps_5_1
dcl_globalFlags refactoringAllowed
dcl_output o0.xyzw
mov o0.xyzw, l(0,1.000000,0,1.000000)
ret
// Approximately 2 instruction slots used

© 2024. All rights reserved.