Skip to main content

Building x265 with NEON Support in macOS with Brew

<time datetime="2023-02-28 16:23:33 &#43;0000 UTC">28 February 2023</time><span class="px-2 text-primary-500">&middot;</span><span title="Reading time">4 mins</span>

This issue has now been fixed, but it has not made it into a release yet. All you have to do now is brew install x265 --HEAD!

Original Post #

With Brew, the current release of x265 doesn’t support NEON, so encodes on Apple Silicon are much slower than expected. If you try to build from HEAD, you’ll run into this issue. Someone has posted a patch to fix this, but it hasn’t even made it into a PR yet so we have to do some manual work to get it installed.

Modifying the Brew Formula #

Luckily for us, Brew has a system in place for patching formulas. To start editing the formula, run brew edit x265. Here’s the patch which details what we’ll be changing:

--- a/x265.rb      2023-02-28 16:35:34
+++ b/x265.rb   2023-02-28 15:34:27
@@ -25,6 +25,8 @@
     depends_on "nasm" => :build
   end
 
+  patch :DATA
+
   def install
     # Build based off the script at ./build/linux/multilib.sh
     args = std_cmake_args + %W[
@@ -79,3 +81,34 @@
     assert_equal header.unpack("m"), [x265_path.read(10)]
   end
 end
+
+__END__
+diff --git a/source/common/aarch64/asm.S b/source/common/aarch64/asm.S
+index 399c37cf2..b81fb254e 100644
+--- a/source/common/aarch64/asm.S
++++ b/source/common/aarch64/asm.S
+@@ -28,6 +28,11 @@
+ #define PFX2(prefix, name) PFX3(prefix, name)
+ #define PFX(name)          PFX2(X265_NS, name)
+
++
++#define UPFX3(prefix, name) _ ## prefix ## _ ## name
++#define UPFX2(prefix, name) UPFX3(prefix, name)
++#define UPFX(name)          UPFX2(X265_NS, name)
++
+ #ifdef __APPLE__
+ #define PREFIX 1
+ #endif
+diff --git a/source/common/aarch64/pixel-util.S b/source/common/aarch64/pixel-util.S
+index fba9a90d5..49c6b0492 100644
+--- a/source/common/aarch64/pixel-util.S
++++ b/source/common/aarch64/pixel-util.S
+@@ -2407,7 +2407,7 @@ function PFX(costCoeffNxN_neon)
+     // x5 - scanFlagMask
+     // x6 - baseCtx
+     mov             x0, #0
+-    movrel          x1, x265_entropyStateBits
++    movrel          x1, UPFX(entropyStateBits)
+     mov             x4, #0
+     mov             x11, #0
+     movi            v31.16b, #0

First, we add patch :DATA just before the install script. This tells Brew to apply the patch file that we place after the __END__ statement. We then add the __END__ statement, and put the patch from the issue below. For reference, here is my entire formula, since that wasn’t the best explanation:

class X265 < Formula
  desc "H.265/HEVC encoder"
  homepage "https://bitbucket.org/multicoreware/x265_git"
  url "https://bitbucket.org/multicoreware/x265_git/get/3.5.tar.gz"
  sha256 "5ca3403c08de4716719575ec56c686b1eb55b078c0fe50a064dcf1ac20af1618"
  license "GPL-2.0-only"
  head "https://bitbucket.org/multicoreware/x265_git.git", branch: "master"

  bottle do
    rebuild 1
    sha256 cellar: :any,                 arm64_ventura:  "fc0bf01af954762a85e8b808d5b03d28b9e36e8e71035783e39bb9dc0307abea"
    sha256 cellar: :any,                 arm64_monterey: "e60559191a9aba607e512ad33ac9f66688b12837df7e6a3cf57ceae26968235b"
    sha256 cellar: :any,                 arm64_big_sur:  "adc617eed2e065af669994fb5b538195fd46db4ac7b13c7ca2490dc8abaf6466"
    sha256 cellar: :any,                 ventura:        "42bac1c3760905fc0f6c8ee2af2b97c5ef371d6135f6822357afe91f4014a2dd"
    sha256 cellar: :any,                 monterey:       "be446f5c7cb4872205f260b8821fc7ebd5bd7c4b8837888c98c08e051dff2e3f"
    sha256 cellar: :any,                 big_sur:        "55bb46a5dc1924e59b7fa7bc800a21c0cf21355e48cb38b941d8e786427c70a0"
    sha256 cellar: :any,                 catalina:       "5e5bc106e1cf971a176dd5b37a61d28769e353f81102c011b4230cc8732eca7a"
    sha256 cellar: :any,                 mojave:         "c61ebdf9dcd4aedf5da2a7eb2b3a5154fd355c105a19a0471d43a3aa67f3cb88"
    sha256 cellar: :any_skip_relocation, x86_64_linux:   "c80f18988caea25e95ca87dd648f5ff8b0856e24d26adc8d68ca68cc6d4faabf"
  end

  depends_on "cmake" => :build

  on_intel do
    depends_on "nasm" => :build
  end

  patch :DATA

  def install
    # Build based off the script at ./build/linux/multilib.sh
    args = std_cmake_args + %W[
      -DLINKED_10BIT=ON
      -DLINKED_12BIT=ON
      -DEXTRA_LINK_FLAGS=-L.
      -DEXTRA_LIB=x265_main10.a;x265_main12.a
      -DCMAKE_INSTALL_RPATH=#{rpath}
    ]
    high_bit_depth_args = std_cmake_args + %w[
      -DHIGH_BIT_DEPTH=ON -DEXPORT_C_API=OFF
      -DENABLE_SHARED=OFF -DENABLE_CLI=OFF
    ]
    (buildpath/"8bit").mkpath

    mkdir "10bit" do
      system "cmake", buildpath/"source", "-DENABLE_HDR10_PLUS=ON", *high_bit_depth_args
      system "make"
      mv "libx265.a", buildpath/"8bit/libx265_main10.a"
    end

    mkdir "12bit" do
      system "cmake", buildpath/"source", "-DMAIN12=ON", *high_bit_depth_args
      system "make"
      mv "libx265.a", buildpath/"8bit/libx265_main12.a"
    end

    cd "8bit" do
      system "cmake", buildpath/"source", *args
      system "make"
      mv "libx265.a", "libx265_main.a"

      if OS.mac?
        system "libtool", "-static", "-o", "libx265.a", "libx265_main.a",
                          "libx265_main10.a", "libx265_main12.a"
      else
        system "ar", "cr", "libx265.a", "libx265_main.a", "libx265_main10.a",
                           "libx265_main12.a"
        system "ranlib", "libx265.a"
      end

      system "make", "install"
    end
  end

  test do
    yuv_path = testpath/"raw.yuv"
    x265_path = testpath/"x265.265"
    yuv_path.binwrite "\xCO\xFF\xEE" * 3200
    system bin/"x265", "--input-res", "80x80", "--fps", "1", yuv_path, x265_path
    header = "AAAAAUABDAH//w=="
    assert_equal header.unpack("m"), [x265_path.read(10)]
  end
end

__END__
diff --git a/source/common/aarch64/asm.S b/source/common/aarch64/asm.S
index 399c37cf2..b81fb254e 100644
--- a/source/common/aarch64/asm.S
+++ b/source/common/aarch64/asm.S
@@ -28,6 +28,11 @@
 #define PFX2(prefix, name) PFX3(prefix, name)
 #define PFX(name)          PFX2(X265_NS, name)

+
+#define UPFX3(prefix, name) _ ## prefix ## _ ## name
+#define UPFX2(prefix, name) UPFX3(prefix, name)
+#define UPFX(name)          UPFX2(X265_NS, name)
+
 #ifdef __APPLE__
 #define PREFIX 1
 #endif
diff --git a/source/common/aarch64/pixel-util.S b/source/common/aarch64/pixel-util.S
index fba9a90d5..49c6b0492 100644
--- a/source/common/aarch64/pixel-util.S
+++ b/source/common/aarch64/pixel-util.S
@@ -2407,7 +2407,7 @@ function PFX(costCoeffNxN_neon)
     // x5 - scanFlagMask
     // x6 - baseCtx
     mov             x0, #0
-    movrel          x1, x265_entropyStateBits
+    movrel          x1, UPFX(entropyStateBits)
     mov             x4, #0
     mov             x11, #0
     movi            v31.16b, #0

Building The Brew Formula #

Now that we’ve edited the formula, we can build the formula. First, unlink your current installation of x265 with brew unlink x265. Then, run HOMEBREW_NO_INSTALL_FROM_API=1 brew install x265 -vd --HEAD

  • HOMEBREW_NO_INSTALL_FROM_API=1 ensures that Brew doesn’t fetch the formula, and instead uses our edited local copy.
  • -vd makes Brew show debug info while building, which can be helpful when it breaks.
  • --HEAD makes Brew install from master instead of the (very old) release.

If everything goes to plan, x265 should build (as well as a few other things, such as ffmpeg due to the linking no longer being valid).

Results #

So, is doing this worth it? Yes! I observed a 2x increase in H265 encode speeds after doing this. When encoding Dolby Vision 1080p ProRes, I got 7.01 fps without NEON, and 14.80 fps with it. The specific command I used was:

ffmpeg -i 1080p.MOV -c:v libx265 -c:a copy -x265-params "repeat-headers=1:hrd=cbr:colorprim=bt2020:transfer=smpte2084:colormatrix=bt2020nc:master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,1):max-cll=1000,400" -tag:v hvc1 neon-1080p.mp4