Background: Debian on the ThinkPad X13s
I have a ThinkPad X13s Gen 1 running Debian unstable. It’s not perfect, but it’s a pretty nice lightweight, passively-cooled, ARM-based laptop with good performance, good battery life, and out-of-the-box support available in Debian. The official Debian on Thinkpad X13s wiki page does an excellent job of describing how to get Debian installed and running on it.
I’ve had it for over a year, and have been pretty happy with it overall, but I have run into a few niggling problems that I’ve had to fix. This post is one of those stories.
Trouble with the Linux 6.11 -> 6.12 Kernel Upgrade
Author’s Note: You can skip straight to the next section if you’d rather just have the solution. This section just describes how I found the problem.
Noticing the Problem
The most recent problem I ran into with my ThinkPad X13s was when I upgraded from the last Linux 6.11 kernel in Debian unstable (linux-image-6.11.9-arm64
) to the first Linux 6.12 kernel (linux-image-6.12.3-arm64
).
Everything seemed to work after the upgrade… at first. The system booted, I was able to login, open Firefox, and get back to work in Kitty. However, I noticed it lagging almost imperceptibly when I switched workspaces in Plasma, loaded a new page in Firefox, or even sometimes when I was just typing in Neovim. Also, after using it for a little while, I noticed another problem: I could typically get a full day of battery life out of it, but my battery was draining far more quickly than normal. It was almost dead after a mere 5 hours!
Initial Investigation
I opened htop
, and that’s when I noticed much higher than normal CPU usage. All 8 CPU cores were ~8% loaded. The culprit seemed to be llvmpipe
, which was a red flag. LLVMpipe is a CPU-based fallback path for graphics acceleration in Mesa. It is typically used when:
- the GPU drivers aren’t loaded,
- the GPU isn’t supported,
- the firmware isn’t available (as I’d come to find out was the case later),
- or there’s something else wrong with the GPU or software stack supporting it.
I knew that something was wrong - probably something to do with the GPU, given the presence of llvmpipe
- but I didn’t know exactly what. I had just done software updates before I shut down the system the last time, so I decided to run tail /var/log/apt/history.log
to see what packages had been installed or updated. Maybe one would stand out as suspect, if I was lucky. That is how I learned that the Linux kernel had been updated, as had a few dozen other packages. That isn’t unusual given how fast Debian unstable moves, with multiple important package updates a day being the norm (except during freezes). Since the GPU driver lives in the kernel, and none of the other package updates looked like they might be graphics-related, I decided to guess, and try rebooting into the previous kernel. Conveniently, Debian keeps the last two kernels installed by default (before sudo apt autoremove
will start removing old ones), so doing that was as easy as rebooting and selecting the Linux 6.11 kernel that I had been running the day before from the GRUB boot menu.
It worked! Once my laptop rebooted, I no longer had any of the lag I had started noticing before, and htop
confirmed that my CPU usage was very low, and the llvmpipe
process was not running. Just to confirm that was it, I rebooted back into the newer Linux 6.12 kernel, and the lag was back, and llvmpipe
was in use again.
Ignoring the Problem
At that point, I decided that I was done for the time being. I just wanted to use my laptop. So I set Linux 6.11 as the default kernel in my GRUB configuration, booted back into it, and kept using it. I hoped that maybe the next kernel update would resolve the problem.
But it didn’t. Each time I upgraded to a new Linux 6.12.x point release, and booted into it with high hopes, everything would be laggy and llvmpipe
would be running.
I made a few more half-hearted attempts to find the source of the problem, but nothing came up in my feeble attempts to Google it.
Thorough Investigation
Today I decided to get to the bottom of it once and for all. I was tired of switching back to an older kernel each time there was a kernel update. Surely if something was really wrong in the kernel someone would have reported it by now, right? It turns out that I was half-right. Someone would have reported it, but no one had because it wasn’t a bug in the kernel after all.
I started by booting into the linux-image-6.12.33+deb13-arm64
kernel image in Debian, and making sure that I could still reproduce the problem. Sure enough, llvmpipe
was running. I then took a look at the kernel boot logs with sudo dmesg
, and I quickly noticed the following message being printed repeatedly.
msm_dpu ae01000.display-controller: [drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -22
That was certainly odd, and it lined up very well with my previous assumption that llvmpipe
was running because something was going wrong with the GPU. Armed with this very specific error message, I tried Googling it, hopeful that I would find a solution this time - but to no avail.
The obvious thing to do at that point was to look further up in the kernel boot log to see if there were any other error messages that stood out. I was particularly looking for when the kernel first tried to initialize the GPU. Unfortunately the dmesg
output cut off too early. I then tried looking at the boot logs with journalctl
, which includes more than just the kernel logs, in hopes that it would have something useful, possibly from another service run at boot. It turns out that it did - it had more of the kernel boot log, including the place where the GPU was initialized! There was still a lot of irrelevant information that made it hard to find what I was looking for, so I narrowed it down like this: sudo journalctl -b 0 | grep -C5 -e adreno -e msm_dpu
.
The following error messages printed by the msm_dpu
and adreno
kernel drivers during GPU initialization really stood out:
msm_dpu ae01000.display-controller: bound 3d00000.gpu (ops a3xx_ops [msm])
Console: switching to colour dummy device 80x25
[drm:dpu_kms_hw_init:1108] dpu hardware revision:0x80000000
[drm] Initialized msm 1.12.0 for ae01000.display-controller on minor 0
msm_dpu ae01000.display-controller: [drm:adreno_request_fw [msm]] loaded qcom/a660_sqe.fw from new location
msm_dpu ae01000.display-controller: [drm:adreno_request_fw [msm]] loaded qcom/a660_gmu.bin from new location
msm_dpu ae01000.display-controller: firmware: failed to load qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn (-2)
msm_dpu ae01000.display-controller: firmware: failed to load qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn (-2)
msm_dpu ae01000.display-controller: firmware: failed to load qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn (-2)
adreno 3d00000.gpu: [drm:adreno_zap_shader_load [msm]] *ERROR* Unable to load qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn
msm_dpu ae01000.display-controller: [drm:adreno_load_gpu [msm]] *ERROR* gpu hw init failed: -2
platform 3d6a000.gmu: [drm:a6xx_gmu_set_oob [msm]] *ERROR* Timeout waiting for GMU OOB set GPU_SET: 0x0
Console: switching to colour frame buffer device 240x75
fb0: Framebuffer is not in virtual address space.
msm_dpu ae01000.display-controller: [drm] fb0: msmdrmfb frame buffer device
That’s odd. Why was the firmware failing to load? I took a look at the firmware file it referenced with ls -l /lib/firmware/qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn
. Sure enough, it was present, just like it was supposed to be. It was also the latest version installed from the firmware-qcom-soc
package. Maybe it was incompatible with the new kernel?
To test that theory, I booted back into to my trusty Linux 6.11.9 kernel. Now that I knew what I was looking for, I thought that I would surely see the firmware being loaded in the kernel boot log for the older kernel. I knew that the GPU worked in Linux 6.11, after all. To my surprise, I saw those exact same messages about failing to load firmware in Linux 6.11! The only difference was that I didn’t get the *ERROR* gpu hw init failed
message immediately afterwards, and the GPU seemed to initialize properly from there without the firmware. That was even more odd.
Knowing that, I switched back to Linux 6.12.33, and I started investigating what was causing the firmware that was definitely present on my system to fail to load on both kernel versions (and likely on older kernel versions as well, if I had bothered to go back to check them, which I didn’t). That eventually lead me to this Armbian forum thread from September 27, 2023 that contained the final clue.
That forum thread doesn’t mention a problem with the GPU specifically, but the original poster does describe how he discovered that firmware wasn’t always being loaded on his ThinkPad X13s with Armbian unless it was in the initramfs image. I immediately stopped and checked, and sure enough, the firmware was not in my initramfs image. Finally, a glimmer of hope: something that could explain why the firmware seemed to exist on disk but couldn’t be found by the kernel!
$ sudo lsinitramfs /boot/initrd.img-$(uname -r) | grep firmware/qcom
usr/lib/firmware/qcom
usr/lib/firmware/qcom/a300_pfp.fw
usr/lib/firmware/qcom/a300_pm4.fw
usr/lib/firmware/qcom/a330_pfp.fw
usr/lib/firmware/qcom/a330_pm4.fw
usr/lib/firmware/qcom/a420_pfp.fw
usr/lib/firmware/qcom/a420_pm4.fw
usr/lib/firmware/qcom/a530_pfp.fw
usr/lib/firmware/qcom/a530_pm4.fw
usr/lib/firmware/qcom/a530_zap.mdt
usr/lib/firmware/qcom/a530v3_gpmu.fw2
usr/lib/firmware/qcom/a630_gmu.bin
usr/lib/firmware/qcom/a630_sqe.fw
usr/lib/firmware/qcom/a650_gmu.bin
usr/lib/firmware/qcom/a650_sqe.fw
usr/lib/firmware/qcom/a660_gmu.bin
usr/lib/firmware/qcom/a660_sqe.fw
usr/lib/firmware/qcom/apq8096
usr/lib/firmware/qcom/apq8096/a530_zap.mbn
usr/lib/firmware/qcom/leia_pfp_470.fw
usr/lib/firmware/qcom/leia_pm4_470.fw
usr/lib/firmware/qcom/yamato_pfp.fw
usr/lib/firmware/qcom/yamato_pm4.fw
Solution: Including the Qualcomm Firmware in initramfs
Fortunately, the original poster in the aforementioned Armbian forum thread included an initramfs hook to put the firmware into the initramfs image. I added some additional comments to it and made some very slight tweaks. That is ultimately what solved my problem and got the GPU firmware (and other system firmware) to load with Linux 6.12.33 on my ThinkPad X13s Gen 1 so that I have hardware graphics acceleration again without LLVMpipe.
The solution is pretty simple. I have tested it on Debian unstable (shortly after the trixie freeze prior to the release of Debian 13). Your mileage may vary with other distros.
First, create a new initramfs hook.
$ sudo -e /etc/initramfs-tools/hooks/17-qcom-x13s
Second, give it the following contents.
#!/bin/sh
#
# This initramfs hook ensures that the Qualcomm firmware for the ThinkPad X13s
# is automatically loaded at boot by copying it into the initramfs. This works
# around a possible race condition between mounting the NVMe disk containing
# the full firmware and the initialization of the Qualcomm platform drivers and
# Adreno GPU driver in the kernel. They will only try to load the firmware
# once, on boot, and if they can't, they will never try again. This results in
# the firmware never being loaded (on any Linux kernel version), and the GPU
# never being initialized at all on Linux 6.12 (or later, presumably), which
# causes llvmpipe to be use for graphics, which greatly reduces battery life.
#
# This Armbian forum thread has additional details:
# https://forum.armbian.com/topic/30415-thinkpad-x13s-remoteproc-firmware-fails-to-load-if-on-a-fast-boot-device/
#
# To view the contents of the initramfs, use this command:
# sudo lsinitramfs /boot/initrd.img | grep firmware/qcom
#
# This script should be installed as /etc/initramfs-tools/hooks/17-qcom-x13s
# and marked executable. Then run the following command to update the
# initramfs, which will execute it:
# sudo update-initramfs -uv
#
# Error that this initramfs hook works around:
# $ sudo journalctl -b 0 | grep -C5 -e adreno -e msm_dpu
# ...
# msm_dpu ae01000.display-controller: [drm:adreno_request_fw [msm]] loaded qcom/a660_sqe.fw from new location
# msm_dpu ae01000.display-controller: [drm:adreno_request_fw [msm]] loaded qcom/a660_gmu.bin from new location
# msm_dpu ae01000.display-controller: firmware: failed to load qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn (-2)
# msm_dpu ae01000.display-controller: firmware: failed to load qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn (-2)
# adreno 3d00000.gpu: [drm:adreno_zap_shader_load [msm]] *ERROR* Unable to load qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn
# msm_dpu ae01000.display-controller: [drm:adreno_load_gpu [msm]] *ERROR* gpu hw init failed: -2
set -e
PREREQ=""
prereqs()
{
echo "$PREREQ"
}
case \\$1 in
## Get the prerequisites.
prereqs)
prereqs
exit 0
;;
esac
. /usr/share/initramfs-tools/hook-functions
# Define a list of firmware files to be included.
FIRMWARE_FILES="\
qcom/sc8280xp/LENOVO/21BX/qcadsp8280.mbn \
qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn \
qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn \
qcom/sc8280xp/LENOVO/21BX/qcvss8280.mbn \
qcom/sc8280xp/LENOVO/21BX/qcslpi8280.mbn \
qcom/a660_sqe.fw \
qcom/a660_gmu.bin"
# Copy each firmware file to initramfs.
for file in $FIRMWARE_FILES; do
dir=$(dirname "$file")
mkdir -p "${DESTDIR}/lib/firmware/${dir}"
cp -a "/lib/firmware/${file}" "${DESTDIR}/lib/firmware/${dir}/"
done
Third, make it executable.
$ sudo chmod 755 /etc/initramfs-tools/hooks/17-qcom-x13s
Fourth, update the initramfs for your current kernel image. This will use the new initramfs hook to include the firmware for the ThinkPad X13s.
$ sudo update-initramfs -u
Fifth, list the contents of the initramfs image to ensure that the ThinkPad X13s firmware was added to it. This step is technically optional, but it is always good to verify. I included sample output below after the command to show what it should look like once the new initramfs hook has been run.
$ sudo lsinitramfs /boot/initrd.img-$(uname -r) | grep firmware/qcom
usr/lib/firmware/qcom
usr/lib/firmware/qcom/a300_pfp.fw
usr/lib/firmware/qcom/a300_pm4.fw
usr/lib/firmware/qcom/a330_pfp.fw
usr/lib/firmware/qcom/a330_pm4.fw
usr/lib/firmware/qcom/a420_pfp.fw
usr/lib/firmware/qcom/a420_pm4.fw
usr/lib/firmware/qcom/a530_pfp.fw
usr/lib/firmware/qcom/a530_pm4.fw
usr/lib/firmware/qcom/a530_zap.mdt
usr/lib/firmware/qcom/a530v3_gpmu.fw2
usr/lib/firmware/qcom/a630_gmu.bin
usr/lib/firmware/qcom/a630_sqe.fw
usr/lib/firmware/qcom/a650_gmu.bin
usr/lib/firmware/qcom/a650_sqe.fw
usr/lib/firmware/qcom/a660_gmu.bin
usr/lib/firmware/qcom/a660_sqe.fw
usr/lib/firmware/qcom/apq8096
usr/lib/firmware/qcom/apq8096/a530_zap.mbn
usr/lib/firmware/qcom/leia_pfp_470.fw
usr/lib/firmware/qcom/leia_pm4_470.fw
usr/lib/firmware/qcom/sc8280xp
usr/lib/firmware/qcom/sc8280xp/LENOVO
usr/lib/firmware/qcom/sc8280xp/LENOVO/21BX
usr/lib/firmware/qcom/sc8280xp/LENOVO/21BX/qcadsp8280.mbn
usr/lib/firmware/qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn
usr/lib/firmware/qcom/sc8280xp/LENOVO/21BX/qcdxkmsuc8280.mbn
usr/lib/firmware/qcom/sc8280xp/LENOVO/21BX/qcslpi8280.mbn
usr/lib/firmware/qcom/sc8280xp/LENOVO/21BX/qcvss8280.mbn
usr/lib/firmware/qcom/yamato_pfp.fw
usr/lib/firmware/qcom/yamato_pm4.fw
Sixth, and finally, reboot. This will boot into the default kernel image with the newly updated initramfs. The firmware should be correctly loaded, and you shouldn’t see LLVMpipe running anymore with htop
or ps -ef | grep llvmpipe
.
$ sudo reboot
Conclusion: Don’t Forget Your Firmware
What I learned from this investigation was that a surprising number of things work without firmware loaded on the ThinkPad X13s Gen 1 in Linux. I never would have thought that I hadn’t been loading the GPU firmware the entire time I’d been using it until now, even though I had the firmware correctly installed, and that it somehow actually worked despite that. I’m glad to have solved that little problem.