Skip to main content

Fixing Android Kernel Drivers in the Post-Treble Era

·7 mins

This is a follow-up on fixing the touchscreen issue in our previous custom-built Pixel 3a kernel.

2022-04-27 Update: Add @yanivagman’s solution that compiles the drivers in the kernel

2021-05-21 Update: Add a second solution that will persist the fix

TD;LR: How to fix the broken touchscreen driver #

If you still recall what we did in our last Android kernel fuzzing series: we built the kernel, slapped it onto the boot image, flashed it into the boot partition, restarted the Pixel 3a… And now we have our brand new kernel running, only with broken touchscreens! You probably will see the lock screen shows up and the power and volume buttons working fine, but the screen won’t respond to any touches.

The simplest fix is to edit the kernel config and compile the touchscreen drivers WITHIN the kernel.

Credit goes to @yanivagman

diff --git a/arch/arm64/configs/bonito_defconfig b/arch/arm64/configs/bonito_defconfig
index 964fef80dca7..4910ccbe1755 100644
--- a/arch/arm64/configs/bonito_defconfig
+++ b/arch/arm64/configs/bonito_defconfig
@@ -324,10 +324,10 @@ CONFIG_JOYSTICK_XPAD=y
 CONFIG_JOYSTICK_XPAD_FF=y
 CONFIG_JOYSTICK_XPAD_LEDS=y
 CONFIG_INPUT_TOUCHSCREEN=y
-CONFIG_TOUCHSCREEN_SYNAPTICS_DSX_CORE_v27=m
-CONFIG_TOUCHSCREEN_SYNAPTICS_DSX_RMI_DEV_v27=m
-CONFIG_TOUCHSCREEN_SYNAPTICS_DSX_FW_UPDATE_v27=m
-CONFIG_TOUCHSCREEN_SYNAPTICS_DSX_TEST_REPORTING_v27=m
+CONFIG_TOUCHSCREEN_SYNAPTICS_DSX_CORE_v27=y
+CONFIG_TOUCHSCREEN_SYNAPTICS_DSX_RMI_DEV_v27=y
+CONFIG_TOUCHSCREEN_SYNAPTICS_DSX_FW_UPDATE_v27=y
+CONFIG_TOUCHSCREEN_SYNAPTICS_DSX_TEST_REPORTING_v27=y
 CONFIG_TOUCHSCREEN_TBN=y
 CONFIG_INPUT_MISC=y
 CONFIG_INPUT_HBTP_INPUT=y
-- 

Method 2: Manually load driver kernel modules #

After building your kernel (here we use android-msm-bonito-4.9-android10 as an example), there should be four synaptics_*.ko files generated in the output directory output/bonito-kernel-4.9-android-10/msm/drivers/input/touchscreen/. These’re the touchscreen drivers that are compatible with the new kernel but were’t automatically included in the kernel/boot image.

synaptics_dsx_core.ko
synaptics_dsx_fw_update.ko
synaptics_dsx_rmi_dev.ko
synaptics_dsx_test_reporting.ko

Loading these kernel drivers should fix the issue. Below is a simple script that I wrote to copy the files to the phone’s /data/sdcard directory and run insmod to load the kernel modules dynamically. Without a reboot, the touchscreen should be back to life in a second.

$ cat patch_drivers.sh
#!/bin/bash
KO_DIR=vendor_ko

adb root
adb push "$KO_DIR" /data/sdcard/
for file in "$KO_DIR"/*; do
  FILENAME="$(cut -d'/' -f2 <<<"$file")"
  echo "Installing kernel module $FILENAME"
  adb shell insmod /data/sdcard/vendor_ko/$FILENAME
done

$ ./patch_drivers.sh
restarting adbd as root
vendor_ko/: 5 files pushed. 18.8 MB/s (2208388 bytes in 0.112s)
Installing kernel module synaptics_dsx_core.ko
Installing kernel module synaptics_dsx_fw_update.ko
Installing kernel module synaptics_dsx_rmi_dev.ko
Installing kernel module synaptics_dsx_test_reporting.k

Method 3: Push driver modules to /vendor #

The only caveat in method 1 is that you’ll have to do it repeatedly after each phone reboots. To make the change permanent, you can copy your .ko files into /vendor/lib/modules which should replace the outdated drivers.

Notice that you’ll need either a userdebug build or dm-verity disabled system image to carry out this trick. This is because the /vendor partition, under dm-verity’s write protect, cannot be remounted as read-write so you can’t overwrite any files inside.

# if dm-verity is enabled
$ adb shell
> mount -o rw,remount /vendor
mount: /vendor: cannot remount /dev/mapper/vendor-verity read-write, is write-protected.

# on a user-debug build
$ adb disable-verity
using overlayfs
Successfully disabled verity
Now reboot your device for settings to take effect

$ adb push vendor_ko /data/sdcard
$ adb shell
>$ mount -o rw,remount /vendor
# insmod or reboot to load the new drivers
>$ cp /data/sdcard/vendor_ko/*.ko /vendor/lib/modules/

The root cause #

Finding out the culprit #

When I first flashed the new kernel, the broken touchscreen was the first thing that I noticed. So I tested all IO inputs (power button, fingerprint sensor etc.) while keeping adb to tap the input event. The result was below.

$ adb shell getevent -l
    add device 1: /dev/input/event2
      name:     "uinput-fpc"
    add device 2: /dev/input/event0
      name:     "qpnp_pon"
    add device 3: /dev/input/event1
      name:     "gpio-keys"
    could not get driver version for /dev/input/mice, Not a typewriter
    /dev/input/event0: EV_KEY       KEY_POWER            DOWN
    /dev/input/event0: EV_SYN       SYN_REPORT           00000000
    /dev/input/event0: EV_KEY       KEY_POWER            UP
    /dev/input/event0: EV_SYN       SYN_REPORT           00000000
    /dev/input/event0: EV_KEY       KEY_VOLUMEDOWN       DOWN
    /dev/input/event0: EV_SYN       SYN_REPORT           00000000
    /dev/input/event0: EV_KEY       KEY_VOLUMEDOWN       UP

Surprisingly, there was no touch IO event at all (e.g. ABS_MT_POSITION_X). To compare with the stock ROM, I flashed the factory boot.img again and the touchscreen magically went back to normal. Running getevent again showed us the culprit driver synaptics_dsx that had been missing in our custom build.

$ adb shell getevent -l
...
add device 4: /dev/input/event2
  name:     "synaptics_dsx"
...
/dev/input/event2: EV_ABS       ABS_MT_TRACKING_ID   00000064
/dev/input/event2: EV_KEY       BTN_TOUCH            DOWN
/dev/input/event2: EV_KEY       BTN_TOOL_FINGER      DOWN
/dev/input/event2: EV_ABS       ABS_MT_POSITION_X    000001b3
/dev/input/event2: EV_ABS       ABS_MT_POSITION_Y    000003b8

synaptics is the touchscreen driver for Nexus 5 and onwards. What’s wrong with this innocent-looking thing? Luckily I still kept the dmesg from our last system boot (with the custom kernel). If we search the keyword synaptics, there were error messages hinting that the synaptics drivers failed to load due to disagrees about version of symbol module_layout.

$ cat dmesg.log | grep synaptics_dsx
[    6.595401] synaptics_dsx_core: disagrees about version of symbol module_layout
[    6.633068] synaptics_dsx_fw_update: disagrees about version of symbol module_layout
[    6.677681] synaptics_dsx_test_reporting: disagrees about version of symbol module_layout
[    6.704484] synaptics_dsx_rmi_dev: disagrees about version of symbol module_layout

What went wrong: vendor.img, Project Treble, and module versioning #

Since Android 8.0, Google introduces a new feature named Project Treble (https://android-developers.googleblog.com/2017/05/here-comes-treble-modular-base-for.html), which aims to solve the fragmentation issue in Android’s ecosystem. Since then, all drivers are moved to a dedicated /vendor partition. This measure separates the AOSP code from vendor-maintained drivers so as to make system update much easier: the updater can flash the partitions such as /boot and /system (maintained by Google) while keeping the /vendor partition intact (maintained by the vendor).

Suppose you download a Pixel 3a factory image and extract the content of vendor.img, you should be able to see synaptics_*.ko inside of /lib/modules. You can’t build the vendor.img (at least for the Pixel series) as it contains proprietary blobs that are only distributed through factory images. Remember our custom kernel is bundled with boot.img, which means even if we flash the/boot, the system still loads outdated kernel drivers from /vendor.

$ file vendor.img
vendor.img: Android sparse image, version: 1.0, Total of 122536 4096-byte output blocks in 13 input chunks.

# use simg2img to convert ext4 sparse image to ext2 image
$ git clone https://github.com/anestisb/android-simg2img.git
$ cd android-simg2img; make
$**./simg2img vendor.img vendor.raw.img**

$ file vendor_img_mount/vendor.raw.img
vendor_img_mount/vendor.raw.img: Linux rev 1.0 ext2 filesystem data, UUID=2b96c597-1e2f-5ee1-9851-c4a9fa9de36e, volume name "vendor" (extents) (large files) (huge files)

# use e2ls to list directory inside of the ext2 image
$ sudo apt-get install e2tools
$ e2ls vendor.raw.img:/lib/modules
lcd.ko                                  modules.alias
modules.dep                             synaptics_dsx_core.ko
synaptics_dsx_fw_update.ko              synaptics_dsx_rmi_dev.ko
synaptics_dsx_test_reporting.ko         test_kasan.ko
wlan.ko

But why kernel modules stop working once the kernel changed? This is due to the module versioning mechanism of Linux. The AOSP documentation has added in some insights:

“Typically, a kernel module must be compiled with the kernel that the module is to be used with (otherwise the kernel refuses to load the module). CONFIG_MODVERSIONS provides a workaround by detecting breakages in the application binary interface (ABI). This feature calculates a cyclic redundancy check (CRC) value for the prototype of each exported symbol in the kernel and stores the values as part of the kernel; for symbols used by a kernel module, the values are also stored in the kernel module. When the module is loaded, the values for the symbols used by the module are compared with the ones in the kernel. If the values match, the module is loaded; otherwise the load fails. "

I encourage you to pause a minute and read the full explanation there. Apparently when we alter the kernel options, some kernel structs and symbols are changed, resulting in a different CRC from the one in the factory .ko drivers. Hence, the kernel refuses to load the driver to prevent potential problems due to incompatible ABIs.

Alternative solutions #

The quick fix above doesn’t persist after reboots. If you’re looking for a more permanent solution, you may consider either to patch the vendor.img by replacing the kernel drivers, or to mount the kernel drivers through Magisk as part of the OverlayFS.

References #