Dynamic Cheatsheets on Pocketbook E-Reader

tl;dr: the project sources can be found here

Motivation

My goal for this project was to repurpose my E-Reader device and write an application that dynamically displays cheatsheets based on the current focused window, acting as a special purpose second screen for mostly static content. The idea was to replace having to print cheatsheets and info cards for often used applications and tools like VSCode, Vim, Bash and so on.

Pocketbook E-Reader devices are cool, because their firmware is Linux based and are, in comparison to other devices quite open. They allow creating custom applications in a standard linux environment and additionally Pocketbook's own SDK for drawing directly on the display, retrieving input and interacting with their services and utilities.

Unfortunately the publicly available SDK seems to not be regularly updated, but it still works on recent firmware. The actual used pocketbook E-Reader is the Touch Lux 3 (PB626) which is now already quite old but still works fine.

Setup & Build environment

Bindings

The language of choice to write software for me is Rust. The SDK is written in C, so there is the need for bindings. Fortunately I am not treading on entirely new territory here. Ben Simms already created inkview-rs, a bindings crate that does bindings code generation through bindgen and can be used in combination with cargo-zigbuild to dynamically link the SDK with the built application.

The bindings code is generated with bindgen's --dynamic-loading flag, so the actual symbols are linked at runtime through the dynamic linker ld. No need to have a complicated build environment with a custom cross-compilation C toolchain for the SDK, linking .a files, and so on!

Because of that, remote development really is free of pain. Simply adding the armv7-unknown-linux-gnueabi target to cargo and then executing cargo zigbuild --target armv7-unknown-linux-gnueabi.2.23 cross-compiles to armv7 with a specific glibc version that matches the one present on the Pocketbook device.

Rooting & SSH

By default it is not possible to run programs with root permissions on the device. However, because the firmware uses an old kernel version that is vulnerable to the "Mad-COW" exploit it is possible to "jailbreak" and install a root shell. The pbjb application has a guide in the "low-level-internals" section in the mobileread forum thread to do that. After that programs with root permissions can be launched through /mnt/secure/su.

Now I could launch a SSH server with a script ssh-dropbear.app that executes dropbear like:

#!/bin/sh
/mnt/secure/su /sbin/dropbear -p 2468 -G ""

The installed dropbear version does not accept all key algorithms though, I personally had trouble connecting at first. After some research, I came up with this SSH config that should work in most cases:

Host pocketbook
  HostName <device-ip>
  Port 2468
  User root
  HostKeyAlgorithms +ssh-rsa
  PubKeyAcceptedAlgorithms +ssh-rsa
  PubkeyAuthentication=no
  StrictHostKeyChecking=no

The SSH server should not be launched when the device is connected to non-trusted networks though because anyone could connect to it as root without any authentication.

App transfer

Applications with the file suffix .app placed into the /mnt/ext1/applications directory (that appears as only applications when the device is connected through USB) are automatically shown in the applications launcher. They can be binaries but also shell scripts. This is very useful to create some tools to make the development easier.

To iterate fast I wanted to be able to transfer the built application to the device without any physical tasks like plugging a USB-cable in and out.

To do that I utilized netcat, a tool for creating ad-hoc network sockets and transmit arbitrary data over them. The tool is already present on the device, so I only needed to create scripts for the developer machine & device that send and receive applications.

Sender:

echo "Sending application name.."
echo <app-name> | nc <device-ip> 19991
# The e-reader needs a bit of time to re-launch 'nc'
sleep 3
echo "Sending application content.."
nc <device-ip> 19991 < /local/path/to/binary

Receiver:

echo "Listening for application name.."
LOCAL_APP_NAME=$(nc -l -p 19991 | tr -d ' ')
echo "Received application name : '$LOCAL_APP_NAME'"
LOCAL_APP_PATH="/mnt/ext1/applications/$LOCAL_APP_NAME"
echo "Listening for application content.."
nc -l -p 19991 > "$LOCAL_APP_PATH"
echo "Application has been saved to '$LOCAL_APP_PATH'"

Debugging

I also wanted to debug remotely with gdb. This is possible by: in a SSH session, launching a gdbserver session with command gdbserver 0.0.0.0:10003 /mnt/ext1/applications/<app-name>.app.

On the developer machine when using vscode for development, the extension 'CodeLLDB' can be used for debugging Rust code. A launch.json configuration entry ended up looking like this:

{
    "type": "lldb",
    "request": "custom",
    "name": "remote debug <app-name>",
    "targetCreateCommands": [
        "target create ${workspaceFolder}/target/armv7-unknown-linux-gnueabi/debug/examples/<app-name>"
    ],
    "processCreateCommands": [
        "gdb-remote <device-ip>:10003"
    ]
}

Now I could step through code, set breakpoints and so on. Of course it's also possible to use the gdb CLI directly.

inkview-rs Improvements

The inkview-rs bindings crate needed additional features and fixes to satisfy my needs. For one, it needed to be extended to the SDK version v5.19 that is supported on the Touch Lux 3. Previously, only bindings for v6.5 were generated. These API versions can now be switched through cargo features sdk-5-19 and sdk-6-5 at compile time.

I added additional safe rust wrappers for common tasks like a fast screen update and a wrapper for the dialog API. The library also previously had issues with an vertical draw offset which could be fixed by an setting a specific application flag through the pocketbook SDK.

Additionally I added an adapter crate inkview-eg which implements the embedded_graphics::DrawTarget trait for the display. Doing that enables drawing with convenient API for graphics primitives and even things like text and bitmap images.

pb-cheatsheet Application

Overview

Establishing a solid development environment and improvements to the inkview-rs library took a few weeks while working on-and-off on it. But after that I could focus on the application itself. The code is entirely open-source under 'GPLv3' and can be found here

The host side is a CLI tool that is able to communicate with the GRPC server of the client. It should have the following features:

ability to upload cheatsheets (images) to the client
register/delete associated tags
retrieve device state
ability to take screenshots and automatically upload them to the client
continuously report the current focused window

The client is a pocketbook specific application that holds and manages state and draws to the screen. For communication with the host it starts a GRPC server. It should be possible to display some stats like the reported focused window by pressing the Menu button. There should also be three distinct modes as UI:

Manual : the user can manually cycle through the uploaded cheatsheets
Automatic WM-Class : display cheatsheets for which tags are registered that match with the tags registered to the reported focused window class.
Screenshot : display the last uploaded screenshot

Now let's get into the details of how all of this is actually implemented.

Host

CLI

The following commands are implemented:

Usage: pb-cheatsheet-host --pb-grpc-addr <PB_GRPC_ADDR> <COMMAND>

Commands:
  report-focused-window   Continuously report focused window info to the client.
                               Intended to be run as a service
  get-screen-info         Get device screen info
  get-cheatsheets-info    Get cheatsheets info
  upload-cheatsheet       Upload a new cheatsheet that gets displayed when the added tags match the tags
                               that are added to the wm class of the reported window.
                               The image size is adjusted depending on the reported screen info of the client
  remove-cheatsheet       Remove a cheatsheet
  screenshot              Take a screenshot and upload it to the device for transient display
  clear-screenshot        Clear the screenshot
  add-cheatsheet-tags     Add cheatsheet tags
  remove-cheatsheet-tags  Remove cheatsheet tags
  add-wm-class-tags       Add wm class tags
  remove-wm-class-tags    Remove wm class tags
  help                    Print this message or the help of the given subcommand(s)

For example the output when fetching cheatsheet info looks like this:

./assets/host-get-cheatsheets-info.png

Fetching Info

Retrieving the focused window is implemented only for the Gnome desktop environment, because unfortunately there is no universal Wayland protocol yet that would enable a desktop environment independent implementation.

The Gnome extension focused-window-dbus must also be installed. It reports the focused window through a DBus interface.

The host application uses zbus to fetch this info from the extension. The core Rust code to do that looks like this:

#[proxy(
    default_service = "org.gnome.Shell",
    default_path = "/org/gnome/shell/extensions/FocusedWindow",
    interface = "org.gnome.shell.extensions.FocusedWindow"
)]
trait FocusedWindow {
    async fn get(&self) -> Result<String>;
}

pub(crate) async fn get_focused_window_info<'a>(
    connection: &Connection,
) -> anyhow::Result<FocusedWindowInfo> {
    let proxy = FocusedWindowProxy::new(connection).await?;
    let val: serde_json::Value = serde_json::from_str(&proxy.get().await?)?;
    // ...
}

Service

The pb-cheatsheet-host report-focused-window command continuously fetches info about the current focused window and reports it to the client over GRPC when it changes.

This command can be a service that is started by systemd. A suitable .service file:

[Unit]
Description="pb-cheatsheet-host focused window reporter"
StartLimitIntervalSec=0
StartLimitBurst=0

[Service]
Environment="PB_GRPC_ADDR=<client-ip:51151>"
Environment="RUST_LOG=pb-cheatsheet-host=<log-level>"
ExecStart=%h/.cargo/bin/pb-cheatsheet-host report-focused-window
Restart=on-failure
RestartSec=30

[Install]
WantedBy=default.target

Client

The client starts a GRPC server listening to incoming procedure call requests and messages.

It does that in an asynchronous task, which sends messages over a channel to the handler. This handler also receives incoming messages from the pocketbook event main loop. Because the pocketbook SDK is not designed to be used in an async context, both "worlds", async and blocking, have to be combined carefully.

The client also holds some state about the cheatsheets, their tags and the UI.

Persistence

The cheatsheets and the metadata containing the tags are saved as files whenever changes are made and when the app is closed. The populated save directory looks like this:

save_dir/
  wm_class_tags.json # JSON file containing the tags that are associated with windows
  cheatsheet_app_1.cs # Raw cheatsheet image data about <app1>
  cheatsheet_app_1.json # Metadata about <app1> cheatsheet containing associated tags
  # ..

The metadata is saved as json with serde and serde_json.

For the cheatsheets images serde and bincode is used for maximizing efficiency for raw image data. The images themselves are pre-converted to 8-bit grayscale and scaled the client display resolution on the host.

Dynamic Cheatsheets using Tags

The tags system works like this:

Every window class is associated with different tags grouping different cheatsheets. For every tag that matches, the corresponding cheatsheet is displayed. The user is able to cycle through them with the Prev/Next buttons.

Screenshot feature

The host uses the XDG Desktop Screenshot Portal to open the screenshot tool and fetch the save path. It then reads the image data from the path, prepares and sends it to the client. The client then automatically switches to the Screenshot UI mode and displays that image. For users of dark mode UI's, there is the possibility to invert the screenshot colors with a flag.

RPC

For communication and remote procedure calls between host and client GRPC is used. The protocol is specified in a protobuf .proto file. A snippet:

syntax = "proto3";
package pb_cheatsheet;

service PbCheatsheet {
  rpc FocusedWindow(FocusedWindowInfo) returns (Empty) {}
  rpc GetScreenInfo(Empty) returns (ScreenInfo) {}
  rpc GetCheatsheetsInfo(Empty) returns (CheatsheetsInfo) {}
  rpc UploadCheatsheet(UploadCheatsheetRequest) returns (Empty) {}
  // ...
}

// ...

The tonic crate generates Rust code for client and server out of it.

Cheatsheets

To create cheatsheets easily there is a typst template that is optimized for 'keyboard-key' -> 'functionality' pair lists, uses an E-Paper suitable font and scales it's content for the lower resolution while reducing margins to maximize available space.

A snippet how this template is used for a git cheatsheet:

#import "./cheat-template.typ": cheat

#show: cheat.with(
  title: [Git Cheatsheet],
  icon: image("icons/git.svg"),
)

#table(
  table.header[Adding changes],
  [`git add -u <path>`], [Add all tracked files to the *staging area*.],
  [`git add -p <path>`], [Interactively pick which files to *stage*],
)

// ..

The rendered image looks like this:

Conclusion

It was a lot of work, but I gained a lot of knowledge about creating a development environment and the additional hurdles when developing for remote embedded linux systems. Generating bindings that do dynamic linking and using 'zigbuild' stand out as very useful tools to avoid most of the pain of that would come with a 'regular' cross-build setup.

I also really enjoyed using GRPC as the remote procedure call mechanism between host and client. It seems very efficient, well thought out with regards to forward and backwards compatibility, stable and on top of it, was also quite easy and enjoyable to work with. So it will be one of my first choices in future projects that need to do RPC.

I think this turned out pretty well, I achieved basically all of the goals for features that I had planned before I started.

Table of Contents