Saturday, October 22, 2011

Reverse engineering a USB camera

I've been wanting to upgrade my CNC microscope but unfortunately hit a snag: the USB camera doesn't seem to be supported by Linux. This is a braindump of some of the things I tried and the results.

I think I first tried to do the captures through VMWare / vusb-analyzer. I had good results previously using it for reverse engineering a temperature probe protocol. This was relatively simple, but had to be done very carefully since vusb-analyezr isn't very fast and had difficulty handling the large packet dumps. I also had to direct VMWares log to a RAM disk as it couldn't keep up writing to my SSD. Even still, I tried decoding some images from the dumps and they had tearing problems. Heres the image captured from the AmScope program:


And heres a decoded image:


I can't remember if this image is actually from the captures or my libusb proof of concept program, but both had similar isuses anyway. The white square was just for size reference when I was comparing things. The first image seems to be dark while the camera is adjusting as if you look carefully the image is there.

I decided to see what else was out there. When I did some quick searches a lot of people seemed to be happy with SnoopyPro so I gave that a try. Heres a screenshot:



I found that the program wasn't very reliable and when it was working was only capturing control packets. The log file is also binary format that I didn't care to take time to learn so its not ideal for quick and dirty scripting. I thought it might be related to me using a VM but still had issues using on a real Windows XP system. No dice.

However, there is a derivative program called UsbSnoop that I had good results with. It creates a log file including some kernel details. Screenshot:


Sample log:

[7747 ms] >>> URB 43 going down >>>
-- URB_FUNCTION_VENDOR_DEVICE:
TransferFlags = 00000003 (USBD_TRANSFER_DIRECTION_IN, USBD_SHORT_TRANSFER_OK)
TransferBufferLength = 00000001
TransferBuffer = 00000000
TransferBufferMDL = fecaf148
UrbLink = 00000000
RequestTypeReservedBits = 00000000
Request = 0000000b
Value = 000001f4
Index = 00003012
[7750 ms] UsbSnoop - MyInternalIOCTLCompletion(f3898126) : fido=00000000, Irp=fed47510, Context=fecb5da0, IRQL=2
[7750 ms] <<< URB 43 coming back <<<
-- URB_FUNCTION_CONTROL_TRANSFER:
PipeHandle = fecbb1f0
TransferFlags = 0000000b (USBD_TRANSFER_DIRECTION_IN, USBD_SHORT_TRANSFER_OK)
TransferBufferLength = 00000001
TransferBuffer = 00000000
TransferBufferMDL = fecaf148
00000000: 08
UrbLink = 00000000
SetupPacket =
00000000: c0 0b f4 01 12 30 01 00

This is good as most of the Linux side programs use the VMWare mangled packets which are technically correct but don't give you the raw application behavior. Its also not trivial but not terribly bad to parse. While not the most powerful, it did perform basic logging very well to the point that I was able to decode a good (no tearing) image stream from its log file.

I noticed that Wireshark had USB support (and just about every other protocol known to man...) and decided to give it a whirl. I booted up Wireshark 1.2.17. After trying some captures I noticed some read buffer overflows and poor decoding. Needless to say I wasn't impressed with things as is. Later I realized that this was a horribly old version and compiled 1.6.2 and met with much better results. Since then I've been using it as my primary development platform. Heres a screenshot:


I was using some dumb script to convert the packets from the older Wireshark version from exported C arrays due to some issues with the other saves. I don't think I tried pcap at that point for w/e reason. In any case, I wanted to see if I could integrate things much smoother.

I first looked into writing a Wireshark plugin. The most basic type is a "dissector", although I'm not really sure that's what I wanted since Wireshark could already display USB fine. I briefly looked into the export logic but it seemed more integrated into the GUI than I wanted to deal with. Additionally, I'm not sure if it could have been implemented as a plugin and if it was it would have required a lot of work. Feel free to correct me if there is a nicer export plugin interface that I missed.

My next thought was to use libpcap. While I'm sure (I really am as you'll see...) that the C interface is fine, prototyping work is best done in a scripting language like Python. I first tried a few random Python pcap libraries but found that they tended to be Ethernet / network centric. pypcap seems to be the most popular or maybe even official interface to libpcap. Unfortunately, I couldn't get it to work without seg faulting upon opening my packets. I tried to scrub out the old pcap versions I had laying around, maybe it was still related to that. It has a way to regenerate the Lua bindings but I didn't want to spend too much time with messing with it. In any case, I decided for the meantime to just use the C interface (through C++). The interfaces are pretty and easy to use and I was able to write my replay generator with minimal effort. It generates lines like this:

//Generated from packet 202/203
n_rw = usb_control_msg(dev->udev, usb_rcvctrlpipe(dev->udev, 0), 0x0B, USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE, 0x023E, 0x305E, buff, 1, 500);
if (validate_read((char[]){0x08}, 1, buff, n_rw, "packet 202/203") < 0)
return 1;
//Generated from packet 204/205
n_rw = usb_control_msg(dev->udev, usb_sndctrlpipe(dev->udev, 0), 0x01, USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_DEVICE, 0x0003, 0x000F, NULL, 0, 500);
if (validate_write(0, n_rw, "packet 204/205") < 0)
return 1;
Which I figured was flexible enough for most purposes depending on how someone defined validate_read / validate_write. In my old version it was generated like this:

CAMERA_CONTROL_MESSAGE(USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE, 0x0B, 0x023E, 0x305E, buff, 1);
VALIDATE_READ((char[]){0x08}, 1, &buff, n_rw, "packet 84");
printk(KERN_ALERT "Generated from packet 85\n");
CAMERA_CONTROL_MESSAGE(USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_DEVICE, 0x01, 0x0003, 0x000F, NULL, 0);

Which required the user to define some macros to make it useful. I figured for most stuff I did the first was just fine even with the implict return. It might be a good idea to use the validate return code but that should be an easy fix later if I care.

To make the process more streamlined as I tried out different captures, my setup function looked something like this:

int replay_wireshark_setup_neo(struct uvscopetek *dev) {
sdbg("neo replay");
{
#include "replay.c"
}
sdbg("neo replay done");
return 0;
}
Which allowed me to easily regenerate the capture which seems to be working well enough at least for a first test. This is the program I used to decode images after cat'ing the dev node to a file. Full program here (WARNING: doesn't handle switching between VMWare and Linux well...may give you an oops).

I did two things to get familar with linux-usb to get to this point:
  • Read the "Linux Device Drivers" (LDD) chapter on USB
  • Read through the Linux usb device (didn't pay attention to host) sources
This gave me enough knowledge to do this simple test. If you have no kernel experience you might want to read some of the earlier LDD chapters for basics on memory management. Take care that you allocate memory as needed. I forgot that URB buffers need to be allocated and I wasted some time using a statically allocated buffer as a hack to try to prototype faster. The result was that data didn't get copied to it and was very confusing.

Now this so far will get us a driver working on a simple endpoint to device node. In order to move from my endpoint driver to the basic V4L2 driver I had to learn V4L2 basics. I started with the gspca V4L2 framework since a number of cameras use it. I believe UVC is a competing V4L2 framework but I don't know much about it. Anyway, the main gspca code is only a few thousand lines and isn't terribly complex so I recommend reading through it. One gotcha I had that was solved by reading through it was that the reference code I used reversed URBs for its bulk endpoint search. The reverse search code skips the first bulk endpoint since otherwise you'd want to use the forward search. This resulted in my endpoint not being detected but was confusing until I had a good enough understanding to know why that was happening. However, this didn't solve all of my problems and I did read through the V4L2 spec a bit. Its a hefty several hundred page doc but I'm sure you'd appreciate it if you wrote a serious app. Ultimately I read through the mplayer V4L2 plugin and a sample V4L2 app to get a better idea of the userland perspective.

After this I was able to convert the code over to use V4L2. Heres the 4L driver after I cleaned it up to remove some debug prints. Full WIP source here. First image it gave:


I have no idea why it started at 8 hours. Note that the colors on the right are correct. This seems to indicate there may be a sync column there since an even number of bytes in a bayer filter should not change on image boundaries. I haven't figured out how to sync frames yet.

I tried a variety of programs to display the stream. When I did a quick Google search for "linux usb camera" or something of that sort I got an Ubuntu page with some good resources. The first program on the list, Cheese, crashed with SIGSEGV on my driver. Not helpful. I found that mplayer was good for playing around with saved captures and vlc tended to be better for live captures. mplayer in particular exposes a lot of encoding options.

One mistake I made (but fortunately was able to realize very quickly) was that I tried to decode captures from "cat /dev/video0 >video0.bin". Why is this a bad idea? Well lets see some data first to see what my first heads up was. Here is an image taken through AmScope:


Heres is my crude bayer filter running on a capture (cat /dev/uvscopetek0 >uvscopetek0.bin) from my bulk endpoint device (ie no v4l involved, just map USB bulk endpoint to /dev/uvscopetek0):


Obviously I still need to do some work to improve brightness or whatever. Anyway, heres the same decode program ran on "cat /dev/video0 >video0.bin" output:


Hmm it doesn't look very good and was also going slow. Why is that? I then remembered from earlier two key pieces of information:
  • cat was buffering reads a lowish (1024 order of magnitude) number of fixed bytes at a time (as seen from the read() requested size in a kernel debug print)
  • v4l2 sends out 1 frame per read() call
The end result is that it kept reading the beginning of frames and throwing away the rest. This also explains why we keep seeing the top of the image above.


Heres some shots showing the result of shifting the byte offset and how it effects the color for my crude decoder:



The take-away is that the image still looks reasonable but the colors shift. Here is a true pixel position plot:

I was able to identify the correct decoding scheme by shifting around the red/blue/green positions until they matched the yellow, red, green colors on the wires.


One of the next challenges was to find the frame sync. I looked very carefully at the images and noticed that they got darker around the upper and left edges. Hmm. A color capture hexdump is reasonably noisy:

0000ed80 0a 08 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
0000ed90 0a 0a 0a 0a 0a 0a 0a 0b 0a 0b 0a 0a 0a 0a 0a 0a |................|
0000eda0 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
0000edb0 0a 0b 0b 0c 0c 0d 0c 0d 0c 0d 0b 0b 0b 0b 0b 0b |................|
0000edc0 0b 0c 0c 0d 0c 0d 0c 0d 0c 0d 0c 0c 0b 0b 0b 0b |................|
0000edd0 0a 0b 0a 0a 0a 0b 0a 0a 0a 0a 0a 0b 0a 0b 0a 0a |................|

But if you take away all of the light:
0000ed80 0a 08 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
0000ed90 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
*
0000f000 0a 08 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
0000f010 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
*
0000f280 0a 08 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
0000f290 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
*
0000f500 0a 08 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|
0000f510 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a 0a |................|

You get something much clearer. I'm not sure why I get lots of 0x0a...still looking into that.

Sunday, March 21, 2010

Linux boot: playing with initrd (initial RAM disk)

This is a tutorial on how initrd works from a user's perspective. I'm experimenting with using blogsplot to post tutorials. Going to post this in several stages, and edit as I go. Example was run on a VM running a custom compiled CentOS 5 Linux 2.6.18 kernel. This contains Red Hat Linux specific stuff, but is probably similar to other desktop/server Linux distros.
When your Linux system boots, it goes through a number of stages. From the very low level, we typically start by pressing the button to first go through a power on self test (POST). Next, the system loads some BIOS firmware so we can get a baseline for the hardware supported. Then, we figure out what device to boot to, usually a hard drive. If you have GRUB or another bootloader, these small applications know what you have on your disks and help you select and boot whichever operating system suits you. Linux then begins to load. Here is a screen shot of what early Linux boot looks like to about where it starts poking around for hard drives:
The "Booting 'CentOS (2.6-18.8-UV5)'" gives the GRUB human friendly name right before it actually boots our kernel. After a few other boot debug statements, we see that it uncompressed the kernel and boots. Finally, we see "Red Hat nash version 5.1.19.6 starting." This is about what happens in between those two and how we boot nash.
In the old days (and on embedded systems), booting was simple. You have a hard drive and a kernel. The kernel had a driver so it never had issues knowing how to handle the hard drive it was on or needed to boot to. But, as systems have grown, the kernel doesn't typically bundle all of the drivers in it. For booting, the most critical are file system drivers and hard drive controllers. So you compile a kernel module for your hard drive controller and put it on the hard drive. So we just need to get it off of the hard drive at boot so we can load the driver...no wait.... So we have a chicken and the egg problem: we need the hard drive driver to load the hard drive. initrd does the magic to solve this.
Lets take a look at whats in /boot. On my machine:
[mcmaster@gespenst initrd]$ ls -lh /boot/
(abbreviated)
total 49M
drwxr-xr-x 2 root root 1.0K Nov 12 21:18 extlinux
drwxr-xr-x 2 root root 1.0K Sep 3 07:30 grub
-rw------- 1 root root 3.0M Jan 14 2009 initrd-2.6.18.8-UV4.img
drwx------ 2 root root 12K Aug 4 2008 lost+found
-rw-r--r-- 1 root root 79K Mar 12 2009 message
-rw-r--r-- 1 root root 923K Jan 14 2009 System.map-2.6.18.8-UV4
-rw-r--r-- 1 root root 2.1M Jan 14 2009 vmlinuz-2.6.18.8-UV4
In order to boot, we need two key items: a kernel and some data to put into it. vmlinz is the Linux image. It gets its name from originally using z compression as it self decompresses. Other algorithms are also supported. The data is in the initrd image. Its an INITial Ram Disk for us to boot to. Lets see whats in it. Setup a sanbox directory somewhere on your system to play in:
[mcmaster@gespenst ~]$ mkdir ~/buffer/initrd
[mcmaster@gespenst ~]$ cd ~/buffer/initrd/
And copy over initrd so we can mess with it (on my system they were readable only by root):
[mcmaster@gespenst initrd]$ sudo cp /boot/initrd-2.6.18.8-UV4.img .
[mcmaster@gespenst initrd]$ sudo chown mcmaster/mcmaster initrd-2.6.18.8-UV4.img
What does trusty ol' "file" say about it?
[mcmaster@gespenst initrd]$ file initrd-2.6.18.8-UV4.img
initrd-2.6.18.8-UV4.img: gzip compressed data, from Unix, last modified: Wed Jan 14 09:52:20 2009, max compression
Ah ha! So lets uncompress it. Note that gunzip doesn't just do gzip format files, so it gets angry if it doesn't end in something standard. Either rename it to something ending in .gz or pipe it:
[mcmaster@gespenst initrd]$ cat initrd-2.6.18.8-UV4.img |gunzip >initrd-2.6.18.8-UV4
What did we get?
[mcmaster@gespenst initrd]$ file initrd-2.6.18.8-UV4
initrd-2.6.18.8-UV4: ASCII cpio archive (SVR4 with no CRC)
A cpio archive. And I always thought the cpio command was useless. Whats in the box?
[mcmaster@gespenst initrd]$ cpio --verbose -t (abbreviated)
lrwxrwxrwx 1 root root 3 Jan 14 2009 sbin -> bin
drwx------ 3 root root 0 Jan 14 2009 lib
-rw------- 1 root root 27840 Jan 14 2009 lib/dm-mirror.ko
-rw------- 1 root root 37268 Jan 14 2009 lib/ehci-hcd.ko
...
-rw------- 1 root root 159732 Jan 14 2009 lib/scsi_mod.ko
drwx------ 3 root root 0 Jan 14 2009 dev
crw------- 1 root root 4, 67 Jan 14 2009 dev/ttyS3
crw------- 1 root root 1, 5 Jan 14 2009 dev/zero
crw------- 1 root root 4, 10 Jan 14 2009 dev/tty10
drwx------ 3 root root 0 Jan 14 2009 etc
drwx------ 2 root root 0 Jan 14 2009 etc/lvm
-rw------- 1 root root 15911 Jan 14 2009 etc/lvm/lvm.conf
-rwx------ 1 root root 2354 Jan 14 2009 init
drwx------ 2 root root 0 Jan 14 2009 sys
drwx------ 2 root root 0 Jan 14 2009 proc
drwx------ 2 root root 0 Jan 14 2009 bin
-rwx------ 1 root root 852164 Jan 14 2009 bin/kpartx
-r-x------ 1 root root 1464040 Jan 14 2009 bin/lvm
-rwx------ 1 root root 2381980 Jan 14 2009 bin/nash
lrwxrwxrwx 1 root root 10 Jan 14 2009 bin/modprobe -> /sbin/nash
-rwx------ 1 root root 1038596 Jan 14 2009 bin/dmraid
-rwx------ 1 root root 470244 Jan 14 2009 bin/insmod
drwx------ 2 root root 0 Jan 14 2009 sysroot
Careful extracting, there are some devices in there which can have odd repercussions if created without need. Chances are you don't really want those created upon extract. Just run as a normal user and they will harmlessly flop. Lets extract it:
[mcmaster@gespenst initrd]$ mkdir decompressed
[mcmaster@gespenst initrd]$ cd decompressed/
[mcmaster@gespenst decompressed]$ cpio -i <../initrd-2.6.18.8-UV4
cpio: dev/ttyS3: Operation not permitted
cpio: dev/zero: Operation not permitted
...
cpio: dev/tty10: Operation not permitted
13718 blocks
[mcmaster@gespenst initrd]$ ls -lh decompressed/
total 32K
drwx------ 2 mcmaster mcmaster 4.0K Dec 30 18:03 bin
drwx------ 3 mcmaster mcmaster 4.0K Dec 30 18:03 dev
drwx------ 3 mcmaster mcmaster 4.0K Dec 30 18:03 etc
-rwx------ 1 mcmaster mcmaster 2.3K Dec 30 18:03 init
drwx------ 3 mcmaster mcmaster 4.0K Dec 30 18:03 lib
drwx------ 2 mcmaster mcmaster 4.0K Dec 30 18:03 proc
lrwxrwxrwx 1 mcmaster mcmaster 3 Dec 30 18:03 sbin -> bin
drwx------ 2 mcmaster mcmaster 4.0K Dec 30 18:03 sys
drwx------ 2 mcmaster mcmaster 4.0K Dec 30 18:03 sysroot
As non-root we couldn't make devices, so it went splat in some regards.
But that's okay, we just want to look at the normal files and know that those devices would have existed.
So, this is the initial filesystem loaded onto your box. What happens now is the file "init" is ran. I believe this is hard coded in the kernel. Lets see whats in there:
#!/bin/nash

mount -t proc /proc /proc
setquiet
echo Mounting proc filesystem
echo Mounting sysfs filesystem
mount -t sysfs /sys /sys
echo Creating /dev
mount -o mode=0755 -t tmpfs /dev /dev
mkdir /dev/pts
mount -t devpts -o gid=5,mode=620 /dev/pts /dev/pts
mkdir /dev/shm
mkdir /dev/mapper
echo Creating initial device nodes
mknod /dev/null c 1 3
mknod /dev/zero c 1 5
mknod /dev/systty c 4 0
...
mknod /dev/ttyS3 c 4 67
echo Setting up hotplug.
hotplug
echo Creating block device nodes.
mkblkdevs
echo "Loading ehci-hcd.ko module"
insmod /lib/ehci-hcd.ko
...
echo "Loading dm-snapshot.ko module"
insmod /lib/dm-snapshot.ko
echo Waiting for driver initialization.
stabilized --hash --interval 250 /proc/scsi/scsi
mkblkdevs
echo Scanning and configuring dmraid supported devices
echo Scanning logical volumes
lvm vgscan --ignorelockingfailure
echo Activating logical volumes
lvm vgchange -ay --ignorelockingfailure VolGroup00
resume /dev/VolGroup00/LogVol01
echo Creating root device.
mkrootdev -t ext3 -o defaults,ro /dev/VolGroup00/LogVol00
echo Mounting root filesystem.
mount /sysroot
echo Setting up other filesystems.
setuproot
echo Switching to new root and running init.
switchroot
Its a script executing with the "nash" interpreter. And now you know where /dev/null comes from and some other important devices. Basically, we mount some of the basic kernel information pseudo-filesystems (procfs, sysfs) and create important device nodes such as /dev/null that will make some programs freak out without. Next, we load in the device drivers we packed into the initrd image. Then can be seen in /lib. If you are using LVM, it will then scan for LVs now that we have the bootstrap filesystem and device drivers loaded.
Now heres my favorite part. We just mounted some bogus filesystem on /. We need to trash that so we can put our real filesystem on that. This is summarized in the last part of this script:
echo Creating root device.
mkrootdev -t ext3 -o defaults,ro /dev/VolGroup00/LogVol00
echo Mounting root filesystem.
mount /sysroot
echo Setting up other filesystems.
setuproot
echo Switching to new root and running init.
switchroot
We mount our real root at some random mount point /sysroot. Hmm thats not quite right, but getting closer. And then after a bit of prep, switchroot? What magic is that? As it turns out, there is a special black magic system call, pivot_root(2). From the man page pivot_root(2):
int pivot_root(const char *new_root, const char *put_old);

DESCRIPTION
pivot_root() moves the root file system of the current process to the directory put_old and makes new_root the new root file system of the current process.
Fancy. Once thats done, we are pretty much set. We have our expected filesystem mounted. The last thing to do is to call /sbin/init and we move on to our normal system startup.
Hopefully this gave you some background on how we get from GRUB to our normal boot. Any comments, suggestions, corrections, etc are most welcome!