Many recent advances in the scale, cost, and connectivity of hardware have brought about the era of the Internet of Things (IoT), in which numerous objects in our every-day lives now contain networked computing capabilities.
Most notably, this has brought with it an astonishing array of new device market sectors, form factors, and use cases, which purport to make our lives easier, simpler, and safer. However, these new network-connected devices have opened a massive new attack surface for attackers to exploit, and a challenging landscape for defenders, which could undermine these benefits.
Unfortunately for analysts, the code of these new ubiquitous, low-power embedded systems increasingly utilize \emph{monolithic firmware images}, in which code, libraries, and data are intermixed, without a conventional operating system or metadata needed by third-party analyses. This combines with the extreme hardware-software coupling found in firmware to create a complex software environment, that is extremely difficult to model for the purposes of conventional program analyses. This has created two significant gaps in the vulnerability discovery lifecycle: modeling of the execution environment, and patching of vulnerabilities, even without a manufacturer's help. As a result, devices with monolithic firmware have been largely ignored by academia and industry thus far.
In this dissertation, we will showcase novel techniques to help bridge the gaps in analysis capabilities between traditional programs and the monolithic firmware of deeply-embedded systems. To overcome the environment modeling problem, we will focus on re-hosting, the act of transferring a program from one execution environment to another, typically from a hardware environment to an emulated one. Re-hosting an important pre-requisite to fuzzing or symbolic execution used for vulnerability discovery, as it allows execution environments to be freely copied and scaled. we will propose two techniques: an automated approach based on observing and modeling the original device's hardware, and a semi-automatic approach based on abstracting away and modeling parts of the firmware itself. we will show how these techniques can allow us to re-host many firmware images, and can be directly used for security analyses to find both synthetic and previously-undiscovered real-world vulnerabilities.
Finally, we will address the issue of patching monolithic firmware. While numerous steps are required for an analyst to produce a final patched firmware image, we focus on automating three critical steps: finding sources of attacker-controlled input, finding a safe location to insert a payload, and locating self-checks intended to thwart modification. We combined these techniques into a system able to produce modified firmware, and used it to correct serious safety-critical issues in three products from the medical, industrial automation, and engineering sectors.
Through both re-hosting and patching, this work completes the vulnerability lifecycle for embedded devices, and helps make our connected world safer.