By Kurt Hopfer, Cloud Architect at Effectual
In 2017, the Defense Advanced Research Projects Agency (DARPA) engaged research and development firm Galois to lead the BESSPIN project (Balancing Evaluation of System Security Properties with Industrial Needs) as part of its System Security Integrated through Hardware and Firmware (SSITH) program.
The objective was to develop tools and techniques to measure the effectiveness of SSITH hardware security architectures, as well as to establish a set of “baseline” Government Furnished Equipment (GFE) systems-on-chip (SoCs) without hardware security enhancements.
While Galois’s initial work on BESSPIN was carried out entirely using on-premises FPGA resources, the pain points of scaling out to a secure, widely-available bug bounty program soon emerged.
It was clear that researchers needed to be able to stress test SSITH hardware platforms without having to acquire their own dedicated hardware and infrastructure. Galois leveraged Amazon EC2 F1 instances to scale infrastructure, increase efficiencies, and accelerate FPGA development.
The company then engaged AWS Premier Consulting Partner Effectual to ensure a secure and reliable AWS environment, as well as to develop a serverless web application that allowed click-button FPGA SoC provisioning to red team researchers for the different processor variants.
The result was DARPA’s first public bug bounty program—Finding Exploits to Thwart Tampering (FETT).
Galois is a privately held U.S.-owned and -operated firm specializing in the research and development of trustworthy computing systems and associated new technologies.
The company partners with academic researchers to support the R&D efforts of key government agencies such as DARPA, the Department of Defense (DoD), NASA, the Department of Energy, the Department of Homeland Security, and the intelligence community.
Effectual is an AWS Premier Consulting Partner and cloud-first managed and professional services company working with enterprise and public sector customers to mitigate risk and enable modernization.
The company’s team of expert technologists applies proven methodologies across Amazon Web Services (AWS) and VMware Cloud on AWS to solve business challenges with modernization services, including strategy and ideation, migration, application development, and modern cloud management.
The goal of SSITH is to develop hardware security architectures and associated design tools to protect systems against classes of hardware vulnerabilities exploited through software in DoD and commercial electronic systems.
The BESSPIN project is focused on developing tools and techniques to measure the effectiveness of SSITH hardware security architectures, as well as a set of “baseline” GFE SoCs without hardware security enhancements.
The innovation and impact of secure hardware developed by other SSITH performers, like SRI/Cambridge and Lockheed Martin, is then measured against these baseline systems.
Migration Planning for Security at Scale with AWS
Galois’s initial work on BESSPIN was carried out entirely using on-premises FPGA resources. This included developing the GFE; developing measurement techniques and tools for evaluating secure hardware; and case studies of applications on secure hardware devices, such as the “smart ballot box” Galois brought to DEF CON in 2019.
Though the limited size and scope of initial development on-premises worked reasonably well, the pain points of scaling out to a secure, widely-available bug bounty program soon emerged.
Primarily, there was a clear need for a more flexible infrastructure that would not require in-house maintenance. Galois had previously deployed and tracked expensive hardware for all of the SSITH performers so that everyone would have identical infrastructure with which to work.
Figure 1 – Initial cloud architecture concept.
However, there was a 6-8 week lead time for scaling up development and acquiring additional FPGA development boards. In addition, making them available for a wider bug bounty program would require maintaining considerable internal infrastructure to host the FPGA development platform.
Advantages of Using Amazon EC2 F1 Instances
To address these issues, Galois analyzed the benefits of using Amazon EC2 F1 instances instead of an on-premises environment for the project. The company quickly concluded that building on easy-to-use F1 instances would improve efficiency, increase collaboration, and speed up deployment.
Using F1 instances allowed Galois to use Xilinx Ultrascale+ FPGAs, exactly as in the SSITH GFE. This provided good equivalents for all of the GFE system devices that Galois wanted to include in the cloud deployment, as well as the necessary development tool licenses.
F1 instances also eliminated the need to synchronize access to a small set of FPGA developer boards (and the systems containing them). This enabled greater flexibility and concurrent development.
Utilizing AWS CodePipeline, Galois was able to 1) efficiently test infrastructure functionality with several different SSITH processor configurations that were a part of the bug bounty program; and 2) deploy a customized version of the GFE that could run with both baseline processors and SSITH-protected processors.
For the half of the processors implemented in the Bluespec SystemVerilog (BSV) Hardware Design Language (HDL), Galois worked with Bluespec and the MIT and SRI/Cambridge teams to implement a design and deployment environment that would work equally well on the FPGA development boards and in F1 instances.
For the other half of the processors that are implemented in the Chisel HDL, Galois used FireSim as a deployment and execution environment. FireSim is an open-source computer system simulation platform from UC Berkeley’s Architecture Research Group designed specifically for F1 instances.
Expanding Red Team Testing with a Scalable Cloud-Native Solution
For the next phase of the project, Galois needed to expand its red team testing to a broader community of researchers, while ensuring the security and integrity of the new AWS environment.
The company contracted Effectual, a cloud-first managed and professional services company, to develop a highly scalable cloud-native solution.
With deep expertise across AWS for commercial enterprises and the public sector, the Effectual team was tasked with designing, developing, and integrating a cloud-native solution with F1 instances bootstrapped with project SSITH SoCs, and provisioning them at the click of a button from a fully serverless web app portal.
To accomplish this, Effectual needed to meet the following requirements:
- Provide AWS development environments to several autonomous teams composed of 100+ developers and researchers with unlimited access to FPGAs depending on role.
- Transform one-off, on-premises SSITH processors to an on-demand and scalable solution in the cloud for 200 concurrent FPGAs, ensuring a high availability of F1 instances.
- Refactor existing on-premises GitLab CI/CV to AWS CodePipeline, leveraging on-demand F1 instances and run appropriate regression tests for each processor variant.
- Automate provisioning of on-demand researcher F1 instances and bootstrap the host operating system (OS) and FPGA SoC while enabling communication by means of a message broker with the portal and comprehensive, real-time logging.
- Architect an environment that would remain stable and reliable as user traffic increased, and provide easy access for troubleshooting issues.
Taking the Use Case to Scale
Using AWS services and tools, Effectual was able to take the Galois use case quickly to scale and deliver a seamless web app portal for bug bounty researchers to use.
To begin, the team developed a multi-account structure using AWS Control Tower to ensure proper delineation and security guardrails were in place between different development environments.
Figure 2 – AWS Control Tower architecture.
Next, Effectual designed and implemented two separate architectures:
Supporting Cloud Infrastructure
This consisted of a multi-account, hub and spoke network implemented with AWS Control Tower. The AWS environment ensured proper guardrails existed between development, QA, and production resources.
Figure 3 – Researcher AWS Environment created dynamically by the FETT Portal application.
Specifically, a multi-region environment was deployed in the production account, leveraging AWS Transit Gateway for inter-region connectivity. This allowed researchers direct access to the FPGA SoCs providing stable, secure, and consistent access.
Fully Serverless React Web Application
This was hosted in Amazon Simple Storage Service (Amazon S3) and Amazon CloudFront with a backend API written in NodeJS on AWS Lambda interfacing with Amazon Aurora Serverless. This provided seamless scalability up and down during the FETT bug bounty program and beyond.
Figure 4 – AWS architecture of the serverless FETT web portal application.
Upon completion, a status message was then communicated back to the web portal by means of a message broker. Once bootstrapped, the Effectual team mapped an EC2 secondary IP one for one to the SoC, allowing all TCP and UDP port-specific packets to traverse the FPGA PCIe interface unfettered.
With further integration using Amazon CloudWatch agents on all F1 instances alongside rsyslog on the SoCs, Effectual was able to capture real-time logs and telemetry. This information was critical in debugging found exploits that could potentially deadlock or crash the SoC.
The first was a CI/CD pipeline to complement the serverless web application that was attached to development, QA, and production branches.
Figure 5 – FETT portal serverless CI/CD pipeline.
The second was a CI/CV pipeline to replace GitLab regression test runners, where any new pull request to the development branch notified CodePipeline to boot all processor variants using CloudFormation and ran regression tests on the SoC.
Figure 6 – Custom CI/CV pipeline for FETT target application pull requests.
These regression test responses were communicated back to AWS CodePipeline by means of a custom action. The Github pull request was then updated to include information regarding the status and outcome of the regression tests.
The result of these efforts was DARPA’s first public bug bounty program—Finding Exploits to Thwart Tampering (FETT). This enabled research teams working under SSITH to improve their hardware defenses by addressing any discovered weaknesses or bugs.
Leveraging Amazon EC F1 instances allowed Galois and Effectual to easily test and red team the RISC-V processors in a stable, reliable environment with the ability to scale FETT’s infrastructure, increase efficiencies, and accelerate FPGA development.
The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this post.
Effectual – AWS Partner Spotlight
Effectual is an AWS Premier Consulting Partner with the experience, expertise, and ability to execute modern strategies with app development, migration, and cloud management.
*Already worked with Effectual? Rate the Partner
*To review an AWS Partner, you must be a customer that has worked with them directly on a project.