Lamm Consulting AB

Sibyllegatan 50 114 43 Stockholm

Hello SBOM

Introduction

The process of building software could be compared with a factory assembly line where the product is on on top of the belt and the toolchain is the assemly line itself.

Let’s make it easy by keeping the product to four-line of source code, so we can focus on the underlying assembly line ( toolchain and platform ).

Hello world is the most famous C program , but in the era of Software-bill-of-material it raises several challenges.

  • Include directives
  • The C compiler
  • C library
  • Linux Kernel

Hello SBOM World

This is the product

#include <stdio.h>
int main(void) 
{ 
 printf("Hello sbom\n"); 
}

Hello World by itself is not an application that handles user data or performs complex tasks, so there are usually no security concerns associated with it. “Chat-GPT”

In AI we trust so let’s assume that Hello SBOM is bug-free. Next step is the generation of a binary, for that we need a toolchain.

Toolchain

The GCC compiler is the most important part of the assembly line, it is also a very complex software and and therefore vulnerabilities exists security issues. Most distributions like Ubuntu, Redhat, Debian etc provides their own compiler, but you can also build you own group up with Crosstool-NG

Building software with GCC actives several steps.

Step Option Comment
Preprocessing -E #include <stdio.h> will be replaced with the content of /usr/include/stdio.h
Generate Assembly -S Output a textfile with assemly code interleaved with commented C code
Optimize -O Optimize execution time and size for the final binary
Staticly linked –static Binary interfaces directly with kernel

Preprocessor

The preprocessor is basically a text processor with the capability of including other files. There is also simple programming support with variables and conditions

gcc -std=c89  -E -o bom.txt sbom.c

Depending of which of the C language being used, different files being included.

gcc -std=c99  -E -o bom99.txt sbom.c

Preprocesser output

Difference C89 and C99

Compile

Compile source code into binary

gcc  sbom.c -lc -o sbom

Optimize

However, there are gazillions of different parameters to control how to generate a binary. Two frequent flags are optimization (-Olevel ) and (-S) for generating human-readable assembly code. The latter is very important to ensure that private data and keys are handled correctly.

gcc -O2 -S -o bom.s bom.c
gcc -O2 -S -o bom.ss bom.c
diff bom.s bom.ss > bom.s.diff.txt

Assembler output

Difference when optimization is applied

Linking

Static linking

gcc  sbom.c -lc -static -o sbom

Dynamic linking

gcc  sbom.c -lc -o sbom
Type of linking size of binary ( bytes )
Static 900344
Dynamic 15960

executing the dynamicly linked binary. With strace the calls to C-library could be traced

strace ./sbom

openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3

Hello sbom

C library is the interface to the Linux kernel and comes in many many flavors.

Library License
Gnu libc LGPL 2.1+ w/exceptions
Muslc MIT
Relibc MIT

There might be security issues with the C library such as CVE-2019-1010022

GNU Libc current is affected by: Mitigation bypass. The impact is: Attacker may bypass stack guard protection. The component is: nptl. The attack vector is: Exploit stack buffer overflow vulnerability and use this bypass vulnerability to bypass stack guard. NOTE: Upstream comments indicate “this is being treated as a non-security bug and no real threat.”

The kernel contains 30+ millions lines of code and and have also security issues

Cybersecurity Resilience Act

The underlying C library and Linux kernel provides a software platform so we better check CRA and article 10

Article 10

In order not to hamper innovation or research, free and open-source software developed or supplied outside the course of a commercial activity should not be covered by this Regulation. This is in particular the case for software, including its source code and modified versions, that is openly shared and freely accessible, usable, modifiable and redistributable. In the context of software, a commercial activity might be characterized not only by charging a price for a product, but also by charging a price for technical support services, by providing a software platform through which the manufacturer monetises other services, or by the use of personal data for reasons other than exclusively for improving the security, compatibility or interoperability of the software.

Assumptions

In addition to what we can download and build from internet, there are still layers that is hard to verify and must be trusted.

Firmware

T2data was engaged in a research project together with Ericsson

We contributed by writing a simple bootloader from scratch inspired by two projects:

The project was originally designed for the Juno development board and includes the very first instruction residing in ROM.

To keep track of headerfiles being used we selected Subversion since it support keywords

All header files used in the project contains a preprocessor variable assigned a value from subversion

bignum.h

#define BIGNUM_H_DEF "$HeadURL: $ $Revision: $"

In the implementation that includes the header file a static variable get the value assigned to the header macro. All keywords is allocated to a section .keywords that is removed furing the final step in the build process.

bignum.c

#include "bignum.h"
char secure_bios_bignum_header[]  __attribute__ ((section (".keywords")))  = BIGNUM_H_DEF;

Each compiled artifact could then be backtracked to source by executing the ident command line utility.

$ ident bignum.o

 bignum.o:
    $HeadURL: https://cm-ext.dev.oniteo.com/svn/nanodev/packages/secure_bios/branches/haspoc/hikey/secure_bios/pki/bignum.c $
    $Revision: 26005 $
    $HeadURL: https://cm-ext.dev.oniteo.com/svn/nanodev/packages/secure_bios/branches/haspoc/hikey/secure_bios/pki/bignum.h $
    $Revision: 17411 $

Keeping the keywords in a dedicated section make it easy to keep or drop the information in different instances of the build pipelines.

KEEP(*(.keywords))

Hardware

  • Hardware Architecture is all about the instruction set for a developer

Conclusion

Even if the product on top of the assembly line is very small and without bugs, the assembly line itself is very complex. The overall security depends on both the product and the tool. The tool also includes the platform, C-library, and Linux kernel, in addition, we assume that firmware and hardware are trusted.

Trust

In 1984 Ken Thompson wrote the paper Reflections on trusting Trust, that address how to trust the output from a very simple program being compiled.

Trust

Uboot