Samsung R&D Institute Poland (SRPOL), based in Warsaw, was established in 2000 and is one of the largest R&D centers in the region. SRPOL is a vibrant research lab involved in a diverse range of projects within Samsung, including AI, Computer Vision, Mobile, Networks, IoT, Visual Display, and more.
My name is Tomasz Kuchta and I lead the development of novel program analysis techniques that automatically find low-level software bugs. By utilizing these techniques, we strive to enhance the reliability and security of software systems used by hundreds of millions of users. My research spans novel applications of symbolic execution, software reliability, and easier testing of complex software systems [1].
For example, the low-level part of the Android Open Source Project (AOSP) – a software system used on modern Android mobile devices – has over 70 million lines of code spread across 300,000 source files. The code base size, along with a multitude of possible end-product configurations, calls for specialized automated solutions. Let me just mention a few examples of such tools proposed by our team: Code Aware Services (CAS), Kernel Memory Flattening Module (KFLAT), and Security-oriented Entrypoint Analyzer for Linux(SEAL).
For example, low level part of AOSP – a software system used on modern Android mobile devices – has over 70 million lines of code spread across 300k source files. The code base size, along with multitude of possible end-product configurations, calls for specialized automated solutions. Let me just mention a few examples of such tools proposed by our team: CAS, KFLAT and SEAL.
CAS [2, 5, 7] is a system which allows to extract precise build information from a large product, such as a mobile phone, and help automate various source code operations. In a nutshell, thanks to CAS we can precisely tell which files out of the 300k source files in AOSP are really used in a given final product and even which parts of the files are used. Moreover, we can extract precise code information from the used source code for further automatic analysis by other tools.
KFLAT [3, 9] is a novel tool for performing selective memory dumps on a source code granularity level. The dumps can be captured on a device and restored on a developer machine for analysis, testing or debugging with the exact same source code structures.
SEAL [10] is a dynamic analysis tool installed on the end device. This tool automatically creates mapping between the files representing pieces of hardware and the functions in the kernel handling the requests to the hardware. As a result, we know which code in the kernel can be activated by the applications.
We believe the tools we develop in our team could be used more widely for bug finding, software analysis and software engineering in general. As a result, we often present our work at international venues [2-6]. We have also made some of our tools open source [7-10].
There are many low-level critical software systems that billions of people rely on every day: automotive, operating systems, and telecom infrastructure being just a few examples. Programming such systems is challenging; the developers must be well aware of the underlying hardware constraints and must keep good track of the program memory. In the case of various communication protocols, the developers also need to design and manage the program state well, e.g., not all message types can be accepted before a user is authenticated to the network.
Software bugs can reside on different levels, starting from generic errors related to incorrect memory management all the way up to logical issues, such as an incorrect protocol state. Importantly , although some bug classes are defeated by certain programming languages, there will always be another class that should be considered. In other words, the type of bugs might change, but the bugs will not go away.
Given how much we rely on software systems, a good software testing and analysis practice is not merely “nice to have,” it is paramount. Unfortunately, the growing complexity of software systems makes testing them thoroughly an increasingly challenging task. Moreover, some critical components are inherently difficult to test. For example, consider testing a mobile phone modem, i.e., a piece of hardware responsible for handling cellular network traffic. A modem testing environment should ideally involve setting up a cellular network and sending messages over the air, which is nontrivial. Another great example is a bootloader – a piece of software that starts the operating system, i.e., it is executed even before the user can interact with the device. Now, imagine testing a bootloader on an infotainment system of a car, in a smart TV, or on a small IoT device. Usually, a good automated testing run involves running a piece of software thousands of times with various inputs, but how do we achieve that with a software component such as a bootloader?
As we can see, we need to conduct constant research into the different types of automated software testing techniques. Indeed, the research fields of software reliability, verification, and testing have been very active for the past several decades. In particular, dynamic software testing has gained much traction in recent years. In our group, we strive to extend state-of-the-art solutions customized for low-level domains and scalability.
Finally, given that we are going to see a lot of code generated by AI in the near future, it is crucial to ensure that the generated code is trustworthy. We do the same for software written by human developers, just the velocity at which the programs could be produced with AI is something we have not seen before, which opens up new challenges. This means that the program analysis and testing techniques are not only here to stay but are also likely to gain even more traction than before.
The work we do has many rewarding aspects. One of them is making sure devices used by hundreds of millions of users are more reliable and secure, so there is also an element of social responsibility.
The main project I am most fond of is Auto Off-Target, or AoT for short. Our team at SRPOL developed this project from scratch, and it ended up being published at the premier software engineering forum, the International Conference on Automated Software Engineering (ASE), in 2022 [11]. AoT makes it possible to extract parts of larger system code and test them on a different device. For example, we can extract a piece of code that normally runs on a mobile device and test it on a developer machine. In particular, AoT can be applied in hard-to-test scenarios, some examples of which were mentioned earlier.
The idea to extract and create a test harness for a piece of code has been previously used in the security community, but the novel part is that we automated this process for low-level system code.
AoT is one of those ideas that looked somewhat unrealistic at the start, but – surprisingly (!) – turned out to be working quite well.
One challenging aspect of the project is the simulation of program state. Since we take parts of a larger system and try to run them in isolation, we need to re-create the system state, such as the objects and values in memory. The challenge here is to provide a state which at the same time is close to the original but also provides some room for change. If we diverge too much from the original, we might be detecting issues that don’t exist in reality. On the other hand, if we are too strict, we might not be able to explore some paths in the tested program. This is still an interesting open research problem.
It has been very rewarding to see the progress of this project from its inception to fruition, growing in a great environment and being supported by my talented colleagues.
I would like to see more and more software being thoroughly tested at scale. I would like to be able to not only say that we tested a piece of software, but also provide a high level of assurance that the testing was thorough. In other words, it would be great to move the needle from testing to testing with guarantees, especially for complex systems.
If we could provide the right tools for developers to do this easily, that would be an excellent achievement.
I believe that some important aspects of the art of software testing and software reliability in general are: (1) an internal drive to improve and fix things, and (2) an ability to imagine and analyze how things can be broken. If you see these traits in you, that’s a good start! On top of that, I would advise you to get familiar with common techniques and tools used in the industry, in particular dynamic software testing tools and fuzzing. Open source software provides a great opportunity to learn how to test, find issues and fix them. I hope that helps!
[1] https://scholar.google.com/citations?user=GFlu-cIAAAAJ&hl=en&oi=ao
[2] Linux Security Summit (LSS) 2022 (CAS): https://youtu.be/M7gl7MFU_Bc?t=648
[3] Open Source Summit (OSS) 2023 (KFLAT): https://youtu.be/Ynunpuk-Vfo?si=jkzDIQqsEvLUP9k9
[4] International KLEE Workshop 2022: https://youtu.be/Xzn_kmtW3_c?si=z4xFxO56u7PLbgMT
[5] Developer Productivity Engineering (DPE) Summit 2023: https://youtu.be/FZrhHgor4NE?si=8x6ke6ulmTAl87V5
[6] International KLEE Workshop 2024: https://www.youtube.com/watch?v=rQGB_fk253g
[7] https://github.com/Samsung/CAS
[8] https://github.com/Samsung/auto_off_target
[9] https://github.com/Samsung/KFLAT
[10] https://github.com/Samsung/SEAL
[11] Paper: https://dl.acm.org/doi/10.1145/3551349.3556915