6 min read

Software Protection

Photo by Catrin Johnson / Unsplash
Photo by Catrin Johnson / Unsplash

Threat Models

A traditional threat model used for software security is the following: software runs remotely, in the form of a dynamic web page, a remote API, or a network daemon, as shown in the figure below:

Traditional Threat Model used for Software Security
Traditional Threat Model used for Software Security

To protect that software from external threat vectors (the red arrows), you can rely on many different defense techniques (the blue ring) that are implemented during the software development lifecycle: applying secure coding practices, such as implementing robust input validation, secure memory management schemes, strong authentication and authorization methods, state-of-the-art cryptographic techniques, sandboxing, thorough functional and security testing, proper logging, security monitoring and alerting, secure deployment and so on.

However, the threat model is quite different when an application is delivered to end users. In this case, the threat model, called Man-at-the-End (MatE), is represented by the following figure:

Man-at-the-End Threat Model
Man-at-the-End Threat Model

In this scenario, the threat vectors are, in a sense, inside the security perimeter: the adversary can work quietly at home, well protected by their own defenses. In particular, the adversary has a white-box access to the application and is able to:

  • analyse the internal workings of the software application, i.e., perform static or dynamic reverse engineering;
  • modify the software application;
  • and redistribute the software application (being modified or not).

Concretely, attack scenarios include:

  • carving out a piece of code part in a software application and embedding the carved code in another application;
  • extracting confidential data or manipulating the results of a program running on a computer managed by a malicious cloud operator;
  • decrypting some encrypted media by extracting the decryption key in software that implements a Digital Rights Management (DRM) system;
  • modifying software that implements a DRM system to allow it to ignore business rules enforcement;
  • etc.

To help protect against these threats, you can implement various software protection techniques with the goal of:

  • confuse the adversary at the time of reverse engineering;
  • protect the software from unauthorized modifications;
  • mark the software so that unauthorized redistribution can be traced.

Overview of Protection Techniques

Software Obfuscation

Obfuscation is the act of creating source or machine code that is difficult for humans to understand.

Source: Wikipedia

Software obfuscation consists of breaking the abstractions and modularity of well-written software using code transformations. There are dozens of published techniques, but none of them provide bulletproof security. Software obfuscation can be done at the source code level, at the intermediate representation (bytecode) level, or at the machine code level, depending on the tools.

Software Tamper-Proofing

Software tamper-proofing is a way to make it difficult for someone to change the inner workings of an application. There are several ways to respond to an integrity failure: the attacked application can randomly crash, alert its backend, enter an infinite loop, sabotage its functionality, etc. Tamper-proofing techniques include code hashing, self-modifying code, packaging, etc.

Code Watermarking

The idea behind code watermarking is to distribute individualized executables in order to detect illegal redistribution, not to avoid it. Similar to multimedia watermarking, code watermarking techniques must be able to withstand further modifications of the binary. Code watermarking techniques include software diversity, randomized compilation, etc.

Anti-Debugging / Emulation

The idea of anti-debugging and anti-emulation defenses is to make common analysis techniques and tools such as debuggers, disassemblers, binary instrumentation, emulators, etc. more difficult for a reverse engineer to use. Most of the time it's just tricks and other random magic and heuristics!

Hardware-Based Techniques

The main reason for software piracy is that digital objects are trivial to clone. At the same time, cloning of physical objects requires much more skill and expensive tools.

As a result, the basic idea behind several anti-piracy strategies is as follows: the application will not work unless the user proves possession of a physical object (e.g., USB dongles, storage media that are difficult to clone using standard materials, a trusted execution environment (TEE), a secure element, etc.). Another approach is that the inner workings of the application rely on physical properties of the underlying hardware that are costly to identify and emulate.

Leveraging Network Communications

When network communications are available, you can use the time dimension and a trusted backend to protect software. Here are some ways you can do that: implement protections based on timings; use remote attestation; implement last-minute code decryption; etc.

White-Box Cryptography

Applications that use cryptography are particularly vulnerable to reverse engineering because they deal with secrets, such as symmetric and private keys.

White-box cryptography is an approach to creating symmetric cryptographic algorithms using secrets that are difficult to extract. In other words, this method turns a block cipher into a (pseudo-)public-key algorithm by defining an implementation that embeds a secret that is difficult to extract.

Many academic proposals have been made, but they have all been broken. Some of them were mathematically flawed, while others were vulnerable to side-channel attacks. Nevertheless, industry sometimes has no choice but to implement white-box cryptography.

Homomorphic Encryption

One of the most important goals in information security is to be able to perform computations on encrypted data, i.e., without relying on trusted hardware. A cryptosystem that supports performing general computations on ciphertexts is known as fully homomorphic encryption (FHE).

A first plausible construction was published by Craig Gentry in 2009. Since then, a lot of work has been done and several startups are currently active in the field, especially in the area of privacy-preserving machine learning.

Domains of Application

Today, no software protection technique is 100% effective. All you can do is buy the adversary time and discourage the less skilled. This will continue to be the case until secure encrypted computing becomes fast enough to be ubiquitous, and confidential computing gains traction on mobile platforms.

Nevertheless, software protection techniques are used in a variety of scenarios, which we will now briefly discuss.

Gaming Industry

In the past, hackers have often targeted the gaming industry. Large online games need to ensure that their software is trustworthy to prevent cheating. The gaming industry often uses techniques such as obfuscation and remote attestation to make their software difficult to modify.

Intellectual Property Protection

A typical example where software protection techniques were used to protect some intellectual property is the proprietary communication protocol of Skype. Clients were heavily obfuscated to avoid concurrent (open-source) implementations. The protocol was partly reverse engineered by highly skilled researchers, but without any real consequence (see the BlackHat Europe 2006 talk Silver Needle in the Skype.)

License Management

Since the 1980s, there has been a cat-and-mouse game between license management implementations and crackers. The idea behind license management is to ensure that some business rules in the form of a software license are enforced on the client side (e.g. limited time trial, installation on a single PC, limited functionality, etc.). Software integrity is a key issue in this scenario. Today, security often involves online communication with a backend and may also rely on hardware such as USB dongles or other trusted execution environments.

Digital Rights Management

Digital Rights Management (DRM) systems use software protection techniques a lot. State-of-the-art systems, such as Pay-TV conditional access systems, often rely on custom hardware-based protection schemes. However, it’s not always possible to use hardware-based solutions, mainly due to cost and logistical issues. Most software-only DRMs have been breached in one way or another: DVDs, Blu-rays, etc.

Mobile Applications Hardening

Today's mobile apps handle sensitive information such as financial and health data. Most apps are designed with security in mind, but they're still vulnerable to the MatE threat. You can think about implementing an X.509v3 certificate verification mechanism, including client-side certificate pinning: how would you hack it? One of the goals of software protection is to make sure that traditional security mechanisms work properly on the client side and can't be easily removed or modified.

Malware and Cyber-Attacks

The malware industry is a top user of software protection techniques. Their goal consists to slow down automatic and/or human analysis systems. On their side, cyber-criminals and red teams try to hide from sophisticated intrusion detection systems.

In the next episode, I’ll cover code obfuscation. Stay tuned!


Thanks for reading Crumbs of Cybersecurity! Subscribe for free to receive new posts and support my work.