API Hashing — Why Malware Loves (And You Should Care)

Nikhil gupta
24 min read5 days ago

--

Imagine this: you’re a cybersecurity expert, eyes glued to your screen, thinking you’ve caught that pesky malware red-handed. You’ve seen the patterns, you’ve analyzed the system calls — but wait… it’s not that simple. The malware seems to have learned a few tricks of its own. It’s like playing a game of hide-and-seek with a ninja who keeps changing its name every time you try to call it out….

Welcome to the world of API hashing — where malware uses clever disguise to make system calls look like a completely different creature. This trickery isn’t just for the amateurs; even advanced tools like Cobalt Strike rely on API hashing to fly under the radar. But how does it manage to dodge detection, even when the OS throws kernel-level security at it? Stick around, and let’s uncover how attackers and defenders alike are caught in this high-stakes game of cat and mouse, where the rules keep changing and the stakes get higher.

Windows API Basics & Resolution

Before we dive into API hashing, we need to understand:

  • How Windows APIs are typically resolved.
  • The role of LoadLibrary() and GetProcAddress().
  • How Windows loads DLLs and finds function addresses.

How Windows Resolves API Calls

Windows provides APIs in DLLs (e.g., kernel32.dll, ntdll.dll). When a program calls an API like CreateFileA(), the OS resolves its actual address from:

  1. Import Address Table (IAT) — When a program explicitly links against a DLL, Windows resolves the function address at load time.
  2. Export Address Table (EAT) — DLLs export functions, and their addresses are stored here.
  3. LoadLibrary() and GetProcAddress() – Used for dynamic API resolution.

Code Example: Resolving API Manually

Here’s a basic C++ snippet demonstrating manual API resolution using LoadLibrary() and GetProcAddress()

#include <windows.h>
#include <iostream>

int main() {
// Load kernel32.dll
HMODULE hKernel32 = LoadLibraryA("kernel32.dll");
if (!hKernel32) {
std::cerr << "Failed to load kernel32.dll\n";
return 1;
}

// Resolve address of GetProcAddress dynamically
FARPROC pGetProcAddress = GetProcAddress(hKernel32, "GetProcAddress");
if (!pGetProcAddress) {
std::cerr << "Failed to resolve GetProcAddress\n";
return 1;
}

std::cout << "GetProcAddress found at: " << pGetProcAddress << std::endl;

// Clean up
FreeLibrary(hKernel32);

return 0;
}

Explanation

LoadLibraryA("kernel32.dll")

  • Loads kernel32.dll into the process memory.
  • Returns a handle (base address) of the module.

GetProcAddress(hKernel32, "GetProcAddress")

  • Retrieves the actual address of GetProcAddress() from the loaded DLL.
  • This is how Windows normally resolves API calls dynamically.

Why is this important?

  • Malware avoids calling GetProcAddress() directly because security tools monitor it.
  • Instead, attackers manually find function addresses — this leads us to API hashing.

Manually Resolving API Addresses Without GetProcAddress()

Since malware often avoids calling GetProcAddress() directly (to evade detection), we need to understand how API addresses can be manually resolved.

Key Concepts Before the Code

PEB (Process Environment Block)

  • A structure in each process containing information about loaded modules (DLLs).
  • Can be accessed via the fs:[0x30] (x86) or gs:[0x60] (x64) register.

PEB_LDR_DATA & LDR_MODULE

  • These structures help enumerate loaded modules in a process.

Export Address Table (EAT)

  • A section in a DLL that holds function names and addresses.

How to Manually Resolve API Addresses

Instead of GetProcAddress(), we:

  1. Walk the PEB to locate loaded DLLs.
  2. Find kernel32.dll (or any needed DLL).
  3. Parse the Export Address Table of the DLL.
  4. Locate the function address by comparing names.

Code: Manually Resolving API Without GetProcAddress()

This C++ code retrieves the address of LoadLibraryA manually:

#include <windows.h>
#include <iostream>

typedef struct _PEB_LDR_DATA {
BYTE Reserved1[8];
PVOID Reserved2[3];
LIST_ENTRY InMemoryOrderModuleList;
} PEB_LDR_DATA, *PPEB_LDR_DATA;

typedef struct _LDR_DATA_TABLE_ENTRY {
PVOID Reserved1[2];
LIST_ENTRY InMemoryOrderLinks;
PVOID Reserved2[2];
PVOID DllBase;
PVOID EntryPoint;
PVOID Reserved3;
UNICODE_STRING FullDllName;
BYTE Reserved4[8];
PVOID Reserved5[3];
union {
ULONG CheckSum;
PVOID Reserved6;
};
ULONG TimeDateStamp;
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;

typedef struct _PEB {
BYTE Reserved1[2];
BYTE BeingDebugged;
BYTE Reserved2[1];
PPEB_LDR_DATA Ldr;
} PEB, *PPEB;

HMODULE GetModuleBase(const wchar_t* moduleName) {
PPEB peb = (PPEB)__readgsqword(0x60); // PEB location in 64-bit
PPEB_LDR_DATA ldr = peb->Ldr;
LIST_ENTRY* moduleList = &ldr->InMemoryOrderModuleList;

for (LIST_ENTRY* entry = moduleList->Flink; entry != moduleList; entry = entry->Flink) {
PLDR_DATA_TABLE_ENTRY moduleEntry = CONTAINING_RECORD(entry, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);
if (wcsicmp(moduleEntry->FullDllName.Buffer, moduleName) == 0) {
return (HMODULE)moduleEntry->DllBase;
}
}
return nullptr;
}

int main() {
HMODULE kernel32Base = GetModuleBase(L"kernel32.dll");
if (kernel32Base) {
std::wcout << L"Kernel32 Base Address: " << kernel32Base << std::endl;
} else {
std::cerr << "Failed to get Kernel32 base address\n";
}
return 0;
}

Explanation

Accessing the PEB:

  • The __readgsqword(0x60) gets the PEB address in 64-bit Windows.
  • We use this to traverse loaded DLLs.

Walking Through Loaded Modules:

  • We iterate over Ldr.InMemoryOrderModuleList to find kernel32.dll.

Extracting the DLL Base Address:

  • This is the first step in manually resolving APIs.

Manually Resolving API Function Addresses from the Export Address Table (EAT)

Now that we can find a DLL’s base address in memory, the next step is extracting function addresses without using GetProcAddress(). This requires parsing the Export Address Table (EAT) of a DLL.

Understanding the Export Address Table (EAT)

Every DLL that exports functions (like kernel32.dll) has an Export Directory Table inside its PE (Portable Executable) header. This contains:

  1. AddressOfFunctions – Holds addresses of exported functions.
  2. AddressOfNames – Holds the names of exported functions.
  3. AddressOfNameOrdinals – Links function names to function addresses.

By parsing this table, we can:

  • Extract function names and their corresponding addresses.
  • Locate a function without calling GetProcAddress().

Steps to Manually Resolve an API

  1. Get the base address of kernel32.dll (we already did this in the previous step).
  2. Find the PE header and the Export Directory Table inside kernel32.dll.
  3. Extract function names and compare them to our target function (e.g., LoadLibraryA).
  4. Retrieve the corresponding function address from AddressOfFunctions.

Code: Manually Finding LoadLibraryA in Kernel32.dll

#include <windows.h>
#include <iostream>

DWORD GetFunctionRVA(HMODULE hModule, const char* functionName) {
// Get DOS header
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)hModule;
if (dosHeader->e_magic != IMAGE_DOS_SIGNATURE) {
return 0; // Invalid DOS header
}

// Get NT headers
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)((BYTE*)hModule + dosHeader->e_lfanew);
if (ntHeaders->Signature != IMAGE_NT_SIGNATURE) {
return 0; // Invalid NT header
}

// Get Export Directory
IMAGE_DATA_DIRECTORY exportDataDir = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
PIMAGE_EXPORT_DIRECTORY exportDir = (PIMAGE_EXPORT_DIRECTORY)((BYTE*)hModule + exportDataDir.VirtualAddress);

// Get function names and addresses
DWORD* addressOfFunctions = (DWORD*)((BYTE*)hModule + exportDir->AddressOfFunctions);
DWORD* addressOfNames = (DWORD*)((BYTE*)hModule + exportDir->AddressOfNames);
WORD* addressOfNameOrdinals = (WORD*)((BYTE*)hModule + exportDir->AddressOfNameOrdinals);

// Iterate over exported function names
for (DWORD i = 0; i < exportDir->NumberOfNames; i++) {
const char* functionNamePtr = (const char*)((BYTE*)hModule + addressOfNames[i]);

if (_stricmp(functionNamePtr, functionName) == 0) { // Compare function names
WORD functionOrdinal = addressOfNameOrdinals[i];
return addressOfFunctions[functionOrdinal]; // Return function RVA
}
}
return 0;
}

int main() {
HMODULE kernel32Base = GetModuleHandleA("kernel32.dll"); // Get Kernel32 base address

if (!kernel32Base) {
std::cerr << "Failed to get kernel32 base address\n";
return 1;
}

DWORD loadLibraryRVA = GetFunctionRVA(kernel32Base, "LoadLibraryA");
if (loadLibraryRVA) {
FARPROC loadLibraryAddr = (FARPROC)((BYTE*)kernel32Base + loadLibraryRVA);
std::cout << "LoadLibraryA found at: " << loadLibraryAddr << std::endl;
} else {
std::cerr << "Failed to resolve LoadLibraryA\n";
}

return 0;
}

Breakdown of the Code

Get the PE Header

  • PIMAGE_DOS_HEADER reads the DOS header.
  • PIMAGE_NT_HEADERS extracts the NT headers.

Find the Export Directory

  • Extracted from the Data Directory inside the PE headers.

Iterate Through Exported Functions

  • AddressOfNames contains function names.
  • AddressOfNameOrdinals maps names to function addresses.
  • AddressOfFunctions holds the actual function pointers.

Match and Return the Function Address

  • Compares function names with LoadLibraryA.
  • Retrieves its RVA (Relative Virtual Address).
  • Converts it to an actual function pointer.

Why Is This Important?

  • We completely avoid GetProcAddress(), making it stealthier.
  • This is exactly how malware resolves APIs dynamically.
  • This lays the foundation for API hashing (coming next).

API Hashing — Theory, Code, and Real-World Techniques

Now that we can manually resolve API addresses, let’s take it a step further with API Hashing, a widely used anti-analysis and anti-reversing technique in malware and security tools.

What is API Hashing?

Instead of storing function names (LoadLibraryA, VirtualAlloc, etc.), attackers:

  1. Hash the function names using a specific algorithm.
  2. Store or compare hashes instead of names.
  3. Match function addresses based on the hash.

This makes detection and reverse engineering harder, as function names are never stored in plaintext.

Where is API Hashing Used in Windows and Malware?

  • Malware & Shellcode: Used by malware families like TrickBot, QakBot, and Cobalt Strike to evade detection.
  • Packers & Loaders: Tools like UPX or custom packers use API hashing to obfuscate API resolution.
  • Windows Itself:
  • Windows Defender (mpengine.dll) uses API hashing for scanning.
  • Windows internals, like ntdll.dll stubs, sometimes rely on name obfuscation to hide syscall internals.

How API Hashing Works

  1. A function name (e.g., "LoadLibraryA") is hashed using an algorithm (e.g., CRC32, FNV-1A, DJB2).
  2. Instead of storing "LoadLibraryA", malware stores 0x72F2B53C (example hash).
  3. At runtime, it:
  • Enumerates exported function names (as we did in the last topic).
  • Hashes each name.
  • Compares it to the stored hash.
  • If it matches, extracts the function address.

API Hashing Code: FNV-1A Algorithm

One of the most common hashing algorithms used in malware is FNV-1A due to its speed and simplicity.

Step 1: Hashing a Function Name

#include <iostream>

#define FNV_PRIME 0x1000193
#define FNV_OFFSET_BASIS 0x811c9dc5

// FNV-1A Hashing Function
DWORD fnv1a_hash(const char* functionName) {
DWORD hash = FNV_OFFSET_BASIS;
while (*functionName) {
hash ^= (BYTE)(*functionName++);
hash *= FNV_PRIME;
}
return hash;
}

int main() {
const char* functionName = "LoadLibraryA";
DWORD hashValue = fnv1a_hash(functionName);

std::cout << "Hash of " << functionName << " = 0x" << std::hex << hashValue << std::endl;
return 0;
}

Output Example:

Hash of LoadLibraryA = 0x72F2B53C

Now we have a unique identifier for LoadLibraryA.

Step 2: Using Hashes to Resolve API Addresses

We replace function names with their hashed values and compare them dynamically.

#include <windows.h>
#include <iostream>

#define FNV_PRIME 0x1000193
#define FNV_OFFSET_BASIS 0x811c9dc5

DWORD fnv1a_hash(const char* functionName) {
DWORD hash = FNV_OFFSET_BASIS;
while (*functionName) {
hash ^= (BYTE)(*functionName++);
hash *= FNV_PRIME;
}
return hash;
}

FARPROC GetFunctionByHash(HMODULE hModule, DWORD functionHash) {
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)hModule;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)((BYTE*)hModule + dosHeader->e_lfanew);
IMAGE_DATA_DIRECTORY exportDataDir = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
PIMAGE_EXPORT_DIRECTORY exportDir = (PIMAGE_EXPORT_DIRECTORY)((BYTE*)hModule + exportDataDir.VirtualAddress);

DWORD* addressOfFunctions = (DWORD*)((BYTE*)hModule + exportDir->AddressOfFunctions);
DWORD* addressOfNames = (DWORD*)((BYTE*)hModule + exportDir->AddressOfNames);
WORD* addressOfNameOrdinals = (WORD*)((BYTE*)hModule + exportDir->AddressOfNameOrdinals);

for (DWORD i = 0; i < exportDir->NumberOfNames; i++) {
const char* functionName = (const char*)((BYTE*)hModule + addressOfNames[i]);
if (fnv1a_hash(functionName) == functionHash) {
WORD functionOrdinal = addressOfNameOrdinals[i];
return (FARPROC)((BYTE*)hModule + addressOfFunctions[functionOrdinal]);
}
}
return nullptr;
}

int main() {
HMODULE kernel32Base = GetModuleHandleA("kernel32.dll");
if (!kernel32Base) {
std::cerr << "Failed to get kernel32 base address\n";
return 1;
}

DWORD loadLibraryHash = fnv1a_hash("LoadLibraryA");
FARPROC pLoadLibrary = GetFunctionByHash(kernel32Base, loadLibraryHash);

if (pLoadLibrary) {
std::cout << "LoadLibraryA found at: " << pLoadLibrary << std::endl;
} else {
std::cerr << "Failed to resolve LoadLibraryA\n";
}

return 0;
}

Real-World API Hashing Techniques in Use

1. CRC32 Hashing (Used in Metasploit)

  • Lightweight and fast.
  • Used in malware to hash API names.
  • Implemented in shellcode loaders.

2. DJB2 Hashing (Seen in some Crypters)

  • Very simple but effective.
  • Used by older malware to obfuscate function names.

3. Windows Defender & NtQuerySystemInformation

  • Windows Defender hashes function names for internal lookups.
  • NtQuerySystemInformation API sometimes returns hashed function names when dealing with system process enumeration.

4. Cobalt Strike’s API Hashing

  • Used in Reflective DLL Injection.
  • Evades API-based detection techniques.

Why is API Hashing Effective?

Prevents string-based detection — No plaintext function names.
Anti-debugging and anti-reversing — Security tools can’t easily analyze function calls.
Reduces size — Hashing avoids storing long function names in memory.

NT API Hashing — Resolving Internal Windows Syscalls with Hashing

Now that we’ve covered API hashing for user-mode functions like LoadLibraryA, let’s move on to NT API Hashing. NT APIs (e.g., NtAllocateVirtualMemory, NtQueryInformationProcess) are internal Windows system calls. These are undocumented functions that interact with the kernel.

Since NT API names are not available in user-space by default, malware and advanced attackers often hash these function names as well to:

  • Avoid detection.
  • Access kernel functions indirectly.

We’ll cover NT API hashing using the same FNV-1A hashing technique and show how to resolve NT functions dynamically.

Understanding NT API Resolution

NT API functions are generally accessed by using:

  1. System call numbers: Each NT API has a corresponding number.
  2. Hashes of function names: We resolve NT API names by hashing them instead of using string literals.

In user-mode, to call an NT API, malware uses:

  • NtQuerySystemInformation or NtAllocateVirtualMemory to directly access kernel services.
  • NT APIs are resolved dynamically through the ntdll.dll library.

Let’s hash the NtAllocateVirtualMemory API.

Step 1: Hashing NT API Names (e.g., NtAllocateVirtualMemory)

We continue using FNV-1A to hash NT API names.

Example Code — Hashing NT API Function

#include <windows.h>
#include <iostream>

#define FNV_PRIME 0x1000193
#define FNV_OFFSET_BASIS 0x811c9dc5

DWORD fnv1a_hash(const char* functionName) {
DWORD hash = FNV_OFFSET_BASIS;
while (*functionName) {
hash ^= (BYTE)(*functionName++);
hash *= FNV_PRIME;
}
return hash;
}

int main() {
const char* functionName = "NtAllocateVirtualMemory";
DWORD hashValue = fnv1a_hash(functionName);

std::cout << "Hash of " << functionName << " = 0x" << std::hex << hashValue << std::endl;
return 0;
}

Output:

Hash of NtAllocateVirtualMemory = 0x58E1F14E

We now have a unique hash for NtAllocateVirtualMemory.

Step 2: Resolving NT API Address Dynamically

We can manually resolve NT API functions by parsing the exported functions in ntdll.dll, then comparing the hash with the function names.

#include <windows.h>
#include <iostream>

#define FNV_PRIME 0x1000193
#define FNV_OFFSET_BASIS 0x811c9dc5

DWORD fnv1a_hash(const char* functionName) {
DWORD hash = FNV_OFFSET_BASIS;
while (*functionName) {
hash ^= (BYTE)(*functionName++);
hash *= FNV_PRIME;
}
return hash;
}

FARPROC GetNtApiByHash(HMODULE hModule, DWORD functionHash) {
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)hModule;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)((BYTE*)hModule + dosHeader->e_lfanew);
IMAGE_DATA_DIRECTORY exportDataDir = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
PIMAGE_EXPORT_DIRECTORY exportDir = (PIMAGE_EXPORT_DIRECTORY)((BYTE*)hModule + exportDataDir.VirtualAddress);

DWORD* addressOfFunctions = (DWORD*)((BYTE*)hModule + exportDir->AddressOfFunctions);
DWORD* addressOfNames = (DWORD*)((BYTE*)hModule + exportDir->AddressOfNames);
WORD* addressOfNameOrdinals = (WORD*)((BYTE*)hModule + exportDir->AddressOfNameOrdinals);

for (DWORD i = 0; i < exportDir->NumberOfNames; i++) {
const char* functionName = (const char*)((BYTE*)hModule + addressOfNames[i]);
if (fnv1a_hash(functionName) == functionHash) {
WORD functionOrdinal = addressOfNameOrdinals[i];
return (FARPROC)((BYTE*)hModule + addressOfFunctions[functionOrdinal]);
}
}
return nullptr;
}

int main() {
HMODULE ntdllBase = GetModuleHandleA("ntdll.dll");
if (!ntdllBase) {
std::cerr << "Failed to get ntdll base address\n";
return 1;
}

DWORD ntAllocMemHash = fnv1a_hash("NtAllocateVirtualMemory");
FARPROC pNtAllocMem = GetNtApiByHash(ntdllBase, ntAllocMemHash);

if (pNtAllocMem) {
std::cout << "NtAllocateVirtualMemory found at: " << pNtAllocMem << std::endl;
} else {
std::cerr << "Failed to resolve NtAllocateVirtualMemory\n";
}

return 0;
}

Real-World Use of NT API Hashing

1. Windows Internal Syscalls

  • ntdll.dll exports many undocumented NT APIs used by Windows internally.
  • These functions are typically accessed using system call numbers and hashed function names in security contexts.

2. Malware & Exploits

  • Malware often uses NT API hashing to:
  • Access kernel functions (e.g., NtAllocateVirtualMemory, NtMapViewOfSection).
  • Perform direct system calls without relying on exported symbols, which helps evade detection.
  • Exploit frameworks like Cobalt Strike use NT API hashing for functions like NtQuerySystemInformation for anti-analysis purposes.

Real-World Malware Implementations of API Hashing (NT APIs)

In this section, we’ll explore how real-world malware such as Cobalt Strike, TrickBot, and others use NT API hashing and manual resolution techniques to evade detection and analysis. These techniques ensure that malware is less likely to be flagged by antivirus programs and researchers. By hiding API names and relying on dynamic runtime resolution, malware can bypass many signature-based detection systems.

Malware Implementation: Cobalt Strike’s API Hashing for NT Syscalls

Cobalt Strike Overview

Cobalt Strike is a popular penetration testing framework used by both red teams and adversaries. It provides powerful post-exploitation capabilities like beaconing and command-and-control functionalities. One of the key features of Cobalt Strike is its use of reflective DLL injection and API hashing to execute payloads stealthily.

Cobalt Strike uses NT API hashing to resolve system calls for:

  • Memory allocation (NtAllocateVirtualMemory)
  • Process injection (NtCreateThreadEx)
  • Suspending processes (NtSuspendProcess)

Cobalt Strike primarily uses NT API hashing to ensure that the function names are not visible in memory. This makes it harder for security software to detect the use of system APIs.

Cobalt Strike’s API Hashing Process

Here’s a simplified outline of how Cobalt Strike employs NT API hashing:

  1. Hashing the NT API names: Just like we’ve demonstrated earlier, Cobalt Strike hashes NT API names (e.g., NtAllocateVirtualMemory) using algorithms like FNV-1A.
  2. Storing the hashes: These hashes are stored in the malware binary or in-memory.
  3. Resolving function addresses dynamically:
  • Cobalt Strike dynamically loads ntdll.dll or kernel32.dll.
  • It enumerates the functions in the export table, hashes their names, and compares the hashes against stored values.
  • Once a match is found, it retrieves the function address and uses it to perform system calls.

Reflective DLL Injection with NT API Hashing

Cobalt Strike frequently performs reflective DLL injection, which is a technique that injects a DLL into another process without relying on the Windows loader.

In this process, the attacker:

  1. Hashes the name of critical NT APIs (like NtCreateThreadEx or NtAllocateVirtualMemory).
  2. Dynamically loads the functions at runtime by comparing the function name hash with the hashes stored in the reflective DLL.
  3. Uses system calls to execute injected code, making it extremely difficult for analysts or automated systems to trace the function calls.

Malware Implementation: TrickBot’s NT API Hashing

TrickBot Overview

TrickBot is a well-known banking Trojan that evolved into a sophisticated modular malware family. It is capable of stealing banking credentials, personal data, and corporate network information. TrickBot employs NT API hashing techniques to avoid detection and evade analysis.

TrickBot uses API hashing to:

  • Evade static analysis tools (e.g., signature-based antivirus software).
  • Avoid naming functions directly, which could be flagged.
  • Avoid hardcoded system calls by dynamically resolving them during runtime.

TrickBot’s API Hashing Process

TrickBot uses a similar process to Cobalt Strike but with some unique adaptations:

  1. API Hashing: TrickBot hashes NT API names such as NtCreateThreadEx, NtWriteVirtualMemory, and NtQueryInformationProcess.
  2. Dynamic Resolution:
  • TrickBot dynamically resolves NT APIs at runtime from ntdll.dll.
  • During infection or payload execution, TrickBot hashes the function names and looks them up using their hashes.

Running Shellcode:

  • TrickBot also uses shellcode injection techniques where it runs malicious code directly in memory without writing to disk.
  • The use of NT API hashing ensures that its shellcode is stealthy and avoids detection by static analysis tools.

How Malware Uses API Hashing to Evade Detection

Obfuscation of API Names:

  • By hashing function names, malware avoids plaintext API names in memory. This prevents security tools from scanning and identifying the malware based on its API usage.
  • Antivirus engines that rely on signature-based detection can no longer match function names because they are replaced with hashes.

Stealthy System Calls:

  • Malware can invoke low-level system calls (e.g., NtAllocateVirtualMemory) without directly referring to these functions by name.
  • As system calls are resolved dynamically, signature-based detection tools are unable to detect function usage, as they are not present in the usual form.

Avoiding Debugging:

  • Malware authors commonly use hashing techniques to bypass debuggers and reverse engineers. For example, a malware sample may be compiled with hashed API names, making it difficult for an analyst to easily identify the function calls in a debugger.
  • If the malware is run under a debugger, the hashes would need to be resolved in real-time, preventing static analysis.

Malware Example: Implementing NT API Hashing in Cobalt Strike

Let’s take a look at an example of Cobalt Strike’s Reflective DLL Injection with NT API hashing. Here’s an outline of the steps involved:

  1. Hashing NT API Names (using FNV-1A or another algorithm).
  2. Dynamically Resolving Function Addresses by comparing hashes with those in ntdll.dll.
  3. Injecting Code via Reflective DLL Injection.

Code Example (Outline, Not Exact Malware Code)

#include <windows.h>
#include <iostream>

#define FNV_PRIME 0x1000193
#define FNV_OFFSET_BASIS 0x811c9dc5

DWORD fnv1a_hash(const char* functionName) {
DWORD hash = FNV_OFFSET_BASIS;
while (*functionName) {
hash ^= (BYTE)(*functionName++);
hash *= FNV_PRIME;
}
return hash;
}

FARPROC GetNtApiByHash(HMODULE hModule, DWORD functionHash) {
PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)hModule;
PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)((BYTE*)hModule + dosHeader->e_lfanew);
IMAGE_DATA_DIRECTORY exportDataDir = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
PIMAGE_EXPORT_DIRECTORY exportDir = (PIMAGE_EXPORT_DIRECTORY)((BYTE*)hModule + exportDataDir.VirtualAddress);

DWORD* addressOfFunctions = (DWORD*)((BYTE*)hModule + exportDir->AddressOfFunctions);
DWORD* addressOfNames = (DWORD*)((BYTE*)hModule + exportDir->AddressOfNames);
WORD* addressOfNameOrdinals = (WORD*)((BYTE*)hModule + exportDir->AddressOfNameOrdinals);

for (DWORD i = 0; i < exportDir->NumberOfNames; i++) {
const char* functionName = (const char*)((BYTE*)hModule + addressOfNames[i]);
if (fnv1a_hash(functionName) == functionHash) {
WORD functionOrdinal = addressOfNameOrdinals[i];
return (FARPROC)((BYTE*)hModule + addressOfFunctions[functionOrdinal]);
}
}
return nullptr;
}

void InjectShellcode(HMODULE hModule) {
// Here you would use resolved NT APIs to inject shellcode
FARPROC NtCreateThreadEx = GetNtApiByHash(hModule, fnv1a_hash("NtCreateThreadEx"));
// Execute shellcode by invoking NtCreateThreadEx or similar API
}

int main() {
HMODULE ntdllBase = GetModuleHandleA("ntdll.dll");
if (!ntdllBase) {
std::cerr << "Failed to get ntdll base address\n";
return 1;
}

DWORD ntAllocMemHash = fnv1a_hash("NtAllocateVirtualMemory");
FARPROC pNtAllocMem = GetNtApiByHash(ntdllBase, ntAllocMemHash);

if (pNtAllocMem) {
std::cout << "NtAllocateVirtualMemory found at: " << pNtAllocMem << std::endl;
// Proceed with malware injection or execution
} else {
std::cerr << "Failed to resolve NtAllocateVirtualMemory\n";
}

return 0;
}

Breakdown of the Code

FNV_PRIME and FNV_OFFSET_BASIS: These constants are part of the FNV-1A hash algorithm.

  • FNV_PRIME is the prime number used in the hashing process.
  • FNV_OFFSET_BASIS is the initial value that starts the hash.

fnv1a_hash function:

  • This function takes the function name (e.g., "NtAllocateVirtualMemory") and computes its hash using the FNV-1A algorithm.
  • It initializes the hash with FNV_OFFSET_BASIS, iterates through each character of the string, and performs XOR and multiplication using FNV_PRIME on each character.
  • This produces a unique hash value for the function name, which can later be compared with other hashed values to identify function names without using the original string.

PIMAGE_DOS_HEADER and PIMAGE_NT_HEADERS: These structures are part of the PE (Portable Executable) file format, which is how DLLs and EXEs are laid out in memory. We need these structures to parse the export table.

  • PIMAGE_DOS_HEADER helps us get the DOS header of the module (this is where we can find the pointer to the PE header).
  • PIMAGE_NT_HEADERS provides the NT headers of the module, which contains the necessary information to navigate to the export table.

IMAGE_DATA_DIRECTORY and IMAGE_EXPORT_DIRECTORY: These structures are used to find the export table. The export table in a DLL contains information about the functions it exports (i.e., the addresses of functions like NtAllocateVirtualMemory).

Parsing the Export Table: The function looks up the address of functions (AddressOfFunctions), names (AddressOfNames), and name ordinals (AddressOfNameOrdinals).

  • AddressOfFunctions contains the actual addresses of the functions.
  • AddressOfNames contains the addresses of the names of the functions.
  • AddressOfNameOrdinals contains the index mapping to the AddressOfFunctions.

Function Matching by Hash: The function iterates over all the exported names in ntdll.dll, hashes them, and compares the hash with the one we generated earlier. When a match is found, the function returns the address of the function.

GetModuleHandleA("ntdll.dll"): This function retrieves the base address of the ntdll.dll library in memory. This is required because we need to find the export table in ntdll.dll.

Hashing NtAllocateVirtualMemory: We hash the name of the NT API function NtAllocateVirtualMemory using the fnv1a_hash function.

Resolving the Function: We call GetNtApiByHash to resolve the address of NtAllocateVirtualMemory by comparing the hash we just generated with the hashes of the function names in ntdll.dll. If a match is found, we print the address of the function.

Error Handling: If the function cannot be found, an error message is printed.

Mitigating API Hashing in Security: Detection and Defense

In this section, we will discuss how security researchers, antivirus products, and operating systems can mitigate or detect the use of API hashing techniques employed by malware. We’ll explore various strategies to counter the obfuscation tactics used by attackers, focusing on detection mechanisms and defensive techniques.

Techniques for Detecting API Hashing

Detecting API hashing techniques used by malware is crucial for combating malware that hides its behavior by hashing function names. The detection methods can range from heuristic approaches to advanced memory scanning. Here are some common techniques used to detect API hashing:

1. Static Code Analysis

Static analysis involves analyzing the program’s binary without executing it. Tools like IDA Pro, Ghidra, or Radare2 can be used to inspect malware binaries for suspicious behaviors, including API hashing.

  • Signature-based detection: Static analysis tools can look for common hashes or known hashing patterns (e.g., FNV-1A). If an attacker uses a standard hash algorithm or a known function name, the tool may identify it.
  • Decompilation: By decompiling or disassembling the binary, analysts can look for hashing functions (such as the FNV-1A algorithm) and cross-reference them with known API names to detect the API calls being made dynamically.

Example:

  • Using Ghidra or IDA Pro, an analyst might identify the FNV-1A algorithm in the binary and spot patterns in the hashes that correspond to specific system calls, such as NtAllocateVirtualMemory or NtCreateThreadEx.

2. Memory Forensics

Memory forensics is the process of analyzing a process’s memory to identify malicious behavior that may not be visible in the static binary.

  • Memory dumps can be taken while malware is running, and these can be analyzed to find function names or addresses that were dynamically resolved through API hashing.
  • Dynamic function resolution tracking: Tools like Volatility or Rekall can be used to track function calls in memory. If malware is hashing API names, their resolved addresses can still be tracked.

Example:

  • If a process is running, a memory dump can be taken and analyzed to spot resolved NT API addresses. Even though the API names are hashed, the actual resolved addresses can be identified and matched to known function addresses.

3. Behavioral Analysis

Behavioral analysis involves observing the program’s actions during runtime, such as system calls and file interactions, to identify patterns that are typical of malware.

  • API call tracking: Monitoring system calls, especially NT API calls, allows for identifying suspicious or malicious activities. Even if the API names are hashed, the pattern of calls (e.g., NtAllocateVirtualMemory followed by NtCreateThreadEx) can be flagged as suspicious.
  • Sandboxes and Virtualization: Malware can be executed in a sandbox or virtual machine to observe its behavior. While the program might use hashed API names, its behavior (e.g., injecting code into other processes) can still be detected.

Example:

  • If malware uses NtCreateThreadEx to inject code into another process, a behavioral analysis system can flag this as suspicious regardless of whether the function name was hashed or not.

Defensive Techniques: Mitigating API Hashing in Malware

While detection is important, defense techniques focus on making it harder for attackers to successfully employ API hashing or dynamic resolution of functions. Several defensive mechanisms can be put in place to block or mitigate these techniques:

1. Windows Kernel Security Features

Windows offers built-in security mechanisms that can help mitigate API resolution and prevent malicious usage of system calls:

  • Driver Integrity Checking: Enforcing driver signing and using Windows Defender to prevent the loading of unsigned drivers can help stop malware from injecting custom code or hijacking system calls.
  • Windows API Hooking Protection: Using Windows Defender Application Control (WDAC), administrators can prevent unauthorized API hooks or modifications to system libraries like ntdll.dll.
  • Code Integrity Guard (CIG): CIG ensures that code cannot be injected into protected processes. This helps to prevent memory-based attacks that involve reflective DLL injection, which often uses hashed NT APIs.

2. Code Rewriting (Self-Healing Techniques)

Some security tools and advanced malware analysis systems perform code rewriting or self-healing. When malware is identified or behaves suspiciously, the system may rewrite or redirect certain system calls to their legitimate counterparts, neutralizing the threat.

Example:

  • Anti-VM detection: Certain antivirus systems detect if malware is running inside a virtual environment and can alter its behavior (like removing hash obfuscation) to trigger detection.

3. Behavioral Heuristics and Machine Learning

Advanced antivirus and security products are increasingly relying on machine learning (ML) and behavioral heuristics to detect unknown threats, including those using API hashing techniques.

  • Heuristic detection: This involves identifying suspicious patterns that indicate malicious behavior, such as an application making an unusual number of system calls or attempting to allocate large amounts of memory dynamically.
  • Machine learning models can be trained on known malware behavior to predict new malware strains that use similar techniques, such as API hashing and dynamic resolution.

4. Endpoint Detection and Response (EDR)

EDR tools are designed to track activities across endpoints and analyze unusual behavior. They focus on detecting abnormal API calls and can flag those made through hashed names as anomalous.

  • Anomaly detection: EDR tools can track process execution patterns and flag any unusual or suspicious API calls, even when they are hashed.
  • Real-time analysis: EDR tools continuously monitor running processes, looking for abnormal API usage patterns.

Example:

  • If a process suddenly starts resolving many system calls dynamically (e.g., hashing function names to resolve them), an EDR tool might flag the activity as suspicious.

Counteracting API Hashing in Malware

To summarize, here are the main defense mechanisms that can counteract API hashing techniques used by malware:

  1. Static and Dynamic Analysis: Using both static code analysis and memory forensics can detect hashed API calls and their resolved addresses.
  2. Monitoring System Calls: Even if API names are hashed, the system calls themselves can still be tracked and flagged for unusual behavior.
  3. Using Built-in Windows Security Features: Enforcing kernel security and API protection mechanisms can make it harder for malware to manipulate system functions.
  4. Behavioral and Heuristic Detection: These methods detect suspicious patterns of API usage, even when function names are hashed, making it harder for malware to hide its actions.
  5. EDR and Machine Learning: Advanced endpoint protection tools can spot new malware variants using dynamic resolution and hashing techniques.

Cobalt Strike — Case Study

let’s dive deep into how Cobalt Strike implements API hashing and how it evades detection even in the presence of kernel-level security measures designed to prevent API function resolution manipulation. Cobalt Strike, a legitimate red teaming and penetration testing tool, has often been repurposed by attackers for post-exploitation and lateral movement. It is known for its stealth and robustness, making it an excellent example of how API hashing techniques can be effectively implemented for evasion in a high-stakes attack.

We’ll break this down step by step, from the high-level overview to the low-level technical details of how Cobalt Strike leverages API hashing and continues to operate effectively, even in environments where advanced defenses are in place.

1. High-Level Overview of Cobalt Strike’s API Hashing and Evasion Techniques

Cobalt Strike is typically used in red-team scenarios for post-exploitation tasks such as:

  • Privilege escalation
  • Credential harvesting
  • Lateral movement
  • Command and control (C2)

To achieve stealth during these activities, Cobalt Strike uses various evasion techniques, including API hashing, to obfuscate its interactions with the operating system, thereby avoiding detection from traditional antivirus and endpoint protection solutions.

Why Cobalt Strike Uses API Hashing

  • Evading Signature-Based Detection: Security solutions, such as antivirus software, rely on signature-based detection, which includes recognizing specific function names (e.g., NtCreateThreadEx, VirtualAllocEx, NtAllocateVirtualMemory). By hashing the API function names, Cobalt Strike can make it much harder for these solutions to recognize the actual system calls.
  • Dynamic Behavior: Malware authors and security tools often inspect running processes to detect suspicious patterns. Cobalt Strike uses dynamic resolution of API functions, making it much harder to pin down the exact system calls it is using.
  • Mimicking Legitimate Behavior: By dynamically resolving function names and keeping interactions with the operating system cryptic, Cobalt Strike can better blend in with normal system activities, making it difficult for intrusion detection systems (IDS) or heuristic-based security systems to flag it as malicious.

2. How Cobalt Strike Implements API Hashing

Cobalt Strike uses API hashing in several key areas, including system calls related to memory allocation, thread creation, networking, and file operations. Below, we break down the core parts of how it implements API hashing in practice.

A. Hashing the Function Names

Cobalt Strike utilizes hashing algorithms (often FNV-1A) to obscure the names of NT API functions. Here’s how the process works:

Hashing the NT API Name:

  • Function names like NtCreateThreadEx, NtWriteVirtualMemory, and NtAllocateVirtualMemory are first hashed using a standard hash algorithm (typically FNV-1A). The result is a fixed-size hash value (e.g., 0x12345678), representing the original function name.

example:

unsigned long FNV_1A(const char *str) {
unsigned long hash = 0x811C9DC5;
while (*str) {
hash ^= (unsigned char)(*str++);
hash *= 0x01000193;
}
return hash;
}

This would hash NtCreateThreadEx into a unique hash value.

Dynamically Resolving Function Address:

  • Instead of calling the function directly by name, Cobalt Strike performs a dynamic resolution step where it looks up the hashed function name in the export table of ntdll.dll (or other relevant system libraries). This allows the tool to resolve the hashed value to the actual function address at runtime.
  • Example: Given the hash of NtCreateThreadEx, Cobalt Strike will search the export table of ntdll.dll to find the corresponding function pointer.

Direct Memory Injection:

  • Cobalt Strike will inject its own malicious code into another process by calling system functions that it has hashed. By resolving the function dynamically, the malware ensures that the hash matches a known function pointer, allowing it to inject code stealthily.

B. Dynamic API Resolution at Runtime

  • Cobalt Strike doesn’t hard-code API function addresses into its payloads or binaries. Instead, it hashes the required API names at runtime and then uses those hashes to resolve the correct function addresses dynamically.
  • This technique allows it to avoid signature-based detection that relies on knowing the specific function names being invoked.

For instance, Cobalt Strike may:

  1. Hash NtCreateThreadEx to a value like 0xabc12345.
  2. Use GetProcAddress or manual memory scanning to resolve the hash to the actual function address of NtCreateThreadEx.
  3. Use this resolved address to inject a thread or allocate memory without directly using the function’s name, thus avoiding detection from tools that scan for specific strings or system calls.

3. How Cobalt Strike Evades Kernel-Level Security Measures

Now, let’s discuss how Cobalt Strike can bypass kernel-level security mechanisms, even when security solutions are in place.

A. Kernel-Level Protections

Modern versions of Windows OS have several kernel-level defenses to prevent the abuse of NT APIs:

  • Code Integrity: Ensures that only signed drivers and executables can execute on the system.
  • Protected Process Light (PPL): Ensures that certain critical system processes are protected from modification or injection.
  • Driver Signature Enforcement: Prevents the loading of unsigned drivers.
  • Windows Defender: Attempts to block suspicious activity at the kernel level, such as API hooking or direct memory injection.

B. How Cobalt Strike Bypasses These Defenses

Even with these security measures in place, Cobalt Strike has been effective for red teaming and adversary simulation due to several factors:

Abusing Trusted API Calls:

  • Cobalt Strike can use trusted system processes to run malicious code. By using trusted signatures for execution, it may operate inside processes that have whitelisted permissions (e.g., svchost.exe).
  • This helps to evade detection from security products that focus on detecting suspicious behavior in processes that are not typically trusted.

API Hashing with Custom Algorithms:

  • While FNV-1A is common, Cobalt Strike can implement custom hashing algorithms that make it harder for security researchers to identify the exact algorithm. This further complicates reverse engineering efforts and helps evade signature-based detections.
  • Custom hashing ensures that even if a particular hashing algorithm is detected (like FNV-1A), Cobalt Strike can simply change its hashing method, making the detection methods obsolete.

In-memory Execution:

  • Cobalt Strike heavily relies on in-memory execution techniques, where it does not write its malicious payload to disk. This means that traditional file-based defenses (like antivirus software) are ineffective because no executable file is present for signature-based detection.
  • In combination with API hashing, Cobalt Strike resolves system calls dynamically, allowing it to inject code into memory without triggering static file-based defenses.

Reflection and Dynamic Linking:

  • Cobalt Strike uses reflection techniques to load its payloads, where it dynamically loads libraries and resolves function addresses without relying on predefined imports. This makes it difficult for signature-based solutions to identify and flag the activity.
  • By using dynamic resolution, Cobalt Strike can bypass kernel-level hooking defenses that might intercept and flag hardcoded API calls.

4. Case Study: Cobalt Strike in Action

Let’s take a practical example:

Step 1: Hashing the API Calls

  • Cobalt Strike hashes NtCreateThreadEx using the FNV-1A algorithm.
  • It then resolves this hash to the actual function address using GetProcAddress or manual memory scanning.

Step 2: Resolving the Function Dynamically

  • After resolving the function address, Cobalt Strike uses it to create a new thread in a remote process, say svchost.exe.
  • This thread is used to execute a payload, which can then be used for command and control operations, data exfiltration, or lateral movement.

Step 3: Bypassing Kernel-Level Security

  • Since Cobalt Strike uses trusted processes, dynamic function resolution, and in-memory execution, kernel-level security measures (like Code Integrity Guard or Windows Defender with real-time scanning) are ineffective at catching the activity.

Summary

Cobalt Strike’s success as a red teaming tool, even in the presence of kernel-level security defenses, is due to its advanced use of API hashing, dynamic resolution, custom algorithms, and in-memory execution. By hashing critical NT API function names and resolving them at runtime, it effectively avoids detection from signature-based systems, even in high-security environments.

This makes Cobalt Strike a powerful tool for red teamers and attackers, but also a critical tool for defenders to understand in terms of evasion techniques and mitigation strategies.

#APIHashing #WindowsInternals #DefenseEvasion #RedTeaming

--

--

Nikhil gupta
Nikhil gupta

Written by Nikhil gupta

Incident Response, Threat Hunting, and Reverse Engineering professional, writing things to learn them better. https://www.linkedin.com/in/nikhilnow/

No responses yet