Table of Contents
- Background Information
- Exploitation Techniques
This document describes a set of techniques that can be used to turn a heap overflow into an overlapping chunk allocation. The techniques depend on the VS allocator.
Quick overview of their characteristics:
|Minimum Required Overflow Size||3 bytes|
|Overflow Data Requirements||None, they could be random|
|Vulnerable Chunk Allocator||VS|
|Overlapping Chunk Allocator||VS|
|Heap Grooming Complexity||Medium/Hard|
|Dynamic Lookaside Dependence||No[a]|
|Kernelmode/Usermode Segment Heap Applicability||Yes[b]|
[a]: Dynamic lookaside can be helpful for exploitation purposes (e.g. see ), or at least is not a big obstacle. Nevertheless, the dynamic lookaside is disabled in some heaps (e.g. session pool). In those cases, we can’t use the techniques that depend on the dynamic lookaside.
[b]: The techniques do not depend on specific configurations/optimizations of the segment heap or the layer below. So they should apply to both kernelmode and usermode segment heap implementations.
Here we will discuss some specific aspects of the VS allocator that we are going to use later on. Even though we will try to cover the relevant aspects of the VS allocator, it is also recommended to have a look at the following resources ,,,, at least at the parts that discuss the VS allocator. It is also noted that the details provided here will be relevant to the kernel implementation/configuration of the segment heap unless it's explicitly stated otherwise.
Dynamic Memory Request Flow
On a high level, we can say that dynamic memory requests go through two levels of functions. In the first level function (e.g. ExAllocateHeapPool/ExFreeHeapPool), we have some generic housekeeping/optimizations, and the main goal is to forward the memory request to the appropriate second level function. The second level function is essentially the entry point for the specialized allocator itself and essentially where the segment heap is implemented. The segment heap is composed of a number of allocators, including the VS allocator, each optimized for the range of requests they serve. Apart from some special cases (e.g. special pool enabled for a request), the selection of the appropriate allocator by the first level function is based on the request size. It is noted that apart from differences in the segment heap configuration, the first level functions are expected to have the bulk of differences between the usermode/kernelmode segment heap implementations, while the second layer functions should be very similar.
As mentioned earlier, the first-level functions are responsible for some housekeeping operations, which may require metadata. Those metadata can be stored either within the associated heap chunk or in a global table depending on the request type. In case the metadata is embedded within the chunk, the first-level handler adjusts the requested size to ensure enough room for the additional metadata it needs before forwarding the request to the 2nd level handler (e.g. adds 16 bytes to the request size to account for the
_POOL_HEADER). The metadata is typically added at the beginning of the chunk returned by the second-level function and higher level functions should be oblivious to the metadata used by lower-level functions.
The two diagrams below illustrate the memory representation of chunks with embedded and detached 1st level function metadata:
Requests served by the VS allocator
The kernel implementation of the VS allocator is responsible for regular requests with sizes that satisfy either of the following conditions:
a. 0 <= size <= 0x1e0 && LFH[size]==disabled b. 0x1e1 <= size <= 0xfe0 c. 0x1001 <= size < 0x10000 && size % 0x1000 <= 0xfc0
A chunk that belongs to the VS allocator can be in two states, either freed or used and depending on their state, chunks have different characteristics and metadata associated with them.
Used chunks are the chunks that have been allocated but not yet freed(!). Used chunks are always headed with the
1: kd> dt _HEAP_VS_CHUNK_HEADER nt!_HEAP_VS_CHUNK_HEADER +0x000 Sizes : _HEAP_VS_CHUNK_HEADER_SIZE +0x008 EncodedSegmentPageOffset : Pos 0, 8 Bits +0x008 UnusedBytes : Pos 8, 1 Bit +0x008 SkipDuringWalk : Pos 9, 1 Bit +0x008 Spare : Pos 10, 22 Bits +0x008 AllocatedChunkBits : Uint4B
0: kd> dt _HEAP_VS_CHUNK_HEADER_SIZE nt!_HEAP_VS_CHUNK_HEADER_SIZE +0x000 MemoryCost : Pos 0, 16 Bits +0x000 UnsafeSize : Pos 16, 16 Bits +0x004 UnsafePrevSize : Pos 0, 16 Bits +0x004 Allocated : Pos 16, 8 Bits +0x000 KeyUShort : Uint2B +0x000 KeyULong : Uint4B +0x000 HeaderBits : Uint8B
A quick description of the header:
Sizes: this field is a
_HEAP_VS_CHUNK_HEADER_SIZEstructure and is encoded in memory. Assuming
vs_chunk_headeris a variable that holds the address of the
_HEAP_VS_CHUNK_HEADER, this field is encoded as follows:
vs_chunk_header->Sizes.HeaderBits ^ vs_chunk_header ^ RtlpHpHeapGlobals.HeapKey
At least the
HeapKeyis an 8-byte random value independent from the other xor values, so all the fields under
Sizeswill contain a random value regardless of their decoded contents.
Following is a description of the fields in
MemoryCost: This field indicates the number of pages occupied by the current chunk apart from the page where the chunk's vs header lies. MemoryCost for the current chunk is not updated during allocation, so its value could be inaccurate in some cases. This happens for example in case the allocated chunk was part of a bigger chunk that was split to satisfy the smaller size request. In that case, the MemoryCost of the current chunk will be equal to the memory cost of the originally bigger chunk. It's used primarily for allocations from the FreeChunkTree as a tie-breaker essentially, in case multiple chunks exist with the same size. In that case, the chunk with the smallest MemoryCost is returned.
UnsafeSize: The size of the chunk divided by 0x10. So for example, a chunk with size 0x400, will have 0x40 for
UnsafeSize. This field is part of the value used during the FreeChunkTree allocation to compare different nodes and find a suitable chunk according to the requested size. Another use of this field is during coalescing, where it's used to find the next neighboring chunk.
UnsafePrevSize: This is the size of the previous chunk divided by 0x10. One use of this field is during the allocation of a chunk from the
FreeChunkTree. In that case, the
UnsafePrevSizeis used to find the
EncodedSegmentPageOffsetof the previous chunk and identify the beginning of the vs subsegment. It has to go through the previous chunk to find that value, since the chunk from the FreeChunkTree won't have the
EncodedSegmentPageOffsetfield in its header (remember, those chunks are freed). The
UnsafePrevSizeis also used during coalescing to identify the boundaries of the previous neighboring chunk.
Allocated: Indicates whether the chunk is allocated/used or not. For example, this field is used during the coalescing procedure run when a chunk is deallocated to determine whether previous/next chunks are free as well and merge them if they are. It's also used to determine whether a chunk has a valid
EncodedSegmentPageOffsetduring allocation from the
FreeChunkTree. The value of this field is supposed to be 0 for free and 1 for used, but in practice anything different than 0 is considered as used, which in some cases we will see later on might be convenient for exploitation.
EncodedSegmentPageOffset: this field is also encoded. Assuming
vs_chunk_headerholds the address of the
_HEAP_VS_CHUNK_HEADER, this field is encoded as follows:
(vs_chunk_header->EncodedSegmentPageOffset ^ vs_chunk_header ^ RtlpHpHeapGlobals.HeapKey) & 0xff
This field contains the distance in pages from the current chunk to the beginning of the VS subsegment.
UnusedBytes: indicates whether the chunk has unused bytes. This for example can happen when the chunk is allocated from the
FreeChunkTreeand is bigger than the size requested, but the remaining chunk was too small to split (e.g. smaller than 0x20-the maximum header size in both freed and used states). If this value is one, then the unused size is encoded in the last two bytes of the chunk. The unused bytes potentially added at the end of the chunk don't seem to be used anywhere else in the current segment heap implementation.
Chunks that have been freed can be further divided into three categories:
- Chunks placed in the dynamic lookaside.
- Chunks temporarily placed into the delay free list.
- Chunks placed within the FreeChunkTree
The dynamic lookaside is arguably a 1st level function optimization (in kernel-mode) used to facilitate fast reallocation of chunks with hot sizes that fall within a specific size-range
The dynamic lookaside is grouped into what's called buckets (
_RTL_LOOKASIDE) and each bucket is responsible for servicing a specific size range. Each bucket has a pointer to the head of a singly-linked that contains the chunks that are part of the associated bucket. The singly-linked list structure is maintained in the user data part of the chunk returned by the VS allocator (2nd level function return address). In many cases, that location coincides with the location where the
_POOL_HEADER is placed (interestingly, the windbg
!pool command will choke when the pool header is corrupted, e.g. when a chunk is part of a lookaside). The structure used for maintaining the list is
_SINGLE_LIST_ENTRY. The maximum size of this list is calculated dynamically based on the demand for the associated size within a predefined period, with the maximum depth being 0x100 elements. A bucket is only enabled when that size is not zero.
During allocation, the 1st level function will check whether the request size falls within the eligible range of the dynamic lookaside which is
0x1e1-0xf70. If it does it will check whether the bucket associated with that size has chunks available. If it does, it won't go through the VS allocator (2nd level function), instead it will return the first available chunk from the bucket. The mapping of request size to bucket size coincides with the mapping of 1st level request size to the VS allocator request size calculated by the 1st level function before forwarding the request and is shown below:
|Rounded↑ Request Size||Bucket Size|
|0x1f0 - 0x3f0||size+0x10|
|0x400 - 0x7f0||ROUND_UP(size+0x10, 0x40)|
|0x800 - 0xf70||ROUND_UP(size+0x10, 0x80)|
#define ROUND_UP(x, n) (x+n-1)&~(n-1)
So for example, an allocation request with size 0x800 will be managed by bucket with size 0x880. If during the allocation the associated bucket is enabled, but has no chunks inside, then the allocator will proactively allocate
bucket_maximum_list_size/2 chunks and add them to the bucket.
When a chunk is freed, the first level function will use its metadata (ie
_POOL_HEADER.BlockSize) to find the 1st-level chunk size. It will then check whether the size falls under the range of the dynamic lookaside and then see if the associated bucket has space in its list for the new chunk. If the bucket haven't reached its maximum depth then the chunk will be added to the dynamic lookaside, otherwise it will go through the 2nd level function to properly free the chunk.
Delay Free List
The delay free list is VS allocator specific optimization. It's a singly-linked list maintained right after the
_HEAP_VS_CHUNK_HEADER of the soon to be freed chunk. The maximum depth of the list is set to 0x20 entries. When that limit is reached, all the chunks within that list get freed at once. This optimization is potentially used to decrease fragmentation. It also speeds up 97% (0x20/0x21) of pool free calls since they don't have to go through the whole freeing routine. On the downside, big part of the performance gain gets converted to overhead for the rest of the 3% pool free calls. This optimization is only applied for chunk sizes smaller than 0x1000 and chunks cannot be reallocated during the period they are part of the delayed free list.
The FreeChunkTree is a red-black tree and it's the main structure used by the VS specific function to store the free chunks during deallocation when a request is no (longer) eligible for the previous optimizations.
The node value used to iterate the red-black tree is the:
UnsafeSize<<16 | MemoryCost , so in practice the memory cost is used as a tie-breaker when multiple free chunks exist with the same size.
The header used for the chunks inside the FreeChunkTree is the
1: kd> dt _HEAP_VS_CHUNK_FREE_HEADER nt!_HEAP_VS_CHUNK_FREE_HEADER +0x000 Header : _HEAP_VS_CHUNK_HEADER +0x000 OverlapsHeader : Uint8B +0x008 Node : _RTL_BALANCED_NODE
1: kd> dt _RTL_BALANCED_NODE nt!_RTL_BALANCED_NODE +0x000 Children :  Ptr64 _RTL_BALANCED_NODE +0x000 Left : Ptr64 _RTL_BALANCED_NODE +0x008 Right : Ptr64 _RTL_BALANCED_NODE +0x010 Red : Pos 0, 1 Bit +0x010 Balance : Pos 0, 2 Bits +0x010 ParentValue : Uint8B
As we can see, chunks here are still headed with the
_HEAP_VS_CHUNK_HEADER.Sizes (OverlapsHeader) but in contrast to the used chunks, they use the space after to maintain the reb-black tree node header.
During chunk deallocation, the memory cost of the chunk is updated and the chunk is added to the
FreeChunkTree. The FreeChunkTree is only used during deallocation when a chunk cannot be part of the dynamic lookaside or when the delay free list is flushed.
During allocation and if the dynamic lookaside couldn't fulfill the request, the FreeChunkTree is used. The VS allocator iterates the red-black tree and employs a best-fit strategy to identify a suitable chunk for the allocation request. If the size of the identified chunk is bigger than the request size, then typically, the chunk is split, and the remainder of the chunk is added back to the FreeChunkTree. An exception to this is the case where the remainder of the chunk is smaller than 0x20 bytes, in which case the
_HEAP_VS_CHUNK_HEADER.UnusedBytes is set to 1 and the last two bytes of the chunk are set according to the actual unused size.
At this point, we should also mention the
PageAlignLargeAllocs flag. When this flag is enabled, the VS allocator user data will be page aligned for large allocations. In this context, a "large allocation" is an allocation that spans multiple pages. The VS allocator metadata will be placed on the previous page and the allocator will add an extra 16-byte padding after the VS header. The extra padding is most likely added to allow the whole
_HEAP_VS_CHUNK_FREE_HEADER to fit in a single page. The header has to always be in memory, so if it spans over two pages, both pages will have to stay in memory. So this padding allows the allocator to decommit potentially one more page per chunk when a chunk is freed.
An example of how a vs "large" allocation might look like in memory:
PAGE OFFSETS: 0 0xfe0 0xff0 0x1000 ↓ ↓ ↓ ↓ | other chunks | _HEAP_VS_CHUNK_HEADER | PADDING | USER_DATA |
When freed, it would look like the following illustration, where we can see that the whole
_HEAP_VS_CHUNK_FREE_HEADER can fit on a single page because of the padding:
PAGE OFFSETS: 0xfe0 0x1000 ↓ ↓ | _HEAP_VS_CHUNK_FREE_HEADER | USER_DATA | ^ no headers in this page, it can be decommited if user data>PAGE
Due to the implementation of the decommit optimization, when the PageAlignLargeAllocs is enabled, the allocator will always look for a chunk in the
FreeChunkTree that can hold the header padding (i.e. the extra 0x10 bytes). That happens regardless of whether the allocation request was for a 'Large' chunk. After finding a suitable chunk, it will check whether the chunk starts at offset 0xfe0 from its page. If it does, the padding remains regardless of whether the request is for large chunk or not. Otherwise the correct size of the chunk is passed to the
RtlpHpVsChunkSplit which may split the chunk depending on the number of unused bytes.
So the FreeChunkTree is always iterated for a chunk size potentially 0x10 bytes bigger than the request size. The side-effects of this behavior might add some complexity in some exploitation scenarios. For example, let's say we have created the following memory layout by requesting from the 1st level allocator chunks with sizes: 0x7f0, 0x3f0:
0: kd> !poolview 0xffffc18757802c70 //see  Address Size (Status) Tag Type --------------------------------------------------------- 0xffffc18757802050 0x7f0 (Allocated) NpFr Vs 0xffffc18757802860 0x3f0 (Allocated) NpFr Vs * 0xffffc18757802c70 0x370 (Free) Vs
Given the layout above, to have the chunk
0xffffc18757802c70 potentially allocated back to us, then we have to request a maximum size (remember, best-fit algorithm) of
0x360 instead of
0x370. With a
0x360 request size, the allocator will add the additional 0x10 for the padding and find the
0xffffc18757802c70 chunk. It will then subtract the 0x10 from the needed bytes since its address page offset is not 0xfe0 and pass the chunk for splitting. The splitting function will consider the remaining size to small for splitting so there will be no splitting and the UnusedBytes of the
0xffffc18757802c70 chunk will be set to true.
Another example, assuming we are in the following state:
0: kd> !poolview 0xffffc18757802c70 Address Size (Status) Tag Type --------------------------------------------------------- xffffc18757802050 x7f0 (Allocated) NpFr Vs xffffc18757802860 x3f0 (Allocated) NpFr Vs * xffffc18757802c70 x370 (Allocated) NpFr Vs
Now if we want for example to free
0xffffc18757802860 (e.g. to create a hole for the vulnerable chunk) and reallocate it, we run into the padding problem again:
0: kd> !poolview 0xffffc18757802860 Address Size (Status) Tag Type --------------------------------------------------------- 0xffffc18757802050 0x7f0 (Allocated) NpFr Vs * 0xffffc18757802860 0x3f0 (Free) Vs 0xffffc18757802c70 0x370 (Allocated) NpFr Vs
To get back the
0xffffc18757802860 chunk, we have to allocate maximum of
0x3e0 bytes instead of the original
It is noted that the padding problem only affects the maximum size we have to request to allocate a specific chunk. Since the chunk lookup follows a best-fit strategy, we can request a smaller chunk and still reclaim the target chunk. Some scenarios/considerations where we may have complications with this approach:
- The target freed chunk's dynamic lookaside (dl) is disabled, so it landed into the FreeChunkTree. If the dl of the smaller request size is different than the dl of the original size (e.g. the case of
0x3f0) and the dl of the smaller size is enabled, then it may be difficult to reallocate the freed chunk from the FreeChunkTree.
- The target freed chunk landed into the dynamic lookaside of the original chunk size. Now if the smaller chunk size falls into a different lookaside, it will be impossible to reallocate the target free chunk
- The unused bytes at the end of the chunk have to be taken into consideration when calculating the overflow size. For example in the
0xffffc18757802860chunk described above, the overflow size will have to be 0x13 since there is an extra 0x10 bytes at the end of the chunk. If the vulnerable chunk is allocated without the padding then the overflow will be just 3 bytes. This dual potential overflow size might provide some small flexibility, but to abuse it one has to be very precise and take care of points (1) and (2)
When it comes to chunk deallocation, the chunk is first coalesced with potentially free neighboring chunks. When PageAlignLargeAllocs is enabled, and if the chunk after the freed chunk was free, then the chunk after that is checked as well. The addresses of the coalescing chunk candidates are calculated as follows:
Previous Chunk: prev_chunk_addr = freed_chunk_addr + freed_chunk_sizes.UnsafePrevSize*16 Next Chunk: next_chunk_addr = freed_chunk_addr + freed_chunk_sizes.UnsafeSize*16 Next Next Chunk: next_next_chunk_addr = next_chunk_addr + next_chunk_sizes.UnsafeSize*16
_HEAP_VS_CHUNK_HEADER.Sizes.Allocated of each candidate is used to determine whether the neighboring chunks are allocated or not. If that value is anything other than zero, then the chunk is considered allocated, otherwise it's free. After the coalescing procedure, we get a potentially merged chunk. If that chunk crosses the boundaries of a page, then the chunk is splitted into two chunks as illustrated below:
| PAGE BOUNDARY | PAGE BOUNDARY | PAGE BOUNDARY | ... | MERGED CHUNK BOUNDARIES | ... Is broken down into: | PAGE BOUNDARY | PAGE BOUNDARY | PAGE BOUNDARY | ... | CHUNK 1 | CHUNK 2 | ...
The first chunk that goes through this process is the underlying subsegment itself. When a request reaches the VS allocator and there is no suitable chunk within the FreeChunkTree, the VS allocator will allocate a new subsegment. The size of the new subsegment will depend on the missed request size. The allocator will then use the beginning of the subsegment to keep its headers (ie
_HEAP_VS_SUBSEGMENT) and the remainder of it will be added to the FreeChunkTree. Given that the VS subsegment will be page aligned and that the user data will be placed at the 16-byte aligned address after the
_HEAP_VS_SUBSEGMENT header, we know that the freed subsegment chunk will start at address: xxxxxxx030 and its size will be at least 0x10000 bytes. Since it will cross the page boundaries, it will be split into two chunks, one that will be of size:
0xfe0-0x30) and another chunk with the remaining number of pages.
The table below shows the different categories of chunks served by the VS allocator along with some of their characteristics:
a. 0 <= size <= 0x1e0 && LFH[size]==disabled b. 0x1e1 <= size <= 0xfe0 c. 0x1001 <= size < 0x10000 && size % 0x1000 <= 0xfc0
|Group||Rounded↑ Request Size||Actual VS Request Size||LFH Disabled||POOL_HEADER||Dynamic Lookaside||Delay Free List|
|A||0 - 0x1e0||size+0x10||x||x||x|
|B||0x1f0 - 0x3f0||size+0x10||n/a||x||x||x|
|C||0x400 - 0x7f0||ROUND_UP(size+0x10, 0x40)||n/a||x||x||x|
|D||0x800 - 0xf70||ROUND_UP(size+0x10, 0x80)||n/a||x||x||x|
|E||0xf80 - 0xfd0||size+0x10||n/a||x||~x|
|G||all the rest||size||n/a|
#define ROUND_UP(x, n) (x+n-1)&~(n-1)
Vulnerable Chunk: this is the chunk that is going to get overflowed at some point by a vulnerable application
Bruce Banner Chunk: this is the chunk right after the vulnerable chunk but before the overflow. The UnsafeSize of this chunk is going to get overwritten after the overflow. Our goal is to have this chunk look bigger than it actually is. We will also call it BB chunk.
Mutant Chunk: this is the Bruce Banner chunk after the overflow.
Overlapping Chunk: this is the chunk we get after reallocating part of the mutant chunk. Since its size has been artificially increased, this chunk will overlap with the chunk after it.
Remainder Chunk: this is the remaining chunk after the mutant chunk is split to allocate the overlapping chunk. This chunk is optional, if it exists then it can either be contained within the overwritten chunk as shown in the diagram below or extend beyond the overwritten chunk depending on the mutant chunk size.
Overwritten Chunk: this is the chunk right after the BB chunk. Part of this chunk will get overwritten by the overlapping chunk
The target for the heap overflow will be the encoded
\_HEAP_VS_CHUNK_HEADER.Sizes.UnsafeSize. By targeting the UnsafeSize field, we are essentially changing the chunk's perceived boundaries by the VS allocator. Our goal is to increase the size of the BB chunk, turn it into mutant and have its boundaries overlap with its neighboring chunk (overwritten chunk). We can run this attack while the BB chunk is in three different states:
Allocated Chunk Attack: the BB chunk is allocated. To run the attack:
- Using the memory corruption vulnerability, overwrite the UnsafeSize field of the BB chunk.
- Free the BB chunk. This triggers the VS allocator to use the overflowed UnsafeSize which results into the transformation of the BB chunk into mutant chunk. At this point, the mutant chunk will get added to the FreeChunkTree.
Allocate the overlapping chunk off the mutant chunk.
Considerations: Upon freeing the BB chunk is step (2), the coalescing procedure will attempt to find whether the neighboring chunks are free as well to merge them. To find the chunk following the one that is getting freed, the function makes use of the UnsafeSize of the freed chunk, the field we have corrupted. So the address of the next chunk will be at a random offset from the beginning of the freed chunk. Not only that, to verify whether the chunk is free or not, the coalescing function will make use of the
\_HEAP_VS_CHUNK_HEADER.Sizes.Allocatedwhich is encoded as well. Fortunately, if the
Allocatedfield holds any value other than zero, then it is assumed that the chunk is allocated. So, even though we have no control over the next chunk, chances are (255/256) that the next chunk is going to be identified as allocated and there is not going to be any merging. It is also noted that we can avoid the merging if the mutant chunk size (step 2) is bigger than the boundaries of the underlying VS subsegment, a likely scenario in the MSB attack. Failure to prevent the merging of the two chunks will most likely lead to bug check.
RB Node Attack: the BB chunk is freed and resides in the FreeChunkTree.
- Using the memory corruption vulnerability, overwrite the UnsafeSize field of the freed BB chunk. This should immediately turn the BB chunk into mutant.
Allocate the overlapping chunk off the mutant chunk
Considerations: Here we are changing the value of a node that is already in the red-black tree. A unique benefit of this approach is that the BB chunk is turned into mutant chunk immediately, without requiring any intermediate actions, like for example the deallocation of the BB chunk which was required in the Allocated Chunk Attack. This allow us to avoid the 1/256 probability of failure imposed by the other attacks due to the coalescing free chunk check discussed in the previous attack. On the downside, we have increased complexity for successfully running the attack. For example, if the BB chunk's node initial value was x, and x was smaller than the parent's value and after the overflow x turns bigger than x, then we shouldn't be able to claim the mutant chunk. With numbers, let's say the parent node value is 0x300, the BB chunk value was 0x200 and after the overflow it became a mutant with size 0x400. The BB chunk will be the left child of the parent since its value is smaller than the parent's value. So after the overflow, the left branch of the parent will not be reached if the request size is 0x400, since it will take the right child of the parent.
Coalescing Attack: the BB chunk is freed and resides in the FreeChunkTree. To run the attack:
- Using the memory corruption vulnerability, overwrite the UnsafeSize field of the freed BB chunk.
- Free either the previous or the chunk after the BB chunk. This should cause the execution of the coalescing procedure which will merge the chunk freed here with the BB chunk. The merged chunk will be the mutant chunk.
Allocate the overlapping chunk off the mutant chunk
Notes: The requirements on the heap layout are a bit more strict which adds to the complexity of the attack. Regardless of its complexity, one strength of this approach is that we use the coalescing procedure to "fix" the corrupted rb tree node issue we had in the RB Node Attack. After step 2, the misplaced mutant chunk will be removed from the rb tree and will be properly inserted again in the FreeChunkTree based on its new merged chunk size. On the downside, by fixing the corrupted rb tree node, we are reintroducing the coalescing procedure free chunk check, which adds back the 1/256 probability of failure.
Now looking into the practicality of the attack, the VS header is at the very beginning of the heap chunk so potentially fewer bytes will be needed to be overwritten. With regards to the target field,
UnsafeSafe is essentially the first important field of the structure, so we don't have to worry about potentially corrupting the header until we reach our overflow target. The big downside is the fact that this field is encoded. As mentioned in the introduction, this field is expected to hold a random 16-bit value regardless of the value of its actual size. So by naively overwriting this field, we will be replacing the BB chunk size with some random value, resulting in a mutant that will be out of control and which sooner or later would trigger a bug check. Below are some approaches that can be used to turn the odds into our favour and reliably exploit the overflow of the encoded UnsafeSize
Before getting on, it is noted that the techniques described here only enable the creation of an overlapping chunk. We don't delve into the details of how we can further build on that primitive, but for some ideas for Paged/NonPaged pools look at: , , 
LSB Technique: Overflowing the Least Significant Byte
An overview of the technique: we aim to eliminate the unpredictability added through the UnsafeSize encoding by setting the appropriate heap layout and targeting just the least significant byte (lsb) of the BB chunk. In essence we are artificially limiting the potential sizes of the mutant chunk after the overflow and contain them within the boundaries of chunks that are under our control.
In a bit more detail, to make things work we start by having the BB chunk be bigger than a page. For example, let's say we chose the size 0xc040. Within its UnsafeSize, we should have the encoded 0xc04 (remember this field holds the size of the chunk/16). Our goal after the memory corruption vulnerability is to overwrite just the first byte of the UnsafeSize. So now the chunk will have any size between 0xc00-0xcff, translated to actual size is 0xc000-0xcff0. Given that, if we keep the end of the overflowed chunk as close as possible to its containing page (e.g. at an offset 0x40-headers=0x20 in this case) then our odds of creating a sufficiently big overlapping chunk are pretty good. For example, to overwrite the first 16 bytes of the overwritten chunk user data, we have 249/256, 97% success rate (ie any size
>=0x70). Even in the cases where we fail to get sufficiently big mutant chunk, we should be able to retry.
Let's see the various stages of this approach in more detail. We will keep using the 0xc040 as the vulnerable chunk vs size for illustration purposes. We will also be attacking the UnsafeSize of the vulnearble chunk while it's in the allocated state, so we will be running the "Allocated Chunk Attack" previously described.
First is the heap layout before the overflow:
Important points here:
- The BB chunk should be bigger than a page and the end of that chunk should be located very close to the beginning of its associated page. If the gap from the beginning of the page is too big, then we decrease the odds of having a sufficiently big mutant chunk after the overflow.
- The vulnerable chunk should be close, ideally adjacent to the BB chunk and we should be able to overflow just one byte from the BB chunk's UnsafeSize. Here we should be cognizant of potential unused bytes in the vulnerable chunk.
- The overwritten chunk should occupy the whole remainder of its containing page. This is important since the mutant chunk might get extended for example to size 0xcff0. So if we fail to allocate at least the whole remainder of the BB chunk's final page, and that chunk is handed to another process, then after overwriting that chunk we will most likely be causing an unrecoverable error.
- After the overwritten chunk, we want to have allocated a chunk under our control, the spare chunk in this case. This chunk is important for two reasons:
- If we take care of point (3), the spare chunk's VS header will start at offset 0xfe0 from the page when the PageAlignLargeAllocs is enabled. Now if the UnsafeSize is turned to 0xcff0, it will corrupt the VS header of the spare chunk. So if we don't control it, then it's very likely that the system will crash at some point after the chunk get either allocated or deallocated by another process. Still, this case is quite unlikely, 1/256 (i.e. the odds of getting the size 0xcff)
- Even if the headers of the spare chunk are not corrupted, we may still have a problem. Let's say we have triggered the overflow, and we managed to overwrite part of the overwritten chunk. At this point, the VS header of the overwritten chunk should be corrupted. At this point, if the spare chunk was freed, placed in the FreeChunkTree and got reallocated by another process, then the system will crash, as the vs allocator will attempt to find out whether the previous chunk (i.e. the overwritten chunk) is allocated to use its
EncodedSegmentPageOffsetand calculate the distance to the underlying VS subsegment. But since the whole VS header of the overwritten chunk will be corrupted, its Allocated field will most likely indicate that the chunk is allocated (essentially non-zero value odds 255/256). This will make the VS allocator to use the overwritten chunk's
EncodedSegmentPageOffsetwhich will hold a random value (remember it's encoded similarly to
UnsafeSize) which will cause the system to crash. This scenario is much more likely than the first one, so it's important to have the spare chunk allocated.
So now let's see the state of the memory after the overflow:
Okay, so we have triggered the vulnerability and managed to overwrite the lsb of the BB chunk's size with a zero. It is noted that the value of the overwritten byte is not important since we are targeting an encoded field. The overwritten byte will be turned into a random value after the field is decoded for use, and we deal with that randomness using the steps described in this section. So now, after the overflow, even though the overflowed chunk will occupy 0xc040 bytes in memory, when the VS allocator is handling that chunk it will assume it's 0xc5d0 bytes because of the altered UnsafeSize. So now what's left is to engage the VS allocator by deallocating the chunk. Now we should have the mutant chunk with size 0xc5d0 in the FreeChunkTree.
After the allocation of the overlapping chunk off the mutant chunk we get to the following state:
With regards to the overlapping chunk, in my notes i have the allocation of the whole mutant chunk by the overlapping chunk (i.e., no remainder chunk) as an important point. Unfortunately i didn't mention why, but on a closer look i think this mostly relevant in case we are running the RB Node Attack. In that scenario, we can avoid the 1/256 probability of failing the coalescing procedure's adjacent chunk check as was explained earlier.
At the end, as we can see on the diagram, we managed to overwrite part of the overwritten chunk with the controlled data of the overlapping chunk.
Setting the heap layout
In theory the approach seems good, let's see how it's possible to get the appropriate memory layout when the
LargeAllocsPageAlign flag is enabled (e.g. kernelmode implementation). We will keep using the 0xc040 BB chunk size we used in the previous examples, but again this is not a requirement. We will also assume that the vulnerable chunk size is
0x360 but the same approach can be used for most sizes <
Following is the diagram showing the different states of the attack for our example along with a description of the steps involved:
- BB chunks allocation: spray the memory with 0x200 chunks of size 0xc040 Every time the allocator fails to find a suitable chunk in the FreeChunkTree, it will allocate a new 0x20000 subsegment and split that into two chunks with sizes
0x1f020as mentioned in the missed chunk allocation description. The
0x1f020chunk will get split to serve the
0xc040request. After the allocation of the 0xc040 chunk, the remainder chunk is going to get split again to
- To get the 0xc040 vs chunk size, request 0xc020 from the 1st layer allocator. As mentioned earlier the VS allocator will add 0x10 for the VS header and 0x10 for padding in this case.
- A 0x20000 subsegment will fit two 0xc040 chunks. So for 0x200 allocations we will have about 0x100 new subsegments and thus get 0x100 chunks of size
- In a subsequent step, we will split the
0xfb0accordingly to have the vulnerable chunk adjacent to the overflowed chunk. Since this chunk will eventually hold the vulnerable chunk, we will call it vulnerable superchunk.
0xfc0is going to end up as the overwritten chunk
- The very first chunk of the
0x12020is going to end up as the spare chunk
- Even though we used 0xc020 as the 1st layer request size we could have also used 0xc010 which would have also increase our success odds a bit more. The only size we should avoid is the 0xc030, since we would end up having the overwritten chunk and the vulnerable superchunk have the same size and we wouldn't be able to differentiate between them to run the attack. It is noted that we can't use the size 0xc000 as it will end up getting served by another 2nd layer allocator (segment allocator).
- Overwritten chunk recovery. We allocate
0x200chunks of size
0xfa0(1st alloc layer). The vs allocator will add 0x10 for its header and 0x10 for the "proactive" padding just in case the chunk starts at offset 0xfe0 from its page. It is noted that
0xfa0is the maximum size that allows reallocation of the
0xfc0vs chunk and this is due to the decommit optimization we discussed earlier. In this case, the
0xfa0is also the only size that captures the overwritten chunk fully without causing it to get split (i.e. by having the remainder chunk size be smaller than 0x20)
- Vulnerable chunk carving. So as we already mentioned, the FreeChunkTree allocation mechanism follows a best-fit strategy (at least in principle-ignoring the decommit issue side-effects). Now we have a chunk of
0xfb0and we want to have its end allocated as the vulnerable chunk. If the vulnerable chunk is close to 0xfb0, our job is done, we just claim that with the appropriate 1st level alloc size. But let's take a more challenging size, like
0x360, and see how we can make it adjacent to the BB chunk. The strategy is simple. We create a number of splitting rounds. In each splitting round we will request a specific chunk size which will cause the vulnerable superchunk (i.e. the 0xfb0 chunk in the first round) to get split. So at the end of each round we will end up with a smaller vulnerable superchunk and our goal is to to create the vulnerable chunk itself eventually. So let's see how this strategy could play out for a 0x360 vulnerable chunk:
Splitting Round Superchunk Size Pre-Split 1st Layer Alloc Size VS Alloc Size Superchunk Size Post-Split 1 0xfb0 0x7f0 0x810 0x7a0 2 0x7a0 0x3f0 0x410 0x390
In the table above, we can see that by the end of round 2, the FreeChunkTree should contain a bunch of 0x390 chunks adjacent to the BB chunks. Now when the next 1st level alloc of size 0x360 is requested, one of the 0x390 chunks (0x360+0x10 pool header+0x10 vs header + 0x10 padding) will be used to serve it. It is noted that the decommit issue will affect only the vulnerable chunk in this scenario. The vulnerable chunk will have 0x10 unused bytes, which we will have to consider when calculating how many bytes the overflow should be.
Note: the formula for calculating the "Superchunk Size Post-Split" (ssps) should be similar to this:
Superchunk Size Post Split = (Superchunk Size Pre-Split) % (Vs Alloc Size)It's an important detail, in the table above the vs alloc sizes fit only once in the initial superchunks so the calculation of the post-split superchunk size is a simple subtraction, but that's not always the case. For example, with vs alloc size of 0x300 in splitting round 2, we would get a post-split super chunk of size 0x1a0 instead of 0x4a0.
- At this point, we should have about 0x100 chunks with the size of the vulnerable chunk (i.e. 0x390) in the FreeChunkTree. As a precaution, we can allocate 0x50 chunks with the same size as the vulnerable chunk to clean up the environment of potential noise (e.g. already existing chunks in the dynamic lookaside, interfering deallocations from other processes). Then we trigger the allocation of the actual vulnerable chunk. That chunk should get allocated in one of the carved 0x390 chunks of the previous step.
- Trigger the overflow and free the BB chunk. At this point, the mutant chunk will get added to the FreeChunkTree and its boundaries will most likely overlap with the overwritten chunk.
- Trigger an allocation big enough at least to overwrite the necessary parts of the overwritten chunk. For example an allocation with size 0xc080 will overwrite 0x20 bytes from the overwritten chunk user data. In case we are running the RB node attack, we can avoid the free chunk check that happens during coalescing by having the overlapping chunk cover the whole mutant chunk. This allow us to avoid that 1/256 source of failure discussed in the overview. To do that, using the scenario already provided, we start allocating chunks with size 0xcff0, decreasing the allocation size by 0x10 bytes in each iteration. After each iteration, we attempt to verify whether we managed to successfully overwrite the overwritten chunk. The attack failed if we couldn't create a sufficiently big overlapping chunk by the iteration with size near 0xc040 and we should be able to retry.
By the end of step 6, if it was successful, we have managed to create our overlapping chunk.
MSB Technique: Overflowing the Most Significant Byte
In the LSB approach, we worked around the randomness added by the size encoding by containing the possible sizes of the mutant chunk within an area we control (i.e. BB chunk+overwritten chunk). Here we take the opposite direction, and instead, we overwrite both bytes of the BB's UnsafeSize potentially extending the chunk way beyond the boundaries of the chunks we control. Considering that even some of the most significant bits of the UnsafeSize field remain unused, by overwritting both bytes we have a very high probability of creating a huge mutant chunk which will almost certainly cover the overwritten chunk. But the problem here is that we will end up with a huge chunk in the FreeChunkTree that will most likely span across multiple chunk boundaries, which would probably belong to either different processes or the (vs) allocator itself. If the mutant chunk is used to serve random memory requests, then we will corrupt the state of other processes irrecoverably. On top of that, we also have to race other processes when trying to allocate the overlapping chunk off the mutant chunk since the mutant chunk would be readily available within the FreeChunkTree. We deal with these two problems as follows:
- We want to be the first and ideally the last to use the mutant chunk to satisfy memory allocation requests. To maximize the probability of being the first to use the mutant chunk, we inject the FreeChunkTree with what we call from now on the injection fence chunks. Those chunks should have a size above all the memory request sizes that are expected to happen during the time frame of the attack and lower than the size where we cross the boundaries of adjacent chunks. (i.e.
BB chunk size + overwritten chunk size). Given that the FreeChunkTree follows a best-fit strategy, all the memory requests with size smaller than the fence chunks would be satisfied by the fence chunks themselves if no smaller chunks are available. The only way to access the mutant chunk is to create an allocation request bigger than the fence chunks. The assumption here is that the fence chunk will be smaller than the mutant chunk, which most likely would be true given the possible range of size values. Finally, it is noted that the size of the fence chunk could affect to a small degree the probability of failure. That’s because the fence chunk size becomes the minimum size we need to extend the BB chunk to be able to allocate the overlapping chunk. I haven't looked into it, but it shouldn't be difficult to eliminate this potential source of failures.
- After the mutant chunk is split during the allocation of the overlapping chunk, its remainder chunk will be available within the FreeChunkTree for future allocations. We want to minimize the probability of that chunk getting used by any other process since it may cross the boundaries of other chunks outside our control. To do that, we use the injection trap chunks. These chunks are meant to trap and fulfill memory requests to avoid stepping onto the mine, which in our case would be the allocation of the remainder chunk. Some notes about trap chunks:
- Before allocating the overlapping chunk, we should be careful to install only traps that are smaller than the fence chunks. Otherwise, we will most likely be creating new fence chunks.
- Post-overlapping chunk allocation, especially if the funce chunks were not big enough (e.g. >0xa000) we want to install some big trap chunks. The remainder chunk would probably be very big so we want to shield it with those traps.
- For increased reliability, we could also install smaller traps, as seen fit.
It is important to note that allocated chunks should surround injection chunks since upon free, the adjacent chunks will merge with the injection chunks if they are not allocated. That means the final size of the injection chunks will not be what we expected it to be.
Now let's see an example. Let's say we have a BB chunk with size 0x330, so in its encoded UnsafeSize has 0x33. Adjacent to the BB chunk we have the overwritten chunk with size 0xc030 bytes. After the overflow, we overwrite the whole UnsafeSize of the BB chunk. So now that chunk will have any size between 0x0000-0xffff, translated to actual size is 0x00000-0xffff0. Given that, we have a good chance of extending the size and having it overlap with the overwritten chunk. In this case, assuming overwriting 0x60 bytes from the overwritten chunk is enough for escalating privileges, we have
1-(0x33+0x6-1)/2^16=99.9% success probability. But assuming we are going to use a
0x1500 byte fence chunk, then the success probability drops to
1-0x150/2^16=99.5%. Even when we fail, we should be able to retry. As we can see, successfully extending the overwritten chunk is easy, the challenging part is dealing with the possibility of overextending that chunk. In practice, there is over 85% probability of the size getting bigger even by its underlying subsegment. Now to burry the potentially huge mutant chunk inside the FreeChunkTree we have to consider the following scenarios:
mutant chunk < overlapping chunkunlikely but we failed, try again. It is noted that the overlapping chunk has to be bigger than fence chunk and its size has to be at least equal to the minimum viable overflow size, eg 0x380 bytes in this scenario.
mutant chunk <= original overflowed + overwritten chunkIn this case, we are in a similar situation with the LSB technique, the mutant chunk is contained within a controlled area so there is not much we should do here.
remainder chunk <= 0x10000remainder chunk is what's left after the mutant chunk is split to serve the overlapping chunk. Roughly speaking, when that chunk is smaller than 0x10000, then we can use different sizes of trap chunks to minimize the probability of getting that chunk used again by the allocator.
remainder chunk > 0x10000This is the most likely scenario. Here the remainder chunk will probably cross the boundaries of the underlying subsegment itself. To avoid that, we inject the VS allocator with big trap chunks. We could install for example trap chunks with size
0xdf00, but we have to do it after the allocation of the overlapping chunk to avoid interfering with the fence chunks. Trap chunks for this scenario are important when the fence chunks are relatively small.
Setting the heap layout
Here we will look into an approach to setting the memory layout required for the attack. Overall, for convenience, our strategy will be very similar to the one used in the LSB technique. Still, it is noted that the requirements here are not as strict, so depending on the failure appetite, different/simpler strategies could be deployed. The only strict requirement is to have the vulnerable chunk, BB chunk and overwritten chunk sequentially allocated, not necessarily adjacent to each other but all contained within a heap area with chunks under our control. Ideally, we also want the spare chunk for the reasons described in the LSB approach. Having said that, for this example we will assume the 1st alloc request size for vulnerable chunk is 0x350, BB chunk is 0x300 and the overwritten chunk with 0xc020
Following is the diagram and a description of each step of the attack:
- Allocate the fence chunks. Allocate 0x200 chunks with size 0xc100.
- The actual size is not that important but has to be big enough to satisfy memory requests that arrive during the time window we run the attack.
- They shouldn't be too big, ie bigger than
0x330 + 0xc040 = 0xc370since they would overlap spare chunk
- The size of the fence chunks has to be smaller than the overlapping chunk, otherwise when trying to allocate the overlapping chunk we will be getting back injection chunks instead of the mutant chunk.
- To prevent coalescing with neighboring chunks after we free them, we also request the previous and next chunks of the fence chunks. In this case chunks with sizes
0xfb0(prev chunk is the beginning of the subsegment) and
0xee0(prev/next chunk is the remainder chunks from the 0xc100 allocation) Note: for simplicity, in this example we used bigger fence chunks to avoid adding the additional complexity of requiring at least some trap injection chunks.
- As noted earlier, in the MSB approach we reserve the big chunk for the overwritten chunk. So we start by spraying the memory with
0xc040bytes. Again we end up with
0x100chunks with sizes
0xfc0. In this case:
- Here the
0xfb0chunk is going to be used as the vulnerable and BB superchunk
0xfc0chunk is going to be the spare chunk.
- Here the
- Recover the spare chunks: create 0x300 requests for
0xfa0bytes from the 1st layer alloc.
- Vulnerable/BB chunk carving. The splitting strategy here is much simpler compared to the LSB. In our example we have a vulnerable chunk with vs size 0x370 (0x350+0x10 pool header+0x10 vs header). So we pick a vs alloc size for the first splitting round so that its remainder chunk is big enough to hold only one chunk of size 0x370. In the second splitting round, we use the vulnerable chunk size. The remainder of the final splitting round would be the BB chunk the size of which shouldn't be particularly important. When it comes to the state of the BB chunk before the overflow we have two options:
- Have it allocated. This would require allocating the BB chunk after the vulnerable chunk which will be problematic in case the allocation of the vulnerable chunk and the triggering of the memory corruption are bound together. It's problematic since the vulnerable chunk is allocated before the BB chunk (so the state of the BB chunk would be free, not allocated as we assumed for this scenario). Suppose the allocation/vulnerability triggering cannot be separated, since their separation would have solved our problem. In that case, we could temporarily request a group of chunks that are the same size as the vulnerable chunk during splitting round 1, then allocate the BB chunk. After that we can free the group of temporarily allocated chunks, which should leave the FreeChunkTree/Dynamic Lookaside with about 0x100 possible vulnerable chunk candidates.
Keep it free and run the appropriate attack according to its state (e.g. Coalescing Attack). For this option to be available we have to have a remainder chunk after the second splitting round of at least 0x20 bytes (minimum vs chunk size). This remainder chunk is going to be used as the BB chunk. Ideally but not very important, you want the BB chunk to fall within the limits of the LFH (low fragmentation heap) to eliminate the possibility of the vs allocator handing over our chunks to other processes. This approach has the upside that we can directly allocate the vulnerable chunk after the splitting round one. We also don't have the decommit issue when allocating the vulnerable chunk that existed when the BB chunk was in the allocated state. By the end of the splitting round 1, we should have at least 0x100 possible vulnerable chunk candidates
Here we use option (a) and we will assume that vulnerable chunk allocation and vulnerability triggering are bound together The splitting rounds info are shown below:
Splitting Round Superchunk Size Pre-Split 1st Layer Alloc Size VS Alloc Size Superchunk Size Post-Split 1 0xfb0 0x8f0 0x910 0x6a0 2 0x6a0 0x350 0x370 0x330
- BB chunks allocation: create 0x200 memory allocation requests with size 0x300. Those should recover the chunks created after splitting round 2 of the previous step. Notice that we adjusted the 1st alloc request size to 0x300 instead of 0x310 for capturing the 0x330 vs sized chunk to account for the decommit issue.
- Install the fence chunks: deallocate the chunks allocated in step (1)
- Free the group of chunks that have the same size as the vulnerable chunk temporarily allocated in step 3
- Create 0x50 allocation requests with the same size as the vulnerable chunk size to clean up the environment. After that trigger the allocation of the vulnerable chunk/heap overflow
- Now we should be in the state after the overflow shown in the diagram above. Free the group of BB chunks allocated in step (3). Now we should have the mutant chunk in the FreeChunkTree. When debugging, you can find this chunk by running the following command:
dx @$scriptContents.vs_freechunktree_stats(0x10000)//this will return all the chunks bigger than 0x10000 bytes from the currently selected heap, see 
- Allocate the overlapping chunk: request a chunk with size 0xc110 (close but bigger than the fence chunks). Then test whether we managed to split the overflowed chunk. If not repeat this step since our request might have been fulfilled by another chunk. Assume that the attack failed after about 0x200 attempts.
If we successfully executed step (10), then we should have something similar to the final state in our example diagram. We have successfully allocated the overlapping chunk over the overwritten chunk. At this point we want to escalate privileges and fix the heap state to gracefully exit.
- @OnlyTheDuck, @paulfariello: https://github.com/synacktiv/Windows-kernel-SegmentHeap-Aligned-Chunk-Confusion
- @yarden_shafir: https://i.blackhat.com/USA21/Wednesday-Handouts/us-21-Windows-Heap-Backed-Pool-The-Good-The-Bad-And-The-Encoded.pdf
- @scwuaptx: https://speakerdeck.com/scwuaptx/windows-kernel-heap-segment-heap-in-windows-kernel-part-1
- @MarkYason: https://www.blackhat.com/docs/us-16/materials/us-16-Yason-Windows-10-Segment-Heap-Internals-wp.pdf
- @_vepe: https://github.com/vp777/Windows-Non-Paged-Pool-Overflow-Exploitation
- @alexjplaskett: https://research.nccgroup.com/2021/08/17/cve-2021-31956-exploiting-the-windows-kernel-ntfs-with-wnf-part-2/
- @yarden_shafir: https://github.com/yardenshafir/PoolViewer/blob/83ba3b0ca42c684bd5d45d4f5a70bb8bc6e65de2/PoolData/PoolData.cpp
- @_vepe: https://github.com/vp777/exploit-dev/segment_heap.js