McAfee’s Edward Metcalf Shares Hybrid Rootkit-thwarting Strategy

It’s been 21 years since the first rootkit was written, but it wasn’t until 2005 that rootkits reared their ugly heads in the mainstream. Today, there are more than 2 million unique rootkits, and another 50 are created each hour, according to McAfee Labs.

Hackers like rootkits because they work silently, which makes them ideal for harvesting credit card numbers and other valuable information, as well as industrial espionage and electronic terrorism. Thwarting rootkits isn’t easy because they load before the operating system (OS) does, and antivirus platforms don’t kick into action until after the OS starts running. In response, security researchers have created a hybrid hardware-software approach that loads first and then looks into memory’s deepest corners to ferret out rootkits.

McAfee’s recent DeepSAFE technology is an example of this hybrid, which supplements conventional antivirus software rather than replacing it. Edward Metcalf, McAfee’s group product marketing manager, recently spoke with Intelligence in Software about how hardware-assisted security works, what the benefits are and what enterprises need to know about this emerging class of security products.

Q: Why have rootkits become so common over the past few years? And how is their rise changing security strategies?

Edward Metcalf: For the most part, it’s always been a software-based approach that the cybersecurity industry has taken to combat malware. But the motivation of cybercriminals has changed over the past few years. Early on, it used to be about fame: getting their name on the news or in the hacker community. About six years ago, we started seeing a shift in their motivation from fame to financial gain. That’s changed the landscape dramatically, as evidenced by the growth in malware and techniques.

McAfee and Intel realized that there are things within the hardware that allow our software to work, such as looking at different parts of the system that block certain types of threats: looking at the memory and blocking kernel-mode rootkits, for example. So the last couple of years, McAfee and Intel have been working on technology to allow McAfee and other vendors to better leverage and utilize the hardware functionality.

The first evolution of that integration between hardware and software is the DeepSAFE platform. DeepSAFE uses hardware functionality built into the Intel Core i3, i5 and i7 platforms.

Q: So DeepSAFE basically shines a light into previously dark corners of PCs and other devices to look for suspicious behavior that OS-based technologies wouldn’t see, right?

E.M.: Until now, for the most part, all security software has operated within the OS. Cybercriminals know that, and they know how to get past it and they’re developing ways to propagate malware. Stealth techniques like kernel-mode and user-mode rootkits are sometimes really difficult to detect with OS-based security.

The current implementation of DeepSAFE utilizes the virtualization technology built into the Core i-series platform. We’re using that hardware functionality to get beyond the OS to inspect memory at a deep level that we’ve never been able to before because we’ve never had that access. It does require PCs to be running that latest platform of the Core i-series platform.

Q: If an enterprise has PCs running those Core i processors, can they upgrade to DeepSAFE?

E.M.: Yes. I wouldn’t position it as an upgrade. It’s added functionality that provides a deep level of protection.

DeepSAFE and Deep Defender do not replace the current antivirus on a machine. They augment it. It gives us a new perspective on some of the new threats that we’ve always had a hard time detecting because they’ve always loaded well before the OS loaded, which prevented us from seeing them because we’re an OS-based application. Cybercriminals knew that that was a flaw.

Q: Is it possible to apply this hybrid architecture to embedded devices that run real-time OS’s (RTOS’s)?

E.M.: Absolutely. Currently, we don’t have the ability to do that, but we’ve already talked about working with RTOS like Wind River. Taking the DeepSAFE strategy to the embedded device certainly could happen in the future.

People are asking about whether we can put DeepSAFE on tablets and smartphones. The answer is potentially yes if we have the hardware functionality or technology to hook into the hardware that we need in order to get that new vantage point.

Q: Hackers have a long history of innovation. Will they eventually figure out how to get around hybrid security?

E.M.: We constantly have to play a cat-and-mouse game: We develop a new technology, and they find ways to get around it.

In DeepSAFE, we’ve developed a number of mechanisms built into how we load and when we load to prevent any circumvention. Because we’re the first to load on a system, and because we use techniques to ensure that we’re the first one, it makes it harder for cybercriminals to develop ways to get around it.

Will a Mobile OS Update Break Your Apps?

It’s one of the biggest headaches in mobile app development: The operating system (OS) vendor issues an update that immediately renders some apps partly or completely inoperable, sending developers scrambling to issue their own updates to fix the problem. For instance, remember when Android 2.3.3 broke Netflix’s app in June, as iOS 5 did to The Economist’s in October? These examples show how breakage can potentially affect revenue -- especially when it involves an app that’s connected to a fee-based service. In the case of enterprise apps, breakage can also have a bottom-line impact by reducing productivity and inundating the helpdesk.

Tsahi Levent-Levi, CTO of the technology business unit at videoconferencing vendor RADVISION, has spent the past few years trying to head off breakage. His company’s product portfolio includes an app that turns iOS tablets and smartphones into endpoints. With an Android version imminent, his job is about to get even more challenging. Levent-Levi recently spoke with Intelligence in Software about why app breakage is so widespread and so difficult to avoid.

Q: What exactly causes apps to break?

Tsahi Levent-Levi: The first thing to understand is that when you have a mobile platform, usually you have two sets of APIs available to you. The first set is the one that’s published and documented. The other is one that you need to use at times, and this is undocumented. When it is not documented or not part of the official API, it means that the OS vendor might and will change it over time to fit its needs.

For example, we’re trying to reach the best frame rate and resolution possible. To do that well means you need to work at the chip level. So you go into the Android OS NDK, where you write C code and not Java code. Then you go one step lower to access the physical APIs and undocumented stuff of the system level, which is where the chipset vendors are doing some of the work.

Even a different chip from the same vendor or a newer chip from the same vendor is not going to work in the same way. Or the ROM used by a specific handset with the same chip is going to be different from the ROM you get from another one, and the APIs are going to be different as well.

Q: So to reduce the chances that their app will break, developers need to keep an eye on not only what OS vendors are doing, but also what chipset and handset vendors are doing.

T.L.: Yes, and it depends on what your application is doing. If you’d like to do complex video things, then you need to go to this deep level.

I’ll give you an example. With Android, I think it was version 2.2 of those handsets that had no front-facing camera. Then iPhone 4 came out with FaceTime, and the first thing that Android handset manufacturers did was add a front-facing camera. The problem with that was that there were no APIs that allowed you to select a camera for Android. If you really wanted to access that front-facing camera, you had to use APIs that were specific to that handset. When 2.3 came out, this got fixed because they added APIs to select a camera.

Platforms progress very fast today because there’s a lot of innovation in the area of applications. Additional APIs are being added by the OS vendors, and they sometimes replace, override or add functionality that wasn’t there before. And sometimes you get APIs from the handset vendor or chipset vendor and not from the OS itself. So there is a variance in the different APIs that you can use or should be using.

If the only thing you’re doing is going to a website to get some information and displaying it on the screen, there isn’t going to be any problem. But take streaming video, for example. Each chipset has a different type of decoder and a bit different behavior than another. This is causing problems.

It’s also something caused by Google itself. When Google came out with Android, the media system that they based everything on was OpenCORE. At one point -- I don’t remember if it was 2.1, 2.2 or 2.3 -- they decided to replace it with something totally different. This meant that all of the applications that used anything related to media required a rewrite or got broken. The new interface is called Stagefright, and there are rumors that this is going to change in the future as well.

Q: With so many vendor implementations and thus so many variables, is it realistic for developers to test their app on every model of Android or iOS device? Or should they focus on testing, say, the 25 most widely sold Android smartphones because those are the majority of the installed base?

T.L.: You start with the devices that most interest you, and then you enhance it because you get problem reports from customers. Today, for example, I got a Samsung Galaxy S. When I go to install some applications from the Android Market, it will tell me that I cannot do it because my phone doesn’t support it. That’s a way that Google is trying to deal with it, but it doesn’t always work because of the amount of variance.

The way it should be done in terms of the developer, you should first start from the highest level of instruction that you can in order to build your application. The next step would be to go for a third-party developer framework like Appcelerator, which is a framework that allows you to build applications using HTML5 and Javascript CSS. You take the application that you build there, and they compile it and make an Android or an iOS application. If you can put your application in such a scheme, it means you will run on the most amount of handsets to begin with because these types of frameworks are built for this purpose.

If you can’t do that, then you’ll probably need to go for doing the most amount of stuff that you can do in Android on the software development kit (SDK) level. If that isn’t enough, then you go down into the native development kit (NDK) level. And if that isn’t enough, then you need to go further down into the undocumented systems drivers and such. And you build the application in a way that the lower you go, the less code you have there.

Maximizing Cloud Uptime

For enterprises, the cloud can be as much of a problem as an opportunity. If employees can’t access the cloud, or if the data centers and other cloud infrastructure suffer an outage, productivity and sales can grind to a halt. Wireless is the latest wild card: By 2016, 70 percent of cloud users will access those applications and services via wireless, Ericsson predicts. Wireless is even more unpredictable than fiber and copper, so how can enterprises ensure that wireless doesn’t jeopardize their cloud-based systems?

Bernard Golden -- author of Virtualization for Dummies and CEO of HyperStratus, a cloud computing consultancy -- recently spoke with Intelligence in Software about the top pitfalls and best practices that CIOs and IT managers need to consider when it comes to maximizing cloud uptime.

Q: What are the top causes of cloud service unreliability? What are the weak spots?

Bernard Golden:
There are issues that are common when using any outside resource, and resulting questions you need to ask to identify the weak spots: Does the network go down between you and the provider? In terms of the external party’s infrastructure operations, how robust is their computing infrastructure environment? You might have questions about their operational practices in support: Do they get patches so things don’t crash?

If cloud computing is built on virtualization, and virtualization implies being abstracted from specific hardware dependence, have you designed your application so it’s robust in the face of underlying hardware failure? That’s more about whether you’ve done your proper application architecture design. Many people embrace cloud computing because of its ability to support scaling and elasticity. Have you designed your application to be robust in the face of highly variable user loads or traffic?

Q: What can enterprises do to mitigate those problems? For example, what should they specify in their service-level agreements (SLAs)?

There’s a lot of discussion about SLAs from cloud providers, but really it’s every link in the chain that needs to be evaluated. Do you have a single point of failure? Maybe you need two different connectivity providers.

Some people put a lot of faith in SLAs. We tend to caution people: At the end of the day, SLAs are great. They’re sort of like law enforcement: It doesn’t prevent crime, but it responds to it. It’s not, ‘I’ve got an SLA, so my system will never go down.’ Rather, an SLA means that the vendor pledges to have a certain level of availability.

So you have to evaluate, what do I expect is the likelihood that they’re going to be able to accomplish that? You need to make a risk evaluation. For example, there was a big outage at Amazon in April 2011. Many early-stage startups use Amazon as their infrastructure, so a number of them went down until Amazon was able to fix that.

There were other companies that had evaluated the risk of something like that happening in designing their application architectures and their operational processes. They said, ‘The importance of this application is such that we need to take the extra time and care and investment to design our overall environment so that we’re robust in the event of failure.’

Whatever you get from the SLA will never make up for the amount of lost business in the case of a failure.

Q: Sometimes there’s also a false sense of security, such as when an enterprise buys connectivity from two different providers to ensure redundant access to the cloud. But it could turn out that provider No. 1 is reselling provider No. 2’s network, and a single fiber or copper cut takes out both links.

B.G.: You get two apologies instead of one. That’s a really good point. You can characterize that as incomplete homework.

Q: Business users and consumers are increasingly using wireless to access cloud services. What can enterprises do to minimize the risk that wireless’ vagaries will disrupt that access?

B.G.: That strikes me as very challenging, depending on the type of wireless. For example, an internal Wi-Fi network, you could mitigate against those kinds of risks pretty well, and they’re probably not a lot worse than if you had wired Ethernet.

Out in the field, if you’re talking about somebody using a smartphone or tablet connected over 3G, I don’t know that there’s much a company can do about that. You could evaluate who has the best 3G network, but you’re always going to face the issue of overloads or dead spots.

Q: That goes back to your point about doing your homework. For example, an enterprise might choose to get wireless service from Sprint because it resells 4G WiMAX service from Clearwire. So if Clearwire’s network is unavailable in a particular market, the enterprise’s employees still can get cloud access over 3G, which is a completely separate, Sprint-owned network. The catch is that those options are pretty rare.

B.G.: It is, unfortunately. It would be great if there were more WiMAX.

Lots of times, people over-assess the risks of the cloud while under-assessing the risks of whatever the alternative might be. The fact is that most organizations don’t have redundant connectivity to their data center from two different providers from two different sides of the buildings. They’re not as careful with their own stuff as they insist someone else is.

Q: Or they’ll do it right for their headquarters, but then not be as diligent for their satellite offices.

B.G.: Absolutely. What happens a lot is that people make intuitive risk assessments. When it comes time to make that evaluation, it’s, “Well, we’ve got to support the headquarters, but we don’t have enough budget for those remote offices.” Now what they do is say, “If you’re in a remote office and it goes down, just go down to Starbucks.”

We always tell our clients that cloud providers, in terms of what they bring to the table, are probably going to be as good as best practices or much better than what’s available because those things are core competencies for them. Most IT organizations are cost centers that everybody is always asking: “How can we squeeze this? How can we put this off?”

Major cloud providers don’t have that option. They can’t say, ‘We didn’t upgrade to the latest Microsoft patch because that would require us to move to the newest service pack.’ They just can’t do that from a business perspective.

Photo Credit:

Security Issues for Multicore Processors

If hackers love one thing, it’s a big pool of potential targets, which is why Android and Windows platforms are attacked far more often than BlackBerry and Mac OS X. So, it’s no surprise that as the installed base of multicore processors has grown, they’ve become a potential target.

That vulnerability is slowly expanding to mobile devices. Within three years, nearly 75 percent of smartphones and tablets will have a multicore processor, predicts In-Stat a research firm. That’s another reason why CIOs, IT managers and enterprise developers need to develop strategies for mitigating multicore-enabled attacks.

Cambridge University researcher Robert Watson has been studying multicore security attacks such as system call wrappers for several years. He recently spoke with Intelligence in Software about multicore vulnerabilities and what the IT industry is doing to close the processor back door.

Q: As the installed base of multicore processors grows in PC and mobile devices, are they becoming a more attractive target for hackers?

Robert Watson: One of the most important transitions in computer security over the last two decades has been the professionalization, on a large scale, of hacking. Online fraud and mass-market hacking see the same pressures that more stock-and-trade online businesses do: How to reach the largest audience, reduce costs and to reuse, and wherever possible, automate solutions. This means going for the low-hanging fruit and targeting the commodity platforms. Windows and Android are a case in point, but we certainly shouldn't assume that Apple's iOS and RIM's Blackberry aren't targets. They are major market players, as well.

Multicore attacks come into play in local privilege escalation. They may not be how an attacker gets their first byte of code on the phone -- that might be conventional buffer overflows in network protocols and file formats, or perhaps simply asking the user to buy malware in an online application store.

Multicore attacks instead kick in when users try to escape from sandboxing on devices, typically targeted at operating system kernel concurrency vulnerabilities. In many ways, it's quite exciting that vendors like Apple, Nokia and Google have adopted a "sandboxed by default" model on the phone. They took advantage of a change in platform to require application developers to change models. This has made the mobile device market a dramatically better place than the cesspool of desktop computing devices. However, from the attacker perspective, it's an obstacle to be overcome, and multicore attacks are a very good way to do that.

Q: What are the primary vulnerabilities in multicore designs? What are the major types of attacks?

R.W.: When teaching undergraduates about local and distributed systems programming, the term "concurrency" comes up a lot. Concurrency refers to the appearance, and in some cases the reality, of multiple things going on at once. The application developer has to deal with the possibility that two messages arrive concurrently, that a file is changed by two programs concurrently, etc.

Reasoning about possible interleavings of events turns out to be remarkably difficult to do. It's one of the things that makes CPU and OS design so difficult. When programmers reason about concurrency wrong, then applications can behave unpredictably. They can crash, data can be corrupted and in the security context, this can lead to incorrect implementation of sandboxing.

In our system call wrapper work, we showed that by exploiting concurrency bugs, attackers could bypass a variety of security techniques, from sandboxing to intrusion detection. Others have since shown that these attacks work against almost all mass-market antivirus packages, allowing viruses to go undetected. Similar techniques have been used to exploit OS bugs in systems such as Linux, in which incorrect reasoning about concurrency allows an application running with user privileges to gain system privileges.

It is precisely these sorts of attacks that we are worried about: the ability to escape from sandboxing and attack the broader mobile platform, stealing or modifying data, gaining unauthorized access to networks, etc.

Q: What can enterprise developers, CIOs and IT managers do to mitigate those threats? And is there anything that vendors such as chipset manufacturers could or should do to help make multicore processors more secure?

R.W.: Concurrency is inherent in the design of complex systems software, and multicore has brought this to the forefront in the design of end-user devices. It isn't just a security problem, though. Incorrect reasoning about concurrency leads to failures of a variety of systems. Our research and development communities need to continue to focus on how to make concurrency more accessible.

Enterprise developers need to be specifically trained in reasoning about concurrency, a topic omitted from the educations of many senior developers because they were trained before the widespread adoption of concurrent programming styles, and often taught badly even for more junior developers. Perhaps the most important thing to do here is to avoid concurrency wherever possible. It is tempting to adopt concurrent programming styles because that is the way things are going. Developers should resist!

There are places where concurrency can't be avoided, especially in the design of high-performance OS kernels, and there, concurrency and security must be considered hand-in-hand. In fact, concerns about concurrency and security, such as those raised in our system call wrapper work, have directly influenced the design of OS sandboxing used in most commercially available OSs.

For CIOs and IT managers,  concurrency attacks, like hackers, are just another weapon in the arsenal that they need to be aware of. Concurrency isn't going away. We use multicore machines everywhere, and the whole point of networking is to facilitate concurrency.

What they can do is put pressures on their vendors to consider the implications of concurrency maturely and on their local developers to do the same. Where companies produce software-based products, commercial scanning tools such as Coverity's Prevent are increasingly aware of concurrency, and these should be deployed as appropriate. For software developers, we shouldn't forget training:

First, to avoid risky software constructs, and second, to know how to use them correctly when they must be used.

We should take some comfort in knowing that hardware and software researchers are deeply concerned with the problems of concurrency, and that this is an active area of research. But there are no quick fixes since the limitations here are as much to do with our ability to comprehend concurrency as with the nature of the technologies themselves.

Read more about eliminating concurrency and threading errors from our sponsor.

Who’s in Charge of Multicore?

Like just about every other technology, multicore processors have an industry organization to help create best practices and guidelines. For developers, following The Multicore Association (MCA) can be a convenient way to keep up with what processor manufacturers, OS vendors, universities and other ecosystem members are planning a year or two out. MCA president Markus Levy recently spoke with Intelligence in Software about the organization’s current initiatives.

Q: What’s MCA’s role in the industry? For example, do you work with other trade groups and standards bodies?

Markus Levy: Multicore is a huge topic with many varieties of processors, issues and benefits. It boils down to the target market and, specifically, the target application to determine whether to use a homogeneous symmetrical multiprocessing processor or a highly integrated system-on-a-chip with many heterogeneous processing elements. The Multicore Association is and will be biting off a chunk of this to enable portability and ease of use.

With this in mind, we primarily aim to develop application program interfaces (APIs) to allow processor and operating system vendors and programmers to develop multicore-related products using open specifications. It’s important to note that The Multicore Association doesn’t currently work with other trade groups or standards bodies. We never intend to compete and/or develop redundant specifications.

Q: Developers work with whatever hardware vendors provide at any given time. In this case, that’s multicore processors. How can keeping up with MCA help developers understand what kind of hardware and software might be available to them in a year, three years or five years down the road? For example, what are some current association initiatives that they should keep an eye on?

M.L.: Good question. Notice that the MCA membership comprises a mixture of processor vendors, OS vendors and system developers. The main benefit for processor vendors is to support current and future generations of multicore processors. In other words, it makes it easier for their customers to write their application code once and only require minor changes as they move to another generation processor.

The OS vendors are also utilizing the MCA standards to enhance their offerings, and customers are requesting it. System developers are actually using our open standards to create their own proprietary implementations that are optimized for their needs.

We currently have a Tools Infrastructure Working Group (TIWG) that is defining a common data format and creating standards-based mechanisms to share data across diverse and non-interoperable development tools, specifically related to the interfaces between profilers and analysis/visualization tools. Actually, in this regard, the TIWG is also collaborating with the CE Linux Forum on a reference implementation for a de-facto trace data format standard that TIWG will define.

Our most popular specification to date is the Multicore Communications API (MCAPI), and the working group is currently defining and developing refinements and enhancements for version 2.0. MCAPI is now implemented by most OS vendors, quite a few university projects, as well as versions developed by system companies.

We’ve recently formed a Multicore Task Management API (MTAPI) working group that is focused on dynamic scheduling and mapping tasks to processor cores to help optimize throughput on multicore systems. MTAPI will provide an API that allows parallel embedded software to be designed in a straightforward way, abstracting the hardware details and letting the software developer focus on the parallel solution of the problem. This is already turning out to be quite popular with fairly extensive member involvement. It’s an important piece of the puzzle that many developers will be able to utilize in the next one to two years.

Q: Multicore processors have a lot of obvious advantages, particularly performance. But are there any challenges? The need for more parallelism seems like one. What are some others? And how is The Multicore Association working to address those challenges?

M.L.: There are many challenges of using multicore processors. Parallelizing code is just one aspect. Again, it also depends on the type of multicore processor and how it is being used. While the MTAPI is focused on the parallel solution, the MCAPI is critical to enable core-to-core communications, and our Multicore Resource management API (MRAPI) specification is focused on dealing with on-chip resources that are shared by two or more cores, such as shared memory and I/O.

Q: The MCA website has a lot of resources, such as webinars and a discussion group. Would you recommend those as a good way for developers to keep up with MCA activities?

M.L.: The webinars provide good background information on the projects we have completed so far, so I recommend these as starting points. The discussion group is very inactive. Better ways to stay up on activities include:

  • Subscribing to the MCA newsletter. This comes out every four to eight weeks, depending on whether or not there is news to be reported.
  • We also have special cases where nonmembers can attend meetings as guests. They can contact me for assistance.
  • Attend the Multicore Expo once a year, where members go into depth on the specifications. Plus, other industry folks present on various multicore technologies.

Photo Credit: