NetFlow is the Wrong Way to do Attack Surface Mapping

Robert Hansen March 26, 2021

This post is the fifth of a short series of posts that we have dubbed “Attack Surface Mapping the Wrong Way,” showing the wrong way that people/companies/vendors attempt to do attack surface mapping. Read the first in this series here. Next up is NetFlow and why it is the wrong way.

NetFlow alone is flawed

One of the more common ways network engineers attempt to discover all their assets is by looking at the wire. If it needs to communicate, the theory is to get out of the internal network by a choke point, like a router, switch, or firewall that can log packets that include the origin and destination addresses. That belies the reality that many networks have more than one way into and out of them for redundancy, and due to poor controls, but let’s give them the benefit of the doubt.

Network security people tend to believe that they have access to everything as it traverses their switches or network taps. From this network vantage point, software can get either a raw packet dump (which rarely happens) or, more than likely, NetFlow data.

NetFlow is a concept introduced by Cisco Systems, and it allows the network devices to give a sampling of network traffic as it either enters or exits a network device. Typically, a Netflow setup is comprised of one or more aggregation systems that collect the data and a console that allows people to monitor said data

While NetFlow can give you the protocol (TCP/UDP, for instance), source address and port, and the destination address and port, it is otherwise very limited in its design and functionality. More modern versions of NetFlow provide a bit more information about the TCP flags, for instance, but those are not useful for attack surface mapping purposes. There are, therefore, some significant downfalls with NetFlow when it comes to asset discovery.

First, the issue is that since it is only a sampling of network traffic, NetFlow will miss certain details. It is not meant to be perfect; NetFlow is meant, by design, to sample. If you want a sample of your asset inventory, then NetFlow is fine. If you want to get everything, it is not great.

Second, NetFlow is for IP sampling, not for DNS/HTTP sampling, which is how network applications typically work these days. We will go more into this in the next section, but if the NetFlow data knows that two IPs are talking to one another, that is only somewhat useful. What is on that IP? Is it your site, or is it a search engine, a music streaming site, a video game, or a thousand other things that have nothing to do with your company’s assets? How do you know which is which?

Third, because NetFlow is only useful when it is running, it cannot find sites that you no longer visit. So, if your company has been around for a long time or if you acquired an old company, there could be many assets that no one goes to any longer but still have vulnerabilities in them and could be leveraged against you. That makes NetFlow a particularly bad choice.

Lastly, suppose NetFlow is running in location A, but you have staff building websites in location B. In that case, there is no way, other than forwarding all traffic through location A, that a NetFlow exporter will detect said traffic from location B. In the example of your company having several satellite offices or in the age of COVID, where people are working from wherever, you are going to miss a ton. Unless all traffic, including traffic bound to the Internet, goes through a set of centralized VPN concentrators and then, after decrypting the traffic, is sent through a router/switch that can do NetFlow, this will be the case.

There are also many new applications where there are “serverless” compute or 3^rd party APIs, and that you have no access at all to the network – how is NetFlow supposed to work in that environment? The same is true if you use a third party to develop your assets. Unless you somehow arrange with all your vendors to tap their connections, which is unlikely, NetFlow alone is a bad option if you can choose another technology.

Said simply, NetFlow can only see IP, and it can only see the IP of the networks it is exposed to. You cannot run NetFlow on a part of a network you do not control or do not know you own. NetFlow can only find what runs over it, and therefore it is not ideal for finding assets.

Just because someone at your company went to a website does not mean it is yours. NetFlow creates an enormous amount of human labor that will likely lead to an enormous amount of on-going false positives, which is a substantial hidden cost. For this reason alone, we would not suggest that you use NetFlow unless you have a lot of money to throw at the problem and are okay with incredibly diminishing returns and a large headcount or costly consulting services to manage.

If you have NetFlow data, should you use it? In the asset mapping world, it’s useful but certainly not necessary and requires a lot of work on behalf of the company to set up and maintain. Additionally, it’s expensive to analyze it properly. NetFlow is so poorly designed for this purpose that I would almost tell people to skip it entirely. To be fair, there are some use cases where it has marginal utility, like in identifying new assets and for identifying malware that is beaconing out but certainly should never be used in a vacuum. If you or your vendor relies on NetFlow to do attack surface mapping, you are likely missing a lot.

Want to talk about the right way to do attack surface management? We’ll show you. Get in touch with us here.