Estimating flow distributions from sampled flow statistics

N Duffield, C Lund, M Thorup - Proceedings of the 2003 conference on …, 2003 - dl.acm.org
Proceedings of the 2003 conference on Applications, technologies …, 2003dl.acm.org
Passive traffic measurement increasingly employs sampling at the packet level. Many high-
end routers form flow statistics from a sampled substream of packets. Sampling is necessary
in order to control the consumption of resources by the measurement operations. However,
knowledge of the statistics of flows in the unsampled stream remains useful, for
understanding both characteristics of source traffic, and consumption of resources in the
network. This paper provide methods that use flow statistics formed from sampled packet …
Passive traffic measurement increasingly employs sampling at the packet level. Many high-end routers form flow statistics from a sampled substream of packets. Sampling is necessary in order to control the consumption of resources by the measurement operations. However, knowledge of the statistics of flows in the unsampled stream remains useful, for understanding both characteristics of source traffic, and consumption of resources in the network.This paper provide methods that use flow statistics formed from sampled packet stream to infer the absolute frequencies of lengths of flows in the unsampled stream. A key part of our work is inferring the numbers and lengths of flows of original traffic that evaded sampling altogether. We achieve this through statistical inference, and by exploiting protocol level detail reported in flow records. The method has applications to detection and characterization of network attacks: we show how to estimate, from sampled flow statistics, the number of compromised hosts that are sending attack traffic past the measurement point. We also investigate the impact on our results of different implementations of packet sampling.
ACM Digital Library