The complexity of network monitoring strongly depends on the size of the network under observation. Challenges in monitoring large-scale networks arise not only from dealing with a large volume of traffic, but also from keeping track of all traffic sources, destinations, and who-talks-to-whom communications. Analyzing this information allows to uncover new behaviors that would have not been visible by merely observing common metrics such as bytes and packets. The drawback is that extra pressure is put on the monitoring system as well and on the downstream data- and timeseries-stores.
This talk presents a case study based on the monitoring of a large-scale university network. Challenges faced, findings, and lessons learned will be examined. It will be shown how to make sense of the input data to properly manage and reduce its scale as early as possible in the monitoring system. The discussion will also highlight the advantages and limitations of the opensource software components of the monitoring system. In particular, the opensource network monitoring tool ntopng and the timeseries-store InfluxDB will be considered. It will be shown what happens when ntopng and InfluxDB are pushed to their limits and beyond, and what it can be done to ensure their smooth operation. Relevant findings, behaviors uncovered in the network traffic, and future directions will conclude the talk. Intended audience is technical and managerial individuals who are familiar with network monitoring.