Quantifying the Importance of Vantage Point Distribution in Internet Topology Mapping

—The topology of the Internet has been extensively studied in recent years, driving a need for increasingly complex measurement infrastructures. These measurements have produced detailed topologies with steadily increasing temporal resolution, but concerns exist about the ability of active measurements to measure the true Internet topology. Difficulties in ensuring the accuracy of every individual measurement when millions of measurements are made daily, and concerns about the bias that might result from measurements along the tree of routes from each vantage point to the wider reaches of the Internet must be addressed. However, early discussions of these concerns were based mostly on synthetic data, oversimplified models or data with limited or biased observer distributions.

In this paper, we show the importance that extensive sampling from a broad and well spread set of vantage points has on the resulting topology and bias. The majority of this paper is devoted to a first look at the importance of the distribution quality. We show that diversity in the locations and types of vantage points is required for obtaining an unbiased topology. We analyze the effect that broad distribution has over the convergence of various autonomous systems topology characteristics, and show that although diverse and broad distribution is not required for all inspected properties, it is required for some. Finally, claims against bias in active traceroute sampling are revisited, and we empirically show that diverse and broad distribution can question their conclusions.