Scientists in Switzerland say they have developed an algorithm which they claim can trace the source of computer viruses, malware and spammers.
Pedro Pinto, researcher at the Audiovisual Communications Laboratory of the Swiss Federal Institute of Technology in Lausanne (EPFL), and colleagues first presented the algorithm in a paper titled, “Locating the Source of Diffusion in Large-Scale Networks.” The research was published Aug. 10, 2012, in the Physical Review Letters scientific journal.
In a telephone interview, Pinto told Computer World that if one wanted to find the source of a virus, malware or spam-attack, it is impossible to track the status of all nodes on the Internet.
“That would mean you would need about one billion sensors. And you don’t want to monitor the entire Internet,” Pinto added.
Instead, in an EPFL news blog, Pinto said, “Using our method, we can find the source of all kinds of things circulating in a network just by ‘listening’ to a limited number of members of that network.”
To help understand Pinto’s algorithm, Physics, part of the American Physical Society, wrote:
[Based on the standard network picture of epidemics] Individuals are imagined as points, or ‘nodes,’ in a plane, connected by a network of lines. Each node has several lines connecting it to other nodes, and each node can be either infected or uninfected. In the team’s scenario, all nodes begin the process uninfected, and a single source node spreads the infection from neighbor to neighbor, with random time delay for each transmission. Eventually, every node becomes infected and records both its time of infection and the identity of the infecting neighbor.
To trace back to the source using data from a fraction of the nodes, Pinto and his colleagues’ adapted methods used in wireless communication networks. When three or more base stations receive a signal from one cell phone, the system can measure the difference in the signal’s arrival time at each base station to triangulate a user’s position.
Pinto’s team then tested the effectiveness of the algorithm with real, measured data from the cholera outbreak that in the KwaZulu-Natal province, South Africa, in 2000.
“We tested our method with data on an epidemic in South Africa provided by EPFL professor Andrea Rinaldo’s Ecohydrology Laboratory,” said Pinto. “By modeling water networks, river networks and human transport networks, we were able to find the spot where the first cases of infection appeared by monitoring only a small fraction of the villages.”
According to EPFL, Pinto’s team tested their system again by using computer simulations of the telephone conversations that could have occurred during the terrorist attacks on the United States, Sept. 11. With these simulations and Pinto’s algorithm, it was the team’s hope that they could pinpoint the individuals behind the attack.
“By reconstructing the message exchange inside the 9/11 terrorist network extracted from publicly released news, our system spit out the names of three potential suspects – one of whom was found to be the mastermind of the attacks, according to the official enquiry,” said Pinto.
Although Pinto notes in his paper that several challenges remain, he tells EPFL the algorithm could be used as a preventative measure to help counteract outbreaks or simply as a tool for advertisers who use viral marketing strategies.