Behdad Esfahbod's daily notes on GNOME, Pango, Fedora, Persian Computing, Bob Dylan, and Dan Bern!

My Photo
Location: Toronto, Ontario, Canada

Ask Google.

Contact info
Hacker Emblem Become a Friend of GNOME I Power Blogger
follow me on Twitter
July 2003
August 2003
October 2003
November 2003
December 2003
March 2004
April 2004
May 2004
July 2004
August 2004
September 2004
November 2004
March 2005
April 2005
May 2005
June 2005
July 2005
August 2005
September 2005
October 2005
November 2005
December 2005
January 2006
February 2006
March 2006
April 2006
May 2006
June 2006
July 2006
August 2006
September 2006
October 2006
November 2006
December 2006
January 2007
February 2007
March 2007
April 2007
May 2007
June 2007
July 2007
August 2007
September 2007
October 2007
November 2007
December 2007
January 2008
February 2008
March 2008
April 2008
May 2008
June 2008
July 2008
August 2008
October 2008
November 2008
December 2008
January 2009
March 2009
April 2009
May 2009
June 2009
July 2009
August 2009
November 2009
December 2009
March 2010
April 2010
May 2010
June 2010
July 2010
October 2010
November 2010
April 2011
May 2011
August 2011
September 2011
October 2011
November 2011
November 2012
June 2013
January 2014
May 2015
Current Posts
McEs, A Hacker Life
Wednesday, October 26, 2005
 Inverse Distribution

Yesterday Divesh Srivastava was giving the departmental colloquium here about Streams, Security, and Scalability. Interesting talk generally, but there was one point I particularly liked, and think the idea can be useful in other scenarios too:

Suppose that you are monitoring the traffic of your lab's network, and would like to detect when a new internet worm is emerging. Note that you don't even know what this new work looks like. Here is how it works: take packets for one-minute (whatever) worth of traffic, form the n-grams in the input (start with bigram or trigrams, but 4-grams and above are not out of question either), and distribute them into bins based on their frequency. You get an inverse distribution which for given x, tells how many n-grams have frequency x. Now if the traffic is random-enough, this inverse distribution function should follow a Poisson distribution, which is a deformed bell curve of some kind. Draw the graph of the function (may want to use logarithmic x axis.)

Now as a new internet worm becomes epidemic, the frequency of the n-grams in the worm payload get exponentially(?) higher and higher, and in the graphic view of the function, you see another bell curve raising higher and higher and farther to the right. That's a new worm spreading!

This paper contains the details, as well as ways to trigger this automatically.

Comments: Post a Comment

Links to this post:

Create a Link

<< Archive
<< Home