AV vendors say most badware sites are compromised

A recent report from Symantec reinforces the idea that most web-based malware is distributed via compromised, legitimate sites:

In 2010 so far, using the same approach, the proportion of malicious domains that are legitimate [i.e., set up for reasons other than distributing malware] has increased dramatically compared to last year – it’s now about 90%.

On a related note, Avast reports that, despite popular belief, adult sites are not carrying the load of malicious content:

…the statistics are clear – for every infected adult domain we identify there are 99 others with perfectly legitimate content that are also infected.

Everyday Internet users who are hearing this for the first time should take this as a wake-up call. Protect your computer. Protect your website. And recognize that, while making smart decisions about your Internet use is always a critical part of security, deciding which type of website you visit isn't as important as it once was.

Hat tip: H-Online via UnmaskParasites (Twitter)

Tagged , | Comments Off

WEIS Recap: Review of “Might Governments Clean Up Malware?”

Richard Clayton wrote on the more interesting papers presented at WEIS.  In his paper “Might Government Clean Up Malware” [pdf] he suggests some possible goverment intervention to aid consumers in cleaning up their computers.  His paper explains the reasons as follows.  
1) ISPs do not have an incentive to act
2) The problem has public dimensions very similar to public health issues
3) The math behind this issue requires someone (the government) to seed the funding for experts to act
I agree with the contention that ISPs do not have incentives to act.  Of the web hosts that I have communicated with not a single one has found it financially rewarding to deal with the problems I highlight.  This really isn’t how it is supposed to work either.  As Clayton points out “in principle the market should deal with ISPs who skimp on abuse activity.”  Which put another way means that those ISPs who do actively clean up infections in their consumer base should have a better image and thus more business.  The market should reward those ISPs who go out of their way to make sure that its customers remain protected.  But as pointed out in many of the papers who grace WEIS and other conferences like it the margins are extremely slim.
Clayton’s paper even references  another paper which makes the claim that a single interaction with a customer by an ISP will eat up all of the profit generated by that customer for the entire year.  (In a footnote he mentions that this may be exaggerated but not greatly so)  
The one issue I have with this paper is that it doesn’t quite cover the issue I’m most concerned about.  And obviously that isn’t a valid criticism of the paper so much as a want from my side.  The paper deals with helping out web “surfers” instead of web masters.  Often the problem that I’m studying involves both levels.  Web sites are infected because the web master’s personal computer was infected and the attacker gathered the login details from there.  So fixing one may in fact help fix the other.  But there is a major difference worth noting.  The paper made a good point in writing about the hesitation of an ISP in engaging with its customers this way.  When margins are thin profit is only acceptable through volume.  So any actions which drive customers away in any number are dangerous.  Accusing customers of infections isn’t always rewarded with gratitude.  Customers can feel angry, ashamed, alienated or all three at once.  It is difficult to find new options for bandwidth provision for many people.  In Cambridge I have my choice between one cable company and two DSL (one who just resells the others at a mark up).  And the change from cable to DSL (or vice versa) comes with considerable costs as well.  But for web hosting providers there isn’t that much cost and there are a lot of choices.  So the dangers of customer alienation for web hosting firms are very very high.  

1 Comment

Australian ISPs on the right track

In early June, the Australian Internet Industry Association, an ISP industry trade group, published icode [PDF], a voluntary code of conduct for ISPs to follow to better fight bots on their networks. Like the previously-mentioned IETF draft, this document lays out a rationale for, and recommendations on how to implement, an ISP-level response to bots. Unlike the IETF draft, icode is a reflection of a coordinated effort by a large number of ISPs to buy in to a common framework for how to respond.

The icode framework has four parts:

  1. Education. ISPs that adopt icode are expected to educate their customers about keeping their computers from becoming compromised.
  2. Detection. ISPs can implement their own detection methods and/or get data from trusted third parties. Even better, they can get data from the Australian Internet Security Initiative, a government-led effort to centralize bot reporting by collecting bot reports from trusted providers and then distributing ISP-specific data daily to participating ISPs. (Wouldn’t it be great if we had something like this for infected URLs and hosting companies?)
  3. Action. ISPs are encouraged to act on the information about bots, through whatever combination of customer notification, password resets, bandwidth throttling, walled garden quarantining, smtp blocking, or other measures they consider appropriate.
  4. Reporting. ISPs are expected to report “significant cyber security incidents” to governments.

icode also recommends, though doesn’t require, that participating ISPs share threat data with each other, facilitated by the Australian CERT.

One could quibble over some of the details, but it’s clear that the Australian ISPs that created and will be adopting icode are light years ahead of most ISPs (and web hosting providers) globally in tackling the spread of malware.

Tagged , , , | Comments Off

Recent spikes in badware reports

We have generally seen an increase in the number of badware URLs reported by our data providers lately, but in the past few weeks, we’ve seen unusually big spikes on three autonomous systems (simplifying slightly, an AS is a set of networks operated by a single entity):

AS16276 (OVH) graph
AS9809 (Nova Network, China) graph
AS22489 (Castle Access) graph

We have attempted to notify all three network operators via abuse@domain_name. The report to abuse@nova.net.cn bounced, so if you know another contact address there, please let us know.

Most likely, these spikes in infection numbers are the result of either targeted attacks at these networks or opportunistic attacks that happened to find their way into large numbers of identically configured (or misconfigured) web servers.

Tagged | Comments Off

StopBadware welcomes a new board member

StopBadware is pleased to announce that Paul Mockapetris will join our
board of directors.

Mockapetris created the Domain Name System (DNS), an essential part of
today’s Internet infrastructure, in the 1980s. He is now the Chief
Scientist and Chairman of the Board at Nominum, a global provider of DNS
and DHCP solutions to communication providers. He has previously served
as chair of the Internet Engineering Task Force (IETF) and as a program
manager at the Advanced Research Projects Agency (ARPA).

The board, chaired by PayPal CISO Michael Barrett, also includes Vint
Cerf, Esther Dyson, John Palfrey, Ari Schwartz, Mike Shaver, and
executive director Maxim Weinstein.

Press release: http://www.stopbadware.org/home/pr_06112010

Comments Off

Thoughts on WEIS 2010

Earlier this week I sat in on the Workshop on the Economics of Information Security.  One of the more lively research papers presented was on insecurities in the online pornography industry.  The paper [0] has also been written about by Threatpost [1].  As noted by Naraine’s article the team crawled just over 35,000 websites using an automated system.  Interestingly the team discovered that about 3.23% of those sites were also infected with drive by downloads.  One aspect of the research I was curious about was the degree to which those infected porn sites were popular.  I spoke with Dr Wondracek after his talk to speak about the possibility of figuring this out.  In my own thesis last semester I discovered that of the sampled sites we receive from our data partners less than 3% of the those were listed as popular by Alexa.

To determine this one simply downloads Alexa’s “Top 1,000,000 Websites” list [2] and formats the list for comparison appropriately.  (Alexa’s list uses canonical hostnames) Then simply take the intersection of that list (find which hostnames appear on list A and list B) and use that to create a percentage.  This statistic should answer Pr(Popularity|Infection) or the probability of popularity given an infection.

[edit: moved links to bottom in footnote format for better readability]
[0] http://weis2010.econinfosec.org/papers/session2/weis2010_wondracek.pdf
[1] http://threatpost.com/en_us/blogs/understanding-porn-malware-connections-060810
[2] http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

Tagged , , | Comments Off

A Detailed Look at ThePlanet’s Infection Distribution

A reader asked a question in the comments of the previous blog post about ThePlanet regarding the distribution of infections.  The reader wanted to know if the rest of the infections not attributed to Skenzo and HostGator were evenly distributed with less than 10% of the total infections.  The type of distribution the reader was describing is called a Power Law distribution.  A power law distributed population will look something like this:

power law

courtesy of Wikipedia.org

For this blog post I’m using data pulled from early May 2010 on AS21844 (ThePlanet) and find the infection counts are roughly power law distributed.  I’ve gone over the methodology to obtain this data in previous posts but it bears mentioning that I am using data distributed by RWhois organization names.  Later in this post I will look at the same data distributed by only IP address so that it can be compared with other AS blocks.  The raw infection counts look like this in graphical format:

raw data plot

The shape is precisely the same and it is obvious that there are a lot of organizations that have only single and double digit infections attributed to them.  The area between 500 and 2500 is entirely barren with only a single entry beyond 2500.  One of the issues when looking at data like this is the blur of data points below the 500 marker.  One could simply strip away the outliers (those data points above 500) but in this particular case I don’t think that is an effective way to view the data.  In statistics people often “transform” the data to deal with this situation.  This generally means they divide all the numbers by some constant which allows the data to retain the same shape but become easier to read.  I generally favor the log/log method which means I take the log of each number and graph it that way.  Log (or logarithm) is a mathematical function best explained by Wikipedia but best thought of as a number “reducer” that can be applied uniformly across data.

To get a sense of the scale the log of 2500 is 7.8, the log of 500 is 6.2, the log of 100 is 4.6 and the log of 1 is 0.  Once the data is transformed we can see there is a little variance in the actual distribution but the fact that the line is sloping downward like that is another very good indicator of the power law distribution.

log/log data plot

Tagged , , , | Comments Off

Update on ThePlanet and Hostgator

Last month we started an investigation into the massive numbers of infections we saw on ThePlanet’s AS21844 network. Last week we discovered the Rwhois server at ThePlanet and were able to get a more fine grained view of the infection distribution. 10% of the infections were attributed to Skenzo while 40% were attributed to HostGator resellers.
The infections we thought were attributed to Skenzo turned out to be abandoned badware domains. We think this problem will largely work itself out as Skenzo has no interest in monetizing from domains marked as badware.
The infections at HostGator were a bit more challenging. I communicated with several members of the HostGator team over the course of the last few weeks. They voiced some valid complaints that I will talk about in this post. The most important of which is the way infections are counted.
One domain, nyalines.com, had something like 1000 infections attributed to it. This is pretty unusual for our data partners to do. If there are more than a handful of infections at the same domain they will usually just list the entire domain. When we asked Google, the data partner responsible for that particular listing, they said the automated system they have in place thought it was better to list it that way.
Here is a sample of what they were talking about:

    10 vadakarapally . org
    10 websitecoders . org
    11 e-sense . tv
    11 malayalamwallpapers . net
    12 attorney2traffic . org
    15 kingvip . com
    17 niftysensex . com
    18 fountain.fountaintips . com
    19 findluxurywatch . com
    19 freewallpapershere . com
    20 quitsmokingtips4u . com
    21 shorthandlogic . com
    23 dir10 . net
    91 freenewdownload . com
   116 moviemark.com . br
   987 nyalines . com

We didn’t get any further explanation from Google so I am at a loss for why there was a need to mark the same domain 1000 times. The senior security tech at HostGator I spoke with felt that our report unfairly characterized HostGator and I would like to address that. We at StopBadware simply follow the data. We take what is in front of us and interpret as best we can for public consumption. When we are shown errors in our methodology we adapt it. Figuring out how to more accurately represent infections on the Internet is a giant part of what I do and over counting of a particular domain will be at the top of my list (along with Rwhois resolution). However ThePlanet is still at the top of the infection charts for US based web hosting providers. And even if we count each domain only once HostGator resellers accounted for 6655 of the infections within that network. I am very grateful for their team’s willingness to work with us to eradicate those infections.
It also bears mentioning that I don’t particularly think Google did anything wrong here either. They produce a list of URLs believed to contain badware on it and release it to their partners. We made the move to quantify this list so we could get some sense of whether things were getting better or worse. Both in terms of overall infections and infections within particular networks. Those metrics allow us to prioritize hubs of infection on the Internet and spend our scarce resources attacking where it counts.
We will begin the bulk appeal process to get the URLs HostGator has cleaned unmarked as badware. With some luck the high numbers of infections on AS21844 will start coming down.

Tagged , , , , | 3 Comments

Update on Sustained Infections at ThePlanet (Skenzo and Host Gator)

I’ve been working on my investigation of ThePlanet and have some new and interesting results.

Skenzo has some valid concerns. They monetize abandoned domain names and apparently inherited a bunch of abandoned badware URLs. When Google rescans a site on its badware list and finds that the contents have disappeared or changed dramatically, Google does not necessarily assume that the site is clean. Which is to say that someone who simply deletes the page and doesn’t request a review might stay on the list for a prolonged time. The logic, I guess, is that they are preventing someone from simply deleting the page until they are cleared and then reinstate the previous content.
Skenzo did some investigating of their own with a list of URLs I provided them. They found the following:
* 635 URLs had not been visited by Google in the last 90 days
* 108 URLs Google had visited but did not find a suspicious page in the last 90 days
* 473 URLs marked as suspect in the last 90 days. This would be at the previous network and not on Skenzo’s infrastructure

There are obvious issues with Skenzo’s situation. Skenzo doesn’t want the badware URLs in their monetization network anyway so I introduced Skenzo to the Google team in the hopes that Google will just send them updated lists for removal. So that may have a happy ending.

WebsiteWelcome is a whole other headache. Earlier I only ran the top 50 IP addresses from the infections in AS21844. This means I excluded the “tail” of the distribution. Usually the tail is made up of small websites with 1-5 infections on their IP address. However what I didn’t realize at the time was that WebsiteWelcome is, quite literally, HostGator. I had assumed they were just a reseller but they seem to be the private label name used by all Host Gator resellers. So when I reran the entire list of infections in AS21844 through the RWhois server I got this result:

WebsiteWelcome 8317
Skenzo FZE 2592
No Orgname 474
Site5 LLC 389
SiteGround.com 205

This means that of ThePlanet’s 20,000 infections HostGator (under the WebsiteWelcome name alone) comprises ~40% of them. Those infections are spread out across 2,800 IP addresses. That is a really large percentage considering many of the top malware network lists have ThePlanet at the top. Worse I don’t have any way of making the list more granular. HostGator and I have been in touch via email but they refuse to go on record. I continue to send them URLs and they are working on cleaning up these hosts so far as I can tell.

[Update 4/27: Edited the part about Google's policy for improved accuracy.]

Tagged , , , | Comments Off

StopBadware is hiring!

Wear jeans. Write code. Fight viruses. Get home in time for dinner.

Come join StopBadware in an epic battle against viruses, spyware, fake anti-virus apps, and other badware. There are no sword fights involved (sorry), but on the plus side, writing code is far less likely to result in a flesh wound. The code you write will take the data we collect about badware, remix it into useful knowledge, and put that knowledge into the hands of the people best able to decide how to use it. The pen is mightier than the sword, indeed!

Apps are implemented in Ruby on Rails, and run on Apache/Passenger servers with a MySQL back end. The front end is standard HTML, CSS, and JavaScript. Development projects are tracked using Redmine and Subversion.

StopBadware is a pretty great place to work. Jeans are standard attire, and there are even opportunities to telecommute. We share a big, open office in Harvard Square with folks from Harvard’s Berkman Center for Internet & Society (from which we spun off) and the Public Radio Exchange team. You’ll find lots of developers and other fun, geeky people around. Our partners include some companies you may have heard of—Google, PayPal, and Mozilla—which gives us street cred, as well as opportunities to interact with a bunch of cool people.

We are a small team, and until we grow a bit, you’ll be the only full-time developer (except perhaps for a short overlap with our current lead developer). We will at times bring in contractors and interns to complement your skills and provide additional development bandwidth to complete projects. We don’t want to put you out on the battlefield all alone, after all!

StopBadware is a non-profit organization, but don’t let that dissuade you. We offer a fair salary and a solid benefits package (incl. health, dental, and 401k). Perhaps more importantly, we offer the opportunity to get paid to learn, have a good time, and make the Internet safer!

If all of the following apply to you, maybe you should apply for the job:

  • I dream in open source code.
  • I would enjoy watching an unmanned aerial drone attack the home of a prolific spammer.
  • I am eligible to work in the United States.
  • I know Ruby, Rails, SQL, HTML, CSS, and JavaScript, or I can do a decent job faking it while I learn the parts I don’t know.
  • I hear “security” and think “form validation,” not “I’m getting kicked out… again”
  • I may not be Linus Torvalds, but if you need me to do a little Linux, Apache, or MySQL administration, I’ll figure it out.

To apply, please send a cover letter and resume to developer@stopbadware.org.

[Updated 5/10/10: Revised job description.]

Tagged , | Comments Off