Jan 282004

This morning I found in my E-Mail 73 W32.Novarg.A@mm viruses. In order to reduce my bandwidth consumption from this attack, I am taking the following actions:

  • The "sending" E-Mail server must declare the size of the E-Mail, in advance (practically all do this, however the zombie spammers/virus ridden systems don’t).
  • Any E-Mail over 30,000 bytes is rejected and the connection is dropped before it is even sent across the internet.
  • Any particular IP address may only send me one E-Mail every 30 minutes. This restriction may be lifted after the current threat is reduced.
Jan 272004

I knew the virus outlined below was going to be big. This morning at 10:15 AM, I found in my E-Mail a CERT security warning about this latest virus issue. You can view the complete warning at this link:

I knew this virus was going to be big. Why? Due to the volume I have been getting. This morning I found in my E-Mail 57 W32.Novarg.A@mm viruses.

You may have misunderstood my previous posting, though. I am NOT sending out these viruses.

A reader asked "If I am so good, why did you catch the virus?"

I did NOT catch anything! I am being sent this latest attack from OTHER INFECTED COMPUTERS! As many viruses do, it spoofs the "from" address to mask where it "really" comes from. I only identified one such instance in the news post where the from address was fraudulent and it was also sent to me.

Jan 262004

In the last 6 hours, I have found in my E-Mail 38 W32.Novarg.A@mm viruses. This includes "the one" that spoofed my "from" E-Mail address and sent it back to me. The IP address it was "actually" sent from is located in Canada. This does not include the 27 "bounced" messages from servers telling me the "user does not exist." It also does not include the 15 automated messages some system administrators insisting on "warning" the sender of the "virus" attachment.

I have Ranted about viruses before and this is no exception. If anything "funny" is happening with your system, ensure your scanner is completely up to date and read all of the recent attacks located on Symantec.com or your favorite security site.

I have said it before and I will say it again: I will never send you or anyone else a file attachment out of the blue. If you did receive such a thing seemingly from "me," delete it immediately.

More information on this particular threat is here:

  • http://www.symantec.com/security_response/writeup.jsp?docid=2004-012612-5422-99
Dec 292003

I have information about AdShield and the reason the particular software performs as it does (outlined in several news posts below). An explanation was forwarded to me by the particular reader this issue affected:

Robots.txt files are used by web sites to control which of their pages are indexed by search engine spiders.  AdShield isn’t a search engine so it doesn’t conform to this standard even in version 3.  The caching option has always been disabled by default.  Version 3 does have an exclude list which could be used to prevent it from processing any web site which objects for whatever reason.

I guess that explains why it does not conform to the robots.txt standard for the reason quoted above. I also understand the thought process behind it. However, the robots.txt standard was implemented to tell "automated" programs "not to go to a particular spot" on a web site. Actually, it could also be that it was implemented to tell automated programs to "not index a particular file or directory." To help me in determining what the function should be, I quote some information from the first question and answer on http://www.robotstxt.org/wc/faq.html:

What is a WWW robot?

A robot is a program that automatically traverses the Web’s hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced.

Note that "recursive" here doesn’t limit the definition to any specific traversal algorithm; even if a robot applies some heuristic to the selection and order of documents to visit and spaces out requests over a long space of time, it is still a robot.

Now, by this definition, any program that "automatically traverses" a site is considered a robot. I quote one more line from the first question and answer on http://www.robotstxt.org/wc/faq.html:

Normal Web browsers are not robots, because they are operated by a human, and don’t automatically retrieve referenced documents (other than inline images).

Web browsers cannot be considered robots because someone must point them to a particular site and, by default, they do not spider a domain. However, this brings to mind several more questions:

  • If a program "does what a robot does," should it conform to reduce the network load they may cause by ignoring what "robots" should ignore?
  • Should they conform to the standard even if the program was "pointed" to a page?
  • Even if a program is not "indexing" a web site, should it conform?
  • Should I be telling robots, by using the robots.txt method, not to perform particular actions?

One could also take the point of view that since "…the caching option has always been disabled by default…" this particular function is not "automatic" and should not conform.

Where should the line be drawn? I cannot say, but I do know that Internet Explorer, when set in an "automated" fashion, for example: offline browsing, does check the robots.txt file located on a web server.

A Rant outlining this issue and more research will probably appear in due time. Meanwhile, I thank my dedicated reader for, not only pointing this problem out to me, but taking extreme measures to help me in troubleshooting this issue. I just need to sit on this problem a little longer and figure out what I am going to do. Too many questions… not enough answers.

Dec 252003

I have been selected SETI@Home user of the day [link removed]. I think it could have something to do with my profile’s [link removed] picture involving Santa. Coincidence? The world may never know.


I have also been told that the person mentioned in the last couple of news updates may be using an "older" version of AdShield. Unfortunately, due to the Holiday season, they have not been able to get any answer from technical support as to the issues outlined below. If it does turn out that the older version is flawed and the latest does conform to the robots.txt standard, I will update the news.

Dec 232003

Well, the person outlined two updates below "is" using a plain version of IE6, that is, if you do not include using AdShield. AdShield blocks pop-ups and banner ads. Since I do not have any, it is rather pointless to have it running on my site. However, this is the intriguing thing. Cut and pasted from the home page is this "feature:"

Improves performance using optional background downloading and caching of pages/images linked to the ones you’re viewing.

Improves performance for "whom?" That completely explains the reason for the log file entries. AdShield does not conform to the robots.txt standard and, therefore, indiscriminately "sucks everything," while at the exact same time:

Suppresses the download and display of ad images and frames.

Wow. Even though I am highly "anti-banner ads," this gives great cause for those sites that depend on ad revenue to be annoyed at this type of program. Not only does this program block their income generator, it creates "more traffic" by pre-fetching links "just in case." Not exactly sure if the program kills the request for the ad content or downloads it and just does not "display" it to the viewer. That will require more research.

I know there is a Rant in this news update, somewhere…

Dec 222003

In an attempt to give my readers a little insight as to "what goes on behind the scenes," I have posted the following news update.

Time to start the latest Quick Rant.

This is the longest news update I have had in awhile. The reason? I am banging my head up against the monitor.

I, once again, fired up the automatic banning of IP addresses last night. This is due to my desire to stop "bad" robots from sucking too much bandwidth. More information on this practice is located in my Abuse Rant. Basically, I implemented a "hidden link" which all "good" robots, including all major search engines, would ignore.

I had a reader contact me in distress saying they did "nothing wrong" and that they are using a plain version of IE6. Some proxy servers, pre-fetchers and firewall’s chose to ignore the robots.txt standard.

After reviewing the log files, I am attempting to figure out with this person, the exact "reason" this particular version of IE6 is attempting to "pre-fetch" links that are not valid and, as a result, causing my server to flag the IP for abuse. Whether or not this person is using any of the previously mentioned products is unknown at this time.

Once again, I have temporarily removed that particular function from the server. However, even though IP addresses are not automatically banned, I am notified immediately of the spidering attempt and the logs will remain until I can narrow down the cause of this problem.

A cut and paste from the web server log file, and my explanation of the issue will follow: (The actual IP address is removed for obvious reasons).

x.x.x.x - - [22/Dec/2003:12:45:26 -0800] "GET /WinXP/servicecfg.htm HTTP/1.1" 200 9027 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

The above log file line (even though it may display in your browser as "several lines," for the sake of argument, it actually is only one line) denotes the page where this person entered the domain. Having no "referer" (sic) information logged (the "-" after the "200 9027") is how I came to that conclusion.

The next several lines is the "normal" traffic. This includes the "referer" (sic) header, which is valid and tells me that "the browser requested the information because of accessing the above page." One such entry is shown below:

x.x.x.x - - [22/Dec/2003:12:45:26 -0800] "GET /css/20031222basic.css HTTP/1.1" 200 266 "http://www.blackviper.com/WinXP/servicecfg.htm" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

This shows the requesting page as "servicecfg.htm" and it is also requiring a download of "20031222basic.css." This is normal traffic. However, the following request should not be there and is directly "after" the normal logging of traffic patterns:

x.x.x.x - - [22/Dec/2003:12:45:26 -0800] "GET / HTTP/1.1" 200 4581 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

The above line tells me that, in only one second since the first request, the root "index" page ("GET /") was requested by the browser, but it has no "referer" (sic) header attached like the "normal" requests do. Three seconds later, the invalid link is spidered and the IP address was automatically banned. This particular "hidden" link is also the "first link" appearing in my XHTML code. However, It gets better.

The next two lines is what frightens me the most:

x.x.x.x - - [22/Dec/2003:12:45:40 -0800] "GET /AskBV/XP25.htm HTTP/1.1" 200 211 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
x.x.x.x - - [22/Dec/2003:12:45:49 -0800] "GET /AskBV/XP25.htm HTTP/1.1" 200 211 "http://www.blackviper.com/WinXP/servicecfg.htm" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

On the original page, the reference to XP25.htm is the "next" link in the code. However, the first request had no referer (sic) header information (as noted in the log file by "-" after the "200 211"). The second line "does" have the referer (sic) information logged only 9 seconds later, just as if the person actually "clicked" the link and attempted to go to that page.

The burning question I have is "what on Earth is causing IE to pre-fetch links?"

When that question is answered, I will rest better at night.

I am sure this issue has blocked other legitimate readers and I apologize. My intentions are only good by attempting to protecting my server "from the bad folk."

Other people have wrote to me and, in a matter of speaking, "If you do not want people to visit your site, take it down!" That is not the issue. I am not blocking legitimate traffic (well, except for the unknown cause outlined above). What I am attempting to do is stop the complete download of my domain for no reason other than "because it is there."

Dec 172003

I have had several complains from readers about my web server automatically banning their IP address because of "abuse." This is due to my recently implemented configuration to stop "bad" robots from sucking too much bandwidth. More information on this practice is located in my Abuse Rant. I implemented a "hidden link" which all "good" robots, including all major search engines, would ignore. Some proxy servers, pre-fetchers and firewall’s chose to ignore the robots.txt standard. This is an issue to take up with the creators of those programs, not my web site.

I have temporarily removed that particular function from the server. All access is currently available with the following exceptions:

  • I still block all "offline browser" access. This is due to many people synchronizing the entire domain (740+ pages) every day, which is entirely not needed.
  • Most "download managers" remain blocked. Please use the "normal" means of downloading my files. No user name and password is required to do so. If such information is requested, it could be due to the use of a download manager.
  • Access by "page editors" are not authorized. I hope the reasons are obvious.
  • When a "bad" robot hits, I will still immediately get a report via E-Mail and will selectively disable IP addresses instead of automatically banning them.

I appreciate everyone’s feedback while I fine tune the domain to provide everyone rapid content while not alienating others.

Dec 072003

A sharp reader identified a news update that I posted a few days ago for only 30 minutes, then deleted. It outlined the steps that I have recently implemented to stop abuse of my content. I will now go into greater detail in the Abuse Rant.

For several months, I have been attacking my bandwidth problem from a totally different angle than I recently have. I was optimizing images, cutting down on all "extra" content, and compressing content for faster download time. However, after looking through my log files, that was not enough. READ MORE…

Nov 172003

This is another round of warnings that I feel compelled to inform people. My previous page instructed people how to deal with the MSBlast worm. This one, however, deals with yet another mass mailing worm with its purpose in life to steal PayPal account information.

This discovery was prompted by one E-Mail that fits the Symantec description perfectly:

The subject line contains "YOUR PAYPAL.COM ACCOUNT EXPIRES" and comes from the address of "Do_Not_Reply@paypal.com." It arrived at my inbox at 11:41 AM today.

This information was posted November 14, 2003 by Symantec and the virus signatures were updated that day:


However, just a few messages up (more recent), I received about the same message at 12:16 PM with a slightly different subject line. This one is "IMPORTANT <several spaces and then random characters>". It also comes from the address of "Do_Not_Reply@paypal.com."

This particular message, fitting the bill with another scam to steal PayPal account information, was posted on November 17, 2003. Yes, today:


This one tipped me off because it has the exact type of subject line of a previous virus that I am sent often (12 times yesterday, 3 today) for several months. That particular variant comes from the address of "admin@<what ever domain the email is sent to.com>" with the subject line of "your account <several spaces and then random characters>".

More information on that particular virus is here:


What I am trying to get across is that people could find viruses in their E-Mail box before virus signatures can be updated. I fail to remember the "default" amount of time or "how often" the automatic update service runs for Norton Anti-Virus, but 24 hours is not a guess far from the truth, I am sure.

What this means is that I could have been infected 3 times (by the amount of separate E-Mails) before the signatures could have been updated. Of course, by the time the automatic update is performed, it could be too late.

Knowledge is power. Period. I knew these E-Mails contain viruses without even thinking about it from past experience with known subject lines. I looked them up because my curiosity sometimes overwhelms me and discovered that "I could have received it before they fixed it."

Being careful with the "automatic" actions you perform daily by checking E-Mail and knowing "what is good and what could be bad" is much more powerful than any virus scanner available. Knowing an E-Mail’s intent before even opening it has much more power then "assuming" a person is safe just because an Anti-Virus program is running.

More tips are located in my E-Mail Filtering Guide.