
As part of my role as Lifehack’s manager, I am responsible for moderating the comments queue. Lifehack’s back-end has a “Pending” queue for comments that our spam-catching software thinks might be spam, a “Spam” queue for comments labeled “spam” either by the software or by me, and another queue for comments that have been approved, again either by the software or by me. As a general rule, I check that “Pending” queue several times a day, the “Approved” queue every day or so, and the “Spam” queue every week or so.
I’ve been doing this for two years, and I’ve gotten pretty proficient at figuring out what is and is not spam – a tough call to make sometimes, since spammers get more and more sophisticated in lock-step with those of us charged with blocking them. I present my “formula” here for two reasons: one, to give less experienced bloggers and webmasters an idea of how to catch spam on their own site, and two, to give commenters an idea of the kind of thing to avoid so their comments don’t get accidentally thrown in the “Spam” bin.
I should say, a big part of catching spam is a “feel” – intuiting that some comment just doesn’t feel right. I’m not sure I can capture exactly what goes into that feel. Andy Warhol once said that to recognize a great painting, first you have to look at a thousand paintings, and catching spam is a bit like that – the experience of having looked at thousands of spam messages cannot be easily encapsulated. But I’ll try as well as I can.
What is spam?
What makes a message spam is relative and subjective. In a sense, spam is like a weed – a weed is not any particular kind of plant, but a plant that isn’t wanted where it’s at. (See, for example, Wikipidia’s definition of Weed as “a plant that is considered by the user of the term to be a nuisance.”) For instance, Corn is delicious, but if it’s growing in your soybean field, it’s a weed. A message that, say, pimps a word processor might be perfectly welcome on a post that asks for product recommendations for writers, while on a post that just happens to mention writing, the same message could be considered spam.
Some messages are clearly spam; for example, anything delivered by a spambot programmed to leave its message wherever it can find an open form to submit through. But a message can be left by a living person, custom-written for the particular content it’s posted to, and still be spam. This list starts with the most obvious signs and moves to more vague and difficult-to-interpret signs. My guess is that a lot of people run into the ones further down the list because they post without thinking very clearly, so pay attention.
A comment is spam if it:
- Contains links to websites that are unrelated to the content.
For example, a comment might say “I think your baby is really cute!” but the word “baby” links to a site selling baby clothes or even a Forex trading site or other scam.
- Is posted on more than one post.
This is obvious, right? Real people don’t post the same comment over and over on different posts, no matter how relevant. most likely it’s a spambot responding to multiple posts on your blog that contain similar keywords.
- Contains more than one link.
While there are a few situations in which a legitimate comment could contain several links, they’re fairly rare. As a general rule, the likelihood of a comment being spam increases directly with the number of links; anything over three and it’s virtually guaranteed to be spam.
- Is not directly related to the post.
A lot of spambots (or even live spammers) crawl the web looking for posts with certain keywords and then insert a generic message loosely related to the topic on the hopes that it will slip past any human reader who is likely to just skim through their comments. Unless a comment addresses something specific about your post, it’s likely to be spam.
- Is overly complimentary.
Most spammers are fairly astute observers of basic human psychology – particularly our desire to believe good things about ourselves. So they butter us up, saying things like “Great post! In fact, I love this whole site – I’m definitely going to come back again and again!”.
- Has keywords or a business name in the “Name” field.
A basic search engine optimization strategy is to get your website’s address associated with specific keywords, and search engines look closely at the text associated with a link to determine the usefulness of the website linked to. Real people aren’t trying to game search engines, and frankly, we want to be recognized for our contribution, so we use our actual name, or a username. If you can’t imagine replying to a person by the name in their “Name” field, you’re dealing with a spammer. (For example, here’s one taken from our spam queue: “Having a good vocabulary not only gives a framework for thought. It also allows you to be concise and precise to make communication better.” This is relevant to the post, and thoughtful, but it was left by an entity named “dining room table”. It’s spam.)
- Links to a spammy business.
This is a tough call – sometimes I’ll see a thoughtful comment clearly written in direct response to the post it’s commenting on, under a real person’s name, and still mark it as spam because they link to a site whose legitimacy is questionable. Could be porn, WOW gold scams, Forex scams, get rich quick schemes, blogs with stolen content, or anything else that feels to me like someone left a comment more to get their link out than to add to the discussion.
- Quotes the post without responding to the quote.
This is a relatively sophisticated spam technique: pulling lines out of the post it’s responding to in order to make the language of the comment sound like real writing. Real people mark the quotes they’re commenting on (usually with quotation marks, but it could be by italicizing or bolding it, putting it in blockquotes, or some other means) and try to clearly separate their response form the post’s words.
- Is posted on an old post.
Old posts tend to attract a lot of spam. Real people generally recognize that if a post is a year or so old, the conversation there is pretty much over. Spambots do not realize that. It still sometimes happens that someone comments on an ancient post, but the age of the post is a big red flag.
- Is in a different language from the site.
If the point of a comment is to engage in discussion with the author of the post and his or her readers, it doesn’t make much sense to comment in a language that you’re not sure the author knows.
- Is from a Russian .ru domain.
I hate to stereotype an entire top-level domain like this. I’m sure there are Russians out there making thoughtful comments on blogs all the time. And yet I’ve never had a comment that wasn’t spam from a commentor with a .ru domain or email address.
- Tells a long, personal story.
This is experience talking – a lot of times you’ll see what appears to be a blog post in its own right in your moderation queue that starts off, at least, relevant, and is clearly written by a real person. This falls under the “Weed” heading – it might have been totally welcome except it’s out of place as a comment on your blog.
- Asks for specific support.
This is another “weed” situation: a comment on a post about, say, installing Windows 7 that asks for help with a specific problem. Unless the point of your site is to answer specific questions about computer problems, this comment is out of place. There are better and more likely places to get help than on your blog.
- Feels wrong.
Sometimes a comment just feels wrong – it is a little too smarmy, maybe, or it’s a little too formal and stiff. You click through the link and it’s a legitimate-enough site, maybe a little sketchy, but you can totally construct a case where this comment was written by a real person with something to say. The question, though, isn’t what was the intention of the writer, but what is the effect on the conversation on your site. If a comment doesn’t seem to quite fit, you’re well within your rights to “spam it”.
Anyone else have advice for would-be spam-catchers? Or for commenters who might be finding their comments relegated to the spam-heaps of history? Leave a thoughtful, non-spammy comment below!
Dustin M. Wax is a freelance writer and project manager at Stepcase Lifehack. He can be reached though his freelancing site at DustinWax.com</a., where his various projects can be viewed. When he's not writing, he teaches anthropology and gender studies in Las Vegas, NV. He is the author of Don’t Be Stupid: A Guide to Learning, Studying, and Succeeding at College.
Follow him on Twitter: @dwax.
Share This



Go to Source

Aside from partying, the thing you’re probably going to do most in college is read. Assuming you’re at all serious about your education, you’ll read so much that words will come out your ears. Unfortunately, much of what you read will also go pouring out your ears, or so it will seem looking back.
One of the best habits you can develop in college — or even in high school, if you have the discipline — is to keep an academic reading journal. This is more or less what it sounds like: a journal recording everything you read, with an added layer of academic analysis. The idea is, you record what you read, key ideas and quotes from the text, and your own reflections on the work, allowing you to fairly accurately recreate your initial reading at a later date, pershaps a much later date.
Why do this? There are several reasons. First, because if you’re smart, you’ll use material from one class as source material for research papers in later classes, and it’s better to have that material at hand rather than having to re-read the book. Second, because you will often come across the same material, or material bythe same author, later in your education, and can go back and review your initial impressions. And third, because while much of what you’re being asked to read now mightnot seem fairly relevant, you’ll be surprised, 10, 20, or more years down the line what you find yourself wishing you could remember of some book or article you read as a sophomore.
Creating the Academic Reading Journal
An academic reading journal doesn’t have to be anything fancy — in theory, a composition book or notepad will suffice, provided it’s durable enough to last many years. Even better, a hardbound diary or Moleskine-style journal will give you plenty of space in a durable format. If you’re technologically inclined, a personal wiki, word processor file, or even database can be used on your PC. When I was doing my dissertation research (which requires you to read literally everything in your research area) I kept a reading journal in an Access database, synced to a database program on my Palm PDA. The point is, you’ll have to figure out the medium that’s most comfortable for you, comfortable enough that you’ll use it consistently.
There is no standard for what an academic reading journal entry should look like, but I recommend capturing the following pieces of information:
- A full bibliographic citation. Use whatever style is prevalent in your field, or whatever you know best: MLA, APA, or anything else. It doesn’t matter, so long as you make sure to get all the pieces of information you’ll need to produce a bibliography in any style necessary.
- A short synopsis of the book or article. This can be copied from the back cover text or abstract, or just sketched out in your own words.
- Quotes from your reading. Copy out any quotes you would otherwise highlightor underline — anything you think captures some essential point in the text. You don’t have to do this as you read, if you prefer to read with a highlighter or underliner — copy them out when you’re done, in that case. Make sure you get the page number(s).
- A personal response to your reading. 200 or so words capturing your impression of what you’ve read. Why is it important (or not important)? Whatis the author trying to say? Who was influenced by it, or influenced it?Have a look at my post How to Read Like a Scholar for more advice on academic reading.
- Questions raised by the text. Challenge your reading material! Think of a set of questionsthe material leaves unanswered, or that undermine the conclusions reached. These questions might eventually form the basis of a research project or larger critique.
- Any other notes, thoughts, arguments, or feelings about what you’ve read.
When I started keeping a reading journal using a Moleskine a couple years ago, Iprinted out a template that I kept in the back pocket to remind me of what I should include in my entries.
One last thing
While non-fiction is my bread-and-butter, and thus this post might have seemed to lean more towards academic material, don’t hesitate to include fiction and poetry among the books in your reading journal. The truths in fiction are often — maybe even usually — more true than the truths in non-fiction. Shakespeare’s truths trump Einstein’s over and over — after all, we’ve revised our understanding of relativity, but Hamlet will forevermore have been poisoned and killed in the Great Hall at Elsinore.
Dustin M. Wax is the project manager at Stepcase Lifehack. He is also the creator of The Writer’s Technology Companion, a site devoted to the tools of the writing trade. When he’s not writing, he teaches anthropology and gender studies in Las Vegas, NV. He is the author of Don’t Be Stupid: A Guide to Learning, Studying, and Succeeding at College.
Follow him on Twitter: @dwax.
Share This



Go to Source
Confessions of a Spam-Catcher: How to Identify Spam
Jan/100
As part of my role as Lifehack’s manager, I am responsible for moderating the comments queue. Lifehack’s back-end has a “Pending” queue for comments that our spam-catching software thinks might be spam, a “Spam” queue for comments labeled “spam” either by the software or by me, and another queue for comments that have been approved, again either by the software or by me. As a general rule, I check that “Pending” queue several times a day, the “Approved” queue every day or so, and the “Spam” queue every week or so.
I’ve been doing this for two years, and I’ve gotten pretty proficient at figuring out what is and is not spam – a tough call to make sometimes, since spammers get more and more sophisticated in lock-step with those of us charged with blocking them. I present my “formula” here for two reasons: one, to give less experienced bloggers and webmasters an idea of how to catch spam on their own site, and two, to give commenters an idea of the kind of thing to avoid so their comments don’t get accidentally thrown in the “Spam” bin.
I should say, a big part of catching spam is a “feel” – intuiting that some comment just doesn’t feel right. I’m not sure I can capture exactly what goes into that feel. Andy Warhol once said that to recognize a great painting, first you have to look at a thousand paintings, and catching spam is a bit like that – the experience of having looked at thousands of spam messages cannot be easily encapsulated. But I’ll try as well as I can.
What is spam?
What makes a message spam is relative and subjective. In a sense, spam is like a weed – a weed is not any particular kind of plant, but a plant that isn’t wanted where it’s at. (See, for example, Wikipidia’s definition of Weed as “a plant that is considered by the user of the term to be a nuisance.”) For instance, Corn is delicious, but if it’s growing in your soybean field, it’s a weed. A message that, say, pimps a word processor might be perfectly welcome on a post that asks for product recommendations for writers, while on a post that just happens to mention writing, the same message could be considered spam.
Some messages are clearly spam; for example, anything delivered by a spambot programmed to leave its message wherever it can find an open form to submit through. But a message can be left by a living person, custom-written for the particular content it’s posted to, and still be spam. This list starts with the most obvious signs and moves to more vague and difficult-to-interpret signs. My guess is that a lot of people run into the ones further down the list because they post without thinking very clearly, so pay attention.
A comment is spam if it:
For example, a comment might say “I think your baby is really cute!” but the word “baby” links to a site selling baby clothes or even a Forex trading site or other scam.
This is obvious, right? Real people don’t post the same comment over and over on different posts, no matter how relevant. most likely it’s a spambot responding to multiple posts on your blog that contain similar keywords.
While there are a few situations in which a legitimate comment could contain several links, they’re fairly rare. As a general rule, the likelihood of a comment being spam increases directly with the number of links; anything over three and it’s virtually guaranteed to be spam.
A lot of spambots (or even live spammers) crawl the web looking for posts with certain keywords and then insert a generic message loosely related to the topic on the hopes that it will slip past any human reader who is likely to just skim through their comments. Unless a comment addresses something specific about your post, it’s likely to be spam.
Most spammers are fairly astute observers of basic human psychology – particularly our desire to believe good things about ourselves. So they butter us up, saying things like “Great post! In fact, I love this whole site – I’m definitely going to come back again and again!”.
A basic search engine optimization strategy is to get your website’s address associated with specific keywords, and search engines look closely at the text associated with a link to determine the usefulness of the website linked to. Real people aren’t trying to game search engines, and frankly, we want to be recognized for our contribution, so we use our actual name, or a username. If you can’t imagine replying to a person by the name in their “Name” field, you’re dealing with a spammer. (For example, here’s one taken from our spam queue: “Having a good vocabulary not only gives a framework for thought. It also allows you to be concise and precise to make communication better.” This is relevant to the post, and thoughtful, but it was left by an entity named “dining room table”. It’s spam.)
This is a tough call – sometimes I’ll see a thoughtful comment clearly written in direct response to the post it’s commenting on, under a real person’s name, and still mark it as spam because they link to a site whose legitimacy is questionable. Could be porn, WOW gold scams, Forex scams, get rich quick schemes, blogs with stolen content, or anything else that feels to me like someone left a comment more to get their link out than to add to the discussion.
This is a relatively sophisticated spam technique: pulling lines out of the post it’s responding to in order to make the language of the comment sound like real writing. Real people mark the quotes they’re commenting on (usually with quotation marks, but it could be by italicizing or bolding it, putting it in blockquotes, or some other means) and try to clearly separate their response form the post’s words.
Old posts tend to attract a lot of spam. Real people generally recognize that if a post is a year or so old, the conversation there is pretty much over. Spambots do not realize that. It still sometimes happens that someone comments on an ancient post, but the age of the post is a big red flag.
If the point of a comment is to engage in discussion with the author of the post and his or her readers, it doesn’t make much sense to comment in a language that you’re not sure the author knows.
I hate to stereotype an entire top-level domain like this. I’m sure there are Russians out there making thoughtful comments on blogs all the time. And yet I’ve never had a comment that wasn’t spam from a commentor with a .ru domain or email address.
This is experience talking – a lot of times you’ll see what appears to be a blog post in its own right in your moderation queue that starts off, at least, relevant, and is clearly written by a real person. This falls under the “Weed” heading – it might have been totally welcome except it’s out of place as a comment on your blog.
This is another “weed” situation: a comment on a post about, say, installing Windows 7 that asks for help with a specific problem. Unless the point of your site is to answer specific questions about computer problems, this comment is out of place. There are better and more likely places to get help than on your blog.
Sometimes a comment just feels wrong – it is a little too smarmy, maybe, or it’s a little too formal and stiff. You click through the link and it’s a legitimate-enough site, maybe a little sketchy, but you can totally construct a case where this comment was written by a real person with something to say. The question, though, isn’t what was the intention of the writer, but what is the effect on the conversation on your site. If a comment doesn’t seem to quite fit, you’re well within your rights to “spam it”.
Anyone else have advice for would-be spam-catchers? Or for commenters who might be finding their comments relegated to the spam-heaps of history? Leave a thoughtful, non-spammy comment below!
Dustin M. Wax is a freelance writer and project manager at Stepcase Lifehack. He can be reached though his freelancing site at DustinWax.com</a., where his various projects can be viewed. When he's not writing, he teaches anthropology and gender studies in Las Vegas, NV. He is the author of Don’t Be Stupid: A Guide to Learning, Studying, and Succeeding at College.
Follow him on Twitter: @dwax.
Share This
Go to Source