Hacker News new | past | comments | ask | show | jobs | submit login
A quarter of online ad traffic is fraudulent (adweek.com)
78 points by elorant on Oct 20, 2013 | hide | past | favorite | 37 comments



While this is probably true -- or even worse -- I think this is already baked into the rates advertisers are paying. Sophisticated advertisers have conversion rates they track carefully and what they bid/spend on ads is worked-to backwards from that number. In other words, if a conversion is worth $5 to me, and I see I get a 3% conversion rate, I'm not going to spend more than $0.15 per click.

Obviously this isn't a universal truth, and branding campaigns are more difficult to price, but this story is posted cyclically and it I often read people's take that this is a stat indicating things are unsustainable and headed for a correction, but I think we're past that. Online advertisers have grown more sophisticated over the last decade and CPMs & CPCs have dropped significantly along with that.

Curious what anybody else thinks..


I agree. Fake traffic or not, it's factored into the price and the stats are still just too good compared to what you can get anywhere else from an information standpoint. As a marketing person, you're always proving your worth--digital ads make this a lot easier (at least if you're good at your job).

The problem (at least if you're selling digital ads) is I don't see a scenario where margins increase. The "ad network" effect will continue and there will be more consolidation in the industry with more link-bait headlines coming out of blog mills. The area where there could be more profit would be if content creators integrated more with advertisers, which is only going to get hairy and annoying in a different way. But for something like NYT--I wouldn't be all that surprised if the line between advertiser and content continues getting more fuzzy in order to maintain decent CPMs on what is probably a relatively fixed amount of traffic/inventory.

Another opportunity comes to mind--better demographic targeting, better inference algorithms (given a list of factors, what can we tell about this person re: age/sex/income/likelihood-to-buy-our-specific product-based-on-past-purchases?)--these will also allow content creators to maintain higher CPMs, so there's probably a market there.


Yep, this is absolutely true. At Perfect Audience, our advertisers are often using as many as 3 or even 4 tools to track and measure their return on ad spend. The numbers don't always line up exactly from tool to tool, but they can see pretty quickly and clearly if a new channel's going to product ROI or be a dog for them.


On the one hand you're true, on the other hand if somebody steals £10 from my wallet every day I can budget my life to not care about it, but I'm still having money stolen from me.

Just because it works now, if you could get rid of the fraud then either advertisers could get better value, or legitimate sellers could make more money, or both.

Also, you'd be amazed how many advertisers don't track conversions. I've experience in both digital sales and buying, and on the sales side a huge number of clients wouldn't look at anything more than the impressions/clicks figures that we (truthfully) told them we delivered.


My last business was ad-supported (premium, big-brand advertisers). One day, out of the blue, one of the companies which sold ads on our behalf sent us a nasty letter accusing us of buying junk traffic / creating false impressions and clicks - something we most definitely did not do.

Digging through the raw ad server log files, I discovered that the "suspicious" impressions and clicks were all originating from AWS IP addresses - most likely someone was using AWS to run a spider on our site, the spider followed Javascript links, and therefore clicked every ad on every page.

We ended up adding a rule in our ad server to prevent ads from being served to any IP address which belonged to cloud hosting / VPS providers - this solved the problem for us.

I've since sold the business and therefore don't know if the problem ever arose again, but I believe blacklisting IP address blocks which are highly unlikely to belong to real human beings* could be a good start for anyone running into these sorts of issues, either on the advertiser or publisher end.

*Yes, I know some people run VPNs on AWS or similar VPS instances, which means they are real humans - that was a loss we were willing to deal with.


That solves the bad bot problem, it doesn't solve the actual ad fraud problem. There are at least millions, if not tens or hundreds of millions, of ad clicks per day coming from botnets of ordinary Windows computers on residential ISPs with full JS-executing browsers. You can't detect this activity based on IP address, user agent, script execution, etc. I've seen individual advertisers targeted and get hit by hundreds of ad clicks per day, when they usually only get a dozen or two, every click from a different Comcast/FiOS/TW/RR residential IP, with unique user agents, and varying search phrases that match the ads.

It's much, much harder to detect and block that; if we could only see activity on our own individual websites it'd be nearly impossible to tell the normal clicks from the fraudulent activity.


Agreed - targeting bot nets is a whole other ball game - one that likely can only be combated at the ad server / exchange / DSP level, as they are the only entities with large enough data sets to tease out which machines are infected.

From the single advertisers perspective, the easiest solution I can recommend is working with networks which provide eCPA-type bidding* - as then sites which actively buy traffic from bot nets will over time be blacklisted automatically from your campaign. Back when I was on the buy-side of online advertising, we used that "trick" to great success with a major credit card issuer buying billions of impressions.

*What I mean by eCPA type bidding is when you tag your conversion page with the network's pixel, and the network uses your conversion data to optimize the campaign on their end to get rid of publishers which send click that never convert. I know there is a better term for this, but it's a Sunday night and I haven't worked in media buying for a few years now...


what do the owners of those AWSes win from this? did you relay second hand referrer info on the clicks to the ad network?


Only a quarter? Must have been a high-quality source.

The thing is a lot of publishers are fully aware of the fact that the traffic is mostly fraudulent. The ones who rely on bogus metrics such as Quantcast, Hitwise, Comscore, Alexa, etc. They buy up a ton of second tier search traffic for extremely low PPCs, we're talking sub-penny, and use that to inflate their "visitors" metric which is then used to sell how popular their site is to large brand advertisers.


I somehow doubt that any PPC search traffic, even 2nd tier, is cheap enough to arbitrage off in this manner. PPC costs have climbed over the years and any PPC -> CPM arbitrage seems unlikely to me these days. I'm thinking they have other sources. Do you know for a fact it's PPC search they're sourcing visitor count from?


I can't speak for today but in 2008-2010 this was definitely the case. It's not exactly a direct arbitrage play as a lot of the end advertisers tend to be doing full media buys. Top banner ad, background page ads and they are just paying for 100% coverage for a day or week or month. One large gaming news/reviews site was buying about $10k/day worth of mostly fraudulent traffic. The traffic wasn't bots, per se, it was "real" users but they had compromised machines with popups and toolbars. So there was a high visit count of real users but there was no conversion metrics or time-on-site to be measured. They continued buying the traffic for at least a month so I can only assume it was meeting their goals. We had a lot of publishers all doing the same thing.

As this was second tier, I wouldn't really call it "search" traffic. Just a ton of users with keyword-like queries being routed to various xml feeds/exchanges. When we'd send the traffic to actual publishers with conversion goals the rates were abysmal and we'd end up issuing lots of refunds. The only publishers that seemed to like the traffic were ones that sold their own dedicated ad units to large brand advertisers.

I'm just glad to be out of that industry :)


Yeah this kind of scenario is absolutely realistic. I've seen, firsthand, stuff that makes the above look downright honorable, and done by major, major players.

When there's money and success at stake, it's difficult to be too cynical about what tactics are probably being employed.


Motion Seconded. To the best of my knowledge this is not an economically viable traffic inflation technique in todays ecosystem.

Possible, but not much value for an arbitrage savvy publisher.

Not going to name names, but a good deal of fuckery is afoot in under article link exchange widgets. You know the ones, the boxes with link bait thumbnails that show up on a good deal of popular entertainment sites. Dealt with a lot of fake traffic coming through from the link exchange network. It appeared other publishers were sending tons of fake bot clicks to our site, in order to give themselves a higher amount of free reciprocal traffic from the link exchange.


Outbrain, Taboola and the other "content recommendation networks" are selling clicks in the $0.05-0.15 CPC range. Those prices are going higher and higher, but publishers are buying up all the traffic they can at those prices to arbitrage against their direct-sold inventory.


It's more likely CPV(cost per view) traffic, which can be bought as low as $5 CPM.


I've been fortunate enough to have first-hand access to the analytics platforms of a number of fairly high-traffic sites (in the tens-to-hundreds of millions of unique visitors per month), and not once did Hitwise, Alexa, or any of those estimates come anywhere close to actual data. There was nothing shady about the way these sites got their traffic–the vast majority was organic, because they're sites everyone here has heard of–but for some reason the third-party data was just bogus. Like, not unreasonably outside the realm of totall-made-it-up bogus.

It's worth remembering that sites like comScore, Quantcast, Alexa and Hitwise derive a solid chunk of their business by enabling Google sales reps to better convince clients to spend more money. It's not far off from an academic whose career depends on grants by parties who are heavily invested in seeing a particular outcome.


This is a big problem even for Bing/Yahoo. They've recently partnered with Media.net for their content (display network ads) and many of the publishers in that network deliver fraudulent clicks (Keywordblocks and the like).

The other problem I've noticed is that if you don't bid high enough for search keywords, they start sending you traffic from Media.net and make it look like it is search traffic. For example I bid on the search term "get money now" (don't want to reveal the actual query). For real search traffic, the clicks convert very well for me. A few months ago I noticed that my conversion rates had fallen significantly. When I investigated I noticed that most of it was from media.net/5_ways_to_get_money.cfm and none of these clicks ever produced a sale. I am in the process of pursuing a refund and have blocked the site. Real search traffic always comes from search.yahoo.com or Bing.com.

It is an even bigger problem for AOL (Advertising.com) that includes supposedly premium sites like HuffingtonPost.com. You will frequently notice:

1. Inflated clicks. Your analytics show you got 100 clicks, their's show you got 170. When you raise this with their support staff the standard answer is they will not investigate based on someone else's analytics. I said, I have two different analytics programs on my server that show the same count which is a lot lower than theirs. They still won't budge. I am in the process of pursuing a charge back.

2. Clicks with 100% bounce rate and that spent exactly zero seconds on your site. There is just no way these are real humans.


> 2. Clicks with 100% bounce rate and that spent exactly zero seconds on your site. There is just no way these are real humans.

If you're using common analytics programs, you're probably misled about what the time-on-site statistic means. Unless they actively ping every visitor on your site the entire time your page is open, which is not the norm, they have no way to know the time-on-site for single page visits. It's computed as the elapsed time between two page views (two loads of the analytics script), but if there is no second page view, there's no second time to subtract from. Someone who clicks through to your page and reads intently for 13 seconds before closing the page is a "0 second visit" as far as those programs are concerned.


I think curiouslurker meant those as two separate categories: Clicks with 100% bounce rate, and those that spend 0 seconds on your site.


There's actually a legit problem here though where ad networks are counting "clicks" while analytics programs are counting "visits".

Comparing the two always leads to different numbers because they are different things.

Google has this issue even between Adwords stats and Google Analytics stats. They are always billing people for more clicks than are reflected as visits in Google Analytics. Same reason, different things counted differently.


Can't a lot of this be solved by purchasing pay-per-day ads?

Paying per click or per thousand views just gives someone the incentive to increase clicks and views fraudulently. When you pay $10 to have your ad on the front page for the whole day the incentive to cheat disappears.

This is why I've been using Blogads.com and smaller networks instead of Google Adwords. The traffic is high quality, I don't have problems with high bounce rates and visitors that spend 0 seconds on my sites. I actually get legitimate traffic. Sometimes I over-pay for the ad space, sometimes I under pay.

Also, I knew a guy (who makes a social network and forums community platform for big companies) whose competitors would purposely pay an Indian company to search for his website in Google and click the sponsored ads using constantly changing IPs so they can cost him money and use up his budget. Something like 70% of his clicks were fraudulent.

Again, this wouldn't matter with pay-per-day/week/month advertising. Too bad it's not compatible with Google's current Pay-per-click auction / maximize profit system.


We spend an inordinate amount of time trying to combat click fraud. Clearly someone pulling in 2,500 - 5,000 a month via click fraud is a 'small potatoes' to the big ad networks but I'm sure a lot of people have made lifestyle businesses with it. Common fraud patterns are clicks from AWS or foreign hosting facilities. Or waves of clicks on a single ad from a swath of subscribers.


A closely related problem was the basis of a dissertation I wrote for my honours year. At the time I felt I had discovered a reliable way to track legitimate visits to websites by participating users.

Upon subsequent review it transpired that I was wrong. Given the primitives of HTML, Javascript and HTTP, I don't believe you can produce a robust tracking scheme, whether for advertising or for any other purpose. You need to add additional steps or software over and above the basics.

That latter observation is the basis of a new design which I am currently, in order to start brawls in certain circles, in the process of patenting.

I'm happy to forward my dissertation to interested persons via email, check my profile.


As someone who worked several years for one of the front running ad fraud agencies, I can tell you what most of our customers said. "We don't care about fraud, it is priced in."


Right, it's basically "the cost of doing business" overhead that encompasses unsavory costs like bribes, extortion, dishonest employees, etc. As long as the parasites in the process don't get too greedy, it's still possible to run a business. It's just not as profitable as it would be and/or the customers pay more as a result. Unfortunately the sleaziest advertisers tend to thrive better in this environment.


As someone who's spent several years researching and working within the online advertising industry, I can attest to this being quite common on a lot of the larger Alexa-ranked sites and traffic sources.

There are many highly ranked sources that will simply burn through your ad spends as quickly as you can deposit funds to the account. People often times spend thousands throwing away money on "testing" these sources, only to usually find they don't pan out.

That's why one of the most important things in online advertising is constant optimization and watching the conversion rates from different sources and adjusting bids as necessary. Drop the sources that don't pan out.

It's similar to stock trading, mitigate your losses quickly and don't bleed your ROI or throw away excess money on "testing" what works. Prune the sources that aren't doing anything for you, and watch your ad spends become much more productive.

The only thing the traffic described in this article is good for is vanity metrics and pumping up numbers like it sounds several of these large brand sites are doing.

If you're looking to inflate your numbers, sure... waste a bunch of money on a ton of bot hits, it's still 1999 right? Just don't expect any sales or conversions from that type of traffic... ever. It just simply won't happen.


I don't think i've evern seen an article so badly written. It has the right research, but it is pointless. The whole thing is sumarized in the last two paragraphs. With zero information lost:

"""

Says Woodman: “When we try to tighten things up, our measured performance goes down. There is an incentive among buyers to let the floodgates open. And publishers need more money, so they ignore.” So the bad traffic persists. “We need to fix this as an industry,” he adds. “Somebody needs to give a shit.”

The IAB seems to. Per Sullivan, the organization is working on devising a standard for publishers akin to the Good Housekeeping Seal. He’d like to see the biggest stakeholders get more aggressive about the problem, including brands and agencies. “If buyers came out and said, ‘I will only buy from certified vendors,’ that would change things,” he says.

"""

so basically, the industry measures clicks. and bots delivers clicks. fraudulents, but clicks. and everyone is trying to fix it with black/white-lists so they get a promotion for getting the clicks, but not looking bad with their boss when some known scam site shows up in the report.

The whole industry is based on allowing only the 'clean' bots of the week inflate your numbers.


Google is complicit because of the pay-per-click model. They make money whether or not it the clicks are hired by middlemen or not.


does google really care?


Where do I get my 25% refund from AdWords? This is a significant sum to me.


Isn't AdWords an auction of sorts? If so then this 25% is already baked in.


I don't understand that conclusion. Don't people place bids based on the perceived value of the keyword? If so, wouldn't everyone need to discount that value by 25% because they are aware of the fraud in order for it to be "baked in"?


People will bid as much as they can afford while achieving a positive ROI. If you can't, someone who's earning more dollars per click will, and they'll get the slot instead. If 100 clicks generates $100 in gross profit for the company with the best customer LTV, nobody's going to bid more than $1 per click. That $1 per click already factors in the discount due to fraud. If all the clicks were real, qualified visits, they'd have earned more than $100 per 100 clicks, and would be able bid more than $1 per click. As long as there are enough people participating in the auction, it will naturally account for the quality of the traffic, whatever the reasons behind that quality may be.


Thanks. I can see the logic.

OTOH, doesn't this require perfect efficiency and for all bidders to know perfectly their ROI, etc.? Any holes, unintentionally irrational behavior, etc. and the fraud "leaks" into the price, right?

So, for instance, any "unprofitable" advertising on behalf of any participants would seem to automatically cause the fraud to be reflected in the price.

Based on personal experience, I would guess that there's a pretty long tail of advertisers who never realize an ROI while spending a considerable sum seeking it!

Even those that eventually profit likely spend a good bit of money ramping up, all the while paying for fraud in the process and helping to support a fraud-inflated CPC overall.


Google wasn't mentioned at all in the article, which is somewhat odd... are they immune to this type of fraud? Or at least more resistant?


They're not immune, but they're among the best at detecting it. They have more traffic and click data to use to identify the patterns than any other company in the world.

Most AdWords advertisers would probably be surprised to see that they're only being charged for 75% or less of the clicks on their ads already. They simply never show up in the performance reports or on the bill.

To see them, log in to AdWords, click on a campaign, click the "Dimensions" tab, choose the "View: Day" filter, click "Customize Columns", choose the "Performance" category, and add the "Invalid Clicks" column to the report.

Now you'll see the huge number of clicks Google received on your ads but never charged you for.

It's also common to see adjustment entries on the billing report labeled "click quality". Those are automatic refunds for potentially fraudulent clicks given after-the-fact when they couldn't be caught in real-time.


They've become considerably less transparent about fraud-related adjustments in recent months, both in the UI, and on invoices. Employees who seek clarification tend to get very little in the way of answers.

Not saying anything shady is happening, just that they've made big changes and aren't communicating properly about them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: