At BusinessLogs, we help you communicate better with your customers by designing elegant, friendly application interfaces & blogs. Learn More ›

Top Ten Sources: Stealing Your Content?

I just stumbled upon a site called Top Ten Sources, and if I’m assessing it correctly, I believe that editors pick a topic and then find ten weblogs they feel are good resources on that topic to comprise their “Top Ten List”. These editors then make a page for that topic (Venture Capital for example), which aggregates the editor’s Top 10 blogs into one page of information. Each page is essentially an aggregator that pulls down the latest content from those chosen blogs, and republishes that content on the Top Ten Sources site.

Now, I think that this service definitely provides value to the reader, however I’m wondering if the sites that are compiled into these Top 10 pages have been asked ahead of time if they could republish their content? The Venture Capital page has republished full text entries from their sources at the bottom, and this trend is continued throughout the site on other Top 10 lists like the Rose Bowl, Pets, and many others.

From what I can gather, if a site in a Top 10 list page has full-text RSS feeds, then their entire entry will be republished, but if they only have excerpts than the excerpt will be shown. Top Ten Sources appears to make no effort to limit the text — I’ve seen some entries in their lists that have 12-15 paragraphs, complete with quotations, code examples, and even images (yes, still linked to their author’s site, also known as “hotlinking”), and even some people have Google AdSense in their feeds, and yup, their AdSense boxes show up on their associated Top 10 page. So basically, Top Ten Sources takes all your information from your RSS feed, republishes it on their site, and then uses it to build traffic. I’m not sure about their revenue model yet, however if it involves running ads against other people’s content then that’s a major no-no.

To be fair, 9rules displays member content in categories as well, however 1) our members gave us permission to do so, and 2) we will never display their full post, regardless of if their RSS feed is full entry or not — the most we display is 3-4 sentences. Plus, we strip out all images to avoid just such a case I pointed out above.

So the question I pose is this: Are the blogs you are republishing consulted ahead of time about this? I looked at the About Page for the site but didn’t find anything that spoke of an agreement or copyright notice. I think this issue definitely needs to be cleared up, because if they’re doing this without consent Top Ten Sources is no better than the 3-4 site scraping bottom-feeders who republish the Business Logs RSS entries everyday, or the thousands more who poach other copyrighted content from around the web.

updateJohn Palfrey just responded to this entry (as well as a few others) that talked about Top Ten Sources and copyright. In the article he mentioned this entry and called me a “respected member of the blogosphere”, to which I am deeply honored and flattered. While writing this entry about Top Ten Sources, I knew in the back of my head that they probably sent emails to the included blogs about the aggregation, and I was right:

“As the editor compiles the site, the editor sends out an e-mail to the person who appears to be responsible for the site, or, sometimes, posts a comment to say that the site has been chosen. The site renders a list of those sites offering the feeds as directlinks to the page. The site also subscribes to those feeds and renders them all together on a single page. It is this latter activity that I take to be the concern.” — John Palfrey

Top Ten Sources emails newly included blogs as they are being aggregated, and are given the opportunity to opt-out at anytime. My opinion has now been changed :)

About Mike Rundle

Comments

  1. Chris says:

    When they added Hail the Ale! in the Beer category, they sent us an email to notify us we had been added. They didn’t ask permission prior to that.

  2. Whether it’s spam is a matter of moral debate, but I don’t think they’re adding any value. The default feeds in the same categories at My Yahoo would be just as useful, and a real aggregator lets you customize what you’re reading.

    The time has long passed when “combined news from 10 different sites” brought any value to readers.

    If they included any of my sites – which they haven’t – I would insist they remove them.

  3. SC says:

    “The time has long passed when “combined news from 10 different sites” brought any value to readers.”

    How true this is.

  4. Alex says:

    Bloglines republishes the content in its entirety. And so does My Yahoo! When you post it in RSS format, let’s not forget that the third S means syndication, i.e., including the text somewhere else.

  5. Mike Rundle says:

    Alex, but Bloglines is a feed reader, an RSS aggregator. It’s not using content for commercial purposes or to draw traffic like Top Ten Sources is. Bloglines uses feeds so that they can be read by individual people, not broadcast and stored on pages to boost pageviews, and at some point, advertising dollars.

  6. Not only did we not get an email at any point, but we have not heard a response on our request to be removed from the listings. We have a CC license and would gladly share the full feeds if the CC was being honored.

  7. Now that I’ve looked into this further and read Palfrey’s response, let me amend my comments:

    1. They are using one of my sites.

    2. I won’t be asking them to remove it – they seem to be good people, and they’re only displaying excerpts even though we publish full feeds. The link to our site at the top of the category page is also a gesture of good faith that I appreciate.

    3. For what it’s worth, I never received an email about being included there. Email being as unreliable as it is, though, I’m not too concerned about that.

    I’m still not sure they’re bringing any value to readers, but they’re bringing value to my site by linking to it respectfully, and I’m glad to hear they’re not just another spammer.

  8. Ben Eastaugh says:

    Perhaps someone can explain to me why they’re using Javascript for their links? Obviously it makes them pop up in a new window, but it also means they’re not linking directly to the articles in question.

  9. Mark Barrera says:

    Are there any sites out there that can scan the web to see if these services are stealing your content?

  10. SnazzyCat says:

    Well for what i know, google is able to track where the original content of a website sprouted from, but again im thinking they only guess by looking at the page PR or visiting a database like archive.org to see where the content originated from.
    any suggestions?

  11. Jennifer says:

    This blog post was referred to me to read at Active Rain. My companies blog was a victim of Top Ten and they never asked permission to publish our content.

    I find it interesting that Ben mentions Javascript and the pop-up windows (not linking back directly – I am not sure I understand.) However, when I contacted them to remove our content and stop pulling they said that they would stop however if you have an RSS that gives them the right to republish and take our content.

    Pardon me, but the last time I checked RSS was for people who wanted to have automated access to new content on you site and not for people to steal and profit of your intellectual content! To me, it is a very shady way of doing business.

    Besides, our company does not want to be associated with that site. It does not represent our site or our community well. We are very intentional about how we outwork our community blog.

Speak Your Mind

*