Canonicalizable


EDIT: I was getting scraped too much, so I had to pull the tool down :( Fortunately, the code is open source and you can get a free Linkscape API key here.

Inspired by an article at Jane and Robot about domain canonicalization (and the fact that I’m the lead developer on the Linkscape API), I decided I’d write a small application using the Linkscape Free API to help with a common problem: checking canonicalization of website home pages.

I spend a lot of time answering SEOmoz Q&A and I see canonicalization problems come up all the time.  No matter what the size of the site, or savvy of the engineering team, this is just an easy problem to miss.  But it’s also easy to fix.  You just have to find those pesky canonicalization errors.

Canonicalizable

The tool doesn’t scrape any sites.  All of the data is pulled from the Linkscape API.  And everything here is possible using only the free API.  All of the code is available in my github repository.  There’s plenty of documentation there and that’s a good place for any discussion about the code.  Feel free to take it and use it in whole or in part on your site in any application.  You’ll just have to sign up for a free API key.

  1. #1 by Carter Cole on January 18th, 2010

    thanks! they need to extend the api and allow more access to the data… theres no limit to the stuff you could do

  2. #2 by Nick Gerner on January 18th, 2010

    @Carter Cole everything you see here will be possible with the free API we’re releasing by tomorrow morning. You’ll be able to get links, anchor text, and slick metrics from the free API. So go check it out :)

  3. #3 by Richard L. Trethewey on January 20th, 2010

    It’s a very handy tool. Maybe you could also have it scan HTML pages for a rel=”canonical” tag which mitigates many common canonicalization problems.

  4. #4 by Nick Gerner on January 20th, 2010

    @Richard L. Trethewey the tool uses the Linkscape API which has data on rel=canonical so to the extent that the API uses that info (which the tool doesn’t leverage _yet_) the tool will show that.

  5. #5 by AJ on January 26th, 2010

    Great tool, really helped find issue with a clients site. However we made the redirects the tool specified and we still seem to be getting the same error back. Does the tool keep a cache of entries at all that would affect re-checks?

  6. #6 by Nick Gerner on January 26th, 2010

    @AJ the tool uses the Linkscape API to get data. That data is updated about every 5 weeks, with a lag behind that in crawl data. So don’t worry. If you verified the redirects for yourself, I’m sure everything is fine.

  7. #7 by Emanuele on March 24th, 2011

    Is the tool dead? The url redirects here.
    Thanks,
    Emanuele

  8. #8 by Nick Gerner on March 24th, 2011

    Yes, sadly, I had to take it down. I was getting scraped too much and my web host had words with me.

    However, the code is open source and you can grab it here:
    http://github.com/gerner/canonicalizable

(will not be published)