Link analysis going Crazy. Next Linkscape update. Multiple Problems
-
Yer but ones lost hundreds of links, and ones lost thousands of links, and I know they both have very active SEO campaigns , also it hasnt added on my fb shares and likes. how long does the link update take to complete?
-
-
Fair enough. Today, I have been to a couple of high quality sites that I know our link is on, but didn't show in the latest run of link analysis, I went back to the site our link was on and its still on there, and theres no reason this site should be taken off.
-
Same here. I realise the 'OSE is looking to give better quality results and therefore only crawls the top pages of the web.' argument, but two .ac.uk (.edu equivalent here in the UK) links with DAs of 70+ and PAs of 45-50 are no longer appearing.
Are these not considered to be in the top part of the web? And that's just the start, several of our higher DA links are now missing from OSE, and (consequently?) our DA has dropped along with the PA of several of our top pages.
Links have disappeared from major manufacturers websites on pages with PA of >50 pointing to another site I run (and yet, the links are still active). I don't see how this is an 'index clean-up'
I completely understand removing links from brand new blogs with PA/DA of 1/1, sites that are still sandboxed, or deeply nested pages with very little authority, in order to improve overall performance. As it is, using PA and DA as a means of measuring how successful our SEO efforts are is becoming less useful - has it dropped due to something we've done or is it an 'index clean-up'. For the same reason, using it to determine which links are worthwhile getting has lost some of its value too.
-
I agree, it appears SEOmoz have really messed up
-
Totally agree. It's confusing for the clients because we use PA/DA in the monthly reports and the recent DA drops resulted in questions, whether we're doing a good job or not.
I always explain that the authority values are relatively calculated and their website is compared to the other websites in the OSE index but numbers are just numbers and when a well trusted graph starts declining, I'm always in an inconvenient situation.
It would be great to hear more info about OSE updates, especially because we, the PRO users, support the development.
-
Thumbs the question up to get SEOmoz's attention
-
Same story here I'm afraid. I'm a HUGE SEOmoz advocate but things seem to have gone slightly haywire with the latest OSE update. Our site has lost almost 200,000 links! Yikes! I'm not a massive 'link counter' and tend to focus on quality rather than content but we have active links from great resources such as the BBC and the Guardian newspaper which are no longer showing in our OSE report. I presume this is the reason our domain authority has dropped from 64 to 58.
I'll tweet Rand and SEOmoz and point them in the direction of this thread as there seem to be a few disgruntled customers here.
-
If you have concerns about changes in data with the latest update, the quickest and most effective way of getting answers is to email the Help Team direct using help [at] seomoz.org.
Moz Staff do not routinely read the Q&A,so are unlikely to be aware of issues unless they are specifically alerted to an thread by an Associate who happens to see it.
-
We need to make them aware, because this many unhappy clients, and i am sure there are alot more when everyone woke up and logged on this morning. They need to sort this as it may become really really bad for them
-
Nice 1. Yer tweet them and make them aware. Ill do the same to.
-
I am not disputing that SEOmoz need be alerted to an issue.
My point was that an email to the Help Team when this thread was originated 8 hours ago would have alerted SEOmoz that there was an issue much more quickly.
Sha
-
I emailed them 3 hours before I started this post. Reliable service huh
-
For updates on this issue, please follow @SEOmoz on Twitter.
Tweeted in the past few minutes:
Good morning everyone. We've seen the tweets and Q&A about OSE and will update here with information soon.
-
A slight correction is that the help desk staff doesn't routinely read the Q&A, but other SEOmoz staff and associates do read Q&A. I personally try to read every new question that comes in, but am myself on Pacific time and don't see things right away when they come in overnight. Emailing help at seomoz.org is the best way to report a problem like this.
An OSE engineer did just come into the office and is aware of these Q&A threads and will be responding soon. We're so sorry that there have been problems, and we're working to resolve them as soon as possible. Sha Menz, thank you so much for helping out here.
-
Hi everyone,
First, there was an update last night, and I believe we've adjusted the calendars now.
We're actively tracking down what happened. What the team really needs now is domains affected. If you could respond here with the domains involved (if you're allowed to share them) that would be great, otherwise, please send an email to help@seomoz.org with "OSE Domains" in the subject line so they can filter the incoming messages that way.
Thanks for your patience everyone!
-
my domain was affected: www.completeoffice.co.uk. thanks
-
Hi Keri, Thanks for your response. The domain involved for me is confetti.co.uk Thanks again, Brendan.
-
Hey gang - there's a thread going around the SEOmoz engineering + help teams on this topic today. We're researching what happened right now, and Kate Matsudaira, our VP Engineering, has promised to leave a reply once she's got the full story. We'll try to be as transparent as possible here and as fast as we can as well, but Linkscape investigation can take some time due to the massive complexity of the system.
Thanks for posting responses and please do keep suggesting sites we're missing, pages we might not have crawled but should have, large drops in metrics (particularly if/when they're outliers to the rest of your competitors/other sites in your sphere).
Thanks much!
Randp.s. Normally, I'd be much more involved in this myself, but today's my anniversary with my wife and we're on vacation in Southern Oregon. Don't worry, though, I only take one each year, so typically I'm better able to respond fast.

-
Hi everyone!
I just wanted to add a quick response to shed a bit more light on the situation.
Last year we started a on a project to drastically improve our index. The first part of that was to make our crawler discover more of the web - this included crawling deeper on domains, discovering more links faster (freshness), and contain more links overall.
Background
To understand the changes, it might help if I explain how our crawler used to work and how we changed.
Our crawler used to crawl the web (for 3-4 weeks), then we would compute the link graph and create all the lists of links, and metrics you see in Open Site Explorer - this is what we called processing (and it would take 2-3 weeks). As part of processing we would select the top 10 billion urls to crawl, and then start crawling those.
The problem with this system was that the data was could be 7-8 weeks old (crawling time + processing + deployment to the API and OSE). It also wasn't recursive - meaning that we would only discover new links when we did the processing of that crawl, so it could take us several months before we would see new links that were deeper in domains.
The changes
We modified our crawler so we were crawling all the time - we crawl sites every day, or week, or month - based on authority. As we crawl those site, any new links that we find are added to one of the buckets, and will be crawled typically within that same index. This is exciting because we can go deeper, discover more links, and produce a higher quality index. The other benefit, is that since we are crawling all the time, we can just take a snapshot of that crawl and run processing - without waiting for the last round of processing to finish - and this means we can update the index more often.
However, in June, we had a problem with the old crawlers, and we had to roll out our new version of the crawl and index with the OSE launch on July 27th. So even though our testing looked good when we released the new index, and correlations were higher than the old crawl, we got complaints about things that were wrong.
The issues
Binary files were in the index - There are normally only supposed to be links in the index, but because the new crawler went very deep on some domains we started discovering all sorts of binary files, which when parsed, produced lots of weird links. So domains had all these links from sites that didn't link to them. We fixed this issue, and this is the first index with the fix.
We went too deep on big domains - There are a lot of knobs to turn on the new crawlers - from the number of sites we crawl daily/weekly/month to how many links we keep for different domains. One of the first things we noticed with this new crawl, was that we had less domains in our index. So we dialed down how many urls could come from a domain - and this new index also contains that change.
What we are doing
We recognize that all of you depend on this data. And we take the index quality very seriously.
We have already made a lot of other changes, increasing the overall size and adjusting how we crawl. However, since it still takes 2-4 weeks to process an index, so some of those changes won't be seen for another 2-4 weeks yet.
We are also working on an updated, higher correlating Page Authority/Domain Authority that should be out in a month or two - but also may jump around a bit.
What you can do
Definitely keep sending us feedback. It really helps us understand where we may have missed in our testing, and what we can do to fix it.
And thanks again for your patience - we really want to deliver the best possible Linkscape for you, and I assure the team is working nights and weekends to address these concerns.
And if anyone has questions you can always email me or our help team (which tend to respond to emails much faster), as all of us care a lot and really want to hear your feedback.
Thanks again,
Kate