Welcome to the Q&A Forum

DanCrean

I have an old Wordpress site and I want to move it to a new server and take it off Wordpress (too many hacks). I am trying to spider the site so as to get static, non-Wordpress, pages.

I am having trouble doing this. When I spider the site, it changes the URLs. For instance, if the URL is www.domain.com/page/ the URL I get out of the spider is /page/index.html And those are not the URLs in the search engine indices. There are about 2000 pages on this site, so it is not feasible to set up 301 redirects.

I tried using these spidering programs: WinHTTack Website Copier and PageNest

Does anyone know of another method of turning a Wordpress site into a non Wordpress site?

DanCrean

Doesn't the crawl diagnostic system tell you? If you are talking about Google webmaster tools, under Diagnostics => Crawl Errors they will show 404 errors. They also usually show under the Linked From column the page they found the broken links on. If they are from external sites, you can't necessarily do anything about it, although you can try to contact the webmaster of the linking site and ask him to fix the link. If the broken links are from your own site, you fix them.

Many webmasters also use a program called Xenu. Do a search for it. It runs on Windows; I don't think they have a Mac version. It will crawl your site and identify broken links and what page they are on. It's a good idea to run a Xenu scan on your sites periodically.

DanCrean

Yes, your link back to the other site is in good faith and good for readers. If you don't do it too much, you shouldn't get dinged for recip linking.
About 4 or 5 years ago I used to see sites do this, usually using the robots.txt file to exclude spidering ot their links page. i don't know if it;'s the "best practice" but it seems robots,txt was used more often than noindex on the page.

It's a sleazy thing to do and yes, it can cause bad blood with your link partners. I know because on more than one occasion I informed sites about that practice being used on them, and they removed their outbound links and thanked me for pointing out how they were being played for chumps.

DanCrean

We've never syndicated content or done "article marketing".

Another site contacted us and requested to use the content on several of our webpages. The other site is a fairly prestigious nonprofit in our industry. We don't mind them using our content, but we want to get the most benefit out of it.

There are two ways the occur to me:

Have them create pages with the exact same text as on our pages, but put in the header of those pages
Just have them create pages with the text from our pages with embedded links back to our other pages. Each page they create will say "Content courtesy of XXX"

Does anyone have opinions on which way is best, or another approach?

DanCrean

Doesn't the crawl diagnostic system tell you? If you are talking about Google webmaster tools, under Diagnostics => Crawl Errors they will show 404 errors. They also usually show under the Linked From column the page they found the broken links on. If they are from external sites, you can't necessarily do anything about it, although you can try to contact the webmaster of the linking site and ask him to fix the link. If the broken links are from your own site, you fix them.

Many webmasters also use a program called Xenu. Do a search for it. It runs on Windows; I don't think they have a Mac version. It will crawl your site and identify broken links and what page they are on. It's a good idea to run a Xenu scan on your sites periodically.

DanCrean

Yes, your link back to the other site is in good faith and good for readers. If you don't do it too much, you shouldn't get dinged for recip linking.
About 4 or 5 years ago I used to see sites do this, usually using the robots.txt file to exclude spidering ot their links page. i don't know if it;'s the "best practice" but it seems robots,txt was used more often than noindex on the page.

It's a sleazy thing to do and yes, it can cause bad blood with your link partners. I know because on more than one occasion I informed sites about that practice being used on them, and they removed their outbound links and thanked me for pointing out how they were being played for chumps.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

DanCrean

@DanCrean

Latest posts made by DanCrean

Best posts made by DanCrean

Blog Posts

Price Discrimination in Pay Per Click Advertising

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved