Latest posts made by DanCrean
-
Way to spider Wordpress site
I have an old Wordpress site and I want to move it to a new server and take it off Wordpress (too many hacks). I am trying to spider the site so as to get static, non-Wordpress, pages.
I am having trouble doing this. When I spider the site, it changes the URLs. For instance, if the URL is www.domain.com/page/ the URL I get out of the spider is /page/index.html And those are not the URLs in the search engine indices. There are about 2000 pages on this site, so it is not feasible to set up 301 redirects.
I tried using these spidering programs: WinHTTack Website Copier and PageNest
Does anyone know of another method of turning a Wordpress site into a non Wordpress site?
-
RE: Where can I find broken link.
Doesn't the crawl diagnostic system tell you? If you are talking about Google webmaster tools, under Diagnostics => Crawl Errors they will show 404 errors. They also usually show under the Linked From column the page they found the broken links on. If they are from external sites, you can't necessarily do anything about it, although you can try to contact the webmaster of the linking site and ask him to fix the link. If the broken links are from your own site, you fix them.
Many webmasters also use a program called Xenu. Do a search for it. It runs on Windows; I don't think they have a Mac version. It will crawl your site and identify broken links and what page they are on. It's a good idea to run a Xenu scan on your sites periodically.
-
RE: Reciprocal Links and nofollow/noindex/robots.txt
-
Yes, your link back to the other site is in good faith and good for readers. If you don't do it too much, you shouldn't get dinged for recip linking.
-
About 4 or 5 years ago I used to see sites do this, usually using the robots.txt file to exclude spidering ot their links page. i don't know if it;'s the "best practice" but it seems robots,txt was used more often than noindex on the page.
It's a sleazy thing to do and yes, it can cause bad blood with your link partners. I know because on more than one occasion I informed sites about that practice being used on them, and they removed their outbound links and thanked me for pointing out how they were being played for chumps.
-
How to best take advantage of content being used on another site?
We've never syndicated content or done "article marketing".
Another site contacted us and requested to use the content on several of our webpages. The other site is a fairly prestigious nonprofit in our industry. We don't mind them using our content, but we want to get the most benefit out of it.
There are two ways the occur to me:
-
Have them create pages with the exact same text as on our pages, but put in the header of those pages
-
Just have them create pages with the text from our pages with embedded links back to our other pages. Each page they create will say "Content courtesy of XXX"
Does anyone have opinions on which way is best, or another approach?
Best posts made by DanCrean
-
RE: Where can I find broken link.
Doesn't the crawl diagnostic system tell you? If you are talking about Google webmaster tools, under Diagnostics => Crawl Errors they will show 404 errors. They also usually show under the Linked From column the page they found the broken links on. If they are from external sites, you can't necessarily do anything about it, although you can try to contact the webmaster of the linking site and ask him to fix the link. If the broken links are from your own site, you fix them.
Many webmasters also use a program called Xenu. Do a search for it. It runs on Windows; I don't think they have a Mac version. It will crawl your site and identify broken links and what page they are on. It's a good idea to run a Xenu scan on your sites periodically.
-
RE: Reciprocal Links and nofollow/noindex/robots.txt
-
Yes, your link back to the other site is in good faith and good for readers. If you don't do it too much, you shouldn't get dinged for recip linking.
-
About 4 or 5 years ago I used to see sites do this, usually using the robots.txt file to exclude spidering ot their links page. i don't know if it;'s the "best practice" but it seems robots,txt was used more often than noindex on the page.
It's a sleazy thing to do and yes, it can cause bad blood with your link partners. I know because on more than one occasion I informed sites about that practice being used on them, and they removed their outbound links and thanked me for pointing out how they were being played for chumps.
Blog Posts
9/24/2007
PPC is more complicated than it was a few years ago because the ad platforms – Google AdWords, Yahoo Search Marketing, MSN AdCenter, and others – have become more adept at price discrimination. They are better at charging their customers - the advertisers - what the customers are willing to pay. Customers often resent price discrimination but businesses employ it to incr...
Interested in quantitative marketing.