The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Way to spider Wordpress site

    Way to spider Wordpress site

    Technical SEO Issues
    3 3 982
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DanCrean
      DanCrean last edited by

      I have an old Wordpress site and I want to move it to a new server and take it off Wordpress (too many hacks).  I am trying to spider the site so as to get static, non-Wordpress, pages.

      I am having trouble doing this.  When I spider the site, it changes the URLs.  For instance, if the URL is www.domain.com/page/  the URL I get out of the spider is /page/index.html  And those are not the URLs in the search engine indices.  There are about 2000 pages on this site, so it is not feasible to set up 301 redirects.

      I tried using these spidering programs: WinHTTack Website Copier and PageNest

      Does anyone know of another method of turning a Wordpress site into a non Wordpress site?

      1 Reply Last reply Reply Quote 0
      • mememax
        mememax last edited by

        Hi Dan, I'm not so experienced in migrating a WP to non -wp but I understand that the issue you're having is that the spider is returning index.htmlfiles for urls like domain/page/.

        IT's normal, any spider you will use you'll always have and index.html file. Every directory has it's index.html which is the default file to show if you're not establishing something different with rewrite rules.

        If you write /page/ the browser will read the index.html file. What you have to be sure is that you'll set up a 301 redirect to avoid any index.html url to show and have it redirected to the main / page (with wildcards is a one line rule) and that your internal links are pointing all to / pages and not to index.html version of it. You can jsut find and replace the /index.html" string into the html code with the /" text (dreamweaver or any html editor will do that in bulk.

        Only one commentary on you idea is that you may consider useful to build a php driven site, using includes for header, footer and nav/sidebar, jsut because thinking ahead if you're willing to make changes to a portion of the page repeating throughout the site you'll have to make changes in all pages and uplaod them all which is quite huge to do and also let space for many human/machine errors.

        Hope that helped you out!

        1 Reply Last reply Reply Quote 0
        • evolvingSEO
          evolvingSEO last edited by

          Hi Dan

          Hmm that's a little strange. Two things;

          • is WordPress updated? Do you get the normal URLs when viewing in your browser?
          • have you tried Screaming Frog SEO Spider? It's free to crawl up to 500 pages 😉 Although it won't get the actual HTML on the pages, it could solve the URL issue perhaps.

          This blackhat world thread has a few options too.

          -Dan

          1 Reply Last reply Reply Quote 0
          • 1 / 1
          • First post
            Last post
          • Is a micro site the way to go?
            StelinSEO
            StelinSEO
            0
            3
            125

          • Will sitemap generated in Yoast for a combined wordpress/magento site map entire site ?
            Dan-Lawrence
            Dan-Lawrence
            0
            5
            1.4k

          • How do you handle Wordpress sitemaps within your site?
            ske11
            ske11
            0
            6
            221

          • What is the best way to find missing alt tags on my site (site wide - not page by page)?
            franchisesolutions
            franchisesolutions
            1
            4
            9.7k

          • Installing WordPress on a site OR just adding a blog page on the site - Which one is better and why ?
            Personnel_Concept
            Personnel_Concept
            0
            5
            301

          • Adding .html To Wordpress Site
            AlanMosley
            AlanMosley
            0
            5
            961

          • Redirect from old wordpress site to new php site? Best approach
            Vahe.Arabian
            Vahe.Arabian
            0
            7
            844

          • Way to find how many sites within a given set link to a specific site?
            YannickVeys
            YannickVeys
            0
            2
            482

          Get started with Moz Pro!

          Unlock the power of advanced SEO tools and data-driven insights.

          Start my free trial
          Products
          • Moz Pro
          • Moz Local
          • Moz API
          • Moz Data
          • STAT
          • Product Updates
          Moz Solutions
          • SMB Solutions
          • Agency Solutions
          • Enterprise Solutions
          • Digital Marketers
          Free SEO Tools
          • Domain Authority Checker
          • Link Explorer
          • Keyword Explorer
          • Competitive Research
          • Brand Authority Checker
          • Local Citation Checker
          • MozBar Extension
          • MozCast
          Resources
          • Blog
          • SEO Learning Center
          • Help Hub
          • Beginner's Guide to SEO
          • How-to Guides
          • Moz Academy
          • API Docs
          About Moz
          • About
          • Team
          • Careers
          • Contact
          Why Moz
          • Case Studies
          • Testimonials
          Get Involved
          • Become an Affiliate
          • MozCon
          • Webinars
          • Practical Marketer Series
          • MozPod
          Connect with us

          Contact the Help team

          Join our newsletter
          Moz logo
          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
          • Accessibility
          • Terms of Use
          • Privacy