Speeding up site indexing & handling of deleted pages
Hi,
Is there a quick way to speed up site indexing other than manually browsing every page of our site?
Our site runs to approx. 5,400 pages and we get a reasonable volume of daily visitors. We understand that the site index is built using a refer script based on visitor page views and after one day about 570 pages are in the index. Our concern is that only the most popular pages will find immediate inclusion in the index leading to the less popular pages to be omitted, or taking quite sometime before becoming included. Our current sitemap registered with google runs to about 4,800 pages and until the AutoSitemap index reflects the true number of pages in our site we cannot switch to the AutoSitemap index lest google de-index pages from our site.
I was therefore wondering if there is a way of manually update the sql database with all of the html pages names, or alternativeley a means of crawling the site to ensure that all of the pages get registered?
My second question is does AutoSitemap remove deleted pages from the index?
Thank you,
GuyPV
Hi David,
Thanks for the reply.
I think our current xml file could probably be opened by a php script.
It's a shame about the deleted pages.
Regards,
Guy
Hello Guy,
You've correctly understood how AutoSitemap works; so i'm afraid without intervention it would take time until all URLs are included. Of course this is not so much of a problem on a new site (a forum for example where users are creating new pages all the time) as the AutoSitemap database would grow with the site.
If you already have a sitemap it should be possible to write a small utility to parse the file and load the URLs into the AutoSitemap database. I'll have a look into this for you. Could your current XML file be opened by a PHP script running in the same directory on your server?
Old pages are not automatically removed i'm afraid as there is no feedback mechanism to tell the database which pages no longer exists, although you can delete them of course using either an exact or wildcard URL.
Cheers,
David.