One good tip is that you should prepare a crawler page (or pages) and submit this to the search engines. This page should have no text or content except for links to all the important pages that you wished to be crawled. When the spider reaches this page it would crawl to all the links and would suck all the desired pages into its index. You can also break up the main crawler page into several smaller pages if the size becomes too large. The crawler shall not reject smaller pages, whereas larger pages may get bypassed if the crawler finds them too slow to be spidered.
You do not have to be concerned that the result may throw up this “site-map” page and would disappoint the visitor. This will not happen, as the “site-map” has no searchable content and will not get included in the results, rather all other pages would. We found the site wired.com had published hierarchical sets of crawler pages. The first crawler page lists all the category headlines, these links lead to a set of links with all story headlines, which in turn lead to the news stories.
You do not have to be concerned that the result may throw up this “site-map” page and would disappoint the visitor. This will not happen, as the “site-map” has no searchable content and will not get included in the results, rather all other pages would. We found the site wired.com had published hierarchical sets of crawler pages. The first crawler page lists all the category headlines, these links lead to a set of links with all story headlines, which in turn lead to the news stories.
__________________
WinHost Web Hosting
WinHost Web Hosting
Leave a Reply
You must be logged in to post a comment.