Search Crawl Log Error – Access is denied

Scenario:

Crawl History shows 0 successes and a bunch of security errors.

The following error is logged in the Error Breakdown:

“Access is denied. Verify that either the Default Content Access Account has access to this repository, or add a crawl rule to crawl this repository.”

Possible Solution:

Check your hosts file. In my case, host headers had been commented out for testing and never changed back.

 

Working Around Low Disk Space During A Full Crawl

TIL that space is released when you pause a crawl!

Backstory:

We decided to re-index our SharePoint site due to spotty search results, and learned the hard way that we didn’t have enough space to run a full crawl.

Adding space wasn’t an option and I had to get the search working again or else face the wrath of the users who rely on it. I decided I’d run the full crawl and pause it before free space got too low. That way some things would be searchable at least.

To my surprise, a ton of drive space freed up when I paused the crawl, and items that had already been crawled were still searchable! Using that to my advantage, I spent the rest of the day pausing and resuming the full crawl until it was finished. Tedious, but worth it.

Summary:

If you don’t have enough disk space to run a full crawl straight through, monitor the drive while the crawl is running and pause it when you need to free up space. Once the crawl’s status changes from “Pausing” to “Paused”, confirm that space is available again and resume crawling. Repeat as necessary.

Thoughts:

Releasing space seems like basic functionality. Does the crawler really not have the ability to determine free space on the drive where temp files are stored? Why doesn’t it pause itself when the “low disk space” event is triggered? Why are temp files stored on the system drive by default if there isn’t a mechanism that prevents the drive from bottoming out, corrupting the index? Am I missing something?

That being said, I’m not convinced of the health of our search service application.

Recent Crawl/Query Rates Incorrectly Show 0.00 Items Per Second

Problem:

“Recent crawl rate” and “Recent query rate” statistics incorrectly show 0.00 items per second on the Search Administration page.

zero1

Confirmed the crawl rate from the Crawl Health Reports page.

crawlgraph

Confirmed that the number of searchable items was increasing.

Found a TON of update conflict errors (event ID 6398 and 6482) in the Application event logs.

errors

Solution:

Clear the configuration cache:

  1. Stop the Windows SharePoint Services Timer service.
  2. Navigate to the cache folder %SYSTEMDRIVE%\ProgramData\Microsoft\SharePoint\Config
  3. Locate the folder that has the “Cache.ini” file. The folder name should be a GUID.
    explorer
  4. Back up the Cache.ini file (copy and paste it into parent folder, append .bak to the filename).
  5. Delete all the XML configuration files in the GUID folder. NOTE: DO NOT DELETE the Cache.ini or the GUID folder!
  6. Edit the Cache.ini file.
  7. Replace the number in the file with a “1”. Save.
    cache
  8. Start the Windows SharePoint Services Timer service.
  9. After the XML files repopulate, confirm that the Cache.ini file in the GUID folder contains the original number.

Confirm that “Recent crawl rate” and “Recent query rate” show data.

zero