Working Around Low Disk Space During A Full Crawl

TIL that space is released when you pause a crawl!

Backstory:

We decided to re-index our SharePoint site due to spotty search results, and learned the hard way that we didn’t have enough space to run a full crawl.

Adding space wasn’t an option and I had to get the search working again or else face the wrath of the users who rely on it. I decided I’d run the full crawl and pause it before free space got too low. That way some things would be searchable at least.

To my surprise, a ton of drive space freed up when I paused the crawl, and items that had already been crawled were still searchable! Using that to my advantage, I spent the rest of the day pausing and resuming the full crawl until it was finished. Tedious, but worth it.

Summary:

If you don’t have enough disk space to run a full crawl straight through, monitor the drive while the crawl is running and pause it when you need to free up space. Once the crawl’s status changes from “Pausing” to “Paused”, confirm that space is available again and resume crawling. Repeat as necessary.

Thoughts:

Releasing space seems like basic functionality. Does the crawler really not have the ability to determine free space on the drive where temp files are stored? Why doesn’t it pause itself when the “low disk space” event is triggered? Why are temp files stored on the system drive by default if there isn’t a mechanism that prevents the drive from bottoming out, corrupting the index? Am I missing something?

That being said, I’m not convinced of the health of our search service application.