Troubleshooting SharePoint Search using a Proxy

So I had this recent problem with search. I was trying to crawl an external website and it was not returning the results I was expecting. I needed to troubleshoot SharePoint search at a deeper level by using a proxy.

The first place I looked at was the search service search administration page. As you would expect in the crawl history and it shows you general information about what was going on in the crawl.

While helpful I wanted to see exactly what results the crawl was receiving during the crawl.

From my previous experience I knew that SharePoint search could use a web proxy to funnel the crawler. You use the GUI to configure the Search Service Application crawl proxy via the GUI. Click on the ‘Proxy server’ link in the Search Administration page.

Select ‘Use the proxy server specified’
Address: http://localhost
Port: 8888

and click OK.

If you want to take the PowerShell route to configuring SharePoint Search Service.

First, let open up the SharePoint 2010 Management Shell, and let’s make the magic happen.

I’ll assume you only have one Search Service Application in the SharePoint farm and the Crawler component is also on the same server.

$var = Get-SPEnterpriseSearchService

Running the following command we note the property we need to set called ‘WebProxy’.

$var | get-member


Now we need to set the Search Service Application Search to http://localhost:8888

$var.webproxy.address = “http://localhost:8888

We can also confirm this in the GUI using the Search Service Application page

Now let’s go and download Fiddler. Install it to the server that has the crawler component.

If we open Fiddler> Tools>Fiddler Options…>Connections we can see that Fiddler has a built in proxy listing on port 8888.

Open a browser and pointing it to http://localhost:8888 proves that Fiddler is listing. The crawler in effect will funnel all search crawls through fiddler to its final destination. In other words a ‘proxy’.

So now let’s go back to Search Administration>Content Sources and kick off a crawl of the content source in question.

As we can see our crawler is running, and we can get a clear view as to what exact results the crawler is getting as it runs.


About the Author

Leave a Reply

%d bloggers like this: