Image may be NSFW.
Clik here to view.
From time to time, particular from knowledgeable SharePoint users coming up to speed with PowerPivot, I get the question: “PowerPivot have ‘cross-farm’ support”. As you can see from the title of this post, we don’t support it – and in this “A Peek Inside” I hope to explain why.
First, what is SharePoint ‘cross-farm’ support and why is it important. In large, complex SharePoint configurations a common requirement is to specialize servers or farms of servers to specific services. A good example of this approach is to have a separate farm dedicated to Search. Rather than having each end-user farm host its own Search service, the idea is to get better scale through specialization. Content crawling is done remotely; the indexes are kept remotely; and the Search results are calculated remotely. End users connect to the content farms (so-called because that is where the content is stored), but the content farm reaches out to specialized servers/farms for other services. Example of these services is: Search, Personalization, Business Data Catalog, Portal Usage reporting – coming in SharePoint 2010 are lots more . . .
Here is an example:
Image may be NSFW.
Clik here to view.
So why can’t we put PowerPivot out on one of the specialized farms. At a cursory level this sounds like a good thing. You can could share PowerPivot servers across the whole enterprise rather than having to replicate them within each of the content farms (where the workbooks are stored). Sounds like a great idea. Unfortunately however, it isn’t technically possible. For those services such as PowerPivot and Excel Services that rely on on access to the content, there is no way for the remote service to reach back into the content farm to access data. That is OK as far as it goes, but let’s get geekie and dig a bit deeper.
If you take a look at the kinds of services which can be spun out to specialized servers. These have several interesting characteristics, i.e. they are self-contained, and independent of the content itself.
BDC –> Obviously getting access to transactional systems is totally unrelated to the content (.docx, pptx, xlsx, etc.) that is stored in the content farms. Typically you are using BDC servers to lookup line of business reference data, e.g. customer master lists, product catalogs, transactional data, etc. None Typically this means lookups based on keys, e.g. give me the reference data for customer #42, or give me the reference data for product “abc”, or give me the POS data for transaction #291129211. As there are no references to the content, the BDC service can easily be deployed in its own server farm where it acts as a front-end to underlying corporate databases.
Personalization –> While personalization is related to a certain kind of content, i.e. the SharePoint People and Groups that would be used across various farms, it is also very self-contained. The keys are passed into the service for data lookups, but the actual personalized information is fully contained within the service itself and it has no dependency on the content.
Search –> At a first glance, it seems like Search should not be a good candidate for server farms. After all, Search is all about the content. Isn’t it?? As it turns out, it is about the content, but not during the actual lookup process. The Search farm contains the crawlers that are indexing the various SharePoint content farms (and from that point of view, it is all about the content), but it also crawls other sources as well. As it turns out, what the Search really needs at runtime isn’t the content – it is the indexes about the content. As these are kept self-contained within the Search farm, it turns out that Search is a good candidate for a server farm because at run time, what is used isn’t the content itself (there is no need for Search to reach back into the content farms), but rather access to the indexes that matters.
So when we were deciding if PowerPivot would be a candidate for cross-farm specialization, we had to look at the need for access to content. And this brings us to the “geek’est” (is that a real word??) part of the post. How do you programmatically access content in SharePoint. There are two different aspects to the problem: getting access to content if data is within the farm; and getting access if data is outside the farm. If the data is outside the farm, then there are 3 ways:
- Using the web services APIs that SharePoint exposes (flexible but very inefficient at large data sizes)
- Using a standard HTTP GET against the SharePoint URL for where the content is stored, e.g. (this is what Search crawlers use)
/site/subsite/doclib/file.xlsx">/site/subsite/doclib/file.xlsx">http://<sp_server>/site/subsite/doclib/file.xlsx - Using WebDAV that allows you to access content from the SharePoint farm as if it was a large, single file share, e.g.
\site\subsite\doclib\file.xlsx">\site\subsite\doclib\file.xlsx">\\<sp_server>\site\subsite\doclib\file.xlsx
Unfortunately none of those techniques work if you are within the farm because firewall rules do not allow backend servers to loopback into the front-end servers. If the data is within the farm then there is only one way to access content: use the SharePoint object model (aka SP ‘binary OM’) , e.g. SPFile.OpenBinaryStream. This is the preferred way of getting data as it is very efficient and accesses the SharePoint content databases directly. It is an order of magnitude (or more) faster than using the outside the farm APIs, even if loopback was allowed.
Now here’s the problem. The binary OM only allows you to access content from within the farm. You cannot reach out an retrieve data from a different farm. In the case of a remote PowerPivot server farm, it cannot use the binary OM to access the content farm – except if using the ‘outside the farm’ APIs – which is very inefficient and would not perform anywhere near fast enough. Thus for those services that are “content focused”, such as Excel Services and PowerPivot, you have no option but to build their app servers as part of the content farm; cross-farm support is not possible or feasible until there are considerable changes made to the SharePoint binary OM.
Image may be NSFW.Clik here to view.
