Tag Archives: database

Integrating custom data to SharePoint (Part 1)

This is part 1 of Integrating custom data to SharePoint.

One of the business requirements often presented during a medium to large SharePoint implementation project is the integration of custom Line of Business (LOB) data. There are so many options and choices one can make based on different requirements and scenarios, that giving the absolutely correct answer is near impossible. In this post (divided in multiple parts) I’m going to outline the essential enterprise-grade approaches available for integrating custom data to SharePoint.

Since we’re talking about enterprise-grade I’m ruling out the following options – they are not really suitable for large deployments. Your mileage may vary:

  • SharePoint Designer: While creating data-driven applications with SPD is somewhat trivial, it’s not really an option when using multiple SharePoint farms and multiple layers of functionality.
  • Copy-Paste: One could argue that copying content from an internal data source to SharePoint is integration, but it really isn’t. It’s still an option when you  just need to visualize static data with minimal costs.
  • Static HTML-files/Server-Side Includes: A rather old school approach to embedding data but surprisingly often considered as an option. We’ll leave this to the 90’s.

Consider your data

What kind of data needs to be integrated? Where is the master data located? Can you integrate your data source directly? How will the data be consumed? Is data manipulation going to be two-ways? Is real-time access needed to the data?

These are just some of the questions you should ask when considering the options for integrating data to SharePoint.

Typically, data is stored in one or more of the following:

SQL Server database: A solid option. This could be a single table in a database, multiple databases or even a highly customized set of data from numerous data sources. Some or all elements of Business Intelligence might be included, such as data warehouses and analysis services cubes.

Web Service: In true SOA fashion, a consumable web service might be available. This could be a “traditional” SOAP-based web service or a Windows Communication Foundation (WCF) –endpoint exposing anything from RESTful services to ATOM and JSON –based data.

XML-content: Sometimes a static XML-file or feed is the simplest way to expose and consume data in SharePoint. This could also include a comma-separated values file (CSV) from Excel or files with similar well-known structures.

SQL database (non-SQL Server): Third-party databases sitting on MySQL, Oracle or some similar RDBMS are always great candidates for controversy and hidden technical issues. It’s possible to integrate these but pay close attention when choosing your approach as it might yield performance and security issues. Not to say these wouldn’t be a factor with any other integration methods either.

Windows Azure/SQL Azure: I’ve yet to see too many companies embracing the Azure offerings with regards to integrating data from the cloud to on-premises SharePoint. It’s still an option that can – and probably should – be considered more often.

Office 365/SharePoint Online: Similarly to Azure, using SharePoint Online to store your data and then integrating it to your on-premises SharePoint is something a business requirement.

Other ways of integrating data from LOB systems are probably available. These are the ones I seem to encounter the most and therefore think are relevant to include in this post.

Outline the effort

How many days of custom development are you willing to spend for integrating data to SharePoint? Requirements from a business stakeholder might not take into consideration the infrastructural changes, support issues, monitoring needs and licensing costs a simple “integrate A with B” request might incorporate. And to be honest, it shouldn’t since the request is almost always based on actual business needs and justification, not on the boundaries of SharePoint and experience of the delivery team.

When deciding on the optimal way for integration, try to keep the following considerations in mind:

  • Is this something that needs to be done once, and never touched again? Or is this a foundation for future integrations?
  • Who will be responsible for the data? Is it someone from our technical delivery team, or someone who actually owns the original data?
  • What capabilities do we possess, as a delivery team, to provide support and troubleshooting assistance for the LOB data?

So what’s the typical implementation?

A typical request for integrating data might be like this:

A business unit has a legacy SQL Server 2000 database running in a forgotten server somewhere on the corporate network. They now need to integrate several tables and views from said database to the company intranet running with SharePoint 2010.

In addition, the business unit has also decided that they own the data and want to keep updating the database content in the future. This would force us to consume data from their legacy SQL Server 2000, in turn causing serious security and network latency issues during production use.

The best starting point here is to investigate and document the current database functionality and interfaces provided for applications and users. Armed with this information one should first stabilize the situation by migrating the database to a more recent and redundant SQL Server environment. You could use an existing SQL Server cluster, or set up a dedicated SQL Server instance for the business unit to use. IT department would probably be happy to get one more legacy application server off from their network.

This – in essence – is the tactic of cleaning your backyard first, and then doing the fancy integration stuff. Suddenly your integration project looks a lot more like a traditional IT project – decommissioning servers, patching legacy operating systems, finding new virtual machines and budgeting to purchase new hardware. That’s the way it goes; try to foresee this and apply for enough budget to cover your time and efforts to do this properly.

Assuming we manage to migrate the older database from SQL Server 2000 to a more robust hardware running SQL Server 2008 R2, we can resume our original integration planning and implementation.

For SQL Server-based databases, the following options are available in order of preference:

  1. Use ADO.NET to directly consume data sources from your UI component (a web part or similar)
  2. Use External Content Types (by way of Business Connectivity Services) to connect with the data
  3. Create a custom WCF-endpoint to consume data directly from SharePoint (a web part is a typical approach for this)
  4. Create a scheduled task or a timer job to move data from the legacy database to a SharePoint list

One might argue that using ECT’s should be the primary means of integration towards SharePoint. While this might be true for certain scenarios, it often adds to the overall complexity and might also be overkill. It depends whether or not you need two-way data access, and if you plan on using the integrated data in metadata or with search. There are also performance and scalability boundaries to consider.

Using ADO.NET is fairly simple and allows full freedom to integrate the data as it fits your needs best. If the underlying schema changes it’s easy to create abstraction layers for data handling or modify the logic for the UI component.

More to follow in part 2 – stay tuned.

Using DBCC CHECKDB against SharePoint – what’s the story here?

You are about to turn off the monitor and head home for the weekend when you get that last email you just have to check. Oops. It seems there’s a SharePoint-farm feeling not-too-well. Apparently due to infrastructure problems all servers were forcefully booted and now SQL Server is yelling this dozens of times per second:

2010-12-01 16:40:32.30 spid57 * BEGIN STACK DUMP:
2010-12-01 16:40:32.30 spid57 * 12/01/10 16:40:32 spid 57
2010-12-01 16:40:32.30 spid57 *
2010-12-01 16:40:32.30 spid57 * CPerIndexMetaQS::ErrorAbort – Index corruption

Index corruption is not a good sign, especially with SharePoint databases. People often treat these DB’s as black boxes – they just sit there, better not even look at them in case something goes haywire. It would surely mean we’d be breaking the support clauses and the world as we know it would end, right now.

As it turns out there’s a Cumulative Update 7 for SQL Server 2008 R2 for the dreaded CperIndexMetaQS –error. I hate patching servers since I might be introducing even more issues while not fixing the original error(s). In addition just blindly patching everything even remotely close to the error messages you initially see just doesn’t feel right. SharePoint is often about the gut feeling, really.

Luckily there’s DBCC CHECKDB readily available with SQL Server. According to MSDN, it “checks the logical and physical integrity of all the objects in the specified database”. Reference for DBCC CHECKDB is here.

Can DBCC CHECKDB be used against any SharePoint database? What about the numerous options like REPAIR_REBUILD and REPAIR_ALLOW_DATA_LOSS?

The KB article on support for changes to the SharePoint databases was last updated April 18th, 2010, and it states:

Running DBCC_CHECKDB WITH REPAIR_ALLOW_DATA_LOSS (However, running DBCC_CHECKDB WITH REPAIR_FAST and REPAIR_REBUILD is supported, as these commands only update the indexes of the associated database.)

Also, the older whitepaper from 2009 states:

You can use the DBCC (Database Console Command) CHECKDB statement to perform an internal consistency check of the data and index pages, and to repair errors.

This is good stuff, since using DBCC CHECKDB WITH REPAIR_REBUILD often fixes a lot of issues thus allowing me to work within the boundaries that is the SharePoint support realm. It’s also faster than going through the whole restoring crusade.