Are Your SharePoint Libraries File Dumps?

When we first started to use SharePoint five years ago, our focus was to replace our external Internet facing sites with a more consistent look and feel experience for end-users.  We did not have a lot of collaboration sites, but we did support a few, particular among the groups within IT.  We definitely did not spend a lot of time providing training on how you should use SharePoint libraries.  Looking back, that was probably because we did not have a real good idea what worked and what should be avoided.  Therefore, since most of the collaboration site users were IT veterans, they designed their own libraries focused around a paradigm they knew well, a hierarchy of folders within folders within folders…

It took a few months, but we soon started to hear about problems such as:

  • I know I saved my file, but I just don’t remember where.  Can you help me find it?
  • We just discovered that there are two copies of the monthly status report stored in different folders in our site.  Each one has updates that are unique.  Can you help us merge the documents back into a single document.
  • I saved our documents in our project folder, but now when I attempt to access the file with its URL, the browser refused to find it.  What is wrong?

Many of these problems were the direct result of the users having created a complex folder hierarchy that not everyone was familiar with.  We also found several cases where someone saved a document in a different folder specifically because they could not save the file back to the original library.  Perhaps they could not find the original directory again or maybe someone else had the file locked.  Okay, the situation here was that they turned off automatic checkout for the library because checking files out to edit them and then checking them back in was ‘just too much trouble’.

SharePoint 2007 with Microsoft Office 2007 did provide a ‘default’ lock to the file.  However, that default lock released after about 20 minutes or so depending on the client’s OS.  Therefore, conflicts could easily occur.  One of the advantages of moving to SharePoint 2010 and Microsoft Office 2010 is that this dynamic duo now supports multiple editors in the same document at the same time.  Yes, you can now have two or more people edit the same document at the same time.  They just cannot edit the same paragraph at the same time.  My book: Office and SharePoint 2010 User’s Guide from Apress covers this topic in detail.  (BTW to the person who gave the book a bad review because some copies of the book went out from the publisher with the wrong cover.  Thanks for letting us know and I’m sure Apress would have sent you a corrected copied had you contacted them.)

Another solution to their problem of not finding a document would be to use Search to find the document.  However our users had such a bad experience with the web search engine prior to our switch to SharePoint that they did not even try it.  They would try to manually step through the library folders to find their documents. I’ve seen libraries with 8 or more levels of folders. Even uses who tried to use Search were not really sure how they could fine-tune their search string and so were often left with hundreds of references in their search results.

Ultimately after stumbling through several different attempts at solutions, we converged on a solution that many others had proposed and we thought we would give it a try.  That solution was to flatten the structure of the library eliminating most if not all of the folders.  Then we would replace that structure with a set of metadata columns to sort and filter the documents visible using the sort and filter options found in the column headers of the library list view.

We experimented with a library that had limited number of document types but which every person in every department had to create new instances of the documents on a fairly regular basis.  We created one metadata column to identify the department.  Another column to identify the type of document, and a third column to identify the year the document was created.  Then we got rid of the folders and defined a set of views for each document type with groupings on department and year.  Now instead of hunting through hundreds of documents, users could quickly find any document of a specific type for a specific person and year within seconds.

Next we took some other libraries and did the same type of thing.  One library held our monthly newsletters.  In that library, we saved both the newsletter source documents which happen to be from Microsoft Publisher and the published version of the newsletter which is a PDF.  We wanted to display the PDFs on a web page so that employees could just go out and click on a PDF link to view the newsletter.  We also decided not to send out copies of the PDF to our newsletter subscribers.  Rather we would just send them a link to the current issue saving storage needs on the Exchange server that no longer needed to store emails with large newsletter attachments.

But then we ran into a problem.  In creating the page to display the newsletters, we decided to use a Content Query web part.  With this web part, we could just point to a library and display all of the contents from that library on the page as links.  The cool part of creating the page this way is that when we add new documents to the library, they automatically appear on the web page displayed to the users without our having to edit the page.   Obviously we did not want to display all of the files from the library.  We only wanted to include references to the PDF files, not the PUB source files.  The obvious answer was to filter the web part on a metadata column that contained either the word: Newsletter or Source.  Okay, the actual word does not matter.  What mattered is we subsequently found out that simple user defined columns could not be used to filter the data in the Content Query web part.

If we created the metadata column for the library by simply creating a new column for the library and populating the column with one of the two document types (Newsletter or Source), we could not reference that field in the filter portion of the content query web part.  After a lot of experimentation, we found that a field called Category located in the Site Collection columns could be added to our library and when we used it, the Category field appeared in the filter selection of the web part and allowed us to correctly filter the data.

For some time, we thought that was the solution.  By using this Site Collection column, we could then filter the documents in the Content Query web part.  We could even rename the column and it would still work.  One day while working on a different library, I just happened to have a field in the library defined at the Site Collection level.  I was surprised to see that field appear in the list of possible fields I could filter on in the Content Query web part.

So I tried a couple of other libraries, adding fields first to the Site Collection columns and then adding the field to the library as a new metadata column.  Each time it worked.  So for some reason that I have not been able to discover, it appeared to us that the Content Query web part filter only works on columns defined at the site collection level.

Whatever the reason, it has formed the basic of our current technique for defining metadata columns in libraries.  Now when our group works with other teams both in and outside of IT, we help them to define the metadata columns that will minimize or eliminate folders while helping them organize their documents.  If the column has the potential of being used as a filter in a Content Query web part, we first define the metadata column as a Site Collection column.  This has worked well for us since then.

Does this mean that we never use folders?  No.  However the only reason we currently resort to folders within a library is to group files that need special permissions.  While we strongly recommend that our users not create special permissions on individual documents in a library because this has a known negative affect on performance, we do recommend that they apply special permissions to folders and then place documents that need those permissions in that folder.

The bottom line for us is that folders are ‘bad’ except to provide special permissions within a library when multiple libraries are not an option.  A better way to search for your documents is to use filters and sorts operating against metadata columns that categorize those files.  Furthermore, it minimizes the chance of creating duplicate copies of the same file.

See you next time.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s