For the next two weeks, I’m going to go back to a SharePoint topic that is very important to me. It is one topic that many people have trouble with. That topic is the use of metadata in libraries rather than nested hierarchies of folders. Many of you know that I have been an advocate of using metadata to organize document libraries for some time. I have tried to convince users that folders are bad. In fact, folders within folders within folders down through a half dozen levels or more are not only bad, but I maintain that they are just plain evil. In this blog entry and the next, I will explore some of the reasons why you want to avoid folders and never look back.
The problem with folders started back in the ancient history days when people first learned how to store files on the personal computer. Originally, users tried to store everything in one large directory that we know as the root directory of a disk drive. This practice soon encountered physical limitations in the size of the root directory and the number of file entries it could contain. Rather than placing all of their files in the root directory of their hard disk (which many people actually did or at least tried to do), computer users were then taught to think of folders on their hard disk drives as folders in a filing cabinet. Each file folder in the cabinet could hold actual documents, letters, etc. Each folder could also hold other folders which could then hold documents or even additional folders. This paradigm has become almost second nature to most computer users and is still in use today.
Unfortunately, as the number of folders and documents grew over time, it became easy to ‘lose’ things. Microsoft, Apple, and other operating system vendors attempted to create utilities that could be used to search for documents. However, these were often inefficient and at the very least slow. SharePoint introduces a new paradigm to store documents without the use of folders while making it easy to find specific documents you may want. This paradigm is based largely on metadata that defines the contents of the documents. While I will show here how to manually use metadata to organize and fine documents in a library, the new search capabilities of SharePoint can also use the metadata to help narrow down your search results. So with that said, let’s get started.
One of the problems with folders is that the number of nested levels tends to grow over time like a Mandelbrot set (think fractal image). The result may look pretty, but it is almost impossible to navigate from one place to another. In the land of libraries, users have trouble remembering whether the path to the file they want is:
Department —>Group —>Project —>Task —>Year —>User —>Category
Department —>Group —>Project —>Category —>User —>Year —>Task
As a result, users may save the same document in two or more places. At that point, some users may update one version while other users update the other version. Soon no one document has all of the updates. Even if someone notices the problem and identifies each copy, the task of consolidating the versions back to a single version of the truth can be difficult if not impossible.
Another problem is the length of the URL to locate a file. Several Internet sources pin the size limit of a URL within SharePoint using Internet Explorer at only 2083 characters. That sounds like a lot of characters. It would be if not for others who say the limit of a referenced URL including lists and folders is a mere 256 characters. But then a URL is often more than just the path to the file when you consider query string information tacked on to the end of many URL references. In addition, there are other related limits. For example, a site name might be limited to 128 characters. Document library names can grow to only 255 characters while a folder name is limited to 123 characters. The bottom line to all of this is that it is possible to create an overall path to a file in a document library that could be too long if you ‘cheat’ and use Windows Explorer mode to build, populate, and work with the files in your SharePoint library. If so, other users who work through the SharePoint interface may have a problem when they attempt to access files deeply buried under many folder levels with long folder and document names.
Both of these points make valid arguments for not using folders or at least minimizing their use, but like that TV infomercial there is more. For this third argument, I will use a simplified version of a document library to build the case for using metadata while also touching on how the above two arguments are also addressed. Our overall goal is to make finding and organizing your files so much more efficient that you will not be able to wait to get rid of your folders.
In the above figure, you see a small portion of folders within a folder named: Grants. Imagine however that there could be dozens if not hundreds of grants rather than only the 4 shown here. If we just dropped all documents related to a grant into the corresponding grant folder at least we would have all documents grouped together by grant rather than in one large pile. However, because we could eventually have hundreds of documents within a single grant, we may decide to create a second level of folders to organize the files by file category or file type as shown in the next figure.
Perhaps we also have folders within each file type for the year that the document was created. We may even have other levels, but let’s stop the madness here for now.
The first point I want to make about this type of structure is that it is difficult to add more levels or even more folders within a level. For example, suppose we needed to add a category type of Resolution as shown in the following figure.
Sure we could add this new category to the one grant that needs it. However, if we do that and someone managing a different grant needs a similar folder and calls it Decisions instead of Resolutions, how would we know whether the files in these two folder types were the same or different? Even worse, what if someone added both folders to the same grant? Where would you look first to find files of this file type? In addition, should we add this folder to all grants whether it is needed or not to provide a constant structure to each of the grant folders?
In addition to adding the folder itself, consider our example in which the level under the document type is year. Each of the other document type folders has one folder for each year as shown here.
To be strictly correct, would we need to add folders for years under Resolution as well? If the year folder also had subfolders beneath it, the same argument would apply to each of those levels as well.
Imagine the amount of work adding these folders might entail if our grants folder consisted of dozens or a few hundred grants. Managing such a structure would be burdensome. On the other hand, failure to follow through could lead to inconsistent structure between grants making it more difficult to find documents if each grant folder’s structure is different.
So what can we do instead? Within SharePoint, you have the ability to add columns to the default columns provided by SharePoint when you create a new library or list. These additional columns can have different data types such as dates, numbers, text, etc. If we define these columns using definitions that help separate data much like folders do, we can create a structure that functionally gives us the same benefits as folder and much more as well.
To begin defining additional columns, referred to as metadata for the library, we should look first at the types of subfolders used within grants. The obvious choices are document type and year. So, we might want to begin by modifying the library structure to add both of these columns to our library. Perhaps you think that you need a character type column for the document type and a numeric type column for the year. A good rule of thumb in database column definition that applies here is:
If a fields contains content that looks like numbers, but is never used in calculations, store it as a character field.
For that reason, fields like zip codes, course numbers, and many others are stored as character fields. After all, when was the last time you needed to add two zip codes together?
You can see an example of what the library list might look like in the above figure. However, we do not have to stop there. We can create additional columns to provide more data about each document such as the business area of the grant, the application area, the fund number, or even a sub-category. Since metadata means data about data, we can add just about any information that might be useful to the user so that it is not only visible, but as I will show next time, this can be used in ad-hoc filters and sorts as well as permanent views.