While WinFS and its shared type schema makes it possible for an application to recognize the different data types, the application still has to be coded to render the different data types. Consequently, it would not allow development of a single application that can view or edit all data types; rather what WinFS enables, is applications to understand the structure of all data and extract the information that it can use further. When WinFS was introduced at the 2003 Professional Developers Conference, Microsoft also released a video presentation, named IWish, showing mockup interfaces that showed how applications would expose interfaces that takes advantage of a unified type system. The concepts shown in the video ranged from applications using the relationships of items to dynamically offer filtering options to applications grouping multiple related data types and rendering them in an unified presentation.
WinFS was billed as one of the pillars of the "Longhorn" wave of technologies, and would ship as part of the next version of Windows. It was subsequently decided that WinFS would ship after the release of Windows Vista, but those plans were shelved in June 2006, with some of its component technologies being integrated into upcoming releases of ADO.NET and Microsoft SQL Server. While it was then assumed by observers that WinFS was finished as a project, in November 2006 Steve Ballmer announced that WinFS was still in development, though it was not clear how the technology was to be delivered.
Because a file system has no knowledge about the data it stores, applications tend to use their own, often proprietary, file formats. This hampers sharing of data between multiple applications. It becomes difficult to create an application which processes information from multiple file types, because the programmers have to understand the structure and semantics of all the files. Using common file formats is a workaround to this problem but not a universal solution; there is no guarantee that all applications will use the format. Data with standardized schema, such as XML documents and relational data fare better as they have a standardized structure and run-time requirements.
Also, a traditional file system can retrieve and search data based only on the filename, because the only knowledge it has about the data is the name of the files that store the data. A better solution is to tag files with attributes that describe them. Attributes are metadata about the files such as the type of file (such as document, picture, music, creator, etc). This allows files to be searched for by their attributes, in ways not possible using a folder hierarchy, such as finding "pictures which have person X". The attributes can be recognizable by either the file system natively, or via some extension. Desktop search applications take this concept a step further. They extract data, including attributes, from files and index it. To extract the data, they use a filter for each file format. This allows for searching based on both the file's attributes and the data in it.
However, this still does not help in managing related data, as disparate items do not have any relationships defined. For example, it is impossible to search for "the phone numbers of all persons who live in Acapulco and each have more than 100 appearances in my photo collection and with whom I have had e-mail within last month". Such a search encompasses needs to have a data model which has both the semantics as well as relationships of data defined. WinFS aims to provide such a data model and the runtime infrastructure that can be used to store the data as well as the relationships between data items according to the data model, doing so at a satisfactory level of performance.
WinFS promotes sharing of data between applications by making the data types accessible to all applications, along with their schemas. So any application, when it wants to use a WinFS type, by using the schema can find out the structure of the data and utilize the information. So, an application has access to all data on the system, even though the developer did not have to write parsers to recognize the different data format. It can also use the relationships and related data to create dynamic filters to present the information the application deals with, in different ways. The WinFS API further abstracts the task of accessing data. All WinFS types are exposed as .NET objects with the properties of the object directly mapping to the properties of the data type. Also, by letting different applications which deal with the same data share the same WinFS data instance rather than storing the same data in different files, the hassles of synchronizing the different stores when the data changes is removed. Thus WinFS help reduce redundancies.
Access to all the data in the system allows complex searches for data to be performed across all the data items managed by WinFS. In the example used above ("the phone numbers of all persons who live in Acapulco and each have more than 100 appearances in my photo collection and with whom I have had e-mail within last month"), WinFS can traverse the subject relationship of all the photos to find the contact items. Similarly, it can find filter all emails in last month and access the communicated with relation to reach the contacts. The common contacts can then be figured out from the two sets of results and their phone number retrieved by accessing the suitable property of the contact items.
WinFS, in addition to fully schematized data (like XML and relational data), supports semi-structured (like images, which has an unstructured bitstream plus structured metadata) as well as unstructured (like files) as well. It stores the unstructured components directly as files while storing the structured metadata in the structured store. WinFS internally uses a relational database to manage the data. But, it does not limit the data to belong to any particular data model, like relational or hierarchical, but can be of any well-defined schema. The WinFS runtime maps the schema to a relational modality, by defining the tables it will store the types in and the primary keys and foreign keys that would be required to represent the relationships. WinFS includes mappings for object and XML schemas by default; mappings for other schemas needs to be specified. Object schemas are specified using XML; WinFS generates code to surface the schemas as .NET classes. ADO.NET can be used to directly specify the relational schema, though a mapping to the object schema needs to be provided to surface it as classes. All relationship traversals are performed as joins on these tables. WinFS also automatically creates indexes on these tables, to facilitate fast access to the information. Indexes significantly speed up joins, and thus traversing relationships to retrieve related data is performed very fast. Indexes are also used during searching of information; searching and querying use the indexes so that the operations complete quickly, much like desktop search systems.
The development of WinFS is an extension to a feature which was initially planned in the early 1990s. Dubbed Object File System, it was supposed to be included as part of Cairo. OFS was supposed to have powerful data aggregation features. But the Cairo project was shelved, and with it OFS. However, later during the development of COM, a storage system, called Storage+, based on then-upcoming SQL Server 8.0, was planned, which was slated to offer similar aggregation features. This, too, never materialized, and a similar technology, Relational File System (RFS), was conceived to be launched with SQL Server 2000. However, SQL Server 2000 ended up being a minor upgrade to SQL Server 7.0 and RFS was not implemented.
But the concept was not scrapped. It just morphed into WinFS. WinFS was initially planned for inclusion in Windows Vista, and build 4051 of Windows Vista, then called by its codename "Longhorn", given to developers at the Microsoft Professional Developers Conference in 2003, included WinFS, but it suffered from significant performance issues. In August 2004, Microsoft announced that WinFS would not ship with Windows Vista; it would instead be available as a downloadable update after Vista's release.
On August 29, 2005, Microsoft quietly made Beta 1 of WinFS available to MSDN subscribers. It worked on Windows XP, and required the .NET Framework to run. The WinFS API was included in the System.Storage namespace. The beta was refreshed on December 1, 2005 to be compatible with version 2.0 of the .NET Framework. WinFS Beta 2 was planned for some time later in 2006, and was supposed to include integration with Windows Desktop Search, so that search results include results from both regular files and WinFS stores, as well as allow access of WinFS data using ADO.NET.
However, on June 23, 2006, the WinFS team at Microsoft announced that WinFS would no longer be delivered as a separate product, and some components would be brought under the umbrella of other technologies - like the object-relational mapping components into ADO.NET Entity Framework; support for unstructured data, adminless mode of operation, support for file system objects via the FILESTREAM data type, and hierarchical data in SQL Server 2008, then codenamed Katmai, as well as integration with Win32 APIs and Windows Shell and support for traversal of hierarchies by traversing relationships into later releases of Microsoft SQL Server; and the synchornization components into Microsoft Sync Framework. However, having a shared-schema storage system built into a future iteration of Microsoft Windows hasn't been ruled out yet.
With that announcement, most analysts assumed that the WinFS project was being killed off. But in November 2006, Steve Ballmer said in an interview that WinFS is being actively developed but integration into the Windows codebase will come only after the technology has fully incubated. It was subsequently confirmed in an interview with Bill Gates and that Microsoft plans to migrate applications like Windows Media Player, Windows Photo Gallery, Microsoft Office Outlook etc to use WinFS as the data storage back-end.
WinFS uses a relational engine, which is derived from SQL Server 2005, to provide the data relations mechanism. WinFS stores are simply SQL Server database (.MDF) files with the FILESTREAM attribute set. These files are stored in access-restricted folder named "System Volume Information" placed into the volume root, in folders under the folder "WinFS" with names of GUIDs of these stores.
At the bottom of the WinFS stack lies WinFS Core which interacts with the filesystem and provides file access and addressing capabilities. The relational engine leverages the WinFS core services to present a structured store and other services such as locking which the WinFS runtime uses to implement the functionality. The WinFS runtime expose Services such as Synchronization and Rules which can be used to synchronize WinFS stores or perform certain actions on the occurrence of certain events.
WinFS runs as a service which runs three processes - WinFS.exe, which hosts relational datastore, WinFSSearch.exe, which hosts the indexing and querying engine, and WinFPM.exe (WinFS File Promotion Manager), which interfaces with the underlying file system. It allows programmatic access to its features, via a set of .NET Framework APIs, that enables applications to define custom made data types, define relationships among data, store and retrieve information, and allow advanced searches. The applications can then aggregate the data and present the aggregated data to the user.
WinFS provides a unified storage but stops short of defining the format that is to be stored in the data stores. Instead it supports data to be written in application specific formats. But applications must provide a schema that defines how the file format should be interpreted. For example, a schema could be added to allow WinFS to understand how to read and thus be able to search and analyze, say, a PDF file. By using the schema, any application can read data from any other application, and also allows different applications to write in each other’s format by sharing the schema.
Multiple WinFS stores can be created on a single machine. This allows different classes of data to be kept segregated, for example, official documents and personal documents can be kept in different stores. WinFS, by default, provides only one store, named "DefaultStore". WinFS stores are exposed as shell objects, akin to Virtual folders, which dynamically generates a list of all items present in the store and presents them in a folder view. The shell object also allows searching information in the datastore.
A data unit that has to be stored in a WinFS store is called a WinFS Item. A WinFS item, along with the core data item, also contains information on how the data item is related with other data. This Relationship is stored in terms of logical links. Links specify which other data items the current item is related with. Put in other words, links specify the relationship of the data with other data items. Links are physically stored using a link identifier, which specifies the name and intent of the relationship, such as type of or consists of. The link identifier is stored as an attribute of the data item. All the objects which have the same link id are considered to be related. An XML schema, defining the structure of the data items that will be stored in WinFS, must be supplied to the WinFS runtime beforehand. In Beta 1 of WinFS, the schema assembly had to be added to the GAC before it could be used.
Predefined WinFS schemas include schemas for documents, e-mail, appointments, tasks, media, audio, video, and also includes system schemas that include configuration, programs, and other system-related data. Custom schemas can be defined on a per-application basis, in situations where an application wants to store its data in WinFS, but not share the structure of that data with other applications, or they can be made available across the system.
The most important difference between a file system and WinFS is that WinFS knows the type of each data item that it stores. And the type specifies the properties of the data item. The WinFS type system is closely associated with the .NET framework’s concept of classes and inheritance. A new type can be created by extending and nesting any predefined types.
WinFS provides four predefined base types – Items, Relationships, ScalarTypes and NestedTypes. An Item is the fundamental data object, which can be stored, and a Relationship is the relation or link between two data items. Since all WinFS items must have a type, the type of item stored defines its properties. The properties of an Item may be a ScalarType, which defines the smallest unit of information a property can have, or a NestedType, which is a collection of more than one ScalarTypes and/or NestedTypes. All WinFS types are made available as .NET CLR classes.
Any object represented as a data unit, such as contact, image, video, document etc, can be stored in a WinFS store as a specialization of the Item type. By default, WinFS provides Item types for Files, Contact, Documents, Pictures, Audio, Video, Calendar, and Messages. The File Item can store any generic data, which is stored in file systems as files. But unless an advanced schema is provided for the file, by defining it to be a specialized Item, WinFS will not be able to access its data. Such a file Item can only support being related to other Items.
A developer can extend any of these types, or the base type Item, to provide a type for his custom data. The data contained in an Item is defined in terms of properties, or fields which hold the actual data. For example, an Item Contact may have a field Name which is a ScalarType, and one field Address, a NestedType, which is further composed of two ScalarTypes. To define this type, the base class Item is extended and the necessary fields are added to the class. A NestedType field can be defined as another class which contains the two ScalarType fields. Once the type is defined, a schema has to be defined, which denotes the primitive type of each field, for example, the Name field is a String, the Address field is a custom defined Address class, both the fields of which are Strings. Other primitive types that WinFS supports are Integer, Byte, Decimal, Float, Double, Boolean and DateTime, among others. The schema will also define which fields are mandatory and which are optional. The Contact Item defined in this way will be used to store information regarding the Contact, by populating the properties field and storing it. Only those fields marked as mandatory needs to be filled up during initial save. Other fields may be populated later by the user, or not populated at all. If more properties fields, such as "last conversed date", needs to be added, this type can be simply extended to accommodate them. Item types for other data can be defined similarly.
WinFS creates tables for all defined Items. All the fields defined for the Item form the columns of the table and all instances of the Item are stored as rows in the table for the respective Items. Whenever some field in the table refers to data in some other table, it is considered a relationship. The schema of the relationship specifies which tables are involved and what the kind and name of the relationshp is. The WinFS runtime manages the relationshp schemas. All Items are exposed as .NET CLR objects, with uniform interface providing access to the data stored in the fields. Thus any application can retrieve object of any Item type and can use the data in the object, without being bothered about the physical structure the data was stored in.
WinFS types are exposed as .NET classes, which can be instantiated as .NET objects. Data is stored in these type instances by setting their properties. Once done, they are persisted into the WinFS store. An WinFS store is accessed using an ItemContext class (see Data retrieval section for details). ItemContext allows transactional access to the WinFS store, i.e., all the operations since binding an ItemContext object to a store till it is closed either all succeeds or all changes are rolled back. As the changes are made to the data, they are nor written to the disc; rather they are written to a in-memory log. Only when the connection is closed are the changes written to the disc in a batch. This helps optimize disc I/O. The following code snippet creates a contact and stores in a WinFS store.
The related items, in turn, may be related to other data items as well, resulting in a network of relationships, which is called a many-to-many relationship. Creating a relationship between two Items create another field in the data of the Items concerned which refer the row in the other Item’s table where the related object is stored.
In WinFS, a Relationship is an instance of the base type Relationship, which is extended to signify a specialization of a relation. A Relationship is a mapping between two items, a Source and a Target. The source has an Outgoing Relationship, whereas the target gets an Incoming Relationship. WinFS provides three types of primitive relationships – Holding Relationship, Reference Relationship and Embedding Relationship. Any custom relationship between two data types are instances of these relationship types.
Relationships between two Items can either be set programmatically by the application creating the data, or the user can use the WinFS Item Browser to manually relate the Items. A WinFS item browser can also graphically display the items and how they are related, to enable the user to know how their data are organized.
WinFS rules are also exposed as .NET CLR objects. As such any rule can be used for any purpose. A rule can even be extended by inheriting from it to form a new rule which consists of the condition and action of the parent rule plus something more.
The primary mode of data retrieval from a WinFS store is querying the WinFS store according to some criteria, which returns an enumerable set of items matching the criteria. The criteria for the query is specified using the OPath query language. The returned data is made available as instances of the type schemas, conforming to the .NET object model. The data in them can be accessed by accessing the properties of individual objects.
Relations are also exposed as properties. Each WinFS Item has two properties, named IncomingRelationships and OutgoingRelationships, which provides access to the set of relationship instances the item participates in. The other item which participates in one relationship instance can be reached through the proper relationship instance.
The fact that the data can be accessed using its description, rather than location, can be used to provide end-user organizational capabilities without limiting to the hierarchical organization as used in file-systems. In a file system, each file or folder is contained in only one folder. But WinFS Items can participate in any number of holding relationships, that too with any other items. As such, end users are not limited to only file/folder organization. Rather, a contact can become a container for documents; a picture a container for contacts and so on. For legacy compatibility, WinFS includes a pseudo-type called Folder which is present only to participate in holding relationships and emulate file/folder organization. Since any WinFS Item can be related with more than one Folder item, from an end user perspective, an item can reside in multiple folders without duplicating the actual data. Applications can also analyze the relationship graphs to present various filters. For example, an email application can analyze the related contacts and the relationships of the contacts with restaurant bills and dynamically generate filters like "Emails sent to people I had lunch with".
Related items can also be accessed through the items. The IncomingRelationships and OutgoingRelationships properties gives access to all the set of relationship instances, typed to the name of the relationship. These relationship objects expose the other item via a property. So, for example, if a picture is related to a picture, it can be accessed by traversing the relationship as:
An OPath query string allows to express the parameters that will be queried for to be specified using Item properties, embedded Items as well as Relationships. It can specify a single search condition, such as "title = Something'", or a compound condition such as "title = 'Title 1' || title = 'Title 2' && author = 'Someone'". These boolean and relational operations can be specified using C# like &&, ||, =, != operators as well as their English-like equivalent like EQUAL, NOT EQUAL. SQL like operators such as LIKE, GROUP BY and ORDER BY are also supported, as is wildcard conditions. So, "title LIKE 'any*'" is a valid query string. These operators can be used to execute complex searches such as The above code snippet creates an ItemSearcher object that searchs on the OutContactRelationship instance that relates pictures and contacts, in effect searching all pictures related with a contact. It then runs the query Name LIKE 'A*'" on all contacts reachable through OutContactRelationship, returning the list of "contacts whose names start with A and whose pictures I have". Similarly more relationships could be taken into account to further narrow down the results. Further, a natural language query processor, which parses query in natural language and creates a well-formed OPath query string to search via proper relationships, can allow users to make searches such as "find the name of the wine I had with person X last month", provided financial management applications are using WinFS to store bills.
Different relations specify a different set of data. So when a search is made which encompasses multiple relations, the different sets of data are retrieved individually and a union of the different sets is computed. The resulting set contains only those data items which correspond to all the relations.
The WinFS API also provides some support for sharing with non-WinFS applications. WinFS exposes a shell object to access WinFS stores. This object maps WinFS items to a virtual folder hierarchy, and can be accessed by any application. WinFS data can also be manually shared using network shares, by sharing the legacy shell object. Non-WinFS file formats can be stored in WinFS stores, using the File Item, provided by WinFS. Importers can be written, to convert specific file formats to WinFS Item types.
In addition, WinFS provides services to automatically synchronize items in two or more WinFS stores, subject to some predefined condition, such as "share only photos" or "share photos which have an associated contact X". The stores may be on different computers. Synchronization is done in a peer-to-peer fashion; there is no central authority. A synchronization can be either manual or automatic or scheduled. During synchronization, WinFS finds the new and modified Items, and updates accordingly. If two or more changes conflict, WinFS can either resort to automatic resolution based on predefined rules, or defer the synchronization for manual resolution. WinFS also updates the schemas, if required.
With WinFS Beta 1, Microsoft included an unsupported application called StoreSpy, which allowed one to browse WinFS stores by presenting a hierarchical view of WinFS Items. It automatically generated virtual folders based on access permissions, date and other metadata, and presented them in a hierarchical tree view, akin to what traditional folders are presented in. The application generated tabs for different Item types. StoreSpy allowed viewing Items, Relationships, MultiSet, Nested Elements, Extensions and other types in the store along with its full metadata. It also presented a search interface to perform manual searches, and save them as virtual folders. The application also presented a graphical view of WinFS Rules. However, it did not allow editing of Items or their properties, though it was slated for inclusion in a future release. But the WinFS project was cut back before it could materialize.