Feature

Collaborative Authoring on the Web: Introducing WebDAV

by E. James Whitehead, Jr.

The irony was intense. The frustrating, awkward nature of collaborative authoring over the Internet became increasingly evident as draft after draft of the WebDAV Distributed Authoring Protocol specification was edited, spelling out how collaboration could be much easier, much more fluid, if leveraged on the Web's standard infrastructure. The collaboration scheme the authors used when writing the specification was typical: each draft required the author to make the revisions, then e-mail changes back to the other authors. If an author's e-mail system was bogged down, the draft might not resurface for hours, occasionally losing a work day. Once the draft was received, it went into a directory filled with other similarly named revisions of the document, making it tough to pick out a given revision even a few weeks later. When it was OK for another author to modify the document, they e-mailed, "you have the torch" -- but sometimes forgot, leading to confusion, and either lost time or lost changes.

Meanwhile, with each successive revision the WebDAV Distributed Authoring Protocol provided a clear picture of a better way. Instead of passing documents back and forth via e-mail, edit them in-place at a URL. Instead of "passing the torch," a locking mechanism prevents overwrite conflicts and permits lock discovery. Since the document is always accessible via its URL, there is no lost time due to e-mail delays, and other collaborators can view progress as it is made. Alas, when writing the WebDAV specification, WebDAV technology was clearly needed.

WebDAV provides many benefits:

More specifically, the WebDAV Distributed Authoring Protocol defines a set of extensions to the base Hypertext Transfer Protocol for the following capabilities:

A strongly related effort to WebDAV is the DAV Searching and Locating (DASL) group which is working to develop an interoperable means of searching a repository which is compliant with the WebDAV object model and which organizes its resources into URL hierarchies. The main capability of DASL is:

Though it is an IETF working group, and hence has no official affiliation with the World Wide Web Consortium (W3C), WebDAV does work cooperatively with the W3C, which provides technical assistance and help in contacting interested people within the Web community.

DASL is currently in process of becoming an IETF working group and is working to develop extensions to the WebDAV Distributed Authoring Protocol specification (and hence to HTTP) for searching WebDAV repositories. DASL has its own requirements document [RS98] and protocol document [RJR+98], which are still the subject of intense effort within the DASL group.

The remainder of this article provides a detailed overview of the capabilities in the base WebDAV Distributed Authoring Protocol. Note that throughout this article the term resource is often used. Resource is the proper Web terminology for any piece of information, such as a Web page, a document, a bitmap image or a computational object which is stored on a Web server and whose location is described by a Uniform Resource Locator (URL) [BFM98].

Functionality Overview

The WebDAV Distributed Authoring Protocol contains a set of features which can be used in a wide variety of settings by applications which support collaborative work on remotely authored documents. These features can be partitioned into three groups: overwrite protection, properties (metadata) and namespace management. A detailed overview of these capabilities is presented in the sections below.

Overwrite Prevention

Once two or more people start collaborating on the same document, the issue of write control comes to the fore. If everyone can write to the same, unversioned document, then it is possible to lose changes made by one or more contributors as first one collaborator, then another, writes their changes without first merging in previous updates.

There are many techniques which can be used to alleviate this "lost update" problem. Several of the more common ones are:

These schemes vary from least protective and most flexible (POTS) to most protective and least flexible (exclusive locking).

Currently, the WEBDAV approach is to provide facilities for both shared and exclusive locking. This dual lock support provides sufficiently flexible locks to accommodate a wide range of collaborations. While shared locks best support collaborators who have a lot of awareness of each other's activities, exclusive locks provide a more stringent guarantee of conflict avoidance for less aware collaborators or during periods of high contention for a document. Locks may have a scope of a single resource, including all non-live properties on the resource, or a hierarchy of resources (for example, a collection and all of its member resources). A lock discovery mechanism (a WebDAV property) allows authors to find out if any locks exist on a Web resource. Since the Web is designed so that no lock is required to read a Web page, there is no concept of a read lock. An implication of this is the contents of a resource may change without warning if a write lock is not owned on the resource.

Locking usually comes paired with event notification capability, so that other collaborators can be automatically informed by the system when a lock has been released. Notifications are an important mechanism by which collaborators become aware of each other's activities and may occur in multiple granularity levels. Events with a grain size of an entire resource, such as a lock being granted or released, provide document access awareness information, while sub-resource events, such as a word being inserted into a paragraph, can lead to authoring tools which support multiple authors simultaneously working in the same document. Although WebDAV has decided against developing an interoperability standard for Web-based notifications, the recent Workshop on Internet Scale Event Notifications (WISEN) held at the University of California at Irvine in July, 1998 (for details see: http://www.ics.uci.edu/IRUS/wisen/), and the Event Notification Service BOF meeting held at the Chicago IETF meeting in August, 1998, are strong indicators that standardization work may soon begin in this area.

Properties

Information on the Web has many pieces of associated information, such as the title, subject, creator, publisher, length and creation date. This information about information (called properties within WebDAV, but also known as metadata) can be used to search for Web resources, enforce copyrights or provide bibliographic information. Properties are particularly useful in searching for Web resources due to the inadequacies of existing index-based Web search engines which often return a large number of undesired results to any query. By focusing a search on a the value of a particular property (e.g., the author), properties can be used to reduce the number of undesired query results; the DASL effort is concentrating on providing solid support for queries on properties of resources.

Development of a useful set of properties is extremely important – one schema, or set of metadata, which was developed to assist Web searching is known as the Dublin Core (for more information, see: http://purl.org/metadata/dublin_core/). Since other groups have focused on developing metadata sets, the WebDAV group decided to focus on developing facilities for creating, modifying, deleting and retrieving metadata. These facilities allow for the manipulation of metadata from multiple schemas, allowing the schema itself to vary with domain of use. For example, even though the Dublin Core is appropriate for use in the general Web context, it may not be ideal for use in other settings, such as the legal community. By being metadata schema neutral, the WebDAV approach allows the most appropriate schema to be used in any context. It allows WebDAV to focus on "how," as in how properties are stored and retrieved, rather than on "what," as in what do they mean?

WebDAV properties are name-value pairs. The name is a Uniform Resource Identifier (URI), such as a URL, and the value is a well-formed sequence of Extensible Markup Language (XML) [BPS98] content. (For more information on URIs, see “An Introduction to the Resource Description Framework” in this issue of the Bulletin.) If, for instance, a property name is a URL, it can be given uniqueness without central registration by using URL property names chosen from within a domain whose name is controlled by the party defining the property. So, for example, a company that controls a given domain name, like "widgets.com," can choose a property name from within this domain, like "widgets.com/properties/color."

An example WebDAV property defined by the Distributed Authoring Protocol is the DAV:getcontentlength property, which gives the length, in bytes, of the response generated by a GET on the resource. The property name is a URI, with a URI scheme of "DAV," which is reserved for use by WebDAV. A sample value of this property is:

Name: DAV:getcontentlength

Value: <DAV:getcontentlength> 3422

</DAV:getcontentlength>

In this case, the length is 3422 bytes, which is enclosed within the <DAV:getcontentlength> XML element. By convention, the enclosing XML element for a WebDAV property takes the same name as the property itself.

Using XML to encode the value of properties provides three major benefits. First is extensibility. Since all content within XML is encoded between start and end tags, it is easy to add additional elements to a property by inserting new tagged content. Internationalization is the second major benefit. Since XML mandates support for the UTF-8 and UTF-16 encodings of the ISO 10646 character encoding standard, as well as language tagging, properties can express content in the vast majority of human languages. Finally, by using XML, WebDAV properties can support other metadata activities which are also based on XML, such as the Resource Description Framework (RDF) under development at the W3C.

Name Space Management

In the current, publish/browse model of the Web, there is scarce need for a user to duplicate or rename Web resources. However, once the Web is used for distributed authoring, the need for these capabilities, plus the ability to get a listing of a directory, becomes extremely important. Being able to discover what resources currently populate a portion of the name space of a Web server and the ability to copy, move and delete these resources, together form the key elements for managing a Web name space.

There are several justifications for adding copy and move capability. A resource may need to be copied due to changing ownership, prior to major modifications, or when making a backup. It is often necessary to move (i.e., change the name of) a resource, for example due to adoption of a new naming convention, or if a typing error was made originally entering the name.

Copy and move have ramifications with respect to properties: how should properties behave after a copy or a move? It would seem that all properties on the duplicated or moved resource should be identical to the properties on the original. However, there are really two classes of properties: live and static. Static properties have the quality that their value, once set, remains the same until a client explicitly modifies it. Live properties, in contrast, have their syntax and semantics enforced by the server and may vary at any time. One example of a live property is the content length of a resource -- every time the resource is updated, the value of the property will also be updated. WebDAV also attempts to resolve conflicts between the existing properties of a resource being moved and those that may be enforced by the server or directory in which it is to be located.

Listing the contents of a collection, an operation similar to listing a directory of a file system, is accomplished using the property retrieval mechanism. Most existing directory listing operations (such as "ls" or "dir") provide the name of a file and an option for retrieving limited sets of properties about the file, such as its size, owner and access permissions. However, since WebDAV has an existing property retrieval mechanism, it made little sense to define another property retrieval operation just for listing a collection. Instead, the existing property retrieval mechanism was used. Since WebDAV property retrieval allows, with a single operation, a hierarchical retrieval of properties on a collection, returning for each resource its name and requested properties, this mechanism has enough expressive power to do double-duty as the "list a collection's members" operation as well.

Conclusion

Taken together, the WebDAV extensions to HTTP provide the standard needed to make the Web a writable, collaborative medium. What does this mean? Although the future is notoriously hard to predict, here are some likely outcomes of adoption of WebDAV. As WebDAV technology is deployed, it will initially have its largest impact on small to medium sized workgroups, which homogeneously support DAV, allowing their work practices to coalesce around a local intranet. Over time, as critical mass grows, WebDAV will also dramatically reduce the accidental costs of collaboration between workgroups and between organizations. WebDAV additionally shows significant promise as an infrastructure for development of distributed software engineering environments and other complex information products [FWA+98].

WebDAV in the home will make Web page creation significantly easier, since Web pages will be editable in-place. Furthermore, opportunities for collaboration abound in the home: WebDAV will allow school children to collaborate easily on projects and reports, and parents who do volunteer work will find it easier to work on proposals, budgets, schedules and more. By giving more voices access to the global distribution of the Web, and by making it easier to collaborate, WebDAV technologies will have broad social impact.

Equally exciting is the unpredictable nature of information technology, such as the unpredicted advent of electronic commerce on the Web. Right now is the ground floor of WebDAV, when future applications are limited only by your imagination.

For More Information

Working groups of the Internet Engineering Task Force are completely open, and may be joined by subscribing to their e-mail discussion list. If you wish to participate in the discussions on WebDAV topics, you may join the mailing list by sending an e-mail with subject "subscribe" to w3c-dist-auth-request@w3.org. The home page for the WebDAV group is at URL
http://www.ics.uci.edu/pub/ietf/webdav/
which contains links to current working drafts, e-mail list archives and background material. The related DAV Searching and Locating (DASL) working group has its Web page at URL
http://www.ics.uci.edu/pub/ietf/dasl/
and a mailing list which may be joined by sending a message with subject "subscribe" to www-webdav-dasl-request@w3.org.

References

[BFM98] T. Berners-Lee, R. Fielding, L. Masinter. "Uniform Resource Identifiers (URI): Generic Syntax." Internet Draft Standard Request for Comments 2396. MIT/LCS, U.C. Irvine, Xerox. August, 1998. ftp://ftp.isi.edu/in-notes/rfc2396.txt

[BPS98] T. Bray, J. Paoli, C. M. Sperberg-McQueen. "Extensible Markup Language (XML) 1.0," W3C Recommendation, REC-xml, February, 1998. http://www.w3.org/TR/REC-xml

[FGM+97] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee. "Hypertext Transfer Protocol -- HTTP/1.1" Internet Proposed Standard Request for Comments 2068. U.C. Irvine, DEC, MIT/LCS. January, 1997. http://www.ics.uci.edu/pub/ietf/http/rfc2068.txt

[FWA+98] R. Fielding, E. J. Whitehead, Jr., K. M. ANderson, G. A. Bolcer, P. Oreizy, R. N. Taylor. "Web-Based Development of Complex Information Products." Communications of the ACM, Vol. 41., No. 8, August, 1998, pages 84-92.

[GWF+98] Y. Y. Goland, E. J. Whitehead, Jr., A. Faizi, S. R. Carter, D. Jensen. "Extensions for Distributed Authoring on the World Wide Web -- WEBDAV." Internet-Draft, Work-in-progress, draft-ietf-webdav-protocol-08, April, 1998. http://www.ics.uci.edu/pub/ietf/webdav/draft-ietf-webdav-protocol-08.txt

[Lass97] O. Lassila. "HTTP-based Distributed Content Editing Scenarios." Internet-Draft, Work-in-progress, draft-ietf-webdav-scenarios-00, May, 1997. http://www.ics.uci.edu/pub/ietf/webdav/scenarios/draft-ietf-webdav-scenarios-00.txt

[Ragg97] D. Raggett, A. Le Hors, I. Jacobs. "HTML 4.0 Specification," W3C Recommendation REC-html40, April 24, 1998. http://www.w3.org/pub/WWW/TR/REC-html40.html

[RJR+98] S. Reddy, D. Jensen, S. Reddy, R. Henderson, J. Davis, A. Babich. "DAV Searching and Locating." Internet-Draft, Work-in-progress, draft-reddy-dasl-protocol-02, July, 1998. ftp://ftp.isi.edu/internet-drafts/draft-reddy-dasl-protocol-02.txt.

[RS98] S. Reddy, J. Slein. "Requirements for DAV Searching and Locating." Internet-Draft, Work-in-progress, draft-reddy-dasl-requirements-02, March, 1998.
ftp://ftp.isi.edu/internet-drafts/draft-reddy-dasl-requirements-02.txt.

[SVWD97] J. Slein, F. Vitali, E. J. Whitehead, Jr., D. Durand. "Requirements for Distributed Authoring and Versioning on the World Wide Web." Xerox, University of Bologna, U.C. Irvine, Boston University.

Internet Informational Request for Comments 2291, February, 1998. http://www.ics.uci.edu/pub/ietf/webdav/requirements/rfc2291.txt


E. James Whitehead, Jr., is affiliated with the Department of Information and Computer Science, University of California at Irvine, Irvine, CA 92697-3425. He can be reached by e-mail at ejw@ics.uci.edu; by phone at 949/824-4121; or by fax at 949/824-1715.

Bulletin American Society for Information Science