David Shrimpton, University of Kent at Canterbury Christopher Dobbyn, Oxford Brookes University There is now a general consensus that Digital TV and the World Wide Web are converging technologies, in that Digital TV users will shortly be able to access, present and interact with WWW documents by means of their TV service, combining these documents seamlessly with TV pictures and sound. However, no agreed universal models or mechanisms exist for the combination of the two technologies. We are currently researching into models and mechanisms for the integration of the existing interactive Digital TV technologies with the WWW. We are developing models for an integrated service and investigating extensions to the MHEG-5 component of the DAVIC specification for Digital TV and to the various WWW systems, by means of which the two may be interfaced. The project objectives are: 1. To research and implement a model for integration of the digital television standards with the current and emerging World Wide Web standards. 2. To provide a set of tools to be used by content providers that will enable digital television users to access, present and interact with WWW documents. 3. To contribute specific proposals, and to the discussion generally of, the appropriate standards bodies (e.g. ISO) and industrial consortia (e.g. DAVIC) Future web browsers and the software for interactive TV STB's have similar requirements: obviously both are concerned with multimedia presentation; for both, interaction with, and navigation around, documents are essential; and for both, standardisation is required. Moreover, it is clearly desirable for users to be able to move seamlessly between the TV and Web environments: for example a user might wish to download a form from her TV to a browser on a PDA, work on it offline, and then upload the form back to the TV for back-transmission. This is not possible with many of the current interactive TV standards: for example, DAVID MHEG-5 documents can only be handled by special MHEG Engines. To render an MHEG presentation on an existing browser technology, either the browser must be made MHEG-aware or the MHEG presentation must somehow use tag-set that a browser can understand. A common model for interactive TV and the WWW is therefore a clear necessity. The proposed ISO MHEG-8 standard describes document structure and rendering information in terms of XML tags, defined in the MHEG-8 DTD. The majority of the MHEG-8 tags can be mapped onto XHTML tags using XSL-T; however, there are many aspects of MHEG-8 presentation tags whose semantics cannot be expressed in XHTML. For example, XHTML has no notion of object ordering and it is not possible to specify that one object is in front of or behind another. Nor can opacity, which is expressible in MHEG, be encoded in XHTML. In addition to the document structure, MHEG-8 also describes the MHEG event model by means of tags. There is no equivalent to this in XHTML, as tags are mainly associated with structure. One solution to this problem is for multimedia documents to load into a DOM capable browser in the form of an XHTML document, with script tags referring to a .jar file containing MHEG and other support classes encoded in JavaScript; these classes constitute a small-footprint Document Interpreter. The document containing these tags can be loaded from the Web; or-since TV users expect instant access, wired into the STB as a start-up document. Tags later in the document and the tags of documents subsequently loaded into the browser, make calls the methods of these classes. As the browser software processes these tags, they make calls to the Document Interpreter, which establishes MHEG links between the constituent objects of the document; these links being constituted by JavaScript objects. The Interpreter maps the links onto the DOM event-handling model, by creating and registering appropriate action listeners through the DOM API. Events arising from user interaction with document elements are handled by JavaScript demons associated with each tag for which user interactions are enabled, these demons making calls to the API of the Document Interpreter. Although such a system provides an event-handling model, it does not provide any equivalent of the model for synchronisation and timing of objects that are defined in MHEG; however this is the domain of SMIL. SMIL provides the tags, which can be used to express parallel rendering of real time synchronised streams, etc. Such tags can also be integrated into the XHTML/XML documents that are loaded into the browser. The equivalent of the MHEG engine functionality thus becomes spread between the DOM model and the SMIL functionality that will reside in future browsers, and the JavaScript library that is associated with the MHEG presentation and forms the Document Interpretation engine. Another similar example to MHEG is the Japanese, Broadcast Mark-up Language. BML is an XML-based representation incorporating XHTML1.0, ECMAScript, CSS1/2, and DOM Level 1 with extensions. New tags are defined to handle synchronization, dynamic lists; there are extensions to CSS to encode navigation, resolution and colour information; and ECMAScript classes are defined to handle transmission stream, document switching and persistent memory functions. We argue that the W3C standards described above provide the basis for a common model of representation and processing of WWW and Interactive TV documents, in which the core functionality of architectures such as MHEG can be captured, but which general browsers can use without the need for translation or a specialised engine independent of the browser.