skip navigation
URC Consortium Logo

You are here: MyURC.org > publications > Universal Control Hub & Pluggable User Interfaces

Universal Control Hub & Pluggable User Interfaces

Gottfried Zimmermann

The Universal Plug and Play forum has developed specifications for device classes, so-called "Device Control Protocols" (DCPs). Each DCP defines a common (machine-level) interface for a class of UPnP devices with embedded services in terms of mandatory and optional actions and state variables. Control points that have foreknowledge of the DCP that a device is using, can thus easily make use of its interface.

However, a DCP doesn’t tell anything about how a control point should present a UPnP device and its services to a human user. The UPnP specifications explicitly stay away from defining or adopting a framework for user interfaces. The presentation page URL which is an optional part of a UPnP device description is a hint to the fact that there needs to be some common mechanism of specifying a remote user interface for a UPnP device. However, the meaning of this presentation page is unclear and must rely on proprietary solutions, and is therefore hardly used in UPnP implementations.

The Universal Remote Console (URC) framework, specified as a family of ANSI standards (ANSI INCITS 389-2005 through 393-2005), can fill the user interface gap in UPnP. It defines a "Protocol to Facilitate Operation of Information and Electronic Products through Remote and Alternative Interfaces and Intelligent Agents". Despite the title its purpose is to define a user interface layer on top of any existing interoperability framework for device discovery, control and eventing. It does so by defining a family of XML-based languages for the specification of cross-device user interfaces. The most prominent component of the URC framework is the "User Interface Socket", a common low-level denominator for all user interfaces that can be used for a device. High-level, polished user interfaces can be "plugged" into the socket, thus reusing application-specific code contained in the User Interface Socket.

This paper describes how the Universal Remote Console framework can be combined with the UPnP DCP approach, with the "Universal Control Hub" being the core component of the proposed architecture. In a nutshell, the Universal Control Hub (UCH) allows for both thin UPnP devices and thin user interface clients. It provides the UPnP architecture and its DCP specifications with a common user interface layer, thus allowing UPnP devices to project their user interfaces on remote clients that they have no knowledge of.

1. Proposed Architecture

Universal Control Hub Architecture

Figure 1: Universal Control Hub Architecture

In the center of the proposed architecture (see figure 1) is the Universal Control Hub (UCH) which acts as a gateway between a user interface client ("UI client") and any UPnP device that it wants to access and control (here called "back-end device"). The UCH talks the UPnP-defined DCPs for communicating with the back-end devices. For example, it interacts with a UPnP enabled TV or DVR through the AV Device Control Protocol; and it interacts with a UPnP thermostat through the HVAC DCP. As mentioned previously, these UPnP DCPs do not include any mechanism for specifying a user interface or the remoting of them. (One exception is the RemoteUI DCP which is used for a different purpose – see later). Thus a manufacturer of a device can deploy very simple devices that don’t know anything about user interfaces, and solely rely on being controlled through their corresponding DCPs.

On the other end, the UI clients don’t know anything about the DCPs to be used for controlling the back-end devices. Some of them may use UPnP RemoteUI to discover the UCH and to pick the remoting protocol of their choice, thus acting as RemoteUI client (or at least as control point to the RemoteUI server). Others may know how to initiate a specific remoting protocol with the UCH by some other means (e.g. through setup). In any case, a UI client finds a remoting protocol ("remoting channel") on the UCH through which it can remotely access and control any one of the back-end devices.

Figure 1 lists some remoting channels as examples:

Remoting channels are in most cases offered to UI clients by URI. (The phone line is an exception.) The URI scheme denotes the remoting channel, but may also contain a session identifier or other information about the state of a control interaction. For example, the URI "http://192.168.1.1/svg" may serve a portal-style SVG interface that lets the user pick a device from a list of available back-end devices. For a UI client that has already picked a channel and back-end device through the UPnP RemoteUI procedure, the URI "http://192.168.1.1/svg/dvr" may immediately provide an SVG interface for the DVR.

One should not think of a channel as delivering one static version of a user interface. Server-side adaptation mechanisms may be built into the channel protocol that may facilitate delivering user interfaces that are adapted to the UI client’s properties such as screen size and user input capabilities. For example, when a UI client requests a user interface over HTTP, the HTTP header may bear information about the device’s and the user’s preferences. Also, some user interface descriptions allow client-side adaptations such as scaling and reformatting.

Some UI clients are only capable of using one remoting channel; others could use either one of multiple channels. For example, a desktop or laptop computer can easily use the following remote UI channels: SVG on HTTP, HTML on HTTP and Flash Remoting. A PDA can use HTML on HTTP, and Flash Remoting. A TV set with a remote control can use HTML on HTTP or Flash Remoting, depending on its software capabilities. A cell phone could use either one of the Flash Remoting, XRT2, or VoiceXML on phone line. And a plain old telephone could use voice-based VoiceXML user interfaces for remote control.

So far we have looked at the UCH as a "black box" which somehow bridges between user interface protocols on the UI client-side and UPnP DCP based protocols on the back-end. But how does the UCH generate the user interface descriptions for serving the RUI channels? Does it use some pre-defined documents that are hard-coded into such a device? The answer is "YES" and "NO". "YES" because there is some part of a user interface (the "User Interface Socket") that is pre-defined for any DCP-standardized UPnP device. "NO" since the manufacturers of the back-end devices have a great interest in being able to project "their user interfaces" (bearing their corporate identity) onto the UI clients.

The User Interface Socket is the part of the remotable user interface that doesn’t change whatever remoting channel is used to convey the user interface. It is the "common denominator" of all user interfaces for a specific back-end device, defined by its manufacturer. This includes all types of user interfaces with any output modality (visual, auditory, tactile, or any combination) and any input modality (keyboard, mouse, touch-based, stylus, hand-writing, gesture, etc., or any combination). A User Interface Socket contains a flat set of low-level user interface elements (called "socket elements") that provide a synchronized communication channel to the back-end device and its current state. Socket elements add a logical layer on top of the DCP based constructs, that is closer to actual user interfaces than the UPnP DCP constructs are. It is easier for user interface developers to bind their widgets to UI Socket elements than to DCP actions and state variables of the back-end device. Socket elements are either variables, commands or user notifications. The description of the UI Socket (the "Socket Description") also specifies how socket elements depend on each other, for example that the "volume" variable can only be modified is "mute" is off.

For today’s UI client devices, the User Interface Socket would not provide enough information for constructing a nice-looking user interface. What’s missing are concrete instructions how to build the user interface, what widgets to use and how to arrange and structure them. Also, labels need to be provided for the UI Socket elements. Widgets, structure and layout is provided by a "Pluggable User Interface", a channel-specific user interface description that plugs into a particular User Interface Socket. In general, a manufacturer will provide for each of its products a User Interface Socket plus a set of Pluggable User Interfaces for the most common UI client types, and deploy them to a Resource Server. The Resource Server may be company-owned or provided by any other organization such as a consortium. Other parties may create complementary Pluggable User Interfaces and make them available through the same or other Resource Servers. A Universal Control Hub that encounters a particular back-end device will look for Pluggable User Interfaces for that back-end device, searching on any Resource Server on the Internet.

At this point it is important that there be a defined procedure for the UCH what Pluggable User Interface to use if there are multiple available. The UCH is part of an implicit contract between the back-end devices and the UI clients. The agreement is that if the manufacturer of a back-end devices provides a Pluggable User Interface for a specific remoting channel, this Pluggable UI is the default user interface to be rendered on the UI client when using that channel. Only if there is no user interface available from the manufacturer of the back-end device, or if for some reason it is not usable by the UI client or its user, user interfaces from other parties may replace the default one. For example, if the user understands only Japanese, but the manufacturer of the back-end device provides only European language user interfaces, a Japanese user interface that was created by a third party for that back-end device may fill in.

2. Sample Scenarios

To illustrate how this all works together, let‘s look at some example scenarios.

Computer controlling TV

Figure 2: Sample scenario - Computer controlling TV

In the first scenario (figure 2), a user wants to use their desktop computer (as UI client) to control the TV in the living room which are both connected to the home network. The Universal Control Hub advertises itself as a UPnP RemoteUI server device. Since the computer is UPnP aware, it discovers the UCH and interacts with it through the RemoteUI DCP. Thus the computer finds out that the UCH provides an HTML/HTTP based RUI channel for controlling the TV. By following the corresponding URI for the HTML/HTTP channel, the computer opens a HTML/HTTP based controlling session on the UCH for TV control. The UCH is using an HTML/HTTP Pluggable User Interface for this session that it retrieved from the TV manufacturer’s Resource Server on the Internet.

Same TV controlled by cell phone using Flash

Figure 3: Sample scenario - Same TV controlled by cell phone using Flash

The next scenario (figure 3) has the same TV being controlled by a cell phone instead of the computer. The cell phone cannot render HTML, but finds a remoting channel for Flash Remoting clients. Since it has a light Flash player installed, it follows the corresponding URI. From the initial list of back-end devices that the UCH projects onto the cell phone’s screen, the user picks the TV. The UCH retrieves and installs the Pluggable User Interface from the TV manufacturer for the Flash Remoting channel, if not already installed. Then the TV’s flash user interface is rendered on the cell phone, as defined by the TV manufacturer.

Instead of starting a new session with the cell phone controlling the TV, the same session could have migrated from the TV to the cell phone by some kind of session URI manipulation. For example, if the URI "http://192.168.1.1/html/tv?session=xyz" denotes a HTML/HTTP based control session to the TV, the URI "http://192.168.1.1/flashremoting/tv?session=xyz" could denote the same session but served through the Flash Remoting channel. When migrating a session from one channel to another, the Pluggable User Interface would be replaced but the User Interface Socket would remain.

Same phone controlling thermostat, using VoiceXML

Figure 4: Sample scenario - Same phone controlling thermostat, using VoiceXML

In the third scenario (figure 4), a user wants to set the temperature of the thermostat at home while driving home. Because she is driving, she cannot look at the cell phone’s screen. Instead she dials the UCH’s private phone number and hears: "Here is the Universal Control Hub. Say one of these: TV, DVR, thermostat." She says "Thermostat" and hears "Thermostat selected". She: "Set temperature to 68 degrees." The UCH responds: "Thermostat set to 68 degrees". The user hangs up.

In this scenario the UCH retrieves a Pluggable User Interface for the VoiceXML channel from the thermostat manufacturer’s Resource Server, and binds it to the User Interface Socket for the UPnP enabled thermostat. A VoiceXML interpreter (with phone line connection) is acting as UI client.

Aggregated Flash UI for TV and DVR, showing on TV

Figure 5: Sample scenario - Aggregated Flash UI for TV and DVR, showing on TV

The last scenario (figure 5) illustrates how aggregated (compound) Pluggable User Interfaces may be used to project a single user interface comprising functions of multiple back-end devices. Here a TV is used to render a Flash Remoting based user interface for both the TV and the DVR (see figure 5). For example, this user interface could contain the volume slider for the TV and the channel selection list for the DVR.

The UCH finds an Flash Remoting based aggregated Pluggable User Interface for the TV and the DVR in the home network. In its list of available remoting protocols announced through the UPnP RemoteUI server service, it can now offer a URI for a "TV+DVR" user interface session based on Flash Remoting. The user can pick the session using the remote control of the TV, and thus have the aggregated user interface rendered on the TV screen.

3. How the Proposed Architecture Adds Value to UPnP

By proposing a Universal Control Hub as middleware layer between the DCP based back-end devices and the remoting protocol based UI client devices, we identify the following added values:

(1) The UCH provides a solution for device independence. Through its remoting channels it offers a set of diverse user interfaces that are tailored for specific UI client devices. A "device-independent HTML version" should be provided for any back-end device by its manufacturer. This HTML version may be used as a fallback option, when no tailored user interface exists for a particular UI client device. If written in a decent way, HTML code is suitable for almost any type of graphical user interfaces. Guidelines for writing "device-independent HTML" should be provided for developers of back-end devices.

(2) The UCH provides an open platform for Pluggable User Interfaces. This brings about the following features:

(a) The manufacturers of back-end devices can project their user interfaces onto UI client devices. The UCH acts as a broker of remotable user interfaces between the back-end device and a UI client device. Neither the UI client device nor the UCH need to be made by the same manufacturer.

(b) Easy internationalization (i18n) since Pluggable User Interfaces are easily provided as duplicates in different languages. Also, by outsourcing, third parties can be mandated to translate Pluggable User Interfaces and post them to a Resource Server.

(c) Simplified programming model for user interface designers designing for complex UPnP devices. Some UPnP DCPs are very complex and push the limits of what the UPnP architecture can achieve. For example, the AVTransport service template (part of the AV DCP) defines an evented state variable "LastChange" that provides a summary of other state variables’ value changes in the form of an XML document. Therefore a user interface designer would have to write XML parsing code to be able to trigger user interface updates based on a back-end device’s state change. The User Interface Socket would free the user interface designer from having to deal with XML parsing. Instead the socket layer provides a flat set of variables, commands and user notifications that the UI designer can bind its interface to. APIs for the User Interface Socket layer will exist for common user interface description languages

(3) The User Interface Socket model provides an open platform for task-oriented user interfaces. The Socket Description declares how individual UI Socket elements depend on each other. UI Socket elements are suitable for forming the leaf nodes of a task-model tree, with their parent nodes being tasks of various aggregation levels. Task-model trees could be published for device classes or combinations of device classes by the vendors of these devices, or by any third party. Even without a task-model tree, the option of using an aggregated Pluggable User Interface that binds to multiple UI Sockets, is a first step toward task-oriented user interfaces.

(4) The User Interface Socket model also provides an open platform for future usage scenarios, involving intelligent user agents and natural language interaction. It is expected that these kind of next-generation user agents will provide an answer to the simplicity challenge of consumer electronics. An intelligent agent will make the User Interface Socket the basis of its device and service assessment, with possible extensions of the Socket Description toward knowledge modeling and semantic Web technologies. By introducing the basic model of a User Interface Socket today, we benefit in multiple ways today and are also ready for the user interface technologies of tomorrow.

Appendix: Glossary of Main Components

User Interface Socket

Pluggable User Interface

Universal Control Hub

Resource Server

This site is maintained by the University of Wisconsin Trace Center, a member of the Universal Remote Console Consortium.