MPEG-7 to Ease Search, Retrieval


With the explosion of audio and video files now within a keystroke's reach of millions of networked computer users, navigating through the dense jungle of streaming and archived media files without a good map and compass can prove to be an adventure.

The Moving Picture Expert Group (MPEG), however, is hammering out a complex specification-MPEG-7-that would enable audio and video files to contain descriptive data, or "metadata," making search and retrieval easier and far more sophisticated.

Eric Rehm, CTO of and a contributor to the MPEG-7 spec, estimates that between 5 million and 20 million Internet media streams are available to users, making finding specific clips or files difficult at best.

"The amount of audio and video data will (only) increase," conceded Sylvie Jeannin, senior member of the research staff for Philips Research USA and a contributor to the MPEG-7 visual group. "We need ways to access it more efficiently."

Factor in the surging popularity of personal video-recording services-which are on the road map of nearly every advanced digital set-top box maker-and the day is quickly approaching when consumers will have gigabytes of multimedia files on set-top hard drives without any standardized retrieval tools.

The recent completion of a committee draft form of the MPEG-7 spec is a significant milestone toward producing a stable specification that will allow companies "to really start to think about products," said Jeannin.

MPEG-7 is already starting to appear on the cable industry's radar screen. Cable Television Laboratories Inc. is following its progress, and Canal Plus U.S. Technologies is "very interested" in the development of the spec, according to a company official.

IBM Corp.'s T.J. Watson Research Center, Sony Corp. of America, Philips Consumer Electronics, AT&T Labs, Sharp Labs of America and Mitsubishi Electric Research Laboratories all have representatives hammering out various aspects of the MPEG-7 spec. is marketing a multimedia indexing service that uses MPEG-7 to content creators. Thomson Multimedia, parent of Thomson Consumer Electronics, purchased the company in August.

The acquisition is part of a broad Thomson initiative called Digital Media Solutions. It's meant to push the publishing, managing and distribution of audio and video content over broadband networks.

Thomson has already enlisted heavyweights Microsoft Corp., Alcatel Alsthom and DirecTV Inc. Expect MPEG-7 to play a role in Thomson's grand design to make "rich media" accessible to the broadband masses.

MPEG-7, unlike its cousins MPEG-1, MPEG-2 and MPEG-4, is not a video-compression format. Instead, the MPEG-7 specification provides a means to describe various aspects of multimedia based on several parameters, such as title, author or copyright elements.

But the descriptive schemes that MPEG-7 has outlined go much deeper. Imagine, as Jeannin suggested, accessing only the portions of a recorded soccer match in which goals are scored, instead of browsing through the entire game.

MPEG-7 descriptive information could be attached to MPEG-2 or MPEG-4 streams, Jeannin said, much in the way audio and video are mixed in an MPEG-2 stream. The object-based representation and coding of visual data that forms the basis for MPEG-4 compression makes it well suited to adding MPEG-7 description information.

Detailed information of a multimedia file-such as a news footage clip or music track-could include MPEG-7 descriptive information embedded in the stream. This would allow a playback or recording device, such as a set-top box, to store that information, which would assist in searching for and retrieving specific files.

How a set-top or receiver would decode this information-either through hardware or software-remains unclear. Use of the Vertical Blanking Interval or Advanced Television Enhancement Forum (ATVEF)-based code are two possibilities.

Developing the means to carry MPEG-7 over MPEG-2 and MPEG-4 streams has become a group priority, said Rehm.

"There's no doubt that carrying metadata inside of MPEG-4 will be pretty important," he said.

At its core, MPEG-7 provides for so-called descriptors, description schemes and a description definition language, which will be based on XML schema-a method of defining databases using eXensible Markup Language. Many media-company Web pages are now built using XML.

MPEG-7-based content descriptions would work much like the so-called meta tag descriptions written into the Hypertext Markup Language (HTML) of Web pages.

Meta tag description and keyword information is written by Web authors as they develop the HTML code for Web pages so search-engine robot programs can determine what the pages are about.

While Jeannin stressed that MPEG-7 is not being developed or targeted for a single application, several uses of the spec have been identified, including the creation of digital libraries, such as image categories or musical dictionaries. It could be also be used for multimedia directory services or "yellow pages," as well as personalized electronic news services.

The MPEG-7 group hopes to ease interoperability not only between applications designed to search out, retrieve and filter multimedia files, but also components of the content, such as subject matter or specific people.

In addition to standard text-based information about content, MPEG-7 also addresses subtle technical parameters of multimedia content, such as texture, shapes, color and motion, allowing for speech and face recognition, said Rehm. An application using these parameters could let a user navigate through digital photos and seek shots of sunsets or family members, for example.

In the near term, program-guide information may be one of the first ways that MPEG-7-based applications come to market. Both TiVo Inc. and ReplayTV Inc. are members of an industry group called the TV-Anytime Forum, which is charged with enabling audio and video services based on high-volume digital storage.

Replay TV vice president of product management Steve Shannon said the forum is watching to "make sure the MPEG-7 group considers local capture and playback of video."

Shannon sees a role for MPEG-7 in improving the accuracy of program guide information. By letting metadata about video content be included with the content itself, information about would be more precise. This would help when cable companies change their channel lineups, for example.

In terms of the value of searching and indexing video, Shannon said, "there are ways to do that without MPEG-7, but they are all less precise and clean than if the data defined itself."

He also said MPEG-7 would aid television-commerce application development, specifically product-placement advertising, which requires some data to be embedded in the video stream.

TiVo chief technology officer Jim Barton, though not excited about embedding data in video streams, saw some advantages in terms of handling a multitude of EPG suppliers.

"We'd like to see some XML-based program guide data," he said.