How do you optimize, automate and enrich your collection metadata?

A computer screen with a selection of files

Pixels and samples, the information from which audiovisual data is built, we cannot ‘read’, they have no direct meaning. An extra step is needed to bridge the so-called semantic gap between form and content, and archive audiovisual data in terms of content, with words, to make it all searchable. This extra step was traditionally done manually, for example by documentalists. Now, more and more pieces of substantive information, which are stored at different places in the production chain, are used in a structured manner. In recent years, we have also seen a gradual increase in the use of automatic techniques as a tool.

The manual and automatic descriptions are considered ‘representations’ of data. They represent a certain ‘view’ on data. The view of a person from a certain background (broadcaster, documentalist), an automatic algorithm focused on a modality (speech, image) or on a certain part within a modality (sounds, words, the speaker or emotions in audio, and faces, objects and actions in the picture). The descriptions are structured in the metadata or stored separately as a specific form of description. The challenge here is to organize the description process as efficiently as possible, and at the same time increase the quantity and detail of the descriptions. When using automated processes, questions about the quality of the descriptions and their monitoring play a prominent role. A search system indexes these metadata and descriptions to make searching in audiovisual content possible (more on this in the theme Access).

Metadata should be part of an internal and external network. They are exchanged within and between organizations. Agreements, standards and guidelines with regard to the form and content of the metadata are therefore crucial. These agreements can be made within the organization, but also on a national and international level.

Firstly, there are descriptive metadata that are used to describe the content of the audiovisual document. Examples are catalog descriptions, keywords, user annotations, controlled word lists and thesauri.

A second group concerns the administrative metadata. These are used for managing and administering collections. This concerns, for example, rights information, acquisition and conservation data. An increasingly important category is formed by the so-called ‘preservation metadata’, which record the life cycle of a digital object and thus demonstrate the authenticity through all kinds of dynamic modifications and processes. Technical metadata describe technical properties of the document but also include certain system information. Examples: file locations, audio-visual formats, database schemas, compression data, authentications. Metadata are organized in models. Their definitions are fixed in metadata dictionaries.