Originally published: Angeles, M. (1998, Fall). Information Organization and Information Use of Visual Resources Collections. VRA Bulletin, 25 (3), 51-58.
Abstract
It is becoming clear to researchers studying image management -- mainly, those studying image organization, image perception, and retrieval -- that image retrieval systems do not serve a broad range of users. These systems are only useful to searchers with knowledge of the systems' methods of organization. Recent research suggests that there is not enough knowledge on image use issues which directly affect the usefulness of image retrieval systems. This paper will review user studies in visual resources and library and information work in order to compare the traditional and emerging paradigms of image retrieval and will comment on their implication for systems design.
Introduction
Visual resources (VR) collections provide valuable research materials, in the form of images, to students and professional researchers of art history. Images, however, in the context of visual resources collections have historically been difficult to classify for retrieval purposes. Art historians and librarians have settled on various conventions of art historical classification in order to arrange their image collections in a manner reflecting their scholarship, and thereby facilitating retrieval through these classificatory systems. Classification, however, is only useful to those whose state of knowledge is the same as that of the system. Individuals who think of art objects or images in a different way (i.e. who classify by a different set of rules) may argue that the system is arbitrary and not useful given their individual method of ordering art information. The issue of classification is only one of many that has made it difficult to evaluate image retrieval systems in visual resources collections.
The trend in evaluating image retrieval systems has been to focus on the extent to which these systems represent images for a wide array of user needs. It is becoming clear to researchers studying image management -- mainly, those studying image organization, image perception, and retrieval -- that the systems for organization and retrieval of images do not serve a broad range of users. Rather, they are only useful to searchers with knowledge of the methods of organization associated with the information retrieval system.
It has been valuable to study image collections from the system perspective. However, an argument may be made that it is more valuable to determine the failure rate of these systems. An assessment of failure in systems provides impetus for the study of users and the use of image collections, and hopefully will lead to suggestions for systems designs which attempt to deal with the portion of users who are not finding what they want in the collection. Additionally, the trend for researchers in library and information studies has been to shift the focus of research from systems to users. The impact this has had on image management research is inestimable.
This paper will review user studies in the visual resources field in order to compare the traditional paradigms of image retrieval with emerging paradigms. Recent studies suggest that there is not enough knowledge on image use issues which directly affect the usefulness of image retrieval systems. Current research involving non-art images, especially in the work of Ornager (1997), Jörgenson (1995, 1996, 1997) and Turner (1994, 1995, 1997), suggest that image management can be very much informed by focusing on users and their information seeking behaviors.
Traditional classification systems may continue to serve art historians, but,it is certainly worth investing in research that will provide alternate access points to collections, and that will attempt to decrease the failure rate of retrieval in our systems. I will review some of the research that focuses on user queries and user perception of images, and I will comment on their implication for systems design. The paper will start with a definition of the field of visual resources/image management, and it will discuss some of the issues and problems unique to visual resources collections.
Visual resources: Art images vs. Non-art images
The documents which are collected in VR collections are images in various media and materials including photographic prints, 35mm slides, lantern slides,negatives, postcards, illustrations, drawings, electronic images, etc. These documents serve as pictorial surrogates for art objects which exist or existed in the real world. As such, the image documents may exist in many versions,shot from many perspectives, and existing in the collection on many formats. Betz (1982) and Barnett (1988) have shown us that the implications of this statement for cataloging or indexing the empirical data related to images --title, artist, medium, etc. -- is profoundly more complicated than the process of cataloging and indexing books.
The images which are usually collected in VR collections are those which picture an object or work falling within the domain of the visual arts. The distinction is made here because images which do not fit this description maybe referred to in this paper as non-art images -- those which are not typically collected in VR collections, and those which do not serve typical art historical research purposes.
Image retrieval
Image provision in VR collections has been closely married to art historical classification systems which have guided image organization for retrieval purposes. In order to judge the usefulness of image retrieval systems, let us first compare the traditional image retrieval system with the traditional information retrieval model used for text documents.
In the traditional information retrieval (IR) model a query is posed to a system in the form of a surrogation of the information need. This surrogate is compared with surrogates for document contents or messages. If a document surrogate is found to match the user's surrogate the information need is fulfilled. This type of information retrieval functions by Boolean or exact match logic. The traditional IR model is illustrated in figure 1 below.
This type of system has served users of text collections for some time, in systems ranging from the Hollerith punch card systems to electronic computer databases, but the use of such systems has been limited in image collections.Historically, the predominant form of retrieval system in image collections is based on classification of the images according to art historical categories such as:
art historical period
geographical location
artist name and dates
customary title of work
or perhaps,
medium
period
culture or country
school or artist
customary title of work
and physical arrangement of these images in cabinets marked by agreed upon headings. The above information may be recorded in abbreviated form on the slide label with additional information such as dimensions of the work. The researcher who intends to find an image in such a system must move through the physical space of the collection, browsing through slide drawers following the above hierarchy until the required artist and work is found. It is obvious that such antiquated systems, while they may function for experienced researchers,make it impossible to find documents based on aspects of the images which are not represented by the above categories, such as subject matter, for instance.The traditional IR model is usable because systems like OPACs, for instance,index a good deal of bibliographic data, and this data may be searched on a computer. The improved IR model uses algorithms that rank documents for retrieval and that utilize feedback methods to retrieve more relevant documents. Searching based on the former model, the traditional IR model, has only recently been possible in art historical VR collections.
Art historical research depends on the comparison of images. Topics of study which demand comparison of images that have particular features in common, such as iconographic or pre-iconographic content, is not easily facilitated in collections that only adhere to the traditional methods of image organization-- using physical arrangement to reflect categories of images, and using rudimentary cataloging and retrieval based on the traditional IR model. A need has long existed to broaden the paradigms of searching VR collections.
Problems related to image retrieval
The need to represent alternative aspects of images is identified in the work of art historian Erwin Panofsky. This need has also recently been identified in theoretical papers on image classification presented by art documentalists Barrnet (1988), Markey (1988), and Shatford (1986). These papers assert that subject matter indexing, while it is valuable for exploring research topics, is a difficult activity which has only recently been identified in the image management/art documentation literature. In these papers, the determination of the "ofness" (that which is objectively depicted in an image, often referred to as a pre-iconographic description) and the "aboutness" (that which an image subjectively represents, often referred to as an iconographic description) of images is explored. The identification of new aspects of image description which may be utilized by users searching a system is one step towards understanding images and representing new forms of human descriptions of images in our retrieval systems.
The field of semiotics has also contributed to our knowledge of images and their elusive qualities. We may find particularly compelling observations in the work of Roland Barthes, Umberto Eco, and E.H. Gombrich. Steiner (1981) might remark that the traditional image indexing methods are attempts to group the semiotics of art (iconology) with the semiotics of art history(iconography). Indeed, there appear to be many levels of cognitively approaching an image. Semiotics often attempts to break down images into elements of meaning and codes. A few observations of semioticians can be found in Steiner's collection of essays. Gombrich suggests that images which reflect nature are recognizable for natural and biological reasons, but abstract features are more difficult to deal with. Barthes calls for a treatment of the perceptual/cognitive aspects of art in terms of the viewer, and this adds to our difficulty in representing images systematically because individuals may hold different conceptions for and descriptions of an image.
Arnheim (1980) notes that psychologists such as B.F. Skinner suggest that we should focus on individual ways of looking at images and interpreting their meanings. Immeasurable aspects of images are often referred to as noise because it would indeed be difficult to systematically account for individual descriptions of image content. In library and information studies, however,these aberrant ways of looking at images are valued and are dealt with seriously because they represent the types of queries that systems should be able to handle.
In library and information work, the trend has been to focus on various aspects of the user engaging in an information seeking episode. The current period of user centered research comes in the wake of a paper by Dervin (1976) on sense making in information seeking. This user centered approach has also been seen in other influential library and information science research -- Belkin's work in Belkin, Brooks, and Oddy (1982) regarding anomalous states of knowledge in users during information seeking episodes, Kelly's (1963) work on personal construct theory, Kuhlthau's (1988) work on affective aspects of users in information seeking, and Taylor's (1986) investigations of cognitive processes in information seeking. An overview of the theories related to this work is given in Kuhlthau (1991). These papers identify that during an information seeking episode, many factors are involved in the initiation and procedure of information seeking for problem solving. Information seeking is often characterized by non-specifiability of the user need -- users cannot specify what they do not know -- and by other situational, cognitive, and affective aspects. The question often raised by these researchers is, how can we deal with non-specifiability in order to better serve users in problem solving. This type of research clearly suggests that individuals must be considered in the context of their separate information seeking behaviors. User descriptions of information needs that are vague or that do not match a description in the system should not be thought of as noise.
O'Connor (1995) suggests that the ultimate problem in image retrieval is that images often escape description and are therefore resistant to classification.Given the above research, it is clear that we should focus, therefore, on discrete aspects of image description and image perception in order to better understand how we may design image retrieval systems to reflect more types of situations and queries -- to serve more users. The understanding of individual methods of classifying or describing pictures may, therefore, be positive. Astep towards this goal is examining image seeking behaviors in terms of the questions and the contexts in which they arise. We should look at user studies and image perception to understand the various approaches to problem solving using an image collection, and should assess the effectiveness of image retrieval systems on the outcome of image seeking episodes.
Image seeking paradigms in visual resources collections
Image seeking paradigms have historically been determined by systems. Later proposals for information systems facilitated multiple information seeking models. The different types of image seeking paradigms will be analyzed below.
Browsing (the traditional image seeking paradigm)
The traditional image seeking paradigm in a VR collection is characterized by physical and cognitive movement through the arrangement of items in slide cabinets. Roberts (1976) described this method of organization as an arrangement of images by categories of medium, period, culture, or country,school or artist. Searching meant browsing through the categories under which one believed the image to be described. When one found an image or set of images from various cabinets they could be compared by viewing on a light box. While this method of organization served art historians well because it mirrored the categorization that is typical of the study paradigm of art history students, it did not serve researchers who were increasingly searching for visual materials by different categories. Roberts indicated that the alternative approach would be to additionally organize images by sign, symbol, and meaning, and to use indexes. Roberts suggests that this alternative approach would deal with the facets of description identified by Panofsky.
Bradfield (1976) surveyed slide libraries and their users' information seeking behaviors. Her study had important implications for assessing traditional browsing systems. Bradfield discovered that the following issues contributed to success in satisfying users' needs and lessening frustration in using the VR collection: 1) browsability in the collection, 2) comfort of the facility, 3) an atmosphere conducive to searching and to consulting librarians, and 4) having considerable time available for browsing. The most significant reasons for failure in retrieval dealt with browsability and time. Bradfield believed that "browsing is vital" because browsing presents alternatives in the idea generating process, and that "serendipity is valuable". Most users in this study identified that a failure to retrieve materials was due to a lack of time to browse. In a recent study analyzing questions in an image collection, Enser(1993) found that time and expertise factors related most significantly to success in using a picture system, suggesting that image seeking behaviors have not changed much in the 17 years between the studies. Given that image seeking behavior is determined by the image organization of the system, however,whether or not this statement tells the whole story is questionable.
Index searching (a relatively new paradigm, in manual or digital systems)
The work of Barrnet (1988), Markey (1988), Shatford (1986), and O'Connor (1985) were mentioned earlier in this paper because they called for alternative methods for image organization for retrieval purposes. This call has been echoed in the reference specifications by the Getty Art Information Task Force and the Visual Resources Association for a proposed standard for describing artworks and their associated surrogate images. These activities reflect the need to provide alternative forms of access to images in indexing and retrieval systems.
Image retrieval systems will continue to allow access to empirical data. The main type of added access allowed in these systems is subject access --referring to the subject matter or "about" descriptions suggested by Panofsky. Sunderland (1982) suggests that while the traditional browsing method of image organization is adequate for image provision in a museum context, he states that if more information can be entered into a system such as a database, the more valuable the information systems will be. Subject knowledge is still necessary when dealing with this type of system, and thus, the needs of naive searchers whose queries are not identifiable in the indexed matter continue tobe ignored in this image seeking paradigm.
A call for new image seeking paradigms
The above research on the traditional image seeking paradigm is based on user studies that assessed the usefulness of traditional image retrieval systems. In the first paradigm researchers observed image seeking behaviors within particular context, and they indicated that failure rate had much to do with an inability to spend time exhaustively searching the collection in a period or medium category. The implications of these studies suggest that systems design based only on this information seeking paradigm makes it very difficult for users to search for images with a great degree of efficiency or with a great deal of flexibility.
The latter image seeking paradigm is based on theoretical models for describing images. These models account for additional levels of description for images,but, are still based on conventions of thought and discard anomalous descriptions as noise. The success of using these subject indexing systems have not been studied very thoroughly.
These image retrieval systems have ambitiously made it possible for users to search for images within the narrow scope of art history using a systematic method of classification. The latter methods of image indexing have brought image retrieval closer to the traditional IR model, especially when computers are used for these systems. These methods, however, do not account for all ways of describing images. Often, it has been difficult for users to describe their image needs, because the content which individuals describe is not represented in the system. In many cases there are not words to describe the content of the images they seek.
New research dealing with images attempts to grapple with the elusiveness of images in order to additionally service the non-specifiable queries. These researchers are attempting to learn more about image perception, image users,and image use within various contexts. These contexts include information seeking in mediated art and non-art image collections, and unmediated image description based on image perception observation experiments. It is believed that addressing these types of issues in an image retrieval system will add value to the already existent image retrieval paradigms. The library and information professions have begun to look at new areas of research in order to better understand how we may serve users of image collections.
Image users and image use environments
In a recent study by Ornager (1997) the use of digital image collections in newspaper archives was observed to determine the kind of questions users ask of the archive, and to group into categories the kinds of users of the archive. Typical observation activities included word association testing as a method for observing unmediated image descriptions. The intention of Ornager's research is to define an operational subject indexing strategy for images. Sheapproached the idea of subject content from the perspective of Panofsky and Shatford, regarding "ofness" and "aboutness" of images, as well as from theperspective of Barthes regarding textual analysis as a process of identifyingsigns, symbols, and feelings associated with images. Ornager believes, following her analysis of user description of images, that indexing mustencompass factual description (ofness), expressional content description(aboutness), and indication of the context in which the image can be used.
The most significant conclusion Ornager makes is that user queries follow apattern that correlates to their placement within a user typology. This sort ofidea has been suggested by Taylor's (1991) research related to information useenvironments, specifically dealing with information seeking in a hospitalsetting. The theory suggests that sense making within certain contexts or bycertain types of users may directly relate to the behaviors associated withseeking information. While Ornager is concerned with enhancing user interfaceto deal with aspects of the information querying system, I believe the moresignificant implication of her research deals with the observation of patternsof use and the possibilities of enhancing systems based on user typologies.
Image perception and the influence of image use systems
In a series of image retrieval papers, Corrine Jörgenson has been raising issues that question the methods currently available for querying image retrieval systems. Jörgenson (1995) began studying human pictorial image perception in naive image users, and reduced the typical attributes which humans use to describe images into a template. This template of attributes consisted of objects, people, and other facets. Jörgenson based her attributes on the terms which participants typically used to describe images in a variety of tasks.
In a follow-up study, Jörgenson (1996) tested the template of image attribute classes she identified in 1995. In contrast to Ornager's study, which observed image description in an unmediated fashion, Jörgenson's study was concerned with investigating whether a template for image description would be useful to participants in framing their image descriptions. The findings of this study indicated that users may have difficulty when using a template in assigning descriptors to higher-level classes based upon conceptual or functional relationships. She suggests in reaction to the results that the solution to this difficulty may lie in user training or guidance. Her most recent research suggests a radically different conclusion, however.
In her most recent research results, Jörgenson (1997) asked participants to describe images within the template of 47 attributes developed in the 1995 study. Her recent data suggests that the terms chosen after being introduced to a template are significantly different from the terms they may have typically and naturally chosen without use of the template. This suggests -- and this position is surely debatable -- that users are less likely to describe images as they naturally perceive them (using abstract concepts such as "mysterious", for example) when confronted with a traditional image retrieval system, because users are aware of the shortcomings of information systems in dealing with such descriptions. Jörgenson's research sets a precedent for considerations in re-designing or radically altering image access systems. It illustrates the need for providing image access in a manner that satisfies a variety of userneeds. It also suggests that templates or mediated query frameworks in an image retrieval system are most useful for retrieving images by low-level descriptions (ofness) or when dealing with a conventionally agreed upon system of classification. A template may not be useful for retrieving images by abstract descriptions.
In a series of papers given by James Turner, image perception is studied from auser perspective in order to better understand how systems might reflect discrete user descriptions of images. Turner's (1994) user study gathered empirical data on spontaneous image description to suggest new directions for systems design. His study sought to confirm whether participants really distinguished between pre-iconographic (ofness) and iconographic (aboutness) content in images. Turner found that pre-iconographic indexing is necessary, but iconographic indexing of ordinary pictures was questionable within this context. He believes that a general classification of ordinary objects is necessary in an image retrieval system -- an ofness classification that describes what is typically spontaneously observable by a human subject. This finding supports Jörgenson's research. In Turner's (1995) study comparing indexer terms for representing moving image documents to user terms in searching for these documents, he found that there was a high degree of agreement between terms. He suggests in this context that pre-iconographic (ofness) level indexing in addition to iconographic (aboutness) indexing would help improve retrieval rates in this particular collection.
In yet another study of moving image documents, Turner (1997) expands the scope of his observations to analyze natural language descriptions of moving image documents in shot by shot analysis. He indexes voice-over audio descriptions (the audio analog for sight-impaired to closed-captioning for the hearing impaired) of these documents. The words are transcribed into textual documents that are indexed in the same manner as print documents. The present study attempts to continue observing image description primarily at the ofness level, and it remains to be seen how he will interpret the data, and whether he will find any value in the use of ofness description within this context. It is clear, however, that Turner believes that the ability of a system to deal with various levels of description of documents for image provision is largely determined by the context in which these images are needed and used.
A suggestion for creating user centered image retrieval systems is also found in O'Connor (1996). O'Connor noticed the inconsistency that occurs between individuals -- and indeed also within one's self -- when they describe what images are about. While the representation of image content is historically handled by external agencies, O'Connor believes that one way to deal with these inconsistencies is to place the power of representation within the system into the user's hands. He believes user-generated descriptors will cluster around certain images, and retrieval of images within descriptor clusters may be performed using ranking through frequency counting and computer manipulation. This proposal serves to increase our aboutness access to images, and it represents a radical solution that has not been seriously considered in most systems designs.
It seems, in light of all of the above research, that mediation in an image retrieval system may influence image retrieval because these systems do not account for images in the same manner in which users describe them. The implication of studies being pursued by Jörgenson and Turner is that systems may not serve a great deal of naive users. The suggestion may be implied that at the database infrastructure level, ofness-level retrieval may assist in increasing the effectiveness of image retrieval systems. We may also argue that these systems should be more flexible, in order to accommodate user types, such as suggested in Ornager. In light of Jörgenson's findings and O'connor's suggestions, it should also be flexible enough to allow users to search for images in their own natural language, rather than using pre-coordinately defined templates for image searching.
Image users and new interactions with image use systems
Some recent observations on systems design developments and suggestions for future design which have begun to be made public in the library and information professions attempt to deal with adding more points of access to image documents. While the impetus for developing new features may not have been directly influenced by the needs of users these features are nonetheless increasing the types of approach to collections that are possible. In a recent paper on image queries in electronic databases, Cawkell (1993)suggests a new paradigm by which users pose questions about color, shape, and texture, and by which users may even submit a query by drawing a sketch. This research is significant because it illustrates that the addition of pre-iconographic elements of images may now be used for image access. Cawkell suggests that such computerized access methods are reasonable because there is little published literature regarding the kinds of questions asked and the success of librarians or information systems in responding to those questions. Cawkell also believes that the field has not done sufficient failure analysis on searches of image collections.
The weight of Cawkell's statements can be further validated in experimental systems such as the QBIC (Query by Image Content) project at UC Davis. In a recent paper given by Holt and Weiss (1997), the QBIC system is analyzed to illustrate the usefulness of providing pre-iconographic access to color shape and texture features of images, which Markey (1988) identified as the key to removing the barriers to access in traditional image access systems. Holt and Weiss noted that QBIC-style searching cannot replace text systems for thematic searches, but they do offer added means for image access, which helps possibly reduce the failure rate of image retrieval.
These studies reflect, I believe, what researchers have argued for in image retrieval systems. They reflect the move towards adding pre-iconographic descriptions to databases in terms of image shape, color, and texture aspects. This type of access has also been used as a round-about solution for finding ofness aspects, such as when one searches for portraits, by specifying oval shapes in queries using sketches such as the one at right.
In databases that allow querying by color features, searching for portraits using the above method, and searching for images with similar color tones --i.e. using "query by example" methods -- will return images for users looking for portraits that identify persons of color. In a sense, this is a search for an ofness attribute of an image. While this type of searching may not have been intended by systems designers, it is clear that image searchers will utilize the tools they are given, in order to increase the usefulness of their image retrieval systems, and to decrease the failure rate due to the inability to handle certain user search descriptions and unspecifiable search needs.
Conclusion
There have not been sufficient user studies that help determine what is needed in image retrieval systems to account for un-indexed aspects of images and to service the unspecifiable needs of users. In light of the image seeking behavior studies that were reviewed in this paper, it seems that the fields dealing with image perception and image management have made a substantial argument for doing further image use and image user studies based on the implications of these research findings. There is a substantial call for subject and pre-iconographic indexing in systems which are intended to serve a broad range of user queries. This type of enhancement to systems is possible. However, there are a great deal of other types of questions that remain unanswered by systems. Many of these questions may be discovered through user studies.
It is clear that we do not know enough about how humans perceive images and use human language terms to seek images in image retrieval systems. We require a further look into how to automatically index ofness-level descriptions of images. We would benefit from an assessment of automatic image content indexers such as those being used using QBIC technology. There is now a broader range of dimensions of image user studies and image use studies that are possible, given the influential research that has recently been published. It is certain that it will take many researchers to carry the flat, to bring us closer to understanding the elusiveness of the image with the hope that we may more easily find them in our image retrieval systems.
