A Visual Retrieval Environment for Hypermedia Information Systems

pdf
Số trang A Visual Retrieval Environment for Hypermedia Information Systems 27 Cỡ tệp A Visual Retrieval Environment for Hypermedia Information Systems 2 MB Lượt tải A Visual Retrieval Environment for Hypermedia Information Systems 0 Lượt đọc A Visual Retrieval Environment for Hypermedia Information Systems 0
Đánh giá A Visual Retrieval Environment for Hypermedia Information Systems
4.6 ( 18 lượt)
Nhấn vào bên dưới để tải tài liệu
Đang xem trước 10 trên tổng 27 trang, để tải xuống xem đầy đủ hãy nhấn vào bên trên
Chủ đề liên quan

Nội dung

A Visual Retrieval Environment for Hypermedia Information Systems DARIO LUCARELIA and ANTONELlA Centro Ricerca di Automatic, ZANZI ENEL We present a graph-based object model that may be used as a uniform framework for direct manipulation of multimedia information. After an introduction motivating tbe need for abstraction and structuring mechanisms in hypermedia systems, we introduce the data model and the notion of perspective, a form of data abstraction that acts as a user interface to the system, providing control over the visibility of the objects and their properties. A perspective is defined to include an intension and an extension, The intension is defined in terms of a pattern, a subgraph of the schema graph, and the extension is the set of pattern-matching instances. Perspectives, as well as database schema and instances, are graph structures that can be manipulated in various ways. The resulting uniform approach is well suited to a visual interface. A visual interface for complex information systems provides high semantic power, thus exploiting the semantic expressibility of the underlying data model, while maintaining ease of interaction with the system. In this way, we reach the goal of decreasing cognitive load on the user, with the additional advantage of always maintaining the same interaction style, We present a visual retrieval environment that effectively combines filtering, browsing, and navigation to provide an integrated view of the retrieval problem. Design and implementation issues are outlined for MORE (.Multimedia Object Retrieval Environment), a prototype system relying on tbe proposed model, The focus is on the main user interface functionalities, and actual interaction sessions are presented including schema creation, information loading, and information retrieval. Categories and Subject Descriptors: H.2. 1 [Database Management]: Logical Design—data models:H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval-query ~ormulation; selection process; H.5. 1 [Information Interfaces and Presentation]: Multimedia Information Systems—hypertext nauigatiorr and maps; H,5.2 [Information Interfaces and Presentation]: User Interfaces-interaction styles General Terms: Design, Human Factors, Management Additional Key Words and Phrases: Browsing, complex objects, direct object manipulation, graph-oriented models, hypermedia applications, information filtering, visual interface This work was supported by the Italian Electrical Energy Company under the research project 0. L.240 Multimedia Systems. Authors’ addresses. D, Lucarella, Centro Ricerca di Automatic, ENEL, Via Volta 1, 1-20093 Cologno Monzese, Milano, Italy and Dipartimento di Scienze dell’Informazione, University degli Studi di Milano, 1-20135 Milano, Italy; email: lucada(Q imicilea.cilea.it; A. Zanzi, Centro Ricerca di Automatic, ENEL, Via Volta 1, 1-20093 Cologno Monzese, Milano, Italy; email: zanzifl cra.enel.it. Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. @ 1996 ACM 0734-2047/96/0100-0003 $03.50 ACM Transactions on Information Systems, Vol. 14, No. 1, ,January 1996, Pages 3-29. 4. D. Lucarella and A. Zanzi 1, INTRODUCTION Hypermedia has been simply defined as a system to manage a collection of information that can be accessed nonsequentially. It consists of units of information that are arbitrarily diverse in form and content. Such units may contain texts, graphics, images, sound, video, and animation and are connected by logical links to form an information network. The variety of nodes and links that can be defined makes hypermedia a very flexible environment in which information is provided both by what is stored in each node and by the way the information nodes are linked to each other. In addition, current hypermedia systems provide sophisticated user interface tools that enable the reader to inspect the node content and to navigate through the network by selecting paths to follow on the basis of interests emerging along the way [Nielsen 1990]. There is a growing interest today in such technologies for the implementation of massive multimedia information systems, but unfortunately, several well-recognized problems continue to be open research issues [Halasz 1988]. Among these, central points to be addressed are information modeling and information retrieval. 1.1 Information Modeling The simplicity of the basic hypermedia model does not appropriately represent the structure of the information. There is an inadequate separation between a node in the hypermedia network and the content associated with the node. Conversely, a strong separation of the structure from the content would allow many structures to be superimposed over the same set of information units or a unit to be shared among many nodes within a single structure. In addition, a node is a storage unit for a collection of data items without any structural information, and each node and link are assumed to be of the same type. As a result, modeling is more or less a bottom-up process in which we have to analyze how information can be broken down into different elements and then to recognize these individual elements by adding links among them. The problem here is that such an analysis is only useful for that particular instance, and we cannot use this same structure for other instances [Tompa 1989]. The key point is that the basic hypermedia data model is too simplistic. It is not suitable for modeling the real world or capturing its semantics as required in most applications [Furuta and Stotts 1990; Garzotto et al. 1993; Schnase et al. 1993a]. As a consequence, the user has dif%culty in perceiving the conceptual model of the application, resulting in cognitive overhead [Conklin 1987]. In authoring mode, extra mental effort is needed to establish the links required to connect the created nodes. In reading mode, extra mental effort is needed for choosing the path to follow through the network, with the risk of becoming lost or disoriented. One of the main ideas proposed by Garg [ 1988] is that information embedded into the hypertext network should be described by a set of predefined ACM Transactions on Information Systsms, Vol. 14, No. 1, January 1996. Visual Retrieval Environment for Information Systems . 5 domain objects. In this way, the actual content of the hypertext would be represented by a set of information objects, each of which is an instance of a domain object, inheriting by default all of the properties of the domain object. The idea can be compared to the notion of database schema, as opposed to a specific instance of the database. According to this trend, many hypermedia systems have been proposed with the support of underlying databases [Campbell and Goodman 1988; Christodoulakis et al. 1986; Lange 1990; Schnase et al. 1993b; Schutt and Streitz 1990]. Recently, requirements for representing the structurally complex interrelationships that arise in hypermedia have generated a renewed interest in semantic data models [Hull and King 1987]. Semantic models attempt to provide more powerful abstraction and structuring mechanisms for specifying database schemas in order to overcome the limited modeling capabilities of traditional database systems [Beeri 1990; Lieberherr and Xiao 1993]. Schnase et al. [ 1993a] presented a comparative analysis of semantic models, concluding that a structural object-oriented paradigm appears to be superior for hypermedia modeling. Of particular interest are graph-based data models since they provide a natural way of handling data that appear in applications such as hypermedia or multimedia information systems. Gyssens et al. [ 1990] proposed a graph-oriented object database model in which the database schema as well as the database instances can be seen as graphs with the data manipulation language expressed in terms of graph transformations. Amann and Scholl [1992] presented a graph data model with an associated algebraic language based on regular expressions over the data types and showed how such a language can be exploited for hypertext querying. In the same direction, in this article we propose a graph-based object model which provides high semantic expressibility, and we use it as a uniform framework both for conceptual modeling and for direct manipulation of the stored objects. 1.2 Information Retrieval In hypermedia information systems, interaction is mainly devoted to information retrieval. A canonical approach is based on formal querying [Bertino et al. 1992; Straube and Ozsu 1990]. Conversely, browsing techniques consist of exhaustively viewing part of the information base until the desired information has been found. The former approach requires a deep knowledge about the query language, the conceptual structure of the application, and the goals; the latter does not require a preliminary knowledge. On the other hand, a formal query, if correctly formulated, can be directly evaluated and may yield an immediate answer, whereas a browsing session can take a long time before converging to the goal or may not converge at all. Between these two mentioned interaction techniques, other approaches must be studied with the aim of balancing expressive power and ease of use. Some approaches to the integration of query-based retrieval strategies in a hypertext network have been proposed recently. Logic-based languages have ACM Transactions on Information Systems, Vol. 14, No. 1, January 1996. 6. D. Lucarella and A. Zanzi been proposed by Consens and Mendelzon [1989], Lucarella [1990], Afrati and Koutras [1990], and Beeri and Kornatzky [1990]; different attempts to exploit the hypertext links in the retrieval of the relevant nodes have been reported by Croft and Turtle [1989], Lucarella and Zanzi [1993], and Frei and Stieger [1992]. A common aspect to such proposals is that no concept of schema has been introduced, and thus, queries can be specified only over the hypertext network in order to get an optimal starting point for browsing. Conversely, as remarked in the previous section, the approach we are taking in this work is based on a semantic data model, the primary objective being to provide powerful visual constructs for representing a variety of abstractions in a structured fashion. Unfortunately, as soon as the underlying data model becomes more complex, the level of complexity of the associated query language and the level of knowledge required by the user also increase. The main goal becomes the design of a language that provides both high semantic power and ease of interaction with the system. With this objective in mind, we propose a visual query paradigm. The user performs actions symbolically and directly on the screen and is able to express operations by grabbing and manipulating visual representations of objects. The user is not required to know any complex formal language, with the advantage of maintaining the same interaction style normally used during browsing. The effect produced by the query is perceived as a form of filtering and navigation space restriction, So it is natural to pass from querying to browsing and vice versa, depending on the type of user, the type of application, and the type of current needs. By effectively combining browsing and querying under a uniform interface, we provide an integrated view of the retrieval problem. Much research has been carried out in the database community on graphical query languages that has influenced our approach at different levels. Basic principles and a survey of such efforts can be found in C!atarci [1992] and Batini et al. [1992] respectively. Most graphical interfaces are based on intensional data models, typically the entity-relationship model [Angelaccio et al. 1990; Kuntz and Melchert 1989; Wong and Kuo 1982] or the extended entity-relationship model [Auddino et al. 1991; Czejdo et al. 1990]. ISIS [Goldman et al. 1985] and its extension ISIS-V [Davison and Zdonik 1986] provide a visual interface to the semantic data model SDM [Hull and King 1987]. SNAP [Bryce and Hull 1986] is a system based on the IFO data model [Hull and King 1987]. More recently, some projects have dealt with object-oriented data models [Epstein 1990], and DBface provides a tool for building graphical interfaces to object-oriented databases [King and Novak 1993]. The remainder of this article is organized as follows. Section 2 provides a description of the semantic model on which the MORE system is based. This section also includes an example subschema that contains multimedia information about the organization and the activities of our research division. The visual retrieval environment along with the formal definitions of perspective and the operations on perspectives are presented in Section 3. Various examples illustrating the expressive power of the language are presented ACM Transactions on Information Systems, Vol. 14, No. 1, Janusry 1996 Visual Retrieval Environment for Information Systems . 7 with reference to the example subschema shown in Section 2. Section 4 sketches design issues for the MORE prototype system focusing on the main functionalities and presenting visual interaction screendumps taken from the actual application. Section 5 provides a comparison with related work. Finally, brief conclusions and future research work are outlined in Section 6. 2. A GRAPH-BASED OBJECT MODEL The basis of the approach is the characterization of the information system in terms of objects, attributes, and relationships, namely, a general object-oriented conceptual model. An object is an entity of the real world, a concept, an event, a process, or anything else that an application tries to capture and represent. Objects have their own identity that does not change throughout their lifetime and are known by their properties. The specific set of properties used to describe a given object depends on the point of view and the purpose of the modeling. We recognize properties only through attributes. Objects having the same structural properties are grouped together to form an object class. Classes can be related by a superclass-subclass relationship in which an object in a subclass inherits the structural properties from its superclasses. Object attributes can be divided into two general categories: simple and complex. The domain of a simple attribute is a system-defined basic type; the domain of a complex attribute is a class. At the intensional level, the conceptual schema captures this semantic structure. It is defined by a collection of interrelated classes and types, and as such it can be represented by a directed labeled graph. Objects and classes are related by the instantiation relationship. At the extensional level, the information system can be viewed as a collection of interrelated objects, and as such it can also be represented by a directed labeled graph. Thus, the information system can be represented by graphs at both the intensional and the extensional level. A formal definition of such concepts is given next. 2.1 The Model Definition. The conceptual schema 2 is defined as the five-tuple where: —C is a finite set of class names; each class c E C denotes a structure (in terms of attributes) and an extension (the collection of objects that have that structure). —T is a finite set of type names (e.g., integer, system; each t E T denotes a type of primitive associated values. text, picture) built into the object, and V(t) is the set of —A is a finite set of attribute names. Attributes are defined on classes. Attributes may be simple or complex. The domain of a simple attribute is a ACM Transactions on Information Systems, Vol. 14, No. 1. January 1996. 8. D. Lucarella and A, Zanzi basic type t ● T; the domain of a complex attribute is a class c e C. In addition, we distinguish between single-valued attributes As and multivalued attributes A., with A = A, U Am. —9 c C x A x (C u T) is the property relationship. If (c,, a, Cj) =9, then the class c, has the attribute a, having as a domain the class or type Cj. —% c C x C is the inheritance partial ordering relationship. If (cl, c, ) ●%, then the class Ci is a subclass of the class Cj inheriting attributes from CJ. Definition. graph Given X, the conceptual G(Z) schema graph is a directed labeled = (iV, E), where: —N = C U T is the set of nodes. For each c = C, we have a rectangularshaped node labeled c. For each t E T, we have an oval-shaped node labeled t. —E is the set of edges. For each (c,, Cj) = % we have a bold edge connecting Ci to CJ. For each (c,, a, c,) = @ we have an a-labeled edge from Ci to Cj. Particularly, if a = As we have an edge with a single arrow; if a ● Am we have an edge with a double arrow. Definition. tuple The multimedia information M=(X, system M is defined as the four- O, Y, P), where: —2 is the conceptual schema defined above. —O is the set of objects stored into the system. —> c O x C is the instantiation stance of a class c = C. —% c O X that the instance following relationship. Each object o = O is an in- A X (O U V(T)) is the link relationship. (o,, a, Oj) =9 denotes attribute a of the object Oi has the value Oj. Assuming the o, of c, and the Oj instance of c,, we have (o,, a, o,) ~& iff one of the conditions holde: (1) (Cl, U,Ci) G 9; (2) (c,, Ck) ~ 2?A (ck, a, cj) G 9; (the conditions (3) (cj, c~) ~%’A (C,, a,ck) The last two conditions inheritance relationship. or alternative) =9. are the direct consequence Definition. Given the multimedia graph is a directed labeled graph G(M) information of the semantics system M, of the an instance = (N, E), where: —N = O U V(T) is the set of nodes. Nodes represent objects nodee) or values (oval nodes) generated from the schema instantiation relationship. ACM Transactions on Information Systems, Vol. 14, No. 1, January 1996. (rectangular through the Visual Retrieval Environment for Information Systems Fig 1, Graph-based object model: Intentional and extensional . 9 levels —E is the set of edges. For each (o,, a, o~) = Y’, there is an a-labeled from o, to 0]. edge Based on this model, Figure 1 gives an example that shows how we can use a graph-based representation at both the intentional and the extensional levels. Note the effect, at the extensional level, of the inheritance relationship between the class student and the class person. 2.2 A Sample Hypermedia Application In order to demonstrate the capabilities and the flexibility inherent in the approach discussed, a hypermedia application has been developed. The application is aimed at storing multimedia information concerning the structure of the organization and the activities of our research division. It describes the hierarchical structure of the research units, including information about management, personnel, financial budget, research projects, and project leaders. A portion of the schema graph is presented in Figure 2. This schema is used throughout the article as the knowledge base to which all visual operations will be posed. With reference to Figure 2, rectangular nodes in the graph represent classes, and oval nodes represent basic types. Labeled arrows starting from a class depict the properties of that class. Multivalued properties are shown with double-headed arrows. The bold lines express the inheritance is-u relationship from a subclass (at the tail of the arrow) to its superclass. In the following, we describe in further detail the meaning of the objects depicted. Research Unit groups the common attributes (name, direction, ACM Transactions on Information Systems, Vol. 14, No 1, ,January 1996. 10 D. Lucarella and A. Zanzi . expmses 1 pmonnr.1 Research unit is-a I is-a direction is-a t+ is-a Division Fig. 2. Conceptualschemagraph. mission, personnel, and expenses) shared by the units at different hierarchical levels. The Research Division represents the administrative and strategic central headquarters to which all of the research centers spread throughout the country report. The Research Center is a department, characterized by a specific research area with its own laboratories. The Laboratory is the operative research unit, with its own equipment, in which the research projects are carried out. The Research Project is characterized by title, subject, description of objectives, project leader, and a short movie presenting its current state with the main results achieved. Note that some research programs are carried out as joint projects, and consequently, a cycle is present in the graph. The Experimental Installation represents an installation characterized by its name and location, where some experiments that cannot be made in the laboratories are carried out in the field. The Person groups the common attributes (name, resume, and photo) shared by the manager and the project leader. The Manager is the head of a research unit: the central division, a research center, or a research laboratory. The Project Leader is a person who is in charge of a research project. Finally, the Employees class gives information about the personnel in a research unit, grouped by category and by age, respectively; and the Budget class represents the financial planning of a unit, both in terms of the estimate of the expenses and of the balance. Note that the conceptual schema of the application is directly entered and manipulated on the screen by the application designer supported by an appropriate visual tool (see Section 4). ACM Transactions on Information Systems, Vol. 14, No. 1, January 1996. Visual Retrieval Environment for Information Systems 3. VISUAL INFORMATION . 11 RETRIEVAL In this section we deal only with the retrieval and presentation issues without considering other functionalities. In addition, a clear distinction between the information user and the information supplier is quite common in these systems, since object loading and updating often require specialized multimedia editors depending on the type of object manipulated. We have already discussed in the introduction the main reasons for developing a visual interface based on the direct-manipulation paradigm and the expected advantages for the end users in terms of abstraction power, ease of interaction, and flexibility. Basic requirements are the visualization of the conceptual schema as well as the database instances, by enabling the user to filter the amount of information to be displayed. Selective information visualization can be used to locate relevant information and to restrict the visualization to the pertinent parts. A reasonable way to present complex information is to produce multiple views of the same information, each focusing on different aspects and thus conforming to different needs. The cognitive overhead required to face tangled information structures can be alleviated if the system presents only the relevant pieces of the stored information while hiding the rest, In analogy with the views in databases, we introduce the notion of perspective, 1 a form of data abstraction that acts as a user interface, providing control over the visibility of the system objects. A perspective can be tailored to focus selectively on the subset of information that is significant to a particular application. Essentially, perspectives are graph structures that are built from the schema graph and are visually manipulated in various ways. Related works on graph-based object manipulation are reported by Andries et al. [ 1992] and Guo et al. [1991]. In the following, we provide formal definitions for perspectives and a basic set of operations that can be performed on perspectives. For each of these in turn, we give the formal definition, the visual expression, and an example referring to the previous application. 3.1 Perspective Defin it ion. Given a multimedia defined as P(rr, S ),where: information system, a perspective P is — rr is the perspective pattern, that is, a weakly connected subgraph of the schema graph 2; hence, N( rr) c N(Z) denotes the subset of schema nodes (classes and types) included in the perspective, and E(n) g E(2) denotes the set of edges (properties) associated with such nodes. —S is the set of object graphs generated by the perspective graph T through the instantiation relationship. Given an instance s E S, each node o = N(s) 1The term perspect~[x, has already been introduced by Garzotto et al. [ 1993], hut with a different meaning. 2A directed graph is weakly connected iff the graph obtained by removing the arrowheads is connected ACM Transactions on Information Systems. Vol. 14. No 1. January 1996 12 . D. Lucarella and A. Zanzi name 1 Person A is-a I Research unit Research Center subject Fig. 3. A perspective over the schema. is an instance of the corresponding node c E IV(w) and the edge ( Oi, a, Oj) = E(s) iff the edge (ci, a,cj) ● E(m). So, a perspective is defined by a pattern (the intensional representation) and by the corresponding object graphs (the extensional representation). In order to define a perspective, the user has to build the pattern into the “perspective window.” The requested nodes are copied from the “schema window” by pointing and clicking. The system checks automatically that the resulting graph is connected. In this way, incorrect perspectives cannot be specified, since the patterns conform to the structure of the schema. In Figure 3 we show a perspective focusing on those parts of the information system in which the user is interested. In the example, attention is restricted to the research centers and their laboratories including, for each of these, the research projects and corresponding project leaders. Perspectives can be named, saved, reused, and manipulated in various ways. In particular, it is possible to define perspectives on perspectives, thereby producing different levels of abstraction. All of the operations on perspectives are closed, thus removing the major drawback of current objectoriented query languages that do not maintain the closure property [Shaw and Zdonik 1990]. Consequently, in our approach, the result of each operation has the same structural properties as the original objects; thus, it can be further processed by the same set of operators. 3.2 Object Filtering In order to restrict attention to a subset of pattern instances tive, a filter can be defined over it. ACM Transactions on Information Systems, Vol. 14, No. 1, January 1996 in the perspec-
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.