ABSTRACT
While searching the web, the user is often confronted by a great number of results, generally displayed as an ordered list. Due to the limits of this approach, we propose to explore new visualizations of search results, as well as new types of interactions with the results to make their exploration more intuitive and efficient. Indeed, even if the relevance of Infor-mation Retrieval Systems depends on the retrieved results, the effectiveness of the results visualization represents an al-ternative way to improve the relevance for the user. So this paper deals more specially with 3D visualizations of web search results. We present an interface which uses the 3D metaphor of the city to display the results. This 3D virtual city is defined thanks to the X-VRML language. From a more general point of view, the main idea of this paper is the 3D visualization of web documents such as search results, news or RSS feeds. In this case, the use of 3D graphics is less conventional because users do not associate such docu-ments (which are abstract and non-geographic data) to a 3D representation. Another important point is that the proposed approach is not space-driven because our goal is not to in-crease the visualization space but rather to exploit cognitive metaphors.
Author Keywords
3D Visualization of Search Results, 3D Metaphors, Web Min-ing
ACM Classification Keywords
H.5.2 [Information Systems]: Information Interfaces and Pre-sentation — User Interfaces; H.3.3 [Information Systems]:
Information Storage and Retrieval — Information Search and Retrieval; K.6.1 [Computing Methodologies]: Computer Graphics — Three-Dimensional Graphics and Realism INTRODUCTION
Searching the web is one of the most frequent tasks, but of-ten one of the most frustrating too. Search engines which are a way to represent the web to the users, are mainly used for web searches. However they are as easy to use as their results are difficult to interpret, which shows the web search contradiction. Indeed it becomes more and more difficult to extract the relevant information for a given search since available data on the World Wide Web is constantly increas-ing. The search engines return a number of results so great as it is necessary to search for new methods to visualize these
∗e-mail: [email protected] — This work has been carried out while working at the R&D Division of France Telecom.
results. These methods must be more adapted thanks to: a more relevant result organization, a richer visualization in-terface and an intuitive navigation in the result space.
This paper deals with the visualization of web search results.
This step, still neglected in some Information Retrieval Sys-tems, is becoming more and more important and essential.
It can be considered as a solution for enriching the results.
It is in fact complementary to the search process and is also a way to increase the result “relevance” for the user. If the result quality remains a major concern, the quality of the result restitution (organization and visualization) must be taken into account too. Without effective organization and visualization of the results, the user has to process manually the huge amount of results or refine the query in order to limit the results. This last solution can be compared to use a search engine for searching into the results! And these alter-native solutions require efforts from the users.
Facing the increase of query results, it seems natural to want to organize and visualize them in an effective and adapted way. That explains the goal of the presented approach, which is to propose an user-friendly search interface enabling the user to quickly find the relevant information. The two main points to reach this goal are a good document organization and an effective visualization. Concerning these two aspects, our directions are a clustering method (the self-organizing maps) and a 3D visualization. The choice of a 3D visualiza-tion enables to exploit cognitive metaphors (such as spatial metaphors). 3D graphics offer new interaction possibilities too. So it enables us to bring a new point of view to the result visualization. However, considerable new problems appear, such as the navigation in such an environment.
This paper deals with 3D visualization of web search results.
Indeed we present a 3D interface which displays the search results in a virtual city. And we also propose a discussion about our interface as well on the user’s perception and on the technical choices. So the paper is structured as follows.
The next section presents our interface which is based on an unsupervised organization of documents and on a display of the results in a virtual city. Then, in following sections, a discussion about this interface is proposed and some related works are given. The last section allows us to conclude and gives an outlook on future work.
3D VISUALIZATION OF ORGANIZED RESULTS
The visualization of search results has two main issues: or-ganization (or clustering) of the results and their graphical representation. For the first one, the goal is to find an effec-tive method which allows to group similar results together and to spatially organize the clusters. The second one is to find an effective visualization of the organized results. These two points are discussed in the following subsections but we mainly focus on the second point.
In the context of web search, documents are web pages turned by the query. In this paper, the organization of the re-sults is only based on the textual information of documents.
This information is exploited through a vectorial representa-tion (word vectors) of the pages, which is frequently used in Information Retrieval field. The number of results to process must also be specified because it is crucial for the organiza-tion and visualizaorganiza-tion choices. A recent study [iProspect, 2006] shows that 88% of users will try a new search if they are not satisfied with the listings they find within the first 3 pages of results. However it would be too restrictive to only consider the first 30 results (10 results per page). Indeed this study has been done on search engines with linear results vi-sualization (ordered lists) and users may want to see more results on 2D or 3D visualizations. Due to the lack of stud-ies for 2D and 3D visualizations, the maximum number of results is currently fixed to the first hundred results, which is more than the 30 results with which the users are satisfied.
Then a mixed interface for visualizing organized results is proposed. This interface is composed of a 2D part (Java ap-plet) and a 3D scene which represents the metaphor. A good definition of the metaphor word is: the realization of an as-sociation between graphical parameters of the presentation and information on the indexed documents.
SOM-based Organization of the Results
In this subsection, we briefly give some information about the organization of the results. For more details, see [Bon-nel et al., 2006]. A good 3D visualization of search results needs to be able to organize these results in the output space according to their content. This clustering problem has been investigated in many previous works [Hearst and Pedersen, 1996] [Zamir and Etzioni, 1998]. However, this point is not discussed in this paper and we only give the solution used in our interface.
We use a self-organizing map (SOM) [Kohonen, 1995] which is an unsupervised method which enables to cluster and to project documents onto an output space (generally a 2D map).
This method is used to carry out an on-the-fly clustering which can be considered as a post-retrieval document brows-ing technique. In other words, this clusterbrows-ing method orga-nizes documents (or word vectors) on a map with predefined size, which guarantees a good use of space during the vi-sualization. And the obtained organization has a neighbor-hood concept. Indeed two neighboring documents on the map have similar word vectors. Privileged application areas
2002]. Self-organizing maps have already been used for tex-tual data clustering such as in the WEBSOM project1 [Ko-honen et al., 2000].
Web search results are special textual documents due to their various size, content, vocabulary or reliability. So their orga-nization implies a particular SOM application whose adap-tation is described as follows. The SOM-based organiza-tion proposed can be divided in three main steps which are more precisely described in [Bonnel et al., 2006]. First the pre-processing step deals with the documents representation (word selection, word and document weights). Then the computation step corresponds to the execution of the modi-fied batch SOM algorithm. This modimodi-fied version takes the document weight into account and the input parameters are fixed to make this algorithm deterministic. The labels of the clusters are computed in this step too. Finally the post-processing step enables to group similar clusters (or map units) and then to obtain the various “topics” of the search.
These abstraction levels are the result of a hierarchical ag-glomerative clustering applied on the map units.
Also the SOM-based method proposed is only based on word distribution and has the advantage to respect the “semantic”
proximity of the data. It enables to have various levels of abstraction too. In our case, the weak number of results is not really a problem to compute the SOM, because the re-sult organization is more important than the clustering itself.
However one method which provides the best organization in all of the cases probably does not exist. That is why many organization methods must be defined in order to select the most adapted to each case.
Metaphor Context
The visualization interface proposed in this paper is based on the city metaphor. This choice is mainly justified by the cognitive aspect of this metaphor. And it seems adapted to a 3D environment contrary to the map metaphor where two dimensions are enough. A first version of this metaphor was developed and a user study was carried out on this metaphor.
Upon the test results, the city metaphor has evolved. The new version will also be tested in order to know if the mod-ifications answered what the users are waiting for and to identify new issues. Figure 1 gives an overview of the new metaphor whose explanations can be divided in four cate-gories: basic elements, cluster representation, visualization and navigation. The last three categories are described in the following subsection. 3D graphics functionalities are poorly exploited because the visualization deals with abstract data, which explains that we do not want to have a realistic ren-dering of the 3D scene. Now the basic elements of the city metaphor are introduced. Each building of the city repre-sents a web page, and the buildings are grouped by districts which are placed on the ground according to a grid. The building height represents the page relevance, which enables to quickly see the best classified pages according to this cri-terion. As our mapping choice for the relevance does not
al-Figure 1. Visualization based on the city metaphor.
low to visually differentiate two successive ranks, an interval approach is adopted. So the first 10 ranks (i.e. the first result page in traditional search engines) are associated to a high building height. The following 20 ranks are represented by a medium building height. And the other ranks have a small building height. These interval choices are motivated by a study on user attitudes [iProspect, 2006].
Application of the City Metaphor on Self-organized Re-sults
Cluster representation.
The districts are placed on the ground according to a 2D grid and each district represents a neuron (or map unit) of the self-organizing map. So this grid (which is square in our example) enables to map the search results classification obtained with the self-organizing algorithm presented in the previous section. Indeed the 2D grid on the ground is the same one as the SOM output grid. This organization has two interesting properties: the web pages (or building) unicity in the city and the “semantic” neighborhood between the dif-ferent pages and between the difdif-ferent districts. So the doc-uments of the same district are close to each other and two neighboring districts correspond to two topics as neighbor-ing as possible. Colors were chosen for representneighbor-ing clusters defined by the hierarchical clustering on the neurons. Each district is associated with one of these colors (which are dis-played on the ground). It enables us to show the main topics of the search. So the number of clusters is interface-oriented because each cluster is associated with one color.
Visualization.
The choice of a 3D interface to visualize the search results pleased the users. Broadly the 3D visualization is not a prob-lem because it corresponds to our natural vision. Moreover this 3D metaphor enables us to give an overview of a great
number of results. The building texture represents the docu-ment content, which enables us to quickly have an overview of the results when hanging around in the 3D environment.
Highlighting a building allows the user to see information (URL, snippet, keywords) about the associated document and information about four neighboring documents which are obviously close to the chosen document. However, the user test reveals that the user seeks a compromise between the comfort and the effectiveness of the visualization more and more.
Navigation.
A user study was carried out and shows that the main draw-back is the navigation in the city, which does not seem to be commonplace. Other navigation problems are mouse sensi-bility or the loss of reference marks in the city. So certain displacements toward strategic places of the 3D scene were simplified. To do that the 2D map of the scene was made interactive in order to be able to move to any district in only one click. This modification makes navigation more com-fortable but it must be coupled with other approaches. So solutions must be found in order to make navigation more familiar for the user (like navigation in a 2D interface). A more constrained navigation (and thus less tiresome for the user) must be proposed to avoid the user getting lost in the 3D environment (see figure 2). Therefore the dimension in-crease makes navigation essential and especially more com-plex. And this problem is not obvious to solve. Another point concerns the used distance. In real life, to go from one district to another in a city, we follow the streets in a block like fashion, which corresponds to the Manhattan distance.
This is the case in this metaphor only if the user uses the walking mode. However one interest of this metaphor is the use of the flying mode, and then the displacements are based on the Euclidean distance (the same as in the organization algorithm). So the user is not constrained to use the streets.
Figure 2. The user. . . lost in the 3D city?
DISCUSSION AND EVALUATION
Our visualization has the advantage to avoid supporting one topic for a query. Then the user has the possibility to choose their topic. Another point concerns data visualization which is strongly dependent on many criteria such as the results number and type, the search goal or the user category. So one single solution for data visualization probably does not exist. That is why an interesting characteristic consists of making the visualization adaptive. To do that many inter-faces have to be defined in order to choose the most adapted according to the context.
The evaluation task is something very important in the visu-alization process. In our case, the evaluation needs to take the organization and the visualization metaphor into account.
A user study was carried out and an extract of this user study can be found in [Bonnel et al., 2006]. This study was based on the well-known propositions of [Shneiderman, 1998] for evaluating graphical interfaces, and was mainly oriented to evaluate the visualization metaphor. However the interpre-tation of the results is hard because the users prefer the av-erage marks and then avoid the extreme marks. The aver-age mark for each question is generally higher than 3 (scale between 1 -bad- and 5 -good-), so it can be said that there is no drawback to use our approach for searching the web (compared to the existing ones). Another interesting remark concerns the fact that users are ready to use visualizations of clustered web pages. However a more relevant study and a comparison with other interfaces need to be carried out.
For this purpose, we are currently working on the definition of evaluation criteria. Concerning the runtime evaluation, it seems that the interface needs almost 10 seconds to display 50 results. However it is important to highlight the follow-ing points: there is no code optimization, the organization is more time-consuming than the 3D visualization, and the X-VRML interpretor (cf. prototype section) is written in Java, which enables to be independent from the platform but does not optimize the runtime efficiency.
RELATED WORK
2002]. These visualizations as well as our approach can be located in the literature thanks to the taxonomy of search re-sult visualization systems proposed in [Bonnel et al., 2005].
Among these various approaches, geographic metaphors (2D maps or 3D worlds) are often used because they can take ad-vantage of the cognitive aspect.
In this paper, the proposed visualization has two main par-ticularities : it is done on organized (or clustered) search results and it uses a 3D environment. Concerning the first point, the meta search engine KARTOO2, the clustering en-gine CLUSTY3 or the GROKKER4 interface are examples which take inter-document similarities (or content-based links) into account. However all these examples use a linear or 2D interface and therefore they do not propose an overview of the results. Concerning the second point, there are some interfaces which propose a 3D visualization of documents.
Moreover the third dimension is often used for replacing maps by 3D worlds such as landscapes [Boyack et al., 2002]
or cities [Sparacino et al., 2002]. These 3D approaches are a good way to give the user an overview of the results. How-ever these interfaces often propose a bad organization of documents or they are not user-friendly. Another work is the AVE method [Wiza et al., 2004] and its Periscope sys-tem which are the closest works to those ones presented in this paper. We have the use of mixed interfaces (3D scene and 2D interface) in common, or the use of many visualiza-tion metaphors which answer different goals. However the approach proposed in this paper takes the problem of data organization in a “semantic” point of view into account. In-deed it is not sufficient in the context of web search to only order the pages according to some low-level descriptors.
CONCLUSIONS AND OUTLOOKS
In this paper we present an effective method for organizing and visualizing search results. The organization is based on a self-organizing map which is adapted to the context of web search results. Concerning the visualization, we propose a new 3D approach based on a city metaphor which is very effective for representing organized documents. The graphi-cal interface is dynamigraphi-cally generated, interactive and based on a compromise between the 3D scene and the 2D inter-face. With the proposed method, we provide the user with a three levels approach: low-level with document visualiza-tion, medium-level with neuron visualization (similar
In this paper we present an effective method for organizing and visualizing search results. The organization is based on a self-organizing map which is adapted to the context of web search results. Concerning the visualization, we propose a new 3D approach based on a city metaphor which is very effective for representing organized documents. The graphi-cal interface is dynamigraphi-cally generated, interactive and based on a compromise between the 3D scene and the 2D inter-face. With the proposed method, we provide the user with a three levels approach: low-level with document visualiza-tion, medium-level with neuron visualization (similar