• 沒有找到結果。

1. Introduction

1.1 Research Background

Libraries are important to culture development and its influence is gradually increasing in the today’s Internet age; because the Internet effectively widens acquisition of libraries materials (Hundie, 2003), broadens the accessibility of libraries (Barker, 2001) and encourages communities to share information, rather than restricting access to it (McCray and etc., 2001). For example, the Citeseer (Citeseer, 1997) is a well known and popular online digital library. A large number of academic papers related to computer science can be searched on it (Giles et al, 1998). One important part of Citeseer is the software robot (“crawler” or “spider”), which can retrieve and store all related papers in Adobe Portable Document Format (PDF) or PostScript (PS) format from other Web sites (Raghavan et al, 2001). Citeseer then indexes these documents. Users may search Citeseer for documents pertinent to their area of research, and users may download one or more documents as required.

The first possible concern of an Internet librarian or library constructor is the amount of collections in the library. For example, Citeseer only focuses on the research papers in relation to computer science and, in order to acquire as many papers as possible, it employs software robots rather than manually collecting papers on the Internet. Generally speaking, an Internet librarian or a library constructor prefers collecting the largest amount of collections subject to the budget limit and the subjects. In the Internet world, software robots which can automatically acquire materials are a popular choice to achieve this goal. Moreover, a software robot with screening ability, such as keywords selection, can also help the library constructors to choose the works belonging to the preset subjects.

The next concern for an Internet library, along with the growing of the collections, is the copyright issue which is very essential to libraries; in fact, it may be the one which librarians most concern about, no matter for a traditional mortar-and-brick

library or a digital library (Lopatin, 2006; McCray and etc., 2001). The copyright issue is arose when the collection of the library is still copyright protected. According to modern copyright laws, such as 17 U.S.C. 106 and the WIPO copyright treaty, the creators of a copyrightable work automatically own the copyright of the works upon completion; and no one can reproduce, modify or distribute such works without the owners’ consent (Rao, 2003). That is to say, copyright is one of the important issues which could impede the development of digital library because the dissemination of copyrighted works, one of the basic functions of a library, could result in copyright infringements (Bolin, 2006). In fact, subject to other same conditions, the amount of collections in a library free from copyright infringment allegations is definitly less than it of a library disregarding any copyright issues.

Before deeply discussing the collecting methods and copyright issues, it will be very helpful to examine several illustrative websites or libraries which acquire their collections via the Internet. We especially focus on what kinds of works in these sites, how these sites acquire collections and how they circumvent possible copyright infringement allegations.

The first example is the Internet Archive, also called as “WayBack Machine”

which is an archive mainly consisting of copies of past Web pages on the Internet with the use of software robots (Internet Archive, 2009). Due to the fact that the Internet Archive is an non-commercial organization and its main purpose is reserving the historical data on the Internet rather than launching time-consuming negotiations with authors, the Internet Archive relies upon the ’fair use’ and other related copyright law exemptions for libraries to be the defenses against potential copyright infringement allegations(Hirtle, 2003).

The next example is the websites which provide a Web space for authors to upload their own articles and for contributors to publish others’ works with fully permissions, such as the Scribd and the Issuu (Scridb, 2009; Issuu, 2009a). In fact, a website, like Scribd or Issuu, is an agent or mediator, which only offers an platform where right owners and users could interchange with each other: right owners could release their works on the library site as long as grant some copyrights and, accordingly, the users

could choose the works not only meeting their specific purposes but within the scope of authorization as well. As soon as these uploaded files are alleged to infringe any copyrights, the webmasters will instantly remove all suspected materials whenever receiving notices (Issuu, 2009b). In other words, an library adopting this strategy counts on the licensing from authors as well as the Safe Harbor exemption, such as 17 U.S.C. 512, as it does not precisely examine whether the contributors have real authority or not.

We can find out that the first two examples both rely upon the exemptions of copyright laws. Another straight way to avoid potential copyright infringement allegations is constructing a website where all collections are owned by him and, no one, except the librarian himself, could have rights against him. In other words, the librarians may contract with the content owners or the right holders and make a proper arrangement of the benefits. For instance, ACM Digital Library only collects all articles subject to its copyright terms (ACM Digital Library, 2009). Nevertheless, because the negotiation process may be costly as well as direct communication to the numerous authors on the Internet is almost impossible; Internet libraries belonging to this model are all business, main-stream publishers or media. For instance, BBC built a trial site, BBC Creative Archive, to release more than 500 full TV programs (BBC, 2006).

Moreover, a similar example is only focusing on the work without copyright protection. For example, Project Gutenberg announces to encourage the creation and distribution of eBooks, mainly the works in public domain (Hart, 2004). That is to say, all collections in this Website merely consist of public domain or out-of-copyrighted works and, as a result, no one could challenge a depository of this kind about the copyright.

In fact, the present Internet libraries may adopt one or more strategies rather than a pure one. For example, the main materials of the Project Gutenberg are in public domain under US Copyright law, as long as few materials are subject to authors’

permission1. The Citeseer is another example, which not only employs software robots to collect articles on the Internet, but also allow authors to submit their article to this library (CiteseerX.ist, 2009).