Possible Ways to solve the Copyright Problem

1. Introduction

1.3 Possible Ways to solve the Copyright Problem

Instead of expensive human intervention, there are other two main possible useful ways to avoid the potential copyright infringement allegations (Lessig, 2006a): the first approach is definitely the law. For example, a government can grant a totally new copyright exemption which only applies to the Internet library or, directly amplify the reach of fair use exemption. The next useful way is the code. In the context of the Internet, the code, which, more specifically, is software or hardware, forms cyberspace what it is and constitutes a set of constraints on how you can behave (Lessig, 2006b). On this ground, designing a new software robot which can precisely

identify the authorization scope is a possibly useful way to reduce the risk on copyright infringement allegations.

Even though these two ways can both effectively solve the copyright infringement problem. However, a new exemption may inevitably conflict with the present rules of copyright laws; therefore, it is not a proper choice for the unpredicted consequences.

Moreover, a new exemption needs a lot of researches and discussions; in other words, it is very time-consuming. On the other hand, in general, the change in architecture of the Internet may be fiscally cheaper than granting a new exemption, because the process of getting a segment of new code is much easier.

On all the reasons above, employing software robots to automatically collect works, including copyrighted and out-of-copyrighted works, and identifying the explicit authorization scope of the collected works is the best strategy for an Internet library.

That is to say, a library belongs to this model, in quadrant IV, could achieve the goal of broadest collections as well as facing a low risk of copyright infringements.

Nevertheless, this mixed strategy is nothing more than an ideal one in the current time and, in fact, no Internet library so far could launch a software robot with an ability to automatically collect works as well as explicitly identify copyright authorization scope. In fact, just few software robots are able to differentiate between a copyrighted document and a document that has been posted by an author for general use; as a result, they simply automatically retrieve all papers via the Internet.

Some technical hurdles actually impede the advance of the Internet library, especially in respect of the ability to automatically identify authorization scope: The first one is: the real meaning of such information, especially in terms of the legal meaning, is not easy to understand without human beings interferences. To speak more explicitly, there are two kinds of difficulties involved: at first, the information, especially expressed in natural language, could not be perfectly identified and comprehended by software robots. Consequently, the misunderstandings by software robots could inevitably lead a misjudgment of the copyright authorization scope.

Secondly, the vague expression could also result in some misunderstandings. For

example, a common jargon “Under Copyright Law Protection”, without specifically indicating under which nation’s copyright laws, may mislead software robots and result in ambiguities to some extents.

The second difficulty is that, even though the right meaning of authorization information could be specifically understood by software robots, the exact location of

the authorization information of a particular work is not easy to be determined. For example, in SourceForge, all programs are under the same GPL license, which is expressing in the “Term of Use” section of the website, as shown in Figure 2.

Figure 2: A snapshot of SourceForge’s “Terms and Conditions of Use”² On the other hand, every document in Scribd is licensed under the same Creative Common license, as shown in Figure 3. However, as illustrated in these two figures, the locations of the authorization information are different: one is on another page and one is in the same page.

2 http://alexandria.wiki.sourceforge.net/Terms+of+Use

Figure 3: A snapshot of a document in Scribd³

In order to solve the two difficulties above, the first suggestions is offering a much complex software robot: a robot with great artificial intelligence as well as high-level information retrieval technology to find out which piece of information is the real one and to comprehend the legal meaning of information in natural languages.

Nevertheless, technologies in these two areas--artificial intelligence and information retrieval, are very complex and, in fact, a software robot with such ability has not existed yet.

Therefore, the next suggestion is offering the authors of the works a mechanism which could be easily understood by the robots, as well as could be used to properly express the copyright authorization scope should be a more practice measure. To speak more explicitly, a mechanism which fulfills two minimum requirements could be used in such circumstances: the first requirement is that the mechanism should be fully identified by software robots and, the second one is that this mechanism should

http://www.scribd.com/doc/3497454/GPL-have flexible ability to express the copyright authorization scope of works, no matters what types of works.

Furthermore, we hope to construct a library not only acquiring collections by software robots, but also focusing on free and open works. The reason is that a library with free and open works can effectively encourage exchanges of all works on the Internet and, as a result, stimulate more developments and reservations of cultures.

We hope the mechanisms proposed in this thesis can be useful to achieve this goal.

Based on the foregoing discussion, a fixed term expression, rather than natural languages is a more ideal proposal. Moreover, the popularity of a fixed term expression is very important, because search engines, the most common users of robots, only support several popular fixed-term expressions and this fact will finally decide the number of users of the proposed expressing methods. In other words, a well designed but unpopular fixed-term expression is nothing but an unrealistic imagination.

In the present Internet world, there are two popular fixed-term expressions: the Creative Commons (CC license thereafter) and the Robots.txt and Meta tags. These two mechanisms are dedicatedly designed for software robots, that is to say, any further modification of these two could easily be understood by software robots. More importantly, these two approaches are all supported by popular search engines’ robots, such as Google (Google, 2008b), Yahoo (Yahoo, 2008b), and MSN (MSN, 2008a).

However, with regard to expressing the copyright authorization scope, some drawbacks appear to these two schemes: even the CC license covers several common copyright authorization choices, it still have some disadvantages and needs further modifications, especially for works in some kinds of digital forms. On the other hand, the Robots.txt and Meta tags are purposely designed for software robots and very easy to use, but do not focus on expressing explicit copyright authorization. As a result, all these two candidates need some modifications. Furthermore, in respect of the licensing on the Internet, there are two kinds of people in need of expressions of copyright authorization scope. The first one, not surprisingly, is the author of a work.

In addition to differently licensing individual works, in the Internet world, the author

may be in need of licensing all works in one website or a Web page under the same condition. For example, in Scribd, all works are licensed under the same CC license, as shown in Figure 3. Therefore, the second kind of people who need expressions of copyright authorization is the webmasters who operate the websites or the Web page owners who manage the Web page. In general, the site and all pages reside in this site may be owned by the same person; therefore, we use the term, webmaster, to represent the people who are in need of expressions for identically licensing all works.

This thesis is structured as follows: at the beginning of this thesis, we will review some primary concepts, such as digital libraries, software robots, the Internet copyright issues and related terminologies. In the next sections, concerning the above-mentioned two kinds of persons, who are in need of authorization expressions, we first try to pay our attention to the webmasters who authorize all works in the same page. The Robot.txt and the CC license as well as the Robots Meta tags can both be used to license works in the same Web page. Nevertheless, the Robot.txt and Robots Meta tags need a minor amendment to fully express the copyright authorization scope, whereas, the CC licensing scheme can be used to license works not only identically but also individually. Nevertheless, a new revision of the original CC license is proposed in the following section, which can reduce the disadvantages of the original CC license in terms of licensing each particular work. Next, we compare the foregoing revision and amendments before finally discussing some unsolved problems while suggesting additional issues that invite future research.

在文檔中可用於自動蒐集開放網路內容之著作權授權表達法 (頁 16-22)