• 沒有找到結果。

CHAPTER 6 APPENDIX

6.1 B UILDING THE D ATABASE

6.1.3 Data Collection

The author collected data from multiple sources and then used various programs to process the data before building the database.

The primary data source for this Section 337 Investigation database is the USITC’s 337 Investigational History website. This website maintains one list of pending299 and another of all investigations.300 Each investigation has a tabularized webpage. The author retrieved all terminated investigational pages and extracted some usable fields to form the core of the source Microsoft Excel datasheet. However, due to the inconsistency, lack of data and factual errors, it takes a lot of legal research to fill up the blanks.

6.1.3.1 The Investigational Database

For the Section 337 investigations, this research collected information including but not limited to following fields:

• Investigation number. The format is 337-TA-xxx and xxx being the investigation number. Some very early, branched investigations were appended with a capitalized letter such as 054A.

• Investigation title (e.g., patent infringement, trademark, copyright …).

• The accused types of unfair act.

• The disposition.

299 The USITC Section 337 Pending Investigations Home Page, http://info.usitc.gov/ouii/public/337inv.nsf/Pending?OpenView.

300 The USITC Section 337 Investigational History Home Page, http://info.usitc.gov/ouii/public/337inv.nsf/All?OpenView.

o Complaint Withdrawn: A third party usually knows very little about what went on, at least from available official documents. One may infer Complainant was in a difficult position.

o Settlement: A third party usually knows very little about it.

o No Violation Found

o Violation Found: The ALJ can grant one or several of these four types of order. And the decision is subject to USITC review.

ƒ Consent Order

ƒ General Exclusion Order

ƒ Limited Exclusion Order

ƒ Cease-and-Desist Order

• Litigated patent numbers. An investigation may involve several patents. Some patents are known to be B1 re-examed after the investigation. If this is the case, the patent number would not be prefixed.

• Exporter: Countries or political entities that were alleged to export infringed articles. In one case, the United States was listed as an “exporter.” However, it was U.S.-made genuine articles shipped back to the home market by Japanese importers or so-called gray market.

• Date of Investigation

• Date of Determination

• Date of Termination

The above fields only include those important ones. In principle, all available data fields from the USITC’s HTML investigation history files are collected and tabularized.

Data collection for this research started from creating a consolidated HTML document of all terminated and pending Section 337 investigations. The author saved over-500 USITC web documents using Netscape Navigator 4.08,301 and then used said software’s HTML editor to create a combined document of all pages for text processing. The software’s HTML editor is known to reformat the HTML source code a little bit according to its own preferred style of coding; however, it does not change the contents. The deadline of data collection was April 2006. This research only incorporates data before January 31, 2006, into the final database.

301 Netscape Home Page, http://www.netscape.com/ .

The contents obtained from the USITC website are lack of data conformity, the author has to manually copyedit a lion’s share of entries to standardized the format and terminology (e.g., formats and alternative spellings for dates, abbreviations, place names, business names and many other irregularities), sometimes, dozens of times per investigation.302

UltraEdit-32 v. 10.0303 is the main text editor for this research. The author also fixes typos and corrects factual errors as they are found.

The copy-edited data is then saved in tab delimitated plain text format. The text is then exported to Microsoft Excel 2002.304 The author then created a working Section 337 Investigational data table in Microsoft Excel with additional copyedit. Then the author used collected patent numbers to build a proprietary U.S. patent datasheet also with the same software tools and similar methods. The two main parts are then broken apart into multiple data tables to be imported into Microsoft Access 2002.

Many earlier patent-based investigations are documented without involved patent numbers. The author has done the best to obtain the lost patent numbers using a Westlaw- and EDIS-based carpet research.305 Nevertheless, many investigations, especially earlier ones, were not available in digitized forms. This research ended up having patent-based Investigations 065, 063, 013, 012, 011, 009, 008, 006, 003, 002, and 001 without any patent information at all. Thankfully, these investigations were before June 1, 1981 and not eligible to the statistical research. This is repairable if the author could access more original text.

Some data shortages were fixed in June 2006. But even without these patent numbers, the database is still fairly completed. During the legal research, the author also corrected many factual errors in the Investigational History documents.

6.1.3.2 The U.S. Patent Database

The U.S. patent database is a collection of about 800 litigated patents mainly retrieved from the USPTO’s “Patent Full-Text and Full-Page Image Databases.”306 The USPTO has

302 For example, each archivist has his own favorite style. Possibly, they also copied texts prepared by others.

The format of punctuation, nationality and date was not standardized even within the same investigation.

Without very extensive copyediting, the database program can hardly digest the information.

303 UltraEdit Home Page, http://www.ultraedit.com/ .

304 Microsoft Office Home Page, http://office.microsoft.com/en-us/default.aspx .

For a comparable cross-platform open source software solution, you may get OpenOffice.org from http://www.openoffice.org/ .

305 EDIS Home Page, https://edis.usitc.gov/hvwebex/ .

306 USPTO Patent Database Home Page, http://www.uspto.gov/patft/index.html .

made the full searchable text of post-1976 and some earlier patents available. The author only needs to copy and paste all patent numbers into UltraEdit-32, and uses it to create a long list of patent webpage URLs. The URL list is then fed to Net Transport v. 1.94281, a spyware-free closed source download manager307 for data retrieval. The patent files would be downloaded in several minutes.

For many patents issued between 1971 and 1976, the author used the Delphion online patent database308 because it provides patent texts rather than scanned patent images.

Delphion also provides IPC classification information to some but not all pre-1976 patents.

Sometimes, classification information from various sources do not agree with each other.

This research generally used data from the USITC.

To simplify data collection process, the author turned to Learning Tech’s online patent database309 for some post-1976 patents after a week’s endless copying and pasting. Learning Tech (http://www.learningtech.com.tw/) is a Taiwan-based company specialized in patent analysis software. Their patent data was licensed from the USPTO and they allowed public free access. Their simplified output pages are much easier for semi-manual data extraction than those complex pages created by the USPTO’s servers. However, using their data proved to be a bad decision.

Learning Tech’s database front end has some deadly bugs. It fails to output a reissued patent’s original patent information. They also do not provide a patent’s disclaim date, if that’s applicable. Their data contains other substantial errors as well (e.g., certain dates and patent classes as well). This research ended up abandoned all their data and used USPTO information only.