Pattern.compile( ) Flags
4.12 Program: Full Grep
Now that we’ve seen how the regular expressions package works, it’s time to write Grep2, a full-blown version of the line-matching program with option parsing.
Table 4-3 lists some typical command-line options that a Unix implementation of grep might include.
// If found, append to sales data file.
Matcher m = r.matcher(input);
if (m.find( )) {
PrintWriter pw = new PrintWriter(
new FileWriter(DATA_FILE, true));
String date = // 'date +'%m %d %H %M %S %Y'`;
new SimpleDateFormat("MM dd hh mm ss yyyy ").
format(new Date( ));
// Paren 1 is the digits (and maybe ','s) that matched; remove comma Matcher noComma = Pattern.compile(",").matcher(m.group(1));
pw.println(date + noComma.replaceAll(""));
pw.close( );
} else {
System.err.println("WARNING: pattern `" + pattern + "' did not match in `" + url + isbn + "'!");
}
// Whether current data found or not, draw the graph, using // external plotting program against all historical data.
// Could use gnuplot, R, any other math/graph program.
// Better yet: use one of the Java plotting APIs.
String gnuplot_cmd = "set term png\n" +
"set output \"" + GRAPH_FILE + "\"\n" + "set xdata time\n" +
"set ylabel \"Book sales rank\"\n" + "set bmargin 3\n" +
"set logscale y\n" +
"set yrange [1:60000] reverse\n" + "set timefmt \"%m %d %H %M %S %Y\"\n" + "plot \"" + DATA_FILE +
"\" using 1:7 title \"" + title + "\" with lines\n"
;
Process proc = Runtime.getRuntime( ).exec("/usr/local/bin/gnuplot");
PrintWriter gp = new PrintWriter(proc.getOutputStream( ));
gp.print(gnuplot_cmd);
gp.close( );
} }
Example 4-8. BookRank.java (continued)
Program: Full Grep | 113 We discussed theGetOptclass in Recipe 2.6. Here we use it to control the operation of an application program. As usual, since main( )runs in a static context but our application main line does not, we could wind up passing a lot of information into the constructor. Because we have so many options, and it would be inconvenient to keep expanding the options list as we add new functionality to the program, we use a kind ofCollectioncalled aBitSetto pass all thetrue/falsearguments:trueto print line numbers,falseto print filenames, etc. (Collections are covered in Chapter 7.) A BitSetis much like aVector(see Recipe 7.3) but is specialized to store only Boolean values and is ideal for handling command-line arguments.
The program basically just reads lines, matches the pattern in them, and, if a match is found (or not found, with -v), prints the line (and optionally some other stuff, too). Having said all that, the code is shown in Example 4-9.
Table 4-3. Grep command-line options
Option Meaning
-c Count only: don’t print lines, just count them
-C Context; print some lines above and below each line that matches (not implemented in this version; left as an exercise for the reader)
-f pattern Take pattern from file named after-f instead of from command line -h Suppress printing filename ahead of lines
-i Ignore case
-l List filenames only: don’t print lines, just the names they’re found in -n Print line numbers before matching lines
-s Suppress printing certain error messages
-v Invert: print only lines that do NOT match the pattern
Example 4-9. Grep2.java import com.darwinsys.util.*;
import java.io.*;
import java.util.*;
/** A command-line grep-like program. Accepts some options and takes a pattern * and a list of text files.
*/
public class Grep2 {
/** The pattern we're looking for */
protected Matcher pattern;
/** The Reader for the current file */
protected BufferedReader d;
/** Are we to only count lines, instead of printing? */
protected boolean countOnly = false;
/** Are we to ignore case? */
protected boolean ignoreCase = false;
/** Are we to suppress printing of filenames? */
protected boolean dontPrintFileName = false;
/** Are we to only list names of files that match? */
protected boolean listOnly = false;
/** are we to print line numbers? */
protected boolean numbered = false;
/** Are we to be silent about errors? */
protected boolean silent = false;
/** are we to print only lines that DONT match? */
protected boolean inVert = false;
/** Construct a Grep object for each pattern, and run it * on all input files listed in argv.
*/
public static void main(String[] argv) throws RESyntaxException { if (argv.length < 1) {
Program: Full Grep | 115
pg.process(new InputStreamReader(System.in), "(standard input)");
else
int caseMode = ignoreCase ? Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE : 0;
pattern = Pattern.compile(patt, caseMode);
matcher = pattern.matcher("");
}
Example 4-9. Grep2.java (continued)
/** Do the work of scanning one file
* @param ifile Reader Reader object already open * @param fileName String Name of the input file */
public void process(Reader ifile, String fileName) { String line;
int matches = 0;
try {
d = new BufferedReader(ifile);
while ((line = d.readLine( )) != null) { if (pattern.match(line)) {
if (countOnly) matches++;
else {
if (!dontPrintFileName)
System.out.print(fileName + ": ");
System.out.println(line);
}
} else if (inVert) {
System.out.println(line);
} }
if (countOnly)
System.out.println(matches + " matches in " + fileName);
d.close( );
} catch (IOException e) { System.err.println(e); } }
}
Example 4-9. Grep2.java (continued)
117
Chapter 5-
CHAPTER 5
Numbers
5.0 Introduction
Numbers are basic to just about any computation. They’re used for array indexes, temperatures, salaries, ratings, and an infinite variety of things. Yet they’re not as simple as they seem. With floating-point numbers, how accurate is accurate? With random numbers, how random is random? With strings that should contain a num-ber, what actually constitutes a number?
Java has several built-in types that can be used to represent numbers, summarized in Table 5-1. Note that unlike languages such as C or Perl, which don’t specify the size or precision of numeric types, Java—with its goal of portability—specifies these exactly and states that they are the same on all platforms.
As you can see, Java provides a numeric type for just about any purpose. There are four sizes of signed integers for representing various sizes of whole numbers. There are two sizes of floating-point numbers to approximate real numbers. There is also a type specifically designed to represent and allow operations on Unicode characters.
When you read a string from user input or a text file, you need to convert it to the appropriate type. The object wrapper classes in the second column have several Table 5-1. Numeric types
Built-in type Object wrapper Size of built-in (bits) Contents
byte Byte 8 Signed integer
short Short 16 Signed integer
int Integer 32 Signed integer
longLong 64 Signed integer
float Float 32 IEEE-754 floating point
double Double 64 IEEE-754 floating point
char Character 16 Unsigned Unicode character
functions, but one of the most important is to provide this basic conversion func-tionality—replacing the C programmer’s atoi/atof family of functions and the numeric arguments to scanf.
Going the other way, you can convert any number (indeed, anything at all in Java) to a string just by using string concatenation. If you want a little bit of control over numeric formatting, Recipe 5.8 shows you how to use some of the object wrappers’
conversion routines. And if you want full control, it also shows the use of NumberFormat and its related classes to provide full control of formatting.
As the name object wrapper implies, these classes are also used to “wrap” a number in a Java object, as many parts of the standard API are defined in terms of objects.
Later on, Recipe 10.16 shows using anIntegerobject to save anint’s value to a file using object serialization, and retrieving the value later.
But I haven’t yet mentioned the issues of floating point. Real numbers, you may recall, are numbers with a fractional part. There is an infinity of possible real bers. A floating-point number—what a computer uses to approximate a real num-ber—is not the same as a real number. The number of floating-point numbers is finite, with only 2^32 different bit patterns forfloats, and 2^64 fordoubles. Thus, most real values have only an approximate correspondence to floating point. The result of printing the real number 0.3 works correctly, as in:
// RealValues.java
System.out.println("The real value 0.3 is " + 0.3);
results in this printout:
The real value 0.3 is 0.3
But the difference between a real value and its floating-point approximation can accumulate if the value is used in a computation; this is often called a rounding error.
Continuing the previous example, the real 0.3 multiplied by 3 yields:
The real 0.3 times 3 is 0.89999999999999991
Surprised? More surprising is this: you’ll get the same output on any conforming Java implementation. I ran it on machines as disparate as a Pentium with OpenBSD, a Pentium with Windows and Sun’s JDK, and on Mac OS X with JDK 1.4.1. Always the same answer.
And what about random numbers? How random are they? You have probably heard the expression “pseudo-random numbers.” All conventional random number genera-tors, whether written in Fortran, C, or Java, generate pseudo-random numbers. That is, they’re not truly random! True randomness comes only from specially built hard-ware: an analog source of Brownian noise connected to an analog-to-digital con-verter, for example.*This is not your average PC! However, pseudo-random number
* For a low-cost source of randomness, check out http://www.lavarand.org. These folks use digitized video of 1970s “lava lamps” to provide “hardware-based” randomness. Fun!
Checking Whether a String Is a Valid Number | 119 generators (PRNG for short) are good enough for most purposes, so we use them.
Java provides one random generator in the base libraryjava.lang.Math, and several others; we’ll examine these in Recipe 5.13.
The class java.lang.Math contains an entire “math library” in one class, including trigonometry, conversions (including degrees to radians and back), rounding, trun-cating, square root, minimum, and maximum. It’s all there. Check the Javadoc for java.lang.Math.
The packagejava.Math contains support for “big numbers”—those larger than the normal built-in long integers, for example. See Recipe 5.19.
Java works hard to ensure that your programs are reliable. The usual ways you’d notice this are in the common requirement to catch potential exceptions—all through the Java API—and in the need to “cast” or convert when storing a value that might or might not fit into the variable you’re trying to store it in. I’ll show examples of these.
Overall, Java’s handling of numeric data fits well with the ideals of portability, reli-ability, and ease of programming.
See Also
The Java Language Specification. The Javadoc page forjava.lang.Math.