String Search/Replace

I continued to use AWK for writing simple file filters after I wrote Until. About 80% of the AWK scripts were simply searches and replaces. AWK includes a sub() function to do a single search and replace on a line and a gsub() function for global search and replace in a line. The Until string search and replace module was inspired by AWK. In addition to sub and gsub, there are functions to search for a substring within a string and search for a substring and insert a string.

The algorithms are simple and brute force, but effective. There is definitely room for speed improvements. As a matter of fact, one beta tester already improved code that will probably be included in the next release.

Also expect some changes in the words themselves in the next release. For example, there is search&insert but no global version.

( --- )
"clear-temp". The search and replace C functions use a dynamically allocated temporary work string. Calling clear_temp will free() the memory.
( 'string 'search 'replace --- #matches )
"g-sub". The global search and replace word searches string for search and replaces it with replace when a match occurs. This version performs multiple replacements in the string. The number of replacements is returned.
   inbuf 1+ " is" " is not" gsub
Searches <INBUF> for occurrences of "is" and replaces each with "is not".
( 'string 'new-string --- 'string )
"insert". Insert new-string at the beginning of string. The address of the original string is returned.
( 'string 'new_str 'sub_str --- 'string )
"replace". Replace the substring, sub_str, starting at string with the new string, new_str. The original address is returned. No check is made to see that the new string fits.
( 'string 'search --- f|'match )
"search". Search string for the substring search. The address of the match is returned when search is found or 0 for no match. For example:
	inbuf " xxx" search
returns the match address or 0 when no match occurs.
( 'string 'search 'insert --- f|'match )
"search and insert". Search string for the substring, search, and insert insert after the match point. The address of the end of the match is returned on match or 0 for no match. For example:
   inbuf " is" "  not" search&insert
will change an occurrence of "is" into "is not".
( 'string 'search 'replace --- tf )
"sub". The global search and replace word searches string for search and replaces it with replace when a match occurs. This version performs only a single replacement per call.

Search and Replace Example

The primary reason I added the search and replace words is writing file filters. A skeleton file filter application is included in FILTER.APP in the distribution so that writing a file filter is a matter of writing a word to process each line in the file and calling filter.

Assume that you must change all of the occurrences of "is" to "is not" in a file. Using filter and gsub, the following code is all you need:


: process.line		( --- )
   inbuf 1+ "  is " "  is not " gsub	\ blank is blank
   if ( the search was successful )
      inbuf 1+ dup strlen type          \ Type lines that match
   inbuf 1+ fout @ fputs drop

: doit "" " xxx.out" ['] process.line filter  ;

The example code is in file FILTEREX.APP. This filter program:

filter requires three parameters, the input file name, the output file name, and the address of the word to execute. In the example, in the input file, xxx.out the output file, and process.line is the word to execute. filter automatically handles opening, reading, and closing the files.

Table of Contents
Next Section