Register   Login   About   Study   Enterprise   Share
AI / Internet Technology University (AITU)
Fast Login - available after registration







|

Top Links: >> 80. Technology >> Internet Technology Summit Program >> 1. Java Introduction >> 1.5. Text Processing and Collections
Current Topic: 1.5.3. Parsing text with patterns
You have a privilege to create a quiz (QnA) related to this subject and obtain creativity score...
There are complex cases when a sequence of patterns is leading us to the value.
For example we are looking for a value that is located after the table tag but before another pattern. This is a simple sample with just two patterns, but it could be a bigger sequence, such as: after pattern1, then after pattern2, and before the pattern3.

Note that we deal with the keys, such as after, before, etc., and the pattern values. The keys are followed by the patterns, they are always in pairs.

It might be convenient to locate our value with one sentence, such as:

String value = Parser.parseWithPatterns(text, "after,pattern1, after last,pattern2, before ignorecase,pattern3");

This can be done with the Parser utilities provided below. The utilities have several versions with slightly different set of parameters. Remember, that this is called Method Overloading. This is done for convenience.

Here is the source.


public class Parser {
/**
* The parseWithPatterns() search the text according to the patterns
* @param text
* @param patterns, such as {"after","div", "after", "table", "before", "/table"}
* @return result
*/
public static String parseWithPatterns(String text, String patternsInString) {
return parseWithPatterns(text, patternsInString, false); // look at case by default
}
/**
* The parseWithPatterns() search the text according to the pattern
* @param text
* @param patternsInString pairs of keywords and patterns, separated by the comma characters,
* such as "after,div,after,table,before,/table"
* - three patterns with three keywords (after, after, before) as one string
* @param ignoreCase
* @return result
*/
public static String parseWithPatterns(String text, String patternsInString, boolean ignoreCase) {
if(patternsInString == null || text == null) {
return text;
}
String[] patterns = patternsInString.split(",");
return parseWithPatterns(text, patterns, ignoreCase);
}
/**
* This version of the parseWithPatterns() method takes an array of patterns
* In the array the first value is a key, such as "before" or "after" or "before last" or "after ignorecase" etc.
* and the next value is the pattern to look for
* @param text
* @param patterns in pairs (key, pattern)
* @return result
*/
public static String parseWithPatterns(String text, String[] patterns) {
return parseWithPatterns(text, patterns, false); // look at case by default
}
public static String parseWithPatterns(String originText, String[] patterns, boolean ignoreCase) {
if(patterns == null || originText == null) {
return originText;
}
int foundIndex = -1;
String text = originText;
if(ignoreCase) {
// make all text lower case
text = originText.toLowerCase();
}
for(int i=0; i < patterns.length; i+= 2) {
String key = patterns[i];
String pattern = patterns[i+1];
if(ignoreCase || key.toLowerCase().indexOf("ignorecase") > 0) {
// make lower case only for this particular pattern
pattern = pattern.toLowerCase();
text = text.toLowerCase();
}
// check for the pattern looking forward
int indexOfPattern = text.indexOf(pattern);
// check for the pattern looking back (reverse)
int rindexOfPattern = text.lastIndexOf(pattern);
// the key can be "after last" or "afterlast" producing same results
if(key.equalsIgnoreCase("after last") || key.equalsIgnoreCase("afterLast")) {
// for the last pattern, look reverse (from the end of the text)
if(rindexOfPattern >= 0) {
text = text.substring(rindexOfPattern + pattern.length());
originText = originText.substring(rindexOfPattern + pattern.length());
} else {
return "";
}
} else if(key.startsWith("after")) {
// looking forward and using the index found while looking forward
if(indexOfPattern >= 0) {
text = text.substring(indexOfPattern + pattern.length());
originText = originText.substring(indexOfPattern + pattern.length());
} else {
return "";
}
} else if(key.equalsIgnoreCase("before last") || key.equalsIgnoreCase("beforeLast")) {
if(rindexOfPattern >= 0) {
text = text.substring(0, rindexOfPattern);
originText = originText.substring(0, rindexOfPattern);
} else {
return "";
}
} else if(key.startsWith("before")) {
if(indexOfPattern >= 0) {
text = text.substring(0, indexOfPattern);
originText = originText.substring(0, indexOfPattern);
} else {
return "";
}
} else if(key.equalsIgnoreCase("find")) {
if(indexOfPattern >= 0) {
foundIndex = indexOfPattern;
} else {
return "";
}
} else if(key.equalsIgnoreCase("backFromFound")) {
indexOfPattern = text.lastIndexOf(pattern, foundIndex);
if(indexOfPattern >= 0) {
text = text.substring(indexOfPattern + pattern.length());
originText = originText.substring(indexOfPattern + pattern.length());
} else {
return "";
}
}
}
return originText;
}
}
Was it clear so far?


Note that all utility methods are static, so they can be conveniently called with just name of the class.

Assignments:
1. Create a new class Parser in the same package and type the source. Make sure, there is no red flags by Eclipse.
2. Add more comments to the biggest method.
3. Add the main() method with several testing lines.
4. Run As – Java Application.
Topic Graph | Check Your Progress | Propose QnA | Have a question or comments for open discussion?
<br/>public class Parser {
<br/>	/**
<br/>	 * The parseWithPatterns() search the text according to the patterns
<br/>	 * @param text
<br/>	 * @param patterns, such as {"after","div", "after", "table", "before", "/table"}  
<br/>	 * @return result
<br/>	 */
<br/>	public static String parseWithPatterns(String text, String patternsInString) {
<br/>		return parseWithPatterns(text, patternsInString, false); // look at case by default
<br/>	}
<br/>	/**
<br/>	 * The parseWithPatterns() search the text according to the pattern
<br/>	 * @param text
<br/>	 * @param patternsInString pairs of keywords and patterns, separated by the comma characters, 
<br/>	 * such as "after,div,after,table,before,/table" 
<br/>	 * - three patterns with three keywords (after, after, before) as one string
<br/>	 * @param ignoreCase
<br/>	 * @return result
<br/>	 */
<br/>	public static String parseWithPatterns(String text, String patternsInString, boolean ignoreCase) {
<br/>		if(patternsInString == null || text == null) {
<br/>			return text;
<br/>		}
<br/>		String[] patterns = patternsInString.split(",");
<br/>		return parseWithPatterns(text, patterns, ignoreCase);
<br/>	}
<br/>	/**
<br/>	 * This version of the parseWithPatterns() method takes an array of patterns
<br/>	 * In the array the first value is a key, such as "before" or "after" or "before last" or "after ignorecase" etc. 
<br/>	 * and the next value is the pattern to look for
<br/>	 * @param text
<br/>	 * @param patterns in pairs (key, pattern)
<br/>	 * @return result
<br/>	 */
<br/>	public static String parseWithPatterns(String text, String[] patterns) {
<br/>		return parseWithPatterns(text, patterns, false); // look at case by default
<br/>	}
<br/>	public static String parseWithPatterns(String originText, String[] patterns, boolean ignoreCase) {
<br/>		if(patterns == null || originText == null) {
<br/>			return originText;
<br/>		}
<br/>		int foundIndex = -1;
<br/>		String text = originText;
<br/>		if(ignoreCase) {
<br/>			// make all text lower case
<br/>			text = originText.toLowerCase();			
<br/>		}
<br/>		for(int i=0; i < patterns.length; i+= 2) {
<br/>			String key = patterns[i];
<br/>			String pattern = patterns[i+1];
<br/>			if(ignoreCase || key.toLowerCase().indexOf("ignorecase") > 0) {
<br/>				// make lower case only for this particular pattern
<br/>				pattern = pattern.toLowerCase();	
<br/>				text = text.toLowerCase();
<br/>			}
<br/>			// check for the pattern looking forward
<br/>			int indexOfPattern = text.indexOf(pattern);
<br/>			// check for the pattern looking back (reverse)
<br/>			int rindexOfPattern = text.lastIndexOf(pattern);
<br/>			// the key can be "after last" or "afterlast" producing same results
<br/>			if(key.equalsIgnoreCase("after last") || key.equalsIgnoreCase("afterLast")) {
<br/>				// for the last pattern, look reverse (from the end of the text)
<br/>				if(rindexOfPattern >= 0) {
<br/>					text = text.substring(rindexOfPattern + pattern.length());
<br/>					originText = originText.substring(rindexOfPattern + pattern.length());
<br/>				} else {
<br/>					return "";
<br/>				}
<br/>			} else if(key.startsWith("after")) {
<br/>				// looking forward and using the index found while looking forward
<br/>				if(indexOfPattern >= 0) {
<br/>					text = text.substring(indexOfPattern + pattern.length());
<br/>					originText = originText.substring(indexOfPattern + pattern.length());
<br/>				} else {
<br/>					return "";
<br/>				}				
<br/>			} else if(key.equalsIgnoreCase("before last") || key.equalsIgnoreCase("beforeLast")) {
<br/>				if(rindexOfPattern >= 0) {
<br/>					text = text.substring(0, rindexOfPattern);
<br/>					originText = originText.substring(0, rindexOfPattern);
<br/>				} else {
<br/>					return "";
<br/>				}
<br/>			} else if(key.startsWith("before")) {
<br/>				if(indexOfPattern >= 0) {
<br/>					text = text.substring(0, indexOfPattern);
<br/>					originText = originText.substring(0, indexOfPattern);
<br/>				} else {
<br/>					return "";
<br/>				}
<br/>			} else if(key.equalsIgnoreCase("find")) {
<br/>				if(indexOfPattern >= 0) {
<br/>					foundIndex = indexOfPattern;
<br/>				} else {
<br/>					return "";
<br/>				} 
<br/>			} else if(key.equalsIgnoreCase("backFromFound")) {
<br/>				indexOfPattern = text.lastIndexOf(pattern, foundIndex);
<br/>				if(indexOfPattern >= 0) {
<br/>					text = text.substring(indexOfPattern + pattern.length());
<br/>					originText = originText.substring(indexOfPattern + pattern.length());
<br/>				} else {
<br/>					return "";
<br/>				}
<br/>			}			
<br/>		}
<br/>		return originText;
<br/>	}
<br/>}
<br/>






Was it clear so far?



Note that all utility methods are static, so they can be conveniently called with just name of the class.

Assignments:
1. Create a new class Parser in the same package and type the source. Make sure, there is no red flags by Eclipse.
2. Add more comments to the biggest method.
3. Add the main() method with several testing lines.
4. Run As – Java Application.

Topic Graph | Check Your Progress | Propose QnA | Have a question or comments for open discussion?

Have a suggestion? - shoot an email
Looking for something special? - Talk to me
Read: IT of the future: AI and Semantic Cloud Architecture | Fixing Education
Do you want to move from theory to practice and become a magician? Learn and work with us at Internet Technology University (ITU) - JavaSchool.com.

Technology that we offer and How this works: English | Spanish | Russian | French

Internet Technology University | JavaSchool.com | Copyrights © Since 1997 | All Rights Reserved
Patents: US10956676, US7032006, US7774751, US7966093, US8051026, US8863234
Including conversational semantic decision support systems (CSDS) and bringing us closer to The message from 2040
Privacy Policy