Java

Java Trim Strings

When a computer user inputs text using the terminal or the text input field of a window, the useful phrase may be preceded and/or followed by whitespace. Whitespace, written as whitespace (for short) is one or more of the following escape sequences occurring consecutively:

' ' or '\u0020': space by pressing spacebar key
'\n': line feed
'\r': carriage return
'f': form feed
'\t': horizontal tab

In a large text, certain phrases that are required have to lead and/or trailing whitespaces. When such a phrase is extracted, it may come with leading and/or trailing whitespace. No matter how a useful phrase is obtained, the leading or trailing whitespace has to be removed for the phrase to be properly used.

In the java.lang package, there is the string class. This string class has a method called trim(). The trim() method removes leading and/or trailing space from a string. It returns the string with the leading and/or trailing whitespace removed.

In Java, the string class does not have to be imported; it is imported automatically. In order to use the string class in code, the string class has to be typed as “String”, with the first letter in uppercase.

Demonstrating the Effect of Whitespace

Leading and/or trailing whitespace can be a nuisance. Compile and run the following program:

    class TheClass {

        public static void main(String[] args) {  
            String str =  " \t\n\r\fuseful part \t\n\r\f";
            System.out.println(str);
        }  
    }

In the author’s computer, the “useful part” was printed with blank lines above and below it.

Using the trim() Method

The trim() method is simple to use. The syntax is:

public String trim()

The following program illustrates its use:

    class TheClass {
        public static void main(String[] args) {  
            String str =  " \t\n\r\fuseful part \t\n\r\f";
            String ret = str.trim();
            System.out.println(ret);
        }  
    }

The output is:

useful part

without any leading or trailing whitespace.

Handling Input from Console

Java has several ways of getting input from the keyboard into the program. One of the ways uses the Scanner class. The Scanner class is in the java.util package. The following program shows how to get a line from the keyboard into the program:

    import java.util.Scanner;

    public class TheClass {
        public static void main(String[] args) {
            Scanner obj = new Scanner(System.in);
            System.out.println("Type a phrase and press Enter:");

            String phrase = obj.nextLine();
            System.out.println(phrase);

            obj.close();
        }
    }

The first line of the program imports the Scanner class. After that, there is the main class definition (implementation). The main class has the main method. In the main() method, the first statement instantiates the scanner object. The next statement prints text to the console, asking the user to type in a phrase. At this point, the program waits for the user to type in a phrase.

The next statement reads in the input line into the variable, phrase. The following statement in the main() method re-displays this phrase as it was typed, with any leading or trailing space. The last statement in the main() method closes the scanner object.

Leading or trailing spaces are usually not wanted from the keyboard input. It is simple to remove them; by just using the trim() method of the string object. The following program illustrates this:

    import java.util.Scanner;

    public class TheClass {
        public static void main(String[] args) {
            Scanner obj = new Scanner(System.in);
            System.out.println("Type a phrase and press Enter:");
            String phrase = obj.nextLine();

            String str = phrase.trim();

            System.out.println(str);
            obj.close();
        }
    }

Leading or trailing spaces typed with the space-bar key were removed.

Trimming with Regular Expression

With Java regular expression, the trim() method does not have to be used. A regular expression is an expression whose main component is a pattern. A pattern is a string with string meta-characters. A pattern will identify a sub-string with particular characters in a target string. Such a sub-string identified in the target string can be replaced. A regular expression is a construct that holds the pattern.

The target string may be the input string read from the keyboard. The sub-string to be identified in this topic is the leading and/or trailing whitespace. Remember, this whitespace consists of one or more of the different whitespace characters mentioned above. When this whitespace is found at the beginning or end of the target string, it is replaced with nothing.

The pattern for this whitespace is [\u0020\t\n\r\f]* or \s*. \s means [\u0020\t\n\r\f]. The regular expression to match the leading or trailing whitespace is:

"^[ |\t|\n|\r|\f]*|[ |\t|\n|\r|\f]*$"

The string class has the replaceAll() method that can be used to remove the leading and trailing space from the target string. In the following program, rawStr is the string with whitespaces. There are two words, “one” and “two,” in this string. There are white spaces in front of “one”, after “two,” and in-between “one” and “two”. The program successfully removes the leading and trailing white spaces and not the white spaces between “one” and “two”. refinedStr is the string variable without the leading and trailing white spaces. The first argument to replaceAll() is the regular expression. The second argument is the replacement, which in this case, is an empty string (without even the spacebar character). The program is:

    public class TheClass {
        public static void main(String[] args) {
            String rawStr = " \t\n\r\fone \t\n\r\ftwo \t\n\r\f";

            String refinedStr = rawStr.replaceAll("^[ |\t|\n|\r|\f]*|[ |\t|\n|\r|\f]*$", "");

            System.out.println(refinedStr);
        }
    }

So at the output, there is a blank line between “one” and “two”.

Conclusion

Trimming a string means removing the leading and trailing whitespaces. A space here consists of one or more of ‘ ‘ or \t or \n or \r or \f . Any combination of these characters within the text is not removed. The best way to remove leading and to trail space from a string is to use the trim() method of the string class. The string class also has the replaceAll() method, which can be used for trimming. However, the replaceAll() method needs knowledge and experience in using regular expression techniques, as illustrated above. The first argument to replaceAll() is the regular expression; the second argument is the replacement text, which in this case, has to be “.

About the author

Chrysanthus Forcha

Discoverer of mathematics Integration from First Principles and related series. Master’s Degree in Technical Education, specializing in Electronics and Computer Software. BSc Electronics. I also have knowledge and experience at the Master’s level in Computing and Telecommunications. Out of 20,000 writers, I was the 37th best writer at devarticles.com. I have been working in these fields for more than 10 years.