+ Reply to Thread
Results 1 to 3 of 3

Thread: Java: Word Counter

  1. #1
    Join Date
    Mar 2008
    Posts
    7,145
    Rep Power
    86

    Java: Word Counter

    This program is a solution to this problem: http://uva.onlinejudge.org/external/4/494.pdf. The basic idea is given a sentence, you want to find out how many words are in the sentence.

    Lots of examples online that I have found say to count the number of spaces. Though this isn't correct.

    Consider this input:

    Code:
    Hi ... test
    Counting the spaces would say that there is 3 words. This is wrong. There is only 2. The ... doesn't count. Words can only consist of letters (upper and lower case), dashes and apostrophe's.

    The first thing my algorithm done is remove trailing white space, and all amounts of white space are replaced with just one space. Though this doesn't really matter very much. I also remove all numbers and characters that cannot be part of a word.

    Now what I do is move along the string until I find a character that is not part of a word. A space indicates the termination of a word. At this point, we count the word. Then we move along until we either find the end of the string or a letter starting the next word.

    Notice though that counting spaces can't work. Why consider this:

    Hi '' test.
    Counting the spaces will give us 3 words. This is wrong because '' isn't a valid word.

    Code:
    import java.util.*;
    import java.io.*;
    public class wordCount {
        public static void main(String[] args) throws IOException {
            Scanner fin = new Scanner(new FileReader("wordCount.txt"));
            //Scanner fin = new Scanner(System.in);
            String sLine = "";
            int i = 0;
            int words = 0; // number of words
    
            while (fin.hasNext()) {
                words = 0;
                i = 0;
                sLine = fin.nextLine();
                // words can only contain letters, single quotes, and dashes
                // so remove everything else
                sLine = sLine.replaceAll("[^A-Za-z\'-]", " ").trim();
                // remove everything except for space
                sLine = sLine.replaceAll("\\s{1,}", " ");
                while (i < sLine.length()) {
                    // keep increasing i until you find the end of the word
                    // or the end of the string
                    while (i < sLine.length() && (Character.isLetter(sLine.charAt(i))
                            || sLine.charAt(i) == '-' || sLine.charAt(i) == '\'')) {
                        i++;
                    }
                    words++; // count the word
                    // keep increasing i until you find the next word
                    while (i < sLine.length() && !Character.isLetter(sLine.charAt(i))) {
                        i++;
                    }
                }
                System.out.println(words);
            }
            fin.close();
        }
    }

  2. CODECALL Circuit advertisement
    Join Date
    Always
    Posts
    Many

     
  3. #2
    Sinipull's Avatar
    Sinipull is offline Programming Expert
    Join Date
    Jun 2009
    Location
    Tallinn, Estonia, Estonia
    Posts
    382
    Rep Power
    13

    Re: Java: Word Counter

    You should write it as a function, so it could be easily integrated into a program. Anyway, i tried it too, i hope you don't mind if i add it here.

    Code:
        public static int count(String s){    	    	
        	for(int i = 0; i < s.length(); i++){
        		if(!(checkChar(s.substring(i, i+1)) || checkApostrophesDashes(s.substring(i, i+1)))){    			
        			s = s.replace(s.substring(i, i+1), " ");
        		}    		
        	}
        	String str[] = s.split(" ");    	
        	int count = 0;
        	for(int i =0; i <str.length; i++){
        		if(!str[i].equals("") && !checkApostrophesDashes(str[i])) count++;
        	}
        	return count;
        }
        
        public static boolean checkChar(String c){ // expecting only 1 character
        	if(!c.toUpperCase().equals(c) || !c.toLowerCase().equals(c)) return true; // it's a char
        	else return false; // it's not a char
        }
        
        public static boolean checkApostrophesDashes(String c){ // expecting only 1 character
        	if(c.equals("'") || c.equals("-")) return true; // it's dash or apostrophe
        	else return false;
        }
    Last edited by Sinipull; 09-02-2009 at 09:25 AM. Reason: Fixed a bug, that counted single "-" or "'" as a word.

  4. #3
    Join Date
    Mar 2008
    Posts
    7,145
    Rep Power
    86

    Re: Java: Word Counter

    That is true, I should have written it as a function. Nope, I don't really mind. The more approaches the better. Your code is much more concise than mine is.

    The reason, that I didn't use methods is in a contest situation, you want to get it done as quick as you possibly can. So I didn't bother doing it with any method calls.

    Very nice code though.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. How to write a word search program in Java?
    By yeeesh in forum Java Help
    Replies: 0
    Last Post: 10-18-2011, 04:33 PM
  2. Replies: 1
    Last Post: 12-14-2009, 10:42 AM
  3. Help with word counter.
    By so1i in forum C and C++
    Replies: 4
    Last Post: 10-18-2009, 02:32 PM
  4. Hit Counter
    By Ryan in forum PHP Development
    Replies: 5
    Last Post: 02-05-2008, 05:25 AM
  5. Counter
    By Blaze in forum General Programming
    Replies: 1
    Last Post: 08-22-2006, 04:24 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts