+ Reply to Thread
Results 1 to 3 of 3

Thread: Java: Word Counter

  1. #1
    Code Slinger chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5's Avatar
    Join Date
    Mar 2008
    Posts
    7,023
    Blog Entries
    1

    Java: Word Counter

    This program is a solution to this problem: http://uva.onlinejudge.org/external/4/494.pdf. The basic idea is given a sentence, you want to find out how many words are in the sentence.

    Lots of examples online that I have found say to count the number of spaces. Though this isn't correct.

    Consider this input:

    Code:
    Hi ... test
    Counting the spaces would say that there is 3 words. This is wrong. There is only 2. The ... doesn't count. Words can only consist of letters (upper and lower case), dashes and apostrophe's.

    The first thing my algorithm done is remove trailing white space, and all amounts of white space are replaced with just one space. Though this doesn't really matter very much. I also remove all numbers and characters that cannot be part of a word.

    Now what I do is move along the string until I find a character that is not part of a word. A space indicates the termination of a word. At this point, we count the word. Then we move along until we either find the end of the string or a letter starting the next word.

    Notice though that counting spaces can't work. Why consider this:

    Hi '' test.
    Counting the spaces will give us 3 words. This is wrong because '' isn't a valid word.

    Code:
    import java.util.*;
    import java.io.*;
    public class wordCount {
        public static void main(String[] args) throws IOException {
            Scanner fin = new Scanner(new FileReader("wordCount.txt"));
            //Scanner fin = new Scanner(System.in);
            String sLine = "";
            int i = 0;
            int words = 0; // number of words
    
            while (fin.hasNext()) {
                words = 0;
                i = 0;
                sLine = fin.nextLine();
                // words can only contain letters, single quotes, and dashes
                // so remove everything else
                sLine = sLine.replaceAll("[^A-Za-z\'-]", " ").trim();
                // remove everything except for space
                sLine = sLine.replaceAll("\\s{1,}", " ");
                while (i < sLine.length()) {
                    // keep increasing i until you find the end of the word
                    // or the end of the string
                    while (i < sLine.length() && (Character.isLetter(sLine.charAt(i))
                            || sLine.charAt(i) == '-' || sLine.charAt(i) == '\'')) {
                        i++;
                    }
                    words++; // count the word
                    // keep increasing i until you find the next word
                    while (i < sLine.length() && !Character.isLetter(sLine.charAt(i))) {
                        i++;
                    }
                }
                System.out.println(words);
            }
            fin.close();
        }
    }

  2. #2
    Programmer Sinipull will become famous soon enough Sinipull's Avatar
    Join Date
    Jun 2009
    Location
    Estonia
    Age
    21
    Posts
    168

    Re: Java: Word Counter

    You should write it as a function, so it could be easily integrated into a program. Anyway, i tried it too, i hope you don't mind if i add it here.

    Code:
        public static int count(String s){    	    	
        	for(int i = 0; i < s.length(); i++){
        		if(!(checkChar(s.substring(i, i+1)) || checkApostrophesDashes(s.substring(i, i+1)))){    			
        			s = s.replace(s.substring(i, i+1), " ");
        		}    		
        	}
        	String str[] = s.split(" ");    	
        	int count = 0;
        	for(int i =0; i <str.length; i++){
        		if(!str[i].equals("") && !checkApostrophesDashes(str[i])) count++;
        	}
        	return count;
        }
        
        public static boolean checkChar(String c){ // expecting only 1 character
        	if(!c.toUpperCase().equals(c) || !c.toLowerCase().equals(c)) return true; // it's a char
        	else return false; // it's not a char
        }
        
        public static boolean checkApostrophesDashes(String c){ // expecting only 1 character
        	if(c.equals("'") || c.equals("-")) return true; // it's dash or apostrophe
        	else return false;
        }
    Last edited by Sinipull; 09-02-2009 at 11:25 AM. Reason: Fixed a bug, that counted single "-" or "'" as a word.

  3. #3
    Code Slinger chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5 has a reputation beyond repute chili5's Avatar
    Join Date
    Mar 2008
    Posts
    7,023
    Blog Entries
    1

    Re: Java: Word Counter

    That is true, I should have written it as a function. Nope, I don't really mind. The more approaches the better. Your code is much more concise than mine is.

    The reason, that I didn't use methods is in a contest situation, you want to get it done as quick as you possibly can. So I didn't bother doing it with any method calls.

    Very nice code though.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

     

Similar Threads

  1. Tutorial: Starting Java Using Netbeans
    By Jordan in forum Java Tutorials
    Replies: 4
    Last Post: 02-27-2010, 05:20 PM
  2. Kill process by command name
    By mop in forum Linux Installation & Configuration
    Replies: 4
    Last Post: 02-16-2009, 07:21 PM
  3. Guess The Word.
    By Paradox in forum Java Tutorials
    Replies: 5
    Last Post: 01-16-2009, 09:48 AM
  4. Dictonary Program
    By programmer 101 in forum Java Help
    Replies: 9
    Last Post: 07-01-2007, 01:39 PM

Bookmarks

Bookmarks

     
        Algorithms and Data Structures

        Java tutorials

        Algorithms Forum

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts