Closed Thread
Results 1 to 7 of 7

Thread: New to Posting: Help with word comparison in C

  1. #1
    jaano31 is offline Newbie
    Join Date
    Oct 2009
    Posts
    2
    Rep Power
    0

    New to Posting: Help with word comparison in C

    so I'm trying to write a program that reads in a text file and creates a histogram of all the words that are present within that text (how many times each word appears in the sentence). I'm basically supposed to use a two dimensional array (where the first element contains the word and the second element contains the number of occurrences of of that word). the output should look like this:

    this: * 10%
    is: ***** 50%
    word: ** 20%
    hello: *** 30%

    I'm not sure even how to begin as I don't know how to separate words from a text file (search for white spaces?). And I don't have enough experience working with character functions to know shortcut functions to analyze the words.. I would really appreciate any piece of advice on this.. thanks so much

  2. CODECALL Circuit advertisement
    Join Date
    Always
    Posts
    Many

     
  3. #2
    veda87's Avatar
    veda87 is offline Programmer
    Join Date
    Aug 2009
    Posts
    126
    Rep Power
    0

    Re: New to Posting: Help with word comparison in C

    Let me give you my way of solving this program...

    Since the number of words in the file is not known, static array is not possible.. So a dynamic array is needed. So I would go for Linked list. For example,

    Code:
    struct array {
    char *text;
    int occurance;
    struct array *next;
    };
    The words are to be separated by white-spaces, full-stop, comma, end of line and end of file... So these words in a temporary variable (say char *tempWord)

    As soon as the word is found do these following steps:
    1. Find the next word
    2. check whether the list is free
    2.a. If yes, create an entry, add the word to the entry, increment the occurance to 1 and make the next to point NULL.
    Now search for the word in the entire file using strstr ( strstr is the command that points to the first occurrence in word of any of the entire sequence of characters specified in file, or a null pointer if the sequence is not present in file. )
    when the word is found, increment the occurance and move the file pointer to next...
    do this till EOF is reached. then go to step 1 if EOF is reached.
    2.b. If no, search for the word in the list
    2.b.(i). if present in the list, then go to step 1.
    2.b.(ii). if not present, follow the step 2.a and do update the list.


    follow these steps till EOF is reached...
    ----------------------

    this is an idea.. you have to use variables accordingly..

  4. #3
    manux's Avatar
    manux is offline Programming Professional
    Join Date
    Oct 2008
    Posts
    234
    Blog Entries
    1
    Rep Power
    14

    Re: New to Posting: Help with word comparison in C

    I agree with linked lists, but in case you dont understand the concept, you also might want to allocate dynamically a new "page" each time you find x new words.
    What I mean is that you start with an empty char* array[32], which you extend when you find the 33rd word(extend it another 32 char*s).

    As for finding words, you simply need to keep two pointers around:
    -1st the pointer to the last end of a word
    -2nd the pointer to the end of the present word.
    You need to increment the 2nd one until you find a word separator(space, dot comma).

    If you want to compare two strings, you can use strcmp(I think in stdlib.h, or string.h), or you could easily write your own function(which I'd recommend if you're in a learning context)

  5. #4
    Join Date
    Jul 2006
    Posts
    16,478
    Blog Entries
    75
    Rep Power
    143

    Re: New to Posting: Help with word comparison in C

    The first thing I would suggest is making sure you have a list of useful functions: strcmp() comes to mind. I'm guessing this is homework, so you can probably assume words won't be overly long: 30 characters should be good, but be sure to test for it. Also, make sure you understand how scanf() interprets a word. If you can make scanf() do most of the work, that will help a LOT.
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

  6. #5
    jaano31 is offline Newbie
    Join Date
    Oct 2009
    Posts
    2
    Rep Power
    0

    Re: New to Posting: Help with word comparison in C

    Thanks for the responses!.. the professor doesn't want us to delve too deep into pointers for this. He claims it can be done without. I have been able to read in the number of words in the text, and want to store each word in a row of a two-dimensional array (i was hoping to then loop through each row and use strcmp() to compare) but I can't put the array together. I'm looking at scanf and sscanf but am confused as to how they work to separate words.
    Thanks for all the responses so far..

  7. #6
    Join Date
    Sep 2009
    Location
    USA
    Posts
    3,400
    Blog Entries
    5
    Rep Power
    37

    Re: New to Posting: Help with word comparison in C

    Take this small example:
    Code:
    #include <stdio.h>
    
    int main()
    {
    	char word[50];
    	scanf("%s", word);
    	printf("%s", word);
    	return 0;
    }
    Compile that program and run it. Type in two words and press enter. The program should only output one word. scanf stops putting data into the word variable when it hits a newline or a space so that should be convenient to get separate words. The only problem is that scanf will include things like periods and commas, so you will have to filter those out.
    Root Beer == System Administrator's Beer
    Download the new operating system programming kit! (some assembly required)

  8. #7
    Join Date
    Jul 2009
    Location
    Santa Clarita, CA
    Posts
    2,111
    Blog Entries
    47
    Rep Power
    31

    Re: New to Posting: Help with word comparison in C

    I disagree with Guest. scanf() has the same problem gets() does, in that there's no way to tell scanf() what the buffer size is, which will very easily lead to buffer overflows. You need to use something like fgets() to collect it onto a buffer, check to ensure the word isn't too large, then you can sscanf that line and do any necessary strncmp()s, however I'd still use a custom function. So, yeah, that's not the only problem with scanf.
    Wow I changed my sig!

Closed Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. String comparison
    By Macoder in forum C and C++
    Replies: 4
    Last Post: 07-14-2011, 05:42 PM
  2. MD5 Checksum comparison asp.net vb
    By Ray2000 in forum Visual Basic Programming
    Replies: 1
    Last Post: 04-07-2010, 01:28 PM
  3. C++ Comparison Algorithm
    By QuackWare in forum Classes and Code Snippets
    Replies: 0
    Last Post: 02-02-2010, 06:19 PM
  4. Replies: 1
    Last Post: 12-14-2009, 10:42 AM
  5. [C] Comparison problem
    By Alhazred in forum C and C++
    Replies: 1
    Last Post: 08-29-2007, 02:58 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts