Jump to content

Large Scale Parsing and Searching

- - - - -

  • Please log in to reply
No replies to this topic

#1
sadronmeldir

sadronmeldir

    Newbie

  • Members
  • Pip
  • 2 posts
Hello all,

I'm trying to make a Java application that will allow for efficient searches through large amounts of data. The original format for the data is tens of millions of lines of text, each containing a channel ID, time stamp, user ID and text message of up to 255 characters. The logs are split across thousands of files, each containing up to 10 MB on information.

The intended purpose of the program I'm designing is to filter through this information by date, channel, user or keyword from the text in an efficient manner.

Unfortunately, I've never worked on a project of this scope before. Does anyone have any recommendation on how to index this information or some other way to search and parse through the data efficiently?




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users