I am attempting to write a program that analyzes a web server's log file to determine which computers have attempted to access that web server the most.
Any class in the Java standard library is available for use.
I have done some research to figure out what data structures would help, but I am stumped so far.
I know there are many, many ways to program this.
I have found information on "hit filters", and am convinced on using 'Try and catch' methods.
Any help would be appreciated, Thanks.
-Mark
7 replies to this topic
#1
Posted 05 December 2011 - 01:53 PM
|
|
|
#2
Posted 05 December 2011 - 11:34 PM
Depending on what the log file looks like and how structured it is. A regex may be all you need (Pattern & Matcher class in java).
#3
Posted 06 December 2011 - 01:02 AM
Oh! That just might be the solution, researching more about it and will be testing it soon.
I'm going to post the first few lines of the log file, the rest is identical with different addresses.
I'm trying to count up each unique IP and output the top 3 most erroneous accesses.
[Wed Jun 30 20:02:53 2010] [error] [client 209.129.94.61] File does not exist:/site/hancocktools.com/_vti_bin
[Thu Jul 01 04:57:03 2010] [error] [client 67.218.116.163] File does not exist: C:/site/hypergrade.com/robots.txt
I'm going to post the first few lines of the log file, the rest is identical with different addresses.
I'm trying to count up each unique IP and output the top 3 most erroneous accesses.
[Wed Jun 30 20:02:53 2010] [error] [client 209.129.94.61] File does not exist:/site/hancocktools.com/_vti_bin
[Thu Jul 01 04:57:03 2010] [error] [client 67.218.116.163] File does not exist: C:/site/hypergrade.com/robots.txt
#4
Posted 06 December 2011 - 01:08 AM
That ain't too hard :)
public static void main(String[] args)
{
String input = "[Wed Jun 30 20:02:53 2010] [error] [client 209.129.94.61] File does not exist:/site/hancocktools.com/_vti_bin\n" +
"[Thu Jul 01 04:57:03 2010] [error] [client 67.218.116.163] File does not exist: C:/site/hypergrade.com/robots.txt";
Pattern pattern = Pattern.compile("\\[(.*?)\\] \\[error\\] \\[client (.*?)\\] (.*)", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while(matcher.find()){
System.out.println("Date: " + matcher.group(1));
System.out.println("IP: " + matcher.group(2));
System.out.println("Error msg: " + matcher.group(3));
System.out.println("");
}
}
Output:Date: Wed Jun 30 20:02:53 2010 IP: 209.129.94.61 Error msg: File does not exist:/site/hancocktools.com/_vti_bin Date: Thu Jul 01 04:57:03 2010 IP: 67.218.116.163 Error msg: File does not exist: C:/site/hypergrade.com/robots.txt
#5
Posted 06 December 2011 - 01:49 AM
Thank you so much for the Pattern + Matcher example!
The only problem is that when I try to use "matcher.find()" on the log file, it doesn't progress past the first IP.
And when I try to use a Scanner, I end up getting myself into an infinite loop.
For some reason, I can't progress through each individual log, I must be way too tired at this point.
I know it is basic, and I have done it before, but I'm getting slightly frustrated.
Edit:
Here is my current code, which only prints out one IP address when ran.
I'm sure the answer is simple, I just have no energy left in me.
The only problem is that when I try to use "matcher.find()" on the log file, it doesn't progress past the first IP.
And when I try to use a Scanner, I end up getting myself into an infinite loop.
For some reason, I can't progress through each individual log, I must be way too tired at this point.
I know it is basic, and I have done it before, but I'm getting slightly frustrated.
Edit:
Here is my current code, which only prints out one IP address when ran.
I'm sure the answer is simple, I just have no energy left in me.
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class LogAnal
{
public static void main(String[] args)
{
String input = "";
try
{
FileReader fr = new FileReader("small.log");
Scanner scanner = new Scanner(fr);
input = scanner.nextLine();
Pattern pattern = Pattern.compile("\\[(.*?)\\] \\[error\\] \\[client (.*?)\\] (.*)", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while(scanner.hasNextLine())
{
input = scanner.nextLine();
if(matcher.find())
System.out.println("IP: " + matcher.group(2));
}
}
catch (FileNotFoundException e)
{
e.printStackTrace();
}
}
}
#6
Posted 06 December 2011 - 02:11 AM
By the way, the Pattern.compile(..) thingy is quite an expensive operation in terms of processing power.
Make sure you do that only once, and only do .matcher(..) multiple times.
Make sure you do that only once, and only do .matcher(..) multiple times.
#7
Posted 06 December 2011 - 11:38 AM
Did you happen to see my edit?
I just woke up. :)
I just woke up. :)
#8
Posted 06 December 2011 - 12:00 PM
Matcher matcher = pattern.matcher(input);Is only done once outside the loop
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users


Sign In
Create Account

Back to top









