Jump to content

Java Robot - Find Color Sequences

- - - - -

This topic has been archived. This means that you cannot reply to this topic.
13 replies to this topic

#1
BlaineSch

BlaineSch

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 2,448 posts
I was playing around with Java Robot Class and decided to try and make another bot! I'm not posting the full source simply because I don't think it's that useful. But part of it is. I basically used some loops to find sequences of colors or graphics. In the example below it'll find facebook's favicon (for testing). It takes a little bit of time (took me 16 seconds) so if you can think of any improvements so I don't have to test every pixel that would be great!

import java.awt.*;
class urlGathering {
    private static String intToHex(int y) {
        String numeric = "0123456789ABCDEF";
        String result;
        int r = y % 16;
        if ((y-r)== 0) {
            result = numeric.substring(r, r+1);
        } else  {
            result = intToHex((y-r)/16) + numeric.substring(r,r+1);
        }
        return result;    
    }
    private static String colorToHex(Color y) {
        return intToHex(y.getRed())+intToHex(y.getGreen())+intToHex(y.getBlue());
    }
    public static boolean isColorMatch(Robot myBot, int x1, int y1, String[][] list) {
        Color color;
        for(int y2 = y1;y2<(list.length+y1);y2++) {
            for(int x2 = x1;x2<(list[y2-y1].length+x1);x2++) {
                color = myBot.getPixelColor(x2,y2);
                if(colorToHex(color).compareToIgnoreCase(list[y2-y1][x2-x1]) != 0) {
                    return false;
                }
            }
        }
        return true;
    }
    public static int[] findColorSequence(Robot myBot, String[][] list){ 
        Dimension screenSize = Toolkit.getDefaultToolkit().getScreenSize();
        for(int x = 0;x<screenSize.width;x++){
            for(int y = 0;y<screenSize.height;y++){
                if(isColorMatch(myBot, x, y, list)) {
                    return new int[] {x, y};
                }
            }
        }
        return new int[] {-1,-1};//return ret;
    }
    public static void main(String[] args) throws AWTException, InterruptedException {
        //Thread.sleep(6000);
        long startTime = System.nanoTime();
        Robot myBot = new Robot();
        int[] pixels = findColorSequence(myBot, new String[][] {
                {"3b5998","3b5998","6078ab","ebeef4","ffffff","ffffff"},
                {"3b5998","3b5998","ebeef4","ffffff","ffffff","ffffff"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"ffffff","ffffff","ffffff","ffffff","ffffff","ffffff"},
                {"ebeef4","ebeef4","ffffff","ffffff","ebeef4","ebeef4"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"6d84b4","6d84b4","ffffff","ffffff","6d84b4","6d84b4"},
                {"6d84b4","6d84b4","ffffff","ffffff","6d84b4","6d84b4"}
        });
        System.out.println("Found:\n\tX:"+pixels[0]+"\n\tY:"+pixels[1]+"\nNanoseconds: "+(System.nanoTime()-startTime));
    }
}
On another note, I realized multi dimensional arrays are kinda messed up.... if your actually writing them out on paper you might do something like:
[0][1][2]
[1][a][b]
[2][c][d]
if it's x,y... then 2,1 would be "b" but for an array it's backwards!! So visual help should be drawn sideways! lol

#2
wim DC

wim DC

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 2,084 posts
To convert an int to hexadecimal you can easily use:
Integer.toString(number,16)
Static toString from Integer class. The 16 is important ;)

Looks really nice tbh.
Did you let this autogenerate by giving the program a picture to analyse or typed it manually? Shouldn't be too hard to let Java do this I think :)

Quote

{"3b5998","3b5998","6078ab","ebeef4","ffffff","ffffff"},
                {"3b5998","3b5998","ebeef4","ffffff","ffffff","ffffff"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"ffffff","ffffff","ffffff","ffffff","ffffff","ffffff"},
                {"ebeef4","ebeef4","ffffff","ffffff","ebeef4","ebeef4"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"3b5998","3b5998","ffffff","ffffff","3b5998","3b5998"},
                {"6d84b4","6d84b4","ffffff","ffffff","6d84b4","6d84b4"},
                {"6d84b4","6d84b4","ffffff","ffffff","6d84b4","6d84b4"}




Now here's some theory i came up with:

To speed things up you could take bigger steps for screen analysing.
So now you go (x,y) 0,0 - 1,0 - 2,0 -3,0 - ....
But isntead you could like move per 3 pixels, the image you seek (favicon) is 6 px wide. So if the first pixel has the color "3b5998", and if the next pixel (3 pixels further) is NOT a color from the cell in the list 3 position to the right of the color you find, you allready know it's not part of the image.

Maybe not very well explained but maybe the following makes it more clear ^^

in this case of the image we look for = 6px,
you remember the color at position x,y . and the color at position x+3, y. I'll call them colorX, and colorX3 for now.

if in your 2D array of colors you can't find anything where [x][y]==colorX && [x+3][y] ==colorX3 Then you can be pretty sure that you won't find the image between those 2 pixel too. If you complete a whole line of your screen without finding any match it's also safe to increase y by imageYouSeek.height();
If the image was at the line right below the line you checked, then the next line you anaylise will notice a match with the lowest row of the array of colors...

You can't take bigger steps for x than imageYouSeek.width/2, however. I think that this may greatly improve performance.


so i think it goes something like.
(Assuming you keep the y-increasement on 1)
*check if the color on the current pixel is in the list (remember position)
*move to the right 3 pixels, (if this isn't possible, skip the next, go to next pixel (3pixels away) and start over.
*move 3 cells to the right in the list,
* if the colors match. There is a chance the image starts there.
-If it's not, go to the first step.

When you reached the end of the line (x-wise) and no match has been found a line in the picture, increase the y-parameter by imageYouSeek.height()....

Edited by wim DC, 18 March 2010 - 08:36 AM.


#3
Sinipull

Sinipull

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 386 posts
This type of visualization algorithms aren't very good, as slightest changes in the picture will make it unrecognizable for the computer. Such as as any simple filter, (brightness, contrast, hue, noise, format changes to lower quality etc.). That's why neural networks are used in real life to do image recognition. Nothing critical, just saying if someone's looking for a direction :)

#4
wim DC

wim DC

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 2,084 posts
BTW, those 16 seconds.. I hope you were testing with the favicon somehwere in the top left corner because if i test the code with
for(int x = 0;x<screenSize.width/10;x++){
            for(int y = 0;y<screenSize.height/10;y++){
so 168 x 105 pixels. with no icon of facebook in there it takes me 360 seconds to just go trough every pixel and analyse it to find out there is no icon. It doesn't consume much cpu power tho.

#5
wim DC

wim DC

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 2,084 posts
Hello again

Because i find this pretty interesting i took the liberty to use your code and improved it by letting Java read an image, convert it to the 2D array of color strings + i made it faster.
Where you were scanning the screen pixel by pixel it now takes hops equal to the image's width / height.
Also added an 'accuracy' certainly usefull for bigger images. because it will have to check every single pixel of the image with an accuracy of 1. increasing it to 2 will make it check every 2 pixels. (what are the odds that something has equal pixels every 2 pixels)
It's best to set it higher as the image gets bigger.

import javax.imageio.ImageIO;
import java.awt.*;
import java.awt.event.InputEvent;
import java.awt.image.BufferedImage;

import java.io.File;
import java.io.IOException;

class urlGathering {
    private static int width, height;
    private static boolean go = true;
    private static int accuracy=10;

    private static String colorToHex(Color y) {
        return Integer.toString(y.getRed(), 16) + Integer.toString(y.getGreen(), 16) + Integer.toString(y.getBlue(), 16);
    }

    public static int[] isPixelPartOfImage(Robot myBot, int x1, int y1, String[][] list) {
        Color color = myBot.getPixelColor(x1, y1);
        String colorHex = colorToHex(color);

        for (int y2 = 0; y2 < height; y2++) {
            for (int x2 = 0; x2 < width; x2++) {
                if (colorHex.equals(list[y2][x2])) {
                    for (int y3 = 0; y3 < height && go; y3 +=accuracy) {
                        for (int x3 = 0; x3 < width && go; x3 +=accuracy) {
                            myBot.mouseMove(x1 - x2 + x3, y1 - y2 + y3);
                            String colorHex2 = colorToHex(myBot.getPixelColor(x1 - x2 + x3, y1 - y2 + y3));
                            if (!colorHex2.equals(list[y3][x3])){          
                                go = false;
                            }
                            else {
                                if (x3 >= width - accuracy && y3 >= height - accuracy)
                                    return new int[]{x1 - x2, y1 - y2};
                                ;
                            }
                        }
                    }
                    go = true;

                }
            }
        }
        return new int[]{-1, -1};
    }

    public static int[] findColorSequence(Robot myBot, String[][] list) {
        Dimension screenSize = Toolkit.getDefaultToolkit().getScreenSize();
        for (int y = height; y < screenSize.getHeight(); y += height) {
            for (int x = 3; x < screenSize.getWidth(); x += width) {
                myBot.mouseMove(x, y);
                int[] result = isPixelPartOfImage(myBot, x, y, list);
                if (result[0] != -1 && result[1] != -1)
                    return result;
            }
            System.out.println("line: " + y);
        }
        return new int[]{-1, -1};
    }

    public static String getHex(int pixel) {
        int red = (pixel >> 16) & 0xff;
        int green = (pixel >> 8) & 0xff;
        int blue = (pixel) & 0xff;
        Color color = new Color(red, green, blue);

        return colorToHex(color);
    }

    private static String[][] marchThroughImage(BufferedImage image) {
        width = image.getWidth();
        height = image.getHeight();

        String colors[][] = new String[height][width];
        for (int i = 0; i < height; i++) {
            for (int j = 0; j < width; j++) {
                int pixel = image.getRGB(j, i);
                colors[i][j] = getHex(pixel);
                System.out.println("");
            }
        }

        return colors;
    }

    public static void start() throws AWTException, InterruptedException {
        long startTime = System.nanoTime();
        Robot myBot = new Robot();
        File file = new File("find.bmp");
        String[][] colors = new String[1][];
        try {
            BufferedImage img = ImageIO.read(file);
            colors = new String[img.getHeight()][img.getWidth()];
            colors = marchThroughImage(img);
        } catch (IOException e) {
            System.out.println("fout met file");
        }

        for (int y = 0; y < height; y++) {
            for (int x = 0; x < width; x++) {
                System.out.printf("%8s", colors[y][x] + "  ");
            }
            System.out.println("");
        }
        System.out.println("");

        int[] pixels = findColorSequence(myBot, colors);
        myBot.mousePress(InputEvent.BUTTON1_MASK);
        myBot.mouseRelease(InputEvent.BUTTON1_MASK);
        System.out.println("Found:\n\tX:" + pixels[0] + "\n\tY:" + pixels[1] + "\nNanoseconds: " + (System.nanoTime() - startTime));
    }
}

I let the robot control the mouse during the process, so i know where he's at. So if you have a white background all over the screen, and the image you seek contains white. It will take a lloooooooooooooooooooonngggg time. (several minutes) So either remove the robot.mousemove (there are 2 lines) or know the key combination from your IDE to interupt the process. don't be afraid to use a high number for accuracy. It greatly increases speed and it has never given me the wrong place, yet :)

It finds facebook's favicon in 3,379seconds with an accuray set to 10(=every 10th pixel is checked), my facebook favicon is 14x14 btw. 7,173seconds with accuracy of 1 (=every pixel is checked) 4,257s with accuracy of 2. 3795 with 3 accuracy....
This when facebook is the 3th tab of my browser:
Posted Image


Accuracy has a bigger impact when the image you seek contains a color that occurs multiple times on the screen.


the code may be a bit hard to understand with x1,x2,x3 and y1,y2,y3 .... It always makes a lot of sence when i write it, and confusing afterwards :p Just understand that x1-x2, y1-y2 is the possible top left corner of the image.
The image you seek and java reads (hard coded path) must be BMP as jpg screws up the colors. the favicon i used:Posted Image

#6
cliveceps

cliveceps

    Newbie

  • Members
  • Pip
  • 6 posts
Hello, to the above poster:

I have more improvements for you. First off, you should never write code in "Column major" form but use "Row major" instead. This means you should use for(x...) before for(y...). Column major form violates the use of cache spatial locality and is a big no-no for optimization. In fact most optimizing compilers will fix this problem for you. I cut the programs run time from 21 seconds to 11.5 just by changing that alone.

Now the second change I made was to say for(x2 = 0; x2 < width; x2 +=5) and for(y2 = 0; y2 < height; y2+=5). This let me skip a lot more pixels in the search. This cut my running time from 11.5 seconds to 0.05 seconds (WOW!!). Overall the program is ((21.5 / 0.05) * 100) = 43000% faster now than it was before, and it is still working for me (you might have to play around with some numbers depending on the size of the image).

Just some tips for you to try!

#7
wim DC

wim DC

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 2,084 posts

Quote

for(x2 = 0; x2 < width; x2 +=5) and for(y2 = 0; y2 < height; y2+=5)
I allready use
for (int y = height; y < screenSize.getHeight(); y += height) {
            for (int x = 3; x < screenSize.getWidth(); x += width) {
Which is the max i could possibly skip in the search. Once it has found a pixel that matches some part of the image it will obviously take smaller jumps, the size of the smaller jump is the 'accuracy' variable, so i think i pretty much did that allready.


Quote

First off, you should never write code in "Column major" form but use "Row major" instead. This means you should use for(x...) before for(y...). Column major form violates the use of cache spatial locality and is a big no-no for optimization. In fact most optimizing compilers will fix this problem for you. I cut the programs run time from 21 seconds to 11.5 just by changing that alone.
Waw, didn't know that.
But, what is the column and what is the row in a 2D array? [ROW][Column] Or is it [column][Row]?

+ I never actually did it, but would it also be faster if i just took a screenshot with the robot class, and compare the pixels of the image instead of the getpixel() of the screen?

#8
cliveceps

cliveceps

    Newbie

  • Members
  • Pip
  • 6 posts
The format you should use is [row][column]. The reason is because when your cache accesses a memory location, it will bring the next couple memory locations with it as well. For example if you access [0][0] it would also bring [0][1], [0][2], [0][3] etc..so you will get cache hits for those items and your program will run faster. It will not bring [1][0] or [2][0] so you will get cache misses and your program will run slower.

I am interested in optimizing this code more for a program I am writing to capture images. How would you do it differently with a screen shot?

#9
tate

tate

    Learning Programmer

  • Members
  • PipPipPip
  • 90 posts
So I wanted to see if I could do this myself from scratch. The result I got was i could find the facebook favicon placed anywhere on my screen in under .2 seconds using the 16x16 favicon with a skip of 3. Here is my code. It requires one argument, the image you want to find's location, and the skip argument is optional. It waits three seconds before it checks so you can move stuff around. I sped stuff up a lot by checking all four corners and a middle pixel before checking the rest.
import java.awt.Dimension;
import java.awt.Rectangle;
import java.awt.Robot;
import java.awt.AWTException;
import java.awt.Toolkit;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;

/**
 * ImageFind.java
 * @author Tate Exon
 * Date: June 4, 2010
 * Finds an image on the screen.
 */

public class ImageFind {
    
    private Robot robo;
    private BufferedImage capturedImage;
    private BufferedImage compareTo;
    private Dimension screenSize = Toolkit.getDefaultToolkit().getScreenSize();
    private int[] helpers = new int[5];
    private int skipper = 2; // Default to 2;
    
    public ImageFind(String imageToCompareToName, int skipper){
        try {
            this.compareTo = ImageIO.read(new File(imageToCompareToName));
            this.loadHelpers();
            this.robo = new Robot();
            this.robo.delay(3000);
            this.capturedImage = this.robo.createScreenCapture(
                    new Rectangle((int)this.screenSize.getWidth(),(int)this.screenSize.getHeight()));
            this.skipper = skipper;
            long startTime = System.nanoTime();
            int[] location = this.findImage();
            long endTime = System.nanoTime();
            if(location[0]!= (-1)){
                p("Found at x="+location[0]+",y="+location[1]);
                this.robo.mouseMove(location[0], location[1]);
            }else{
                p("Could not find a match.");
            }
            p("Time taken = "+((double)(endTime-startTime))/1000000000+" seconds");
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }catch (AWTException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
    
    public void loadHelpers(){
        this.helpers[0] = this.compareTo.getRGB(0, 0);
        this.helpers[1] = this.compareTo.getRGB(0, this.compareTo.getHeight()-1);
        this.helpers[2] = this.compareTo.getRGB(this.compareTo.getWidth()-1, 0);
        this.helpers[3] = this.compareTo.getRGB(this.compareTo.getWidth()-1, this.compareTo.getHeight()-1);
        this.helpers[4] = this.compareTo.getRGB(this.compareTo.getWidth()/2, this.compareTo.getHeight()/2);
    }
    
    public boolean checkHelpers(int x, int y){
        if(this.helpers[0] != this.capturedImage.getRGB(x, y)){
            return false;
        }else if(this.helpers[1] != this.capturedImage.getRGB(x, y+this.compareTo.getHeight()-1)){
            return false;
        }else if(this.helpers[2] != this.capturedImage.getRGB(x+this.compareTo.getWidth()-1,y)){
            return false;
        }else if(this.helpers[3] != this.capturedImage.getRGB(x+this.compareTo.getWidth()-1, y+this.compareTo.getHeight()-1)){
            return false;
        }else if(this.helpers[4] != this.capturedImage.getRGB(x+this.compareTo.getWidth()/2, y+this.compareTo.getHeight()/2)){
            return false;
        }else{
            return true;
        }
    }
    
    public boolean comparePixels(int x, int y, int cx, int cy){
        if(this.compareTo.getRGB(x, y) == this.capturedImage.getRGB(cx, cy)){
            return true;
        }else{
            return false;
        }
    }
    
    public int[] findImage(){
        int w = this.compareTo.getWidth();
        int h = this.compareTo.getHeight();
        int cw = this.capturedImage.getWidth()-w;
        int ch = this.capturedImage.getHeight()-h;
        int[] foundPoint = new int[2];
        foundPoint[0] = -1;
        foundPoint[1] = -1;
        for(int x=0; x<cw; x++){
            for(int y=0; y<ch; y++){
                if(checkHelpers(x,y)){
                    if(this.checkArea(x, y, w, h)){
                        foundPoint[0]=x;
                        foundPoint[1]=y;
                        return foundPoint;
                    }
                }
            }
        }
        return foundPoint;
    }
    
    public boolean checkArea(int x, int y, int w, int h){
        int compareX = 0;
        int compareY = 0;
        int boolX = x+w-1;
        int boolY = y+h-1;
        for(; x<boolX; x+=this.skipper){
            for(; y<boolY; y+=this.skipper){
                if(!comparePixels(compareX,compareY,x,y)){
                    return false;
                }
                compareY+=this.skipper;
            }
            compareX+=this.skipper;
        }
        return true;
    }
    
    public static void p(String p){
        System.out.println(p);
    }
    
    public static void main(String args[]){
        String fileName = "";
        int skip = -1;
        if(args.length==0){
            p("Need at least an image file location argument.");
            p("args should be <filename> <optional int for skip>");
            System.exit(0);
        }else if(args.length==1){
            fileName = args[0];
            skip = 2;//default
        }else if(args.length>=2){
            fileName = args[0];
            skip = Integer.parseInt(args[1]);
        }
        new ImageFind(fileName,skip);
    }
}

Edited by tate, 04 June 2010 - 08:32 PM.

twas brillig

#10
wim DC

wim DC

    Writes binary right handed and hex left handed

  • Members
  • PipPipPipPipPipPipPipPipPip
  • 2,084 posts
Ahh Tate allready did it with a screenshot ;)

I'm guessing it's faster as the image is accessible in the RAM and the screen has to come from somewhere else in the computer ^^

#11
tate

tate

    Learning Programmer

  • Members
  • PipPipPip
  • 90 posts
hm i just now caught on to what BlaineSch did with the favicon. He pulled out only the white "f" in the picture so the picture is smaller and less to match. I could speed my code up even more if I made an algorithm to look for unique patterns within the original image and match those instead of the whole image. Sounds like a fun thing to try. I also thought about looking for the same image just brighter or darker by taking pixel measurements and looking for similar differences.
twas brillig

#12
cliveceps

cliveceps

    Newbie

  • Members
  • Pip
  • 6 posts
Nice job! Now I have another idea, can you make the program find the same image on any size monitor? For example if I have an image taken from a 1024x768 monitor, it will not be found on a 1280x800 monitor (I tested it, didn't work). I guess the picture is drawn on the screen using more pixels so the robot doesn't match what it's expecting to find.

I'm sure this must be possible, because you have things like facial recognition software which do something similar.