Jump to content

Nearest Words

- - - - -

  • Please log in to reply
5 replies to this topic

#1
newcheese

newcheese

    Newbie

  • Members
  • Pip
  • 4 posts
Hey everyone,

I need to make a "nearest words" function for my site. I am working with PHP and MySQL, and dealing with 100 000+ records. I have a WordID (auto increment) and Word field.

I'll explain what I have so far, which is FAR from ideal:

I have a words table, with a word id, organized in ascending/alphabetical order.

It's quite easy to find the nearest words this way, as the WordID +/- 10 can provide me with the 20 closest words to whatever query.

The problem lies with inserting new data. How can I insert in alphabetical order? Or update my table so that WordIDs stay sequential & Words remain in alphabetical order?

My temporary fix (because I can't come up with anything else, except maybe temp tables..) is to alter the table and order by word ascending, then drop the WordID column, and then re-add it. This meets my requirements but its slow, and it can't be a good idea to be dropping primary keys on the fly like that.

I tried using cursors, but it's also slow.

Any suggestions, ideas, or other ways to approach this problem? I'd love to have something as clean and efficient as the nearest word list urbandictionary has, for example.

Thanks

#2
newcheese

newcheese

    Newbie

  • Members
  • Pip
  • 4 posts
Suppose I have the following list of words:

1 tiger
2 dog
3 elephant
4 kangaroo
5 lion
6 zebra
7 mouse
8 cat

an order by Word ascending would give me:

8 cat
2 dog
3 elephant
4 kangaroo
5 lion
7 mouse
1 tiger
6 zebra

how could I then choose the 4 closest words to lion (by which I mean, elephant, kangaroo, mouse and tiger)?

I can't do WordID +/- 2 because WordID doesn't all of a sudden become sequential after an order by.

This is the problem...

#3
zeroradius

zeroradius

    Speaks fluent binary

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,406 posts
all you need is to use Order by in your select query.

So lets say you want to sort the animals above and the field in the database is named name and the table name is animal

Select * From animal Order By name Desc;

Note: Desc stands for Descending


Edit: Did not see your second post. Use the LIMIT key word.

Select * From animal Order By name Desc Limit 4;
Posted Image

#4
newcheese

newcheese

    Newbie

  • Members
  • Pip
  • 4 posts
That would give me the first 4 results, not what I'm looking for.


Suppose I have 50 records.
I order them by ASC.
I need to find the 25th record, +/- the 4 around it.

I would need to offset the results by 23 and limit them by 4.

However, I wont ever know how many results to offset by this way... the table is constantly changing and so is the search query.

Limit/offset/order by does not work.

#5
John

John

    Writes binary right handed and hex left handed

  • Moderators
  • 6,321 posts
  • Location:New York, NY
mysql> use test;

Database changed


mysql> describe animals;

+-------+--------------+------+-----+---------+----------------+

| Field | Type         | Null | Key | Default | Extra          |

+-------+--------------+------+-----+---------+----------------+

| id    | int(11)      | NO   | PRI | NULL    | auto_increment |

| name  | varchar(255) | NO   |     | NULL    |                |

+-------+--------------+------+-----+---------+----------------+

2 rows in set (0.01 sec)


mysql> (SELECT * FROM `animals` WHERE `name` = 'lion')

    -> UNION ALL (SELECT * FROM `animals` WHERE `name` > 'lion' ORDER BY `name` ASC LIMIT 2)

    -> UNION ALL (SELECT * FROM `animals` WHERE `name` < 'lion' ORDER BY `name` DESC LIMIT 2)

    -> ORDER BY `name` ASC;

+----+----------+

| id | name     |

+----+----------+

|  3 | elephant |

|  4 | kangaroo |

|  5 | lion     |

|  7 | mouse    |

|  1 | tiger    |

+----+----------+

5 rows in set (0.00 sec)


mysql> (SELECT * FROM `animals` WHERE `name` > 'lion' ORDER BY `name` ASC LIMIT 2)

    -> UNION ALL (SELECT * FROM `animals` WHERE `name` < 'lion' ORDER BY `name` DESC LIMIT 2)

    -> ORDER BY `name` ASC;

+----+----------+

| id | name     |

+----+----------+

|  3 | elephant |

|  4 | kangaroo |

|  7 | mouse    |

|  1 | tiger    |

+----+----------+

4 rows in set (0.00 sec)


#6
newcheese

newcheese

    Newbie

  • Members
  • Pip
  • 4 posts
Ha, I guess that would work. I don't know my SQL well enough.

Thanks!




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users