Jump to content

Advice on data management.

- - - - -

This topic has been archived. This means that you cannot reply to this topic.
3 replies to this topic

#1
jfrog

jfrog

    Newbie

  • Members
  • Pip
  • 2 posts
Hello,

I'm doing a project that requires building some sort of database. I'm an electrical engineer, and basically I've written a program that extracts "fingerprints" from mp3's used for identification. Now I need to store these fingerprints in some sort of database, and I don't know how I should proceed. I'm doing this in Matlab, but my question is not really Matlab specific.

Each fingerprint file is split up into 32 bit sub fingerprints. Each entire fingerprint is about 90 kb. So my initial thoughts are to design some sort of header for each fingerprint, store each sub-fingerprint in a binary file as 32 bit integers, and just write all the fingerprints one after another to the file with the headers separating them.

Of course, I will later need to search this database, so I plan to just load the file into memory and search from there.

So, my question is, am I headed in the right direction? I have absolutely no experience with databases and writing larger programs (I do a lot of little coding projects, but never anything like this).

Eventually I would want to be able store thousand of fingerprints. So that would mean a pretty large database. I don't want to start on something that is going to be impossible to manage once the database gets bigger. So any help on the direction I should head would be appreciated.

Thanks!

#2
gaylo565

gaylo565

    Programming Professional

  • Members
  • PipPipPipPipPip
  • 268 posts
I would use a relational db (probably an Sql product such as mySql) If you createa table with a column for the actual fingerprint then you can also have an ID number column (your primary key or unique identifier), and any descriptive columns (such as a text description.) This will make searching through the files much easier in the future. However if you decide to go this way you will need to separate your sub-fingerprints with something. This could be solved by making a table with just the ID number column and a column for each sub-fingerprint integer. this together with a descriptive table for the fingerprints (also with your ID number column to keep things straight) would work well together. You can download MySQL and a db design studio used for creating and editing db's for free. There is quite a bit to learn with sql so if you do decide to go this rout be ready to spend some time looking things up. In the long run this will make your project much easier to manage rather than using a single binary file, as well as creating ways to iterate through desired parts of your fingerprints list rather than the whole thing.

Edited by gaylo565, 24 July 2008 - 06:46 PM.


#3
jfrog

jfrog

    Newbie

  • Members
  • Pip
  • 2 posts
Thanks for the advice. I've heard of mySQL, but don't know much about it. If I went this route, would the program I write require mySQL to be installed on the machine running my program? Or could I use mySQL to create the program and let my program stand alone?

Thanks again.

#4
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
When using a database, you will need the database program installed someplace as well, though not necessarily on the same machine.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog