Jump to content


Check out our Community Blogs

Register and join over 40,000 other developers!


Recent Status Updates

View All Updates

Photo
- - - - -

Organize SQL Table Data by Comparison

SQL

  • Please log in to reply
3 replies to this topic

#1 DoktorD1313

DoktorD1313

    CC Newcomer

  • Just Joined
  • PipPip
  • 24 posts

Posted 16 July 2011 - 04:17 PM

Hey guys,

I'm in a tricky situation and I'm trying to decide on the best way to approach a solution.

I have a .CSV file that needs to be organized and inserted into a SQL database table. My question is: do I need to organize it before or after inserting it into the database?

Here's what makes it difficult.. It's a .CSV file containing the monthly item sales history of the entire inventory of a store. The problem is, instead of having the entire history of one item on one row, there is a seperate row for each year of an item's sales.

Here's what I mean.. This is the first few lines of the .CSV file.

0-004 ,2010,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000002,0000000,0000000,0000000,0000000,
0-007 ,2011,0000000,0000000,0000000,0000001,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000000,
0-034 ,2009,0000002,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000000,
0-034 ,2010,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000000,0000001,0000002,0000000,
0-095 ,2009,0000000,0000000,0000000,0000000,0000004,0000000,0000002,0000000,0000000,0000000,0000000,0000001,
0-095 ,2010,0000000,0000000,0000000,0000003,0000000,0000000,0000001,0000000,0000000,0000002,0000000,0000000,
0-095 ,2011,0000000,0000000,0000000,0000000,0000003,0000000,0000000,0000000,0000000,0000000,0000000,0000000,

So I need to get all the data into a single row by item number to make it easier to process later on.

Google searches seem to indicate that SQL doesn't really support line by line comparison. The only way I can think of getting around this is to first read the .CSV file line by line into an array with C# and organize the sales properly using IF THEN ELSE statements. However, this is a very large file with over 60,000 lines. I'd like to avoid that if I could.

Basically I need to figure out a way to take a line, compare to see if the item number matches the line before it, then compare the years to the current date to move the monthly sales to the correct positions on a single line.

Man, I wish I could just change how the data is first exported to the .CSV file.. But alas, I cannot.
  • 0

#2 WingedPanther73

WingedPanther73

    A spammer's worst nightmare

  • Moderator
  • 17757 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others

Posted 16 July 2011 - 04:46 PM

I would recommend NOT putting all the data in one row, but instead leave it as annual records. Think about it: when 2012 comes, your proposal would mean you have to add another column to your table. When 2013 comes, you have to do it again. if you use the current format, you never have to change your database structure, and your reports can be written to process the records without having to do major changes to them, either.

I'd leave it in the current format, and it should be a simple process to import it.
  • 0

Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

My MineCraft server site: http://banishedwings.enjin.com/


#3 DoktorD1313

DoktorD1313

    CC Newcomer

  • Just Joined
  • PipPip
  • 24 posts

Posted 18 July 2011 - 11:54 AM

I understand what you're saying, however in my particular case it may not be best.

Each time the program is ran, it will begin with a cleared database table and re-import the .CSV file. After the report has been generated, the database table will be cleared. This is because the .CSV file is exported from the Point of Sale program and only retains three years of records. (After a fourth year has been reached, the eldest year is truncated from the record and will never be used again).

Also, I'm considering putting it all into one row to make the mathematical functions easier to work with. (Things like the fact that only the past 24 months are taken into account when processing the data.. This would be more difficult to do if dealing with multiple rows for each year).

Perhaps the biggest reason to put it all in one row is the fact that there are three other data files containing other pertinent information by item number that eventually needs to be all paired together..

However, while writing this response, I think I solved my own problem with how to pair and organize all three files into one single database table..

Anyway, I'm open to ideas.. I think the biggest problem I have is only having a very basic knowledge of the functions, commands, and power of SQL. I would imagine that as I learn the language, I'll pick up and realize there are far more efficient ways of doing things using all the tools SQL provides.
  • 0

#4 WingedPanther73

WingedPanther73

    A spammer's worst nightmare

  • Moderator
  • 17757 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others

Posted 18 July 2011 - 01:20 PM

Joins are one of those power items. You can join a table with itself, using different years to specify the records that should be treated as "columns".

One things that I've found is that people frequently want new reports that slice date in different directions, once they have one report. The closer to a raw form the data is in, the easier it is to slice the data as needed.
  • 0

Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

My MineCraft server site: http://banishedwings.enjin.com/






Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download