Jump to content

Remove duplicate rows from access database

- - - - -

This topic has been archived. This means that you cannot reply to this topic.
5 replies to this topic

#1
access@123

access@123

    Newbie

  • Members
  • Pip
  • 3 posts
hi
i have a window application, wherein i have to choose the database,its table and rowfields present in the table at runtime. Now i want to remove all the duplicates from database. my database is access and whichever row field the user chooses the duplicates should be removed leaving the latest updated row. Please help me my code is as below..
string str = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + txtPath.Text; 
OleDbConnection con = new OleDbConnection(str); 

string t = "select * from " + cmbtables.SelectedItem.ToString(); 
OleDbDataAdapter adt = new OleDbDataAdapter(t, str); 
con.Open();
DataSet ds = new DataSet(); 


adt.Fill(ds);

DataTable dt = ds.Tables[0]; 
con.Close();
dt = RemoveDuplicateRows(dt, cmbRow.SelectedItem.ToString());


private DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
Hashtable hTable = new Hashtable();
ArrayList duplicateList = new ArrayList();
foreach (DataRow drow in dTable.Rows)
{
if (hTable.Contains(drow[colName]))
duplicateList.Add(drow);
else
hTable.Add(drow[colName], string.Empty);
}
foreach (DataRow dRow in duplicateList)
dTable.Rows.Remove(dRow);
return dTable;
}
when i use datatable.update it is giving me error hoping for your co-operation
Thank you

Edited by WingedPanther, 12 October 2008 - 01:48 PM.
add code tags


#2
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
Without knowing the table structure of your database it will be a little tough to help you. For example, do you have a time-stamp in your table? How about a GUID? If you're going to use a database, you should really be using SQL to handle the deletions/updates.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#3
Guest_Jordan_*

Guest_Jordan_*
  • Guests
Use the DISTINCT function in yor SQL:

string t = "select DISTINCT <column1>, <column2>, <etc> from " + cmbtables.SelectedItem.ToString() + " ORDER BY <timestamp> DESC";

Instead of selecting * you need to specify each column that you want so that the combination only appears once (using the DISTINCT) function. I also added an ORDER BY so that it will retrieve the last one based on the timestamp (replace <timestamp> with your timestamp column name).

As a side note, it is best-practice in SQL to capitalize every keyword. Your code above would change to:

string t = "SELECT * FROM " + cmbtables.SelectedItem.ToString(); 

For even cleaner code you may want to use a line break each time a new keyword appears.

#4
access@123

access@123

    Newbie

  • Members
  • Pip
  • 3 posts
hi thanks for your response, however in my table there is no timestamp column. let me explain once more the scenario, i have a desktop application wherein i select the database it can be any database and then i have to select the table and then the row fields present in the table. so it can be any table or any row field which user selects at runtime.

#5
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
It sounds like you're writing a cleanup utility for arbitrary badly designed databases, correct?
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#6
access@123

access@123

    Newbie

  • Members
  • Pip
  • 3 posts

WingedPanther said:

It sounds like you're writing a cleanup utility for arbitrary badly designed databases, correct?

yes absolutely correct so if possible help me in writing this utility....