Jump to content

Repetitive task needs to be automated, don't know where to start!

- - - - -

  • Please log in to reply
3 replies to this topic

#1
TML

TML

    Newbie

  • Members
  • Pip
  • 2 posts
To explain the problem:

I am writing code for Stata. The code reads in data from a text file, then assigns variable names like "Price" to whats read in as ER34556 so I actually know what the variables are. I have multiple text files of data, corresponding to different years in a panel survey. Variables translate across years, but have different names. ie ER34556 is "Price" in 2007 and ER45566 is "Price" in 2009. There exists online documentation for this data, where i can search "ER34556" and it will return a list of variables across years that correspond to this, ie I will get "[05]ER25588 [07]ER34556 [09]ER45566" as part of the returned information. I have written the code for 2009 and cut and pasted to previous years. What I want to do is this: Take the nonsensical variable name "ER..." from my stata code (written as a text file) and search the online documentation. Then, use the returned list of variables corresponding to other years to update my code written for other years. As I see it (please correct me if im wrong at any point), this will require several steps


  • Extract the variables names from the 2009 code (commands are written: generate varname = ER...), so i would need everything to the right of the equals sign.
  • Take each "ER" variable, and search the documentation.
  • Extract the list that is returned
  • Search the stata code for additional instances of the "ER" variable
  • Assign the new variable name from the documentation on a year by year basis.
  • Repeat above as needed

This should be possible, but I have no idea where to start! (languages, methods, etc) Any help at all would be greatly appreciated! If more information is needed to answer the question, please let me know. Thanks in advance.

#2
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
What is the format of the online documentation? PDF is very different from HTML or text.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#3
TML

TML

    Newbie

  • Members
  • Pip
  • 2 posts
The search returns HTML. You can go here and search for something like er43564. What I need can be found clicking on the little exclamation point next to the name in the returned record at the bottom of the page. Thanks again.

#4
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others
You can probably do this with most any web-based language. You could also use something like AutoIt to write a script.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users