I am using python to create an interpreter for a practice language of mine, I have started to create a tokenizer within a function using my understanding of code and mainly imagination, however I can't seem to produce an efficient and properly working loop as I encounter many errors which require changing code whilst also keeping in mind efficiency. I am aiming to loop through a bulk string once and return a list of string tokens, however this has proved to be quite difficult. I have decided to search the web for either tutorials of how to build a tokenizer or examples of actual language tokenizers, yet I had no luck finding any. Now I am here to see if anyone in these forums could reference me to any tutorials of tokenizing or language tokenizers or help me build my own. Any help is appreciated.
Note: The language that the python program interprets contains a few kinds of delimiters, many types of operators some containing characters of other operators such as "+", "+=" and "++", strings and chars and escape sequences including the escape quatation mark to be able to include the same quotation marks within a string or char definition. Ultimately, strings are lists of chars in the language, though they can be defined using string quatation marks.
Edited by Kreative, 26 December 2015 - 09:00 AM.