Jump to content

Check out our Community Blogs

Register and join over 40,000 other developers!

Recent Status Updates

View All Updates

- - - - -

Where can I learn to tokenize well?


This topic has been archived. This means that you cannot reply to this topic.
No replies to this topic

#1 Kreative


    CC Newcomer

  • Member
  • PipPip
  • 20 posts

Posted 26 December 2015 - 08:53 AM


I am using python to create an interpreter for a practice language of mine, I have started to create a tokenizer within a function using my understanding of code and mainly imagination, however I can't seem to produce an efficient and properly working loop as I encounter many errors which require changing code whilst also keeping in mind efficiency. I am aiming to loop through a bulk string once and return a list of string tokens, however this has proved to be quite difficult. I have decided to search the web for either tutorials of how to build a tokenizer or examples of actual language tokenizers, yet I had no luck finding any. Now I am here to see if anyone in these forums could reference me to any tutorials of tokenizing or language tokenizers or help me build my own. Any help is appreciated.

Note: The language that the python program interprets contains a few kinds of delimiters, many types of operators some containing characters of other operators such as "+", "+=" and "++", strings and chars and escape sequences including the escape quatation mark to be able to include the same quotation marks within a string or char definition. Ultimately, strings are lists of chars in the language, though they can be defined using string quatation marks.

Kind Regards,

Edited by Kreative, 26 December 2015 - 09:00 AM.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download