Jump to content

? Finding nested ( )'s in a file using regular expressions?

- - - - -

  • Please log in to reply
2 replies to this topic

#1
vanattab

vanattab

    Newbie

  • Members
  • Pip
  • 1 posts
I am trying to write a python code that reads a string stored in Newick Tree format and builds a graph of the data. Below is a graph and what the stored newick file would look like.

Posted Image

The newick code for the above graph.

(A:1,D:6,(((E:1,F:1):1,B:2):1,(C:4,G:2):2):1);

This representation is using the internal node where A and D meet as the root node. Which internal node you choose is arbitrary. Each set of ( ) represents a internal node and the elements within the parentheses are all the children of the node and are separated by commas. A:1 is a leaf node connected to the root node with a weight of 1. The weights are optional.

Hence,
( A , B , ( C , D )) Would look like:

A     D

  \__/

  /  \

B     C


What I need to be able to do is find the inner most set or sets of parentheses. Is there any way to do this using regular expressions. In the above example I would ultimately want to find the indexes into the string where the ( ) are containing C and D. Some times I would need to find two sets as the example below illustrates. If there are any Regex pros around I would appreciate the help.

#2
djl09

djl09

    Newbie

  • Members
  • Pip
  • 1 posts
I am trying to solve the same problem.

I'm trying to write a regex that will spilt things on the comma, but not if they are inside ().

so the initial call would split it into:
A:1
D:6
(((E:1,F:1):1,B:2):1,(C:4,G:2):2)

and then you could recall the regex on the the third part to get
((E:1,F:1):1,B:2):1
(C:4,G:2):2

and so on.

does anyone know how to right a regex that will do this?

#3
Flying Dutchman

Flying Dutchman

    Programming God

  • Members
  • PipPipPipPipPipPipPip
  • 890 posts
  • Location:::1
I don't know how to do it with regexes, but you could use find() method and find first closing bracket and then find the closest left opening bracket to it.
A conclusion is where you got tired of thinking.
#define class struct    // All is public.




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users