python - Help on Regular Expression problem -


i wonder if it's possible make regex following data pattern:

'152: ashkenazi a, benlifer a, korenblit j, silberstein sd.'

string = '152: ashkenazi a, benlifer a, korenblit j, silberstein sd.' 

i using regular expression (using python's re module) extract these names:

re.findall(r'(\d+): (.+), (.+), (.+), (.+).', string, re.m | re.s) 

result:

[('152', 'ashkenazi a', 'benlifer a', 'korenblit j', 'silberstein sd')] 

now trying different number (less 4 or more 4) of name data pattern doesn't work anymore because regex expects find 4 of them:

(.+), (.+), (.+), (.+). 

i can't find way generalize pattern.

this should trick if want stuff after numbers:

re.findall(r'\d+: (.+)(?:, .+)*\.', input, re.m | re.s) 

and if want everything:

re.findall(r'(\d+): (.+)(?:, .+)*\.', input, re.m | re.s) 

and if want them separated out list of matches, nested regex it:

re.findall(r'[^,]+,|[^,]+$', re.findall(r'\d+: (.+)(?:, .+)*\.', input, re.m | re.s)[0],re.m|re.s) 

Comments

Popular posts from this blog

windows - Why does Vista not allow creation of shortcuts to "Programs" on a NonAdmin account? Not supposed to install apps from NonAdmin account? -

c++ - How do I get a multi line tooltip in MFC -

unit testing - How to mock PreferenceManager in Android? -