The Python Oracle

glob exclude pattern

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: The World Wide Mind

--

Chapters
00:00 Glob Exclude Pattern
00:19 Answer 1 Score 62
00:45 Answer 2 Score 116
00:56 Answer 3 Score 13
01:23 Accepted Answer Score 306
01:54 Thank you

--

Full question
https://stackoverflow.com/questions/2063...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #glob

#avk47



ACCEPTED ANSWER

Score 306


The pattern rules for glob are not regular expressions. Instead, they follow standard Unix path expansion rules. There are only a few special characters: two different wild-cards, and character ranges are supported [from pymotw: glob – Filename pattern matching].

So you can exclude some files with patterns.
For example to exclude manifests files (files starting with _) with glob, you can use:

files = glob.glob('files_path/[!_]*')



ANSWER 2

Score 116


You can deduct sets and cast it back as a list:

list(set(glob("*")) - set(glob("eph*")))



ANSWER 3

Score 62


You can't exclude patterns with the glob function, globs only allow for inclusion patterns. Globbing syntax is very limited (even a [!..] character class must match a character, so it is an inclusion pattern for every character that is not in the class).

You'll have to do your own filtering; a list comprehension usually works nicely here:

files = [fn for fn in glob('somepath/*.txt') 
         if not os.path.basename(fn).startswith('eph')]



ANSWER 4

Score 13


Late to the game but you could alternatively just apply a python filter to the result of a glob:

files = glob.iglob('your_path_here')
files_i_care_about = filter(lambda x: not x.startswith("eph"), files)

or replacing the lambda with an appropriate regex search, etc...

EDIT: I just realized that if you're using full paths the startswith won't work, so you'd need a regex

In [10]: a
Out[10]: ['/some/path/foo', 'some/path/bar', 'some/path/eph_thing']

In [11]: filter(lambda x: not re.search('/eph', x), a)
Out[11]: ['/some/path/foo', 'some/path/bar']