How to use glob() to find files recursively?
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Beneath the City Looping
--
Chapters
00:00 Question
00:31 Accepted answer (Score 1710)
01:35 Answer 2 (Score 187)
02:17 Answer 3 (Score 120)
02:57 Answer 4 (Score 91)
03:23 Thank you
--
Full question
https://stackoverflow.com/questions/2186...
Accepted answer links:
[pathlib.Path.rglob]: https://docs.python.org/3/library/pathli...
[pathlib]: https://docs.python.org/3/library/pathli...
[glob.glob('**/*.c')]: https://docs.python.org/3/library/glob.h...
[os.walk]: https://docs.python.org/2/library/os.htm...
[os.walk]: https://docs.python.org/2/library/os.htm...
[fnmatch.filter]: https://docs.python.org/2/library/fnmatc...
Answer 2 links:
[3.5]: https://docs.python.org/3.5/library/glob...
[Python 3 Demo]: https://trinket.io/python3/e69fe22eff
Answer 4 links:
https://github.com/miracle2k/python-glob.../
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #path #filesystems #glob #fnmatch
#avk47
ACCEPTED ANSWER
Score 1828
There are a couple of ways:
pathlib.Path().rglob()
Use pathlib.Path().rglob() from the pathlib module, which was introduced in Python 3.5.
from pathlib import Path
for path in Path('src').rglob('*.c'):
print(path.name)
glob.glob()
If you don't want to use pathlib, use glob.glob():
from glob import glob
for filename in glob('src/**/*.c', recursive=True):
print(filename)
For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk() solution below.
os.walk()
For older Python versions, use os.walk() to recursively walk a directory and fnmatch.filter() to match against a simple expression:
import fnmatch
import os
matches = []
for root, dirnames, filenames in os.walk('src'):
for filename in fnmatch.filter(filenames, '*.c'):
matches.append(os.path.join(root, filename))
This version should also be faster depending on how many files you have, as the pathlib module has a bit of overhead over os.walk().
ANSWER 2
Score 123
Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:
import os, fnmatch
def find_files(directory, pattern):
for root, dirs, files in os.walk(directory):
for basename in files:
if fnmatch.fnmatch(basename, pattern):
filename = os.path.join(root, basename)
yield filename
for filename in find_files('src', '*.c'):
print 'Found C source:', filename
Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.
ANSWER 3
Score 93
I've modified the glob module to support ** for recursive globbing, e.g:
>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')
https://github.com/miracle2k/python-glob2/
Useful when you want to provide your users with the ability to use the ** syntax, and thus os.walk() alone is not good enough.
ANSWER 4
Score 79
Starting with Python 3.4, one can use the glob() method of one of the Path classes in the new pathlib module, which supports ** wildcards. For example:
from pathlib import Path
for file_path in Path('src').glob('**/*.c'):
print(file_path) # do whatever you need with these files
Update:
Starting with Python 3.5, the same syntax is also supported by glob.glob().