How to use glob() to find files recursively?
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Dream Voyager Looping
--
Chapters
00:00 How To Use Glob() To Find Files Recursively?
00:23 Accepted Answer Score 1828
01:24 Answer 2 Score 123
01:53 Answer 3 Score 93
02:15 Answer 4 Score 79
02:38 Thank you
--
Full question
https://stackoverflow.com/questions/2186...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #path #filesystems #glob #fnmatch
#avk47
ACCEPTED ANSWER
Score 1828
There are a couple of ways:
pathlib.Path().rglob()
Use pathlib.Path().rglob() from the pathlib module, which was introduced in Python 3.5.
from pathlib import Path
for path in Path('src').rglob('*.c'):
print(path.name)
glob.glob()
If you don't want to use pathlib, use glob.glob():
from glob import glob
for filename in glob('src/**/*.c', recursive=True):
print(filename)
For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk() solution below.
os.walk()
For older Python versions, use os.walk() to recursively walk a directory and fnmatch.filter() to match against a simple expression:
import fnmatch
import os
matches = []
for root, dirnames, filenames in os.walk('src'):
for filename in fnmatch.filter(filenames, '*.c'):
matches.append(os.path.join(root, filename))
This version should also be faster depending on how many files you have, as the pathlib module has a bit of overhead over os.walk().
ANSWER 2
Score 123
Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:
import os, fnmatch
def find_files(directory, pattern):
for root, dirs, files in os.walk(directory):
for basename in files:
if fnmatch.fnmatch(basename, pattern):
filename = os.path.join(root, basename)
yield filename
for filename in find_files('src', '*.c'):
print 'Found C source:', filename
Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.
ANSWER 3
Score 93
I've modified the glob module to support ** for recursive globbing, e.g:
>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')
https://github.com/miracle2k/python-glob2/
Useful when you want to provide your users with the ability to use the ** syntax, and thus os.walk() alone is not good enough.
ANSWER 4
Score 79
Starting with Python 3.4, one can use the glob() method of one of the Path classes in the new pathlib module, which supports ** wildcards. For example:
from pathlib import Path
for file_path in Path('src').glob('**/*.c'):
print(file_path) # do whatever you need with these files
Update:
Starting with Python 3.5, the same syntax is also supported by glob.glob().