The Python Oracle

How to use glob() to find files recursively?

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Dream Voyager Looping

--

Chapters
00:00 How To Use Glob() To Find Files Recursively?
00:23 Accepted Answer Score 1828
01:24 Answer 2 Score 123
01:53 Answer 3 Score 93
02:15 Answer 4 Score 79
02:38 Thank you

--

Full question
https://stackoverflow.com/questions/2186...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #path #filesystems #glob #fnmatch

#avk47



ACCEPTED ANSWER

Score 1828


There are a couple of ways:

pathlib.Path().rglob()

Use pathlib.Path().rglob() from the pathlib module, which was introduced in Python 3.5.

from pathlib import Path

for path in Path('src').rglob('*.c'):
    print(path.name)

glob.glob()

If you don't want to use pathlib, use glob.glob():

from glob import glob

for filename in glob('src/**/*.c', recursive=True):
    print(filename)   

For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk() solution below.

os.walk()

For older Python versions, use os.walk() to recursively walk a directory and fnmatch.filter() to match against a simple expression:

import fnmatch
import os

matches = []
for root, dirnames, filenames in os.walk('src'):
    for filename in fnmatch.filter(filenames, '*.c'):
        matches.append(os.path.join(root, filename))

This version should also be faster depending on how many files you have, as the pathlib module has a bit of overhead over os.walk().




ANSWER 2

Score 123


Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:

import os, fnmatch


def find_files(directory, pattern):
    for root, dirs, files in os.walk(directory):
        for basename in files:
            if fnmatch.fnmatch(basename, pattern):
                filename = os.path.join(root, basename)
                yield filename


for filename in find_files('src', '*.c'):
    print 'Found C source:', filename

Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.




ANSWER 3

Score 93


I've modified the glob module to support ** for recursive globbing, e.g:

>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')

https://github.com/miracle2k/python-glob2/

Useful when you want to provide your users with the ability to use the ** syntax, and thus os.walk() alone is not good enough.




ANSWER 4

Score 79


Starting with Python 3.4, one can use the glob() method of one of the Path classes in the new pathlib module, which supports ** wildcards. For example:

from pathlib import Path

for file_path in Path('src').glob('**/*.c'):
    print(file_path) # do whatever you need with these files

Update: Starting with Python 3.5, the same syntax is also supported by glob.glob().