How can I search sub-folders using glob.glob module?
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Isolated
--
Chapters
00:00 How Can I Search Sub-Folders Using Glob.Glob Module?
00:30 Accepted Answer Score 250
01:17 Answer 2 Score 28
01:55 Answer 3 Score 9
02:09 Answer 4 Score 17
02:24 Thank you
--
Full question
https://stackoverflow.com/questions/1479...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #filesystems #glob #fnmatch
#avk47
ACCEPTED ANSWER
Score 250
In Python 3.5 and newer use the new recursive **/ functionality:
configfiles = glob.glob('C:/Users/sam/Desktop/file1/**/*.txt', recursive=True)
When recursive is set, ** followed by a path separator matches 0 or more subdirectories.
In earlier Python versions, glob.glob() cannot list files in subdirectories recursively.
In that case I'd use os.walk() combined with fnmatch.filter() instead:
import os
import fnmatch
path = 'C:/Users/sam/Desktop/file1'
configfiles = [os.path.join(dirpath, f)
for dirpath, dirnames, files in os.walk(path)
for f in fnmatch.filter(files, '*.txt')]
This'll walk your directories recursively and return all absolute pathnames to matching .txt files. In this specific case the fnmatch.filter() may be overkill, you could also use a .endswith() test:
import os
path = 'C:/Users/sam/Desktop/file1'
configfiles = [os.path.join(dirpath, f)
for dirpath, dirnames, files in os.walk(path)
for f in files if f.endswith('.txt')]
ANSWER 2
Score 28
To find files in immediate subdirectories:
configfiles = glob.glob(r'C:\Users\sam\Desktop\*\*.txt')
For a recursive version that traverse all subdirectories, you could use ** and pass recursive=True since Python 3.5:
configfiles = glob.glob(r'C:\Users\sam\Desktop\**\*.txt', recursive=True)
Both function calls return lists. You could use glob.iglob() to return paths one by one. Or use pathlib:
from pathlib import Path
path = Path(r'C:\Users\sam\Desktop')
txt_files_only_subdirs = path.glob('*/*.txt')
txt_files_all_recursively = path.rglob('*.txt') # including the current dir
Both methods return iterators (you can get paths one by one).
ANSWER 3
Score 17
The glob2 package supports wild cards and is reasonably fast
code = '''
import glob2
glob2.glob("files/*/**")
'''
timeit.timeit(code, number=1)
On my laptop it takes approximately 2 seconds to match >60,000 file paths.
ANSWER 4
Score 9
You can use Formic with Python 2.6
import formic
fileset = formic.FileSet(include="**/*.txt", directory="C:/Users/sam/Desktop/")
Disclosure - I am the author of this package.