I'm trying to do a one-way directory sync. Given a list of existing files in a dir, I'd like to make the files in the dir equal a new list of files. There will be subdirs under the dir. Because operating system calls are expensive, I'd rather minimize the number needed.
It's easy to just delete each file in the existing list but not on the new list, but that could leave empty subdirs. I could test for empty subdirs with OS calls, but as noted I'd like to avoid that. Similarly, I'd prefer removing dirs to first removing each file in the dir, then removing the empty dir.
I'm just operating on file names, not checking whether two files with the same name are the same or actually copying or deleting files or directories.
'''
Input:
- list of existing files
- revised list of files
Output:
- lists to be used to transform first list to second list
-- list of files to be added to existing dir
-- list of directories to be pruned
-- list of files to be deleted
'''
import os
import sys
def file_to_list(file):
return [x.strip() for x in open(file, 'r') if not x.startswith('#EXT')]
def one_minus_two(one, two):
return [x for x in one if x not in set(two)]
def reduce_dirs(delete_files, new_list):
new_delete = []
for file in delete_files:
parts = file.split('\\')
sub = ''
for i in range(len(parts)):
sub = os.path.join(sub, parts[i])
if sub == '':
sub = '\\'
count = 0
for song in new_list:
if song.startswith(sub):
count += 1
break
if count == 0:
new_delete.append(sub)
break
return list(set(new_delete))
def reduce_files(remove_dirs, delete_files):
rd = []
rf = []
for dir in remove_dirs:
if dir in delete_files:
rf.append(dir)
else:
rd.append(dir)
return rf, rd
def main():
old_file = sys.argv[1]
new_file = sys.argv[2]
old_list = file_to_list(old_file)
new_list = file_to_list(new_file)
add_files = one_minus_two(new_list, old_list)
print 'add_files', add_files
delete_files = one_minus_two(old_list, new_list)
print '\nraw delete list', delete_files # intermediate result
remove_items = reduce_dirs(delete_files, new_list)
print '\nreduced delete list', remove_items # intermediate result
rf, rd = reduce_files(remove_items, delete_files)
print '\ndelete files', rf
print '\nprune dirs', rd
if __name__ == '__main__':
main()
Sample list of existing files (old_files):
\dir\who\tommy\song1 \dir\who\tommy\song2 \dir\who\tommy\song3 \dir\rolling\beggars\song4 \dir\rolling\beggars\song5 \dir\rolling\beggars\song6 \dir\who\next\song7 \dir\who\next\song8 \dir\who\next\song9 \dir\pink\dark\song10 \dir\pink\dark\song11 \dir\pink\dark\song12 \dir\bach\orch\fugue\song13 \dir\bach\orch\fugue\song14 \dir\bach\orch\fugue\song15
Sample list of new_files:
\dir\rolling\beggars\song4 \dir\rolling\beggars\song5 \dir\rolling\beggars\song6 \dir\pink\dark\song10 \dir\pink\dark\song11 \dir\yes\closer\song16 \dir\yes\closer\song17 \dir\yes\closer\song18 \dir\king\court\song2 \dir\king\court\song4 \dir\king\court\song6
There are likely cases I'm ignoring with these simple examples.
I have the feeling I'm reinventing the wheel here.
shutil.rmtree(path)
... but I guess that would slow down so still not a viable option. 2) You could replace yourone_minus_two
by this:set(one) - set(two)
\$\endgroup\$