This program is supposed to find and slice any cell within a CSV file which contains more than a defined number of characters.
The files can be pretty large, so being a beginner I would like to know if it is written correctly, and if I could make it more efficient. One of the lines is also more than 80 characters long, which bothers me.
import configparser, csv, sys if len(sys.argv) < 3 : usage = """Usage: %s [inputfile][output file]\nThis program requires 2 \ arguments to function properly.\n[input file] is the file to clean\n[output fil\ e] is the name of the file that will be created as a result of this program\n""" print(usage % (sys.argv)) else : #reads the config file config = configparser.ConfigParser() config.read('csv_cleaner.ini') config = config['CONFIG'] encoding = config['character_encoding'] size = int(config['truncation_size']) #opens target file and creates the receiving one with open(sys.argv, 'r', newline='', encoding=encoding)as csv_file, \ open(sys.argv,'x', newline='', encoding=encoding)as output_file: #helps with parsing if config['detect_dialect'] : dialect = csv.Sniffer().sniff(csv_file.read(2048)) dialect.escapechar = '\\' #return to beginning of file csv_file.seek(0) #creates reader and writer reader = csv.reader(csv_file, dialect) dialect.delimiter = config['delimiter_in_output'] writer = csv.writer(output_file, dialect) #loops through file's lines for row in reader : #slices cells and loops through line's columns row=[col[:(size-2)]+(col[(size-2):]and'..')for col in row] #writes in new file writer.writerow(row)
This program uses a config file :
[CONFIG] character_encoding = UTF-8 delimiter_in_output = ; #set this option to False if the file is not recognized detect_dialect = True truncation_size = 255