We are all familiar with Excel spreadsheets. They are used everywhere to make tables and keep track of stuff in general. Excel offers nice functionalities as well. When we are dealing with a huge number of entries, manipulating these values manually becomes tedious. Consider a situation where there are 1000 rows and 25 columns in a spreadsheet. You want to check the values in the 8th column of each row. Depending on those values, you want to manipulate the values in the 17th column of each row. Or perhaps we want to get all the rows with the same name together. If it were a few rows and columns, we could have done it manually. But doing this for 1000 rows is quite boring. So how do we do it?
A CSV (Comma Separated Value) file stores tabular data in text form separated by commas. It’s a plain text file containing numbers, strings, special characters etc and they are all separated by commas. In this situation, the comma is called a delimiter. You can use other delimiters as well (like spaces, tabs etc), but commas are more frequently used. The beauty of this is that it’s very easy to manipulate. Another good thing is that Excel can easily read CSV files. They are stored using .csv extension. It looks just like a regular spreadsheet when you open it in Excel.
Consider the earlier case with 1000 rows. If you have a .xlsx file, you can use the “Save As” option and save it as a .csv file. Once you have that, you can manipulate it using Python. Python has a module named “csv”. You can just import it and it provides necessary functions to read and write csv files.
The code below reads data from the spreadsheet and prints it.
import csv file_reader = csv.reader(open('input.csv', 'rb'), delimiter=',') for row in file_reader: print row, ", "
The code below writes data onto a spreadsheet.
import csv file_writer = csv.writer(open('output.csv', 'wb'), delimiter=',') file_writer.writerow(["Name", "Profession", "Age"]) file_writer.writerow(["Adam", "Engineer", "28"])
You can put the writerow function call in a loop if you have large amount of data to be written. Note that the input argument to ‘writerow’ has to be a Python list.
Once you read the data from a spreadsheet, you can manipulate it very easily. Python provides a big list of features to manipulate data. You can access the smallest details of the input data, manipulate it and put it into an output file.