If you’re building a simple internal app and you might probably be thinking that ‘I’m going to need a database now right!’. Well, Not so fast!
As you all are familiar with importing, exporting and manipulating comma separate files (CSV) using Python, Hereby in this article I’m going to show you the step by step guide to access Google Spreadsheets on the cloud using Python.
As the very first thing, go to Google API Manager by simply googling it and go to https://console.developers.google.com/
To kick things off first create a new project.
I’ll name my project as ‘Telemedicine’ since we will be working with a spreadsheet which includes all the tweets related to Telemedicine hashtags which I extracted earlier (Click here to see how I extracted tweets using hashtags). Define a suitable project name according to your dataset on the cloud then click CREATE to initiate the project. (You don’t have to worry about the Location* below the project name)
OKAY. The first part is done, Now go to API Library and search for Google Drive.
Then add Google Drive API to our project which will allow us to access spreadsheets inside of Google Sheets for our account.
Once that’s added, we need to create some credentials to access the API so click on Add Credentials on the next screen you see after enabling the API.
Since we’ll be accessing the API using a web server, We’ll add the Web Server option on this page and give access to Application Data and tell them that you’re not running your application on either GCE or GAE by selecting the option ‘No, I’m not using them’ then click on the button below.
Next, we will create a service account named Employees and assigned it the role Project Editor which will allow it to access and edit all the data within the API. Clicking continue will generate a JSON file that I will rename and add to the project as Telemedicine_secret.json.
Then open the JSON file in a text editor (I prefer Atom) :)
Inside the file, you can locate an email address property called “client_email”, if we copy that and take it over to our spreadsheet on the cloud, we can share that particular spreadsheet with the email address we provide to give us access to it from the API.
Reading spreadsheet data with Python
Let’s move into the terminal to install gspread and oauth2client packages and wait till all components get installed.
$ pip install gspread oauth2client
Then I’m going to create a new python file called spreadsheet.py in my favorite editor ATOM and write the following code to import gspread and ServiceAccountCredentials from oauth2client .
import gspread
from oauth2client.service_account import ServiceAccountCredentials
- Then, we have to define the scope and create credentials using that scope and the content of employees_secret.json file.
- Then I’ll create a gspread client authorizing it using those credentials.
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('employees_secret.json', scope)
client = gspread.authorize(creds)
3. NOW, We can access our google sheets so we’ll call a client.open and pass it the spreadsheet name and getting access to sheet1
sheet = client.open('telemedicine_data').sheet1
4. Now we can set our employees equal to all of the records inside that sheet and print them out to the terminal.
telemedicine = sheet.get_all_records()
print(telemedicine)
Let’s go the terminal try to run our program and we’ll get a glorious list of perfectly formatted content like in the image below (See the highlighted text and you can find all the columns in our dataset).
python spreadsheet.py
Whoa! It wasn’t what you’re expecting, wasn’t it 😂? Well, trust me I got the perfect solution for that!
We can clean up the result by using pprint module, using that we can create a prettyprinter that we can use to display the result and its a much nicer way to display the output.
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import pprint
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('employees_secret.json', scope)
client = gspread.authorize(creds)
sheet = client.open('employee_reviews').sheet1
pp = pprint.PrettyPrinter()
employees = sheet.get_all_records()
pp.pprint(employees)
If you want to go through the full documentation of gspread click here and in the meantime I’ll follow you up with these cool tricks.
Filtering Data
#to get all the values inside the file
sheet.get_all_values()
#to get exact row values in a second row (Since 1st row is the header)
sheet.row_values(2)
#to get all the column values in the column 'place'
sheet.col_values(16)
#to extract a particular cell value
sheet.cell(1, 1).value
OK! Reading is all done. YES! But, Can we do more? Oh Yes!
Insert / Update and Delete from Spreadsheet
And also, you can enter a random row and can give values to each cell by separating each word using double quotations.
row = ["I'm","inserting","a","new","row","into","a,","Spreadsheet","using","Python"]
index = 3
sheet.insert_row(row, index)
Furthermore, It’s also possible to alter data inside a specific cell. Look at how it is done, I’ll change the header column name ‘id’ to ‘telemedicine_id’:
sheet.update_cell(1, 1, "telemedicine_id")
Finally, In conclusion to this article, I’m going to wrap up by proving that it is not essential to create databases for simple internal apps. Spreadsheets might be your best answer!
You can access the full code here. (If the link doesn't work by any chance, leave a comment below I'll follow up)
Keen on getting to know me and my work? Click here for more!
Here are some articles you might find interesting:
Text Mining and Natural Language Processing Here's the perfect Beginner Guide for NLP: Random Updates…www.jayasekara.blog
0 Comments