Tuesday, August 11, 2020

google cloud services initialization (cloud, colab, sites, dns, ai)

Some of Google's web services have their own names but are tied together with GCP (Google Cloud Platform) and/or some specific GCP API. GCP is at the top, but a person can enter through lesser services and be unaware of the larger picture. Again, GCP is at the top, but then come related but semi-independent services, Colab, Sites, AI. In turn, each of these might rely on just a GCP API, or be under another service. For example, Colab is tied into GCP, but a person can get started in it through Drive, without knowing its larger role. When a person's trying to budget, it's a wide landscape to understand exactly for what they are being charged, and under which service.

Static Google Cloud site (9:51) Stuffed Box, 2019. Starting at the top and getting a simple static (no database) website rolling. One must already have purchased a DNS name.
Hosting on Cloud via a Colab project (30:32) Chris Titus Tech, 2019. This is a bit dated, so prices have come up, but it shows how it's done.
Hosting a Pre-packed Moodle stack (9:51) Arpit Argawal, 2020. Shows the value of a notebook VM in Colab
Hosting on Cloud via a Colab project (30:32) Chris Titus Tech, 2019. This is a bit dated, so prices have come up, but it shows how it's done.

Google's iPython front-end Colab takes the Jupyter concept one-further, placing configuration and computing on a web platform. Customers don't have to configure an environment on their laptop, everything runs in the Google-sphere, and there are several API's (and TensorFlow) that Google makes available.

During 2020, the documentationi was a little sparse, so I made a post here, but now there are more vids and it's easier to see how we might have different notebooks running on different servers, across various Google products. This could also include something where we want to run a stack, eg for a Moodle. If all this seems arcane, don't forget we can host traditionally through Google Domains. What's going to be interesting is how blockchain changes the database concept in something like Moodle. Currently, blockchain is mostly for smart contract and DAPPs.

Colab

Notebooks are created, ran, and saved via the Drive menu, or go directly to colab.research.google.com. Users don't need a Google Cloud account to use Colab. Easiest way to access Colab is to connect it to one's Drive account, where it will save files anyway. Open Drive, click on the "+" sign to create a new file and go to down to "More". Connect Colab and, from then on, Colab notebooks can be created and accessed straight from Drive.

There's a lot you can do with a basic Colab account, if you have a good enough internet connection to keep up with it. The Pro version is another conversation. I often prefer to strengthen Colab projects by adding interactivity with Cloud.

GUI Creation in Google Colab (21:31) AI Science, 2020. Basics for opening a project and getting it operational.
Blender in Colab (15:28) Micro Singularity, 2020. Inadvertently explains an immense amount about Colab, Python, and Blender.

Colab and Google Cloud

Suppose one writes Python for Colab that needed to call a Google API at some point. Or suppose a person wanted to run a notebook on a VM that they customized? These are the two added strengths of adding Cloud: 1) make a VM (website) to craft a browser project, 2) add Google API calls. Google Cloud requires a credit card.

Cloud and Colab can be run separately, but fusing them is good in some cases. Gaining an understanding of the process allows users to know when to rely on either Colab or Google Cloud or interdependently.

Colab vs. Google Cloud (9:51) Arpit Argawal, 2020. Shows the value of a notebook VM in Colab
Hosting on Cloud via a Colab project (30:32) Chris Titus Tech, 2019. This is a bit dated, so prices have come up, but it shows how it's done.

Note the Google Cloud platform homepage above. The menu on the left is richer than the one in the Colab screenshot higher above. We run the risk of being charged for some of these features so that Google will display potential estimated charges before we submit our requests to use Google API's.

API credentials

We might want to make API calls to Cloud's API's. Say that a Colab notebook requires a Google API call, say to send some text for translation to another language. The user switches to their Cloud account and selects the Google API for translation. Google gives them an estimate of what calls to that API will cost. The user accepts the estimate, and then Google provides the API JSON credentials, which are then pasted into their Colab notebook. When the Colab notebook runs, it can then make the calls to the Google API. Protect such credentials because we don't want others to use them against our credit card.

Cloud account VM details

In the case of running notebooks and you update something, did it update on your machine or googles. its more clear on Google Cloud.

API dependencies

When a person first opens a Colab notebook, it's on a Google server, and the path for the installed Python is typically /usr/local/lib/python/[version]. So I start writing code cells, and importing and updating API dependencies. Google will update all the dependencies on the virtual machine it creates for your project on the server. ALLEGEDLY.

Suppose I want to use google-cloud-texttospeech. Then the way to updates its dependencies (supposedly):

% pip install --upgrade google-cloud-texttospeech

Users can observe all the file updates necessary for the API, unless they add the "-quiet" flag to suppress it. However, no matter that this process is undertaken, when the API itself is called, there can be dependency version problems between Python and iPython.

Note that in the case above the code exits with a "ContextualVersionConflict" listing a 1.16 version detected in the given folder. (BTW, this folder is on the virtual machine, not one's home system). Yet the initial upgrade command AND a subsequent "freeze" command show the installed version as 1.22. How can we possibly clear this since Google has told itself that 1.22 is installed, but the API detects version 1.16? Why are they looking in different folders? Where are they?

problem: restart the runtime

Python imports, iPython does not (page) Stack Overflow, 2013. Notes this is a problem with sys.path.

You'd think of course that there's a problem with sys.path, and to obviate *that* problem, I now explicitly import the sys and make sure of the path in first two commands...

import sys
sys.path.append('/usr/local/lib/python3.6/dist-packages/')

... in spite of the fact these are probably being fully accomplished by Colab. No, the real problem, undocumented anywhere I could find, is that one simply has to restart the runtime after updating the files. Apparently, if the runtime is not restarted, the newer version stamp is not reread into the API.

what persists and where is it?

basic checklist

  • login to colab.
  • select the desired project.
  • run the update cell
  • restart the runtime
  • proceed to subsequent cells
  • gather any rendered output files from the folder icon to the left (/content folder). Alternatively, one can hard code it into the script so that they transfer to one's system:
    from google.colab import files
    files.download("data.csv")
    There might also be a way to have these sent to Google Drive instead.

No comments: