Think well about this third step: you can save quite a few dollars annually by just keeping a local cache of travel distances, and only querying when the distance is unknown. My ETL process for this part consists of three global steps:
Add unknown departure/destination pairs to the ‘to be queried’ table (PK of this table is start & end point address in less-structured format, ensuring uniqueness of commutes)
Query Maps API for unknown travel distances. Add retrieved distances (or known unknowns) to the local cache table of travel distances
Use the local cache of travel distance (as complete as it gets at this moment) as the primary lookup for travel distance
The Google Maps API allows free tier users to make about 1000 requests per day. If you don’t need to pull more than that many data points back (or can queue them to run over the necessary time frame), there’s no marginal cost to calls. Otherwise, it ends up being a few dollars per thousand calls, so that shouldn’t break your company’s budget.