Google Drive
Connect Google Drive folders using a GCP service account
Google Drive
Google Drive connections use GCP service account credentials (no OAuth) to read files from a shared Drive folder.
There are two main ways to use this connection:
- Tabular ingest:
dlt_extract+source: "google_drive"forcsv/parquet/jsonl→ bronze layer - File sync:
file_sync+source: "google_drive"for documents/files (PDF/DOCX/PPTX/XLSX/MD/etc) → artifacts bucket (raw + optional extracted text)
Prerequisites
- Create a service account in Google Cloud.
- Enable the Google Drive API for the project.
- Create a service account key (JSON).
- Share the target Google Drive folder with the service account client email (Viewer is enough).
This connector is service-account only. There is no OAuth flow.
Creating a Connection
- Go to Connections → Add Connection
- Select Google Drive
- Enter the credential fields from your service account key JSON
- Click Create, then Verify
Credential Fields
| Field | Description |
|---|---|
client_email | Service account email (from the key JSON) |
private_key | Service account private key (from the key JSON) |
project_id | GCP project id (from the key JSON) |
If your private key is pasted with escaped newlines (\\n), VAI normalizes it automatically.
The Key ID shown in the GCP console is not the private key. You must download a JSON key
and copy the private_key value from that file.
Verification
Verification checks that the credentials can access the Drive API and can list a folder.
Common failures:
- 401 Unauthorized: credentials are invalid (wrong project, email, or key)
- 403 Forbidden: folder is not shared with the service account email
- 429 Rate limited: retry after a short delay
Usage with Actions
Tabular ingest (CSV/Parquet/JSONL)
Use dlt_extract with source: "google_drive" when you want to ingest tabular files from a Drive folder:
{
"kind": "dlt_extract",
"connection": { "kind": "by_slug", "slug": "my-google-drive" },
"source": "google_drive",
"source_config": {
"folder_id": "1AbCDefGhIJkLmNoPqRsTuVwXyZ",
"file_glob": "**/*.csv",
"file_type": "csv"
}
}Notes
folder_idis required and must be a folder that is shared with the service account.- The only available resource is
files.
Multi-select (folders/files) with per-item cursors
Use multi-select when you want to explicitly sync a set of folders/files by Drive item ID (IDs survive renames/moves).
{
"kind": "dlt_extract",
"connection": { "kind": "by_slug", "slug": "my-google-drive" },
"source": "google_drive",
"source_config": {
"folder_id": "root",
"selected_items": ["0Bxx123FolderId", "0Bxx456FileId"],
"item_cursors": {
"0Bxx123FolderId": "modification_date",
"0Bxx456FileId": null
},
"native_export_formats": {
"docs": "application/pdf",
"sheets": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"slides": "application/pdf"
}
}
}Selection semantics
selected_itemscan include both folder IDs and file IDs.- Selecting a folder includes all descendant files (recursively).
item_cursorscontrols incremental behavior per selected item:"modification_date"→ incrementalnull→ full resync for that selected item
- Google-native file types (Docs/Sheets/Slides) are downloaded via the Drive export API using
native_export_formats.
Hard limits
- 1000 expanded files max
- 3 folder levels max (from
folder_id) - 50GB total max
File Sync (Docs + Files)
Use file_sync when you want to sync common Drive documents as raw bytes plus extracted text (when supported).
{
"kind": "file_sync",
"connection": { "kind": "by_slug", "slug": "my-google-drive" },
"source": "google_drive",
"roots": [{ "kind": "folder", "id": "1AbCDefGhIJkLmNoPqRsTuVwXyZ" }],
"include_globs": ["**/*"],
"exclude_globs": ["**/.trash/**"],
"include_extensions": ["pdf", "md", "docx", "pptx", "xlsx", "csv", "txt"],
"max_file_size_mb": 50,
"extract_text": true,
"dry_run": false,
"export_google_native": {
"docs": "text/plain",
"sheets": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"slides": "application/pdf"
}
}Selection semantics
roots: multiple folder IDs and/or file IDsinclude_globs/exclude_globs: matched against a computed path likeRoot/Subfolder/File.ext