Dataproc Service Account based Secure Multi-tenancy (called "secure multi-tenancy", below) enables you to share a cluster with multiple users, with a set of users mapped to service accounts when the cluster is created. With secure multi-tenancy, users can submit interactive workloads to the cluster with isolated user identities.
When a user submits a job to the cluster, the job:
runs as a specific OS user with a specific Kerberos principal
accesses Google Cloud resources using the mapped service account credentials
Considerations and Limitations
When you create a cluster with secure multi-tenancy enabled:
You can submit jobs only through the Dataproc Jobs API.
The cluster is available only to users with mapped service accounts. For example, unmapped users cannot run jobs on the cluster.
Service accounts can be mapped only to Google users, not Google groups.
The Dataproc Component Gateway is not enabled.
Direct SSH access to the cluster and Compute Engine features, such as the ability to run startup scripts on cluster VMs, are blocked. Also, jobs cannot run with
sudo
privileges.Kerberos is enabled and configured on the cluster for secure intra-cluster communication. End user authentication through Kerberos is not supported.
Dataproc Workflows are not supported.
Creating a secure multi-tenancy cluster
To create a Dataproc secure multi-tenancy cluster, use
the --secure-multi-tenancy-user-mapping
flag to specify a list of user-to-service-account mappings.
Example:
The following command creates a cluster, with user bob@my-company.com
mapped to service account service-account-for-bob@iam.gserviceaccount.com
and user alice@my-company.com
mapped to service account service-account-for-alice@iam.gserviceaccount.com
.
gcloud dataproc clusters create my-cluster \ --secure-multi-tenancy-user-mapping="bob@my-company.com:service-account-for-bob@iam.gserviceaccount.com,alice@my-company.com:service-account-for-alice@iam.gserviceaccount.com" \ --scopes=https://www.googleapis.com/auth/iam \ --service-account=cluster-service-account@iam.gserviceaccount.com \ --region=region \ other args ...
Alternatively, you can store the list of user-to-service-account mappings in
a local or Cloud Storage YAML or JSON file. Use the
--identity-config-file
flag to specify the file location.
Sample identity config file:
user_service_account_mapping: bob@my-company.com: service-account-for-bob@iam.gserviceaccount.com alice@my-company.com: service-account-for-alice@iam.gserviceaccount.com
Sample command to create the cluster using the --identity-config-file
flag:
gcloud dataproc clusters create my-cluster \ --identity-config-file=local or "gs://bucket" /path/to/identity-config-file \ --scopes=https://www.googleapis.com/auth/iam \ --service-account=cluster-service-account@iam.gserviceaccount.com \ --region=region \ other args ...
Notes:
As shown in the above commands, cluster
--scopes
must include at leasthttps://www.googleapis.com/auth/iam
, which is necessary for the cluster service account to perform impersonation.The cluster service account must have permissions to impersonate the service accounts mapped to the users (see Service account permissions).
Recommendation: Use different cluster service accounts for different clusters to allow each cluster service account to impersonate only a limited, intended group of mapped user service accounts.