How to build a MongoDB on GKP Kubernetes

First of all lets make sure we have a volume on GCP to storage the data.

Creating storage on Google Cloud Platform (GCP) for use with Google Compute Engine instances can be done through several types of storage options, such as Google Cloud Storage (for object storage), Persistent Disks (for block storage), and local SSDs (for high-performance storage directly attached to your virtual machine instances). The choice depends on your specific needs, such as whether you require high IOPS, persistence across instance terminations, or global accessibility.

Here’s how you can create a Persistent Disk, which is one of the most common storage types used with Compute Engine instances, for block storage:

1. Via the Google Cloud Console

Open the Google Cloud Console: Navigate to your project’s console.
Go to the Compute Engine section: Find the “Compute Engine” menu on the left-hand side and click on “Disks”.
Create a disk: Click on the “Create disk” button.
Configure the disk:

Name your disk.
Choose a region and zone that matches your Compute Engine instance.
Select the disk type (e.g., Standard Persistent Disk, SSD Persistent Disk, etc.).
Specify the disk size according to your needs.
Optionally, you can add labels and configure snapshot schedules.

Create: Click the “Create” button to create the disk.

2. Using the gcloud CLI

First, make sure you have the gcloud CLI installed and authenticated with your GCP account. Then, you can create a persistent disk using the following command:

gcloud compute disks create [DISK_NAME] --size=[DISK_SIZE]GB --zone=[ZONE] --type=[DISK_TYPE]

Replace [DISK_NAME] with the name you want to assign to your disk.
Replace [DISK_SIZE] with the size of the disk in GB.
Replace [ZONE] with the zone where you want to create the disk.
Replace [DISK_TYPE] with the type of disk you want to create (e.g., pd-standard for Standard Persistent Disk, pd-ssd for SSD Persistent Disk).

Example Command:

gcloud compute disks create my-disk --size=100GB --zone=us-central1-a --type=pd-standard

This command creates a 100GB Standard Persistent Disk named “my-disk” in the “us-central1-a” zone.

Remember to attach the disk to a Compute Engine instance after creation if you want to use it for storage. You can do this either through the Google Cloud Console or using the gcloud command-line tool by updating the instance’s settings and specifying the disk you’ve created.¸

Also we need to make sure the environment variable for the project is also set

export CLOUDSDK_CORE_PROJECT=your-project-id

Also make sure the name of the disk matches Must be a match of regex

'(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)'

After creating the Volume we need to give access to the volume in GKP

Step 1: Create a PersistentVolume (PV) in Kubernetes

You’ll need to create a PV object in Kubernetes that references your GCP Persistent Disk. Here’s an example YAML file for creating a PV:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: mongodb-pv
  labels:
    use: mongodb
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ""
  gcePersistentDisk:
    pdName: mongodb-disk
    fsType: ext4

Replace your-disk-name with the name of your GCP Persistent Disk. Adjust the storage size according to the size of your disk, and the fsType with the filesystem type you have used.

Step 2: Create a PersistentVolumeClaim (PVC)

Next, you need to create a PVC that your pods can use to claim the storage provided by the PV. Here’s an example YAML for a PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongodb-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: ""
  selector:
    matchLabels:
      use: mongodb

To use the mongodb-pvc PersistentVolumeClaim in a MongoDB deployment on Kubernetes, you will need to create a Deployment that defines how MongoDB instances should be run. The deployment will specify the MongoDB container image, the necessary environment variables (if any), and include the PVC for persistent storage.

Here’s a basic example of a MongoDB deployment that uses mongodb-pvc for its storage:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mongodb-deployment
  labels:
    app: mongodb
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mongodb
  template:
    metadata:
      labels:
        app: mongodb
    spec:
      containers:
      - name: mongodb
        image: mongo:latest
        ports:
        - containerPort: 27017
        env:
        - name: MONGO_INITDB_ROOT_USERNAME
          value: "mongo_admin"
        - name: MONGO_INITDB_ROOT_PASSWORD
          value: "yourpassword"
        volumeMounts:
        - name: mongodb-storage
          mountPath: /data/db
      volumes:
      - name: mongodb-storage
        persistentVolumeClaim:
          claimName: mongodb-pvc

Deployment Metadata: Defines the name and labels for the deployment.

Replicas: This example uses a single replica. For production, you might want to consider replication for MongoDB for high availability.

Selector and Template Labels: These ensure that the deployment manages pods with the label app: mongodb.

Container Image: Uses the official MongoDB image from Docker Hub. You can specify a particular version by replacing latest with the version number (e.g., mongo:4.4).

Container Ports: Exposes MongoDB on its default port, 27017.

Environment Variables: Sets the MongoDB root username and password. Replace "yourpassword" with a secure password.

Volume Mounts: Mounts the persistent storage to /data/db inside the container, which is the default MongoDB data directory.

Volumes: Specifies that the pod uses a volume backed by the mongodb-pvc PersistentVolumeClaim.

if in production I would suggest creating a secret and have these environment variables assign in the secret them assign them when you create the deployment.

Making it accessible through NodePort IP

To make your MongoDB deployment accessible through a NodePort service in your Kubernetes cluster, you will need to create a Service of type NodePort. This service will expose MongoDB on a port on each node’s IP in your cluster. Here’s how you can define the service:

apiVersion: v1
kind: Service
metadata:
  name: mongodb-service
spec:
  type: NodePort
  selector:
    app: mongodb
  ports:
    - port: 27017
      targetPort: 27017
      nodePort: 30007

Getting Node External IPs with NodePort

To get the external IPs of the nodes in your cluster, you can use the following kubectl command:

kubectl get nodes -o wide

Exposing MongoDB with a LoadBalancer in GKE

To expose your MongoDB service outside your GKE cluster more seamlessly, you can use a LoadBalancer. This type of service automatically creates an external load balancer that points to your MongoDB service, providing you with a single IP address for access.

Here’s how you can define a LoadBalancer service for MongoDB:

apiVersion: v1
kind: Service
metadata:
  name: mongodb-loadbalancer
spec:
  type: LoadBalancer
  selector:
    app: mongodb
  ports:
    - protocol: TCP
      port: 27017
      targetPort: 27017

Deploying MongoDB on Kubernetes offers a robust and scalable solution for managing your database needs in a containerized environment. Whether you’re using Google Kubernetes Engine (GKE) or another Kubernetes setup, integrating MongoDB with persistent storage and exposing it securely plays a crucial role in ensuring data persistence and accessibility. By leveraging the flexibility of NodePort services for development environments and the power of LoadBalancer services for production in GKE, developers can ensure that their MongoDB instances are both accessible and secure.

Understanding how to effectively use PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs) to manage storage, alongside configuring your MongoDB deployment to suit your specific requirements, is essential. Furthermore, transitioning from a NodePort to a LoadBalancer service allows for easier access management and enhanced security, provided best practices are followed.