Exposing the SEP cluster

Configuring external access to the SEP cluster

This guide provides step-by-step instructions on how to make your SEP cluster accessible for external connections, like BI tools. There are four methods to configure the expose parameter in your YAML file, as detailed in this Kubernetes document.

The options used by Starburst include:

  • ClusterIP (default)
  • NodePort
  • LoadBalancer
  • Ingress

Each method is explained below, helping you choose the best fit for your environment. For TLS configurations via a load balancer, we recommend following the Ingress approach.

ClusterIP

By default, if no type is specified, ClusterIP is used. This option exposes the coordinator on a cluster-internal IP, making the service reachable only within the cluster. You can enable public internet access through an Ingress or Gateway.

We advise using this configuration during the initial setup phase. Add it to your sep-prod-setup.yaml file as shown:

yaml
expose:
  type: "clusterIp"
  clusterIp:
    name: "starburst"
    ports:
      http:
        port: 8080

NodePort

NodePort exposes the service on a static port on each node's IP. Kubernetes establishes a cluster IP address in the same manner as a ClusterIP service.

The Kubernetes documentation provides further insight:

If you set the type field to NodePort, the Kubernetes control plane allocates a port from a range specified by --service-node-port-range flag (default: 30000-32767). Each node proxies that port (the same port number on every Node) into your Service. Your Service reports the allocated port in its .spec.ports[*].nodePort field.

Using a NodePort gives you the freedom to set up your own load balancing solution, to configure environments that are not fully supported by Kubernetes, or even to expose one or more nodes' IP addresses directly.

For a node port Service, Kubernetes additionally allocates a port (TCP, UDP or SCTP to match the protocol of the Service). Every node in the cluster configures itself to listen on that assigned port and to forward traffic to one of the ready endpoints associated with that Service. You'll be able to contact the type: NodePort Service, from outside the cluster, by connecting to any node using the appropriate protocol (for example: TCP), and the appropriate port (as assigned to that Service).

yaml
expose:
  type: "nodePort"
  nodePort:
    name: "starburst"
    ports:
      http:
        port: 8080
        nodePort: 30080

LoadBalancer

The LoadBalancer type exposes services externally using an external load balancer. Kubernetes itself doesn't provide a load balancer; it requires external provision or integration with a cloud provider.

In cloud environments supporting external load balancers, setting the type to LoadBalancer automatically provisions one for your service, directing traffic to backend pods.

yaml
expose:
  type: "loadBalancer"
  loadBalancer:
    name: "starburst"
    IP: ""
    ports:
      http:
        port: 8080
    annotations: {}
    sourceRanges: []

Check kubernetes LoadBalancer

To obtain the coordinator's external IP address, use:

bash
kubectl get services

The output will look similar to the following:

bash
NAME         TYPE           CLUSTER-IP        EXTERNAL-IP    PORT(S)          AGE
starburst    LoadBalancer   111.11.111.111    XXX            8080:30080/TCP   1m

Add the port number (8080) to the external address (marked as XXX in the example) to access the coordinator from outside.

If the external IP shows <pending>, investigate for errors in the Event section with the following command. %SERVICE_NAME% will be the name of your service, in the example of these guides it is starburst.

bash
kubectl describe service %SERVICE_NAME%

Ingress

Ingress is ideal for a secure, production-level setup. It supports multiple applications in a cluster, offering load balancing, SSL termination, and virtual hosting. For instance, both the SEP coordinator and Ranger server can be exposed using the ingress: configuration.

yaml
expose:
  type: "ingress"
  ingress:
    serviceName: "starburst"
    servicePort: 8080
    tls:
      enabled: true
      secretName:
    host:
    path: "/"
    annotations: {}

Ingress with nginx & cert-manager

nginx , a robust HTTP and proxy server, is often used as a load balancer. Pairing it with cert-manager from Let’s Encrypt enhances security.

Initially, deploy an HTTPS ingress controller in your cluster. A detailed guide is available in the cert-manager documentation.

After setup, with an A record in your DNS zone, you can expose the Starburst Enterprise web UI:

yaml
expose:
  type: "ingress"
  ingress:
    serviceName: "starburst"
    servicePort: 8080
    tls:
      enabled: true
      secretName: "tls-secret-starburst"
    host: ""
    path: "/(.*)"
    annotations:
      kubernetes.io/ingress.class: "nginx"
      cert-manager.io/issuer: "letsencrypt-staging"

Some configurable parameters include:

ParameterDescription
secretNameStores the certificate generated by cert-manager; can be any chosen value
annotationsDefaults to nginx for single ingress controller setups; assumes an existing certificate issuer named letsencrypt-staging.

The Ranger user interface can be exposed in exactly the same way:

yaml
expose:
  type: "ingress"
  ingress:
    tls:
      enabled: true
      secretName: "tls-secret-ranger"
    host: ""
    path: "/(.*)"
    annotations:
      kubernetes.io/ingress.class: "nginx"
      cert-manager.io/issuer: "letsencrypt-staging"

Check kubernetes ingress

To verify ingress functionality, execute:

bash
kubectl get ingress

To verify ingress functionality, execute:

bash
NAME              CLASS    HOSTS               ADDRESS          PORTS   AGE
example-ingress   <none>   www.example.com     111.111.11.111   80      1m

Here, Host is your DNS name, and Address is the ingress controller's IP.

Conclusion

Following these configurations, your SEP cluster should be readily accessible for external connections. When interacting with the UI, some pages will give an error that a query logger needs to be set up to work properly. In the next guide, you will set this up using a database of your choice.