Introduction
By default, Portworx thin provisions volumes and balances them according to current usage and load within the cluster, requiring only minimal configuration. This approach enables applications to provision volumes simply as long as you have enough backing storage for the volume usage.
However if the volume usage exceeds your available backing storage, and allocating additional storage is not an option, your application will encounter space issues, in this blog I will show how we can avoid these problems.
Getting Started
Let’s start by checking the size of our Persistent Volumes (PVs)
# kubectl get pv -n oracle-namespace
NAME CAPACITY ACC POLICY STATUS CLAIM STORAGECLASS
pvc-1f478ff6-9dc5-4a95-96a3-2016f542c3f6 20Gi RWO Delete Bound oracle-namespace/ora-data193-oracle19c-0 px-ora-sc
...
And the view from our database container
# kubectl exec -it oracle19c-0 -n oracle-namespace -- /bin/bash [oracle@oracle19c-0 ~]$ df -ht ext4 Filesystem Size Used Avail Use% Mounted on /dev/pxd/pxd1073375706949672013 20G 3.8G 15G 21% /opt/oracle/oradata ...
And finally, using pxctl volume list
# pxctl volume list --label version=19.3.0.1, app=database ID NAME SIZE HA SHARED ENCRYPTED IO_PRIORITY STATUS SNAP-ENABLED 1073375706949672013 pvc-1f478ff6-9dc5-4a95-96a3-2016f542c3f6 20 GiB 3 no no HIGH up - attached on ...
Grow Filesystem
Now let’s grow the ora-data ext4 filesystem by increasing the size of our Persistent Volume Claim to 250GB.
# kubectl edit pvc/ora-data193-oracle19c-0 persistentvolumeclaim/ora-data193-oracle19c-0 edited # kubectl get pvc -n oracle-namespace NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS ora-data193-oracle19c-0 Bound pvc-1f478ff6-9dc5-4a95-96a3-2016f542c3f6 250Gi RWO px-ora-sc
By shelling into our container we can also confirm that the ext4 filesystem has been increased ok.
# kubectl exec -it oracle19c-0 -n oracle-namespace -- /bin/bash [oracle@oracle19c-0 ~]$ df -ht ext4 Filesystem Size Used Avail Use% Mounted on /dev/pxd/pxd1073375706949672013 246G 3.8G 232G 2% /opt/oracle/oradata
Now, lets create a big Tablespace and see what happens.
SQL> create bigfile tablespace SOE datafile '/opt/oracle/oradata/PSTG/PSTGPDB1/soe.dbf' size 50g; * ERROR at line 1: ORA-19502: write error on file "/opt/oracle/oradata/PSTG/PSTGPDB1/soe.dbf", block number 6440576 (block size=8192) ORA-27072: File I/O error Additional information: 4 Additional information: 6440576 Additional information: 409600
Why Did it Fail ?
OK, our Oracle tablespace file creation failed, but what went wrong ?
We increased the size of our Persistent Volume and we saw our ext4 file system grew OK.
Unfortunately, we did not consider the back-end storage and check that it was adequately sized to satisfy the request.
We can find out what we have available by first finding out what nodes are being used for our Portworx volumes with pxctl volume inspect.
# pxctl volume inspect pvc-ef5bb9c5-80e5-4c7b-bc48-bf56f6788145 Volume : 148734364355060346 Name : pvc-ef5bb9c5-80e5-4c7b-bc48-bf56f6788145 Size : 20 GiB Format : ext4 HA : 3 IO Priority : HIGH Creation time : Feb 11 16:40:36 UTC 2021 Shared : no Status : up State : Attached: 4afe8cdf-cac3-40e4-a382-d34755f6f98f (10.225.115.119) Device Path : /dev/pxd/pxd148734364355060346 Labels : namespace=oracle-namespace,priority_io=high,pvc=ora-data193-oracle19c-0,repl=3,version=19.3.0.1,app=database,io_profile=db Reads : 40384 Reads MS : 623182 Bytes Read : 1826086912 Writes : 1271325 Writes MS : 12774392 Bytes Written : 15616274432 IOs in progress : 0 Bytes used : 4.2 GiB Replica sets on nodes: Set 0 Node : 10.225.115.117 (Pool 403bc3eb-6c9d-4b54-88d0-247c49ad8761 ) Node : 10.225.115.118 (Pool f563ea69-b62c-4add-aff5-65647d1b194b ) Node : 10.225.115.121 (Pool 1b4e6e38-9ff4-45a7-b259-e1bacc264ec3 ) Replication Status : Up Volume consumers : - Name : oracle19c-0 (b1dcbd3d-4d5e-44af-8bae-ba8c04228640) (Pod) Namespace : oracle-namespace Running on : node-1-4 Controlled by : oracle19c (StatefulSet)
And then check the space available on our storage nodes using pxctl status.
# pxctl status ... Cluster Summary Cluster ID: px-deploy-1 Cluster UUID: 7f443fd8-6591-42a3-b87c-2d96cafe8213 Scheduler: kubernetes Nodes: 7 node(s) with storage (7 online) IP ID SchedulerNodeName StorageNode Used Capacity StatusStorageStatus Version Kernel OS ... 10.225.115.117 1ac8941f-3024-4c91-9aeb-7242fef44e55 node-1-2 Yes 10 GiB 64 GiB OnlineUp 2.5.6.0-80bd45b 3.10.0-1127.19.1.el7.x86_64 CentOS Linux 7 (Core) ... 10.225.115.118 6f07c7b6-ea0f-4381-bc21-bc84031813b7 node-1-3 Yes 10 GiB 64 GiB OnlineUp 2.5.6.0-80bd45b 3.10.0-1127.19.1.el7.x86_64 CentOS Linux 7 (Core) ... 10.225.115.121 65978e6e-a776-4cd2-a76e-e29d90e7fc73 node-1-5 Yes 10 GiB 64 GiB Online Up 2.5.6.0-80bd45b 3.10.0-1127.19.1.el7.x86_64 CentOS Linux 7 (Core) ... Warnings: WARNING: Persistent journald logging is not enabled on this node. Global Storage Pool Total Used : 51 GiB Total Capacity : 448 GiB
From the above, we can see our backing storage is undersized.
Disable Thin Provisioning for some nodes
To disable thin provisioning for some nodes in our cluster, we can use the pxctl cluster options update command with the –provisioning-commit-labels flag, providing the following fields in JSON:
- LabelSelector with the key values for labels and the node key with a comma separated list of the node IDs you wish to apply this rule to
- OverCommitPercent with the maximum storage percentage volumes can provision against backing storage set to 100
- SnapReservePercent with the percent of the previously specified maximum storage storage percent that is reserved for snapshots
Start by labelling our Kubernetes nodes to identify which ones are included, in this example I will create a label call app with a value of database.
# kubectl label nodes node-1-1 node-1-2 node-1-3 node-1-4 node-1-5 node-1-6 node-1-7 app=database
node/node-1-1 labeled
node/node-1-2 labeled
node/node-1-3 labeled
node/node-1-4 labeled
node/node-1-5 labeled
node/node-1-6 labeled
node/node-1-7 labeled
We can easily see labels with kubectl describe node/<node name> or kubectl get nodes –show labels <node name>
# kubectl get nodes --show-labels node-1-1
NAME STATUS ROLES AGE VERSION LABELS
node-1-1 Ready <none> 95d v1.17.0 app=database,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node-1-1,kubernetes.io/os=linux
Now, let’s create a rule to disable thin provisioning for nodes that have a label called app set to database.
# pxctl cluster options update --provisioning-commit-labels '[{"OverCommitPercent": 100, "SnapReservePercent":30,"LabelSelector": {"app": "database"}}]'
Successfully updated cluster wide options
And check the cluster options with the -j (JSON) output
[root@node-1-1 ~]# pxctl cluster options list -j
{
"ReplMoveTimeoutMinutes": 1440,
"AutoDecommissionTimeoutMinutes": 20,
"InternalSnapIntervalMinutes": 30,
"ResyncReplAddEnabled": false,
"ReAddWaitMinutes": 1440,
"DomainPolicy": 1,
"OptimizedRestores": false,
"SmAbortTimeoutSeconds": 0,
"DisableProvisionRule": {
"LabelSelector": null
},
"ProvisionCommitRule": [
{
"OverCommitPercent": 100,
"SnapReservePercent": 30,
"LabelSelector": {
"app": "database"
}
}
],
...
Test Thin Provisioning Rule
Let’s test our new provisioning rule by trying to create a new 100GB volume using the following.
--- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: test1 namespace: oracle-namespace labels: app: database version: 19.3.0.1 spec: storageClassName: px-ora-sc accessModes: - ReadWriteOnce resources: requests: storage: 100Gi
If we apply the the above and then describe the persistent volume claim we see the request has been refused by Portworx as per our rule.
# kubectl apply -f pvc-px.yaml persistentvolumeclaim/test1 created # kubectl describe pvc/test1 Name: test1 Namespace: oracle-namespace StorageClass: px-ora-sc Status: Pending Volume: Labels: app=database version=19.3.0.1 Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"labels":{"app":"database","version":"19.3.0.1"},"name":"te... volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/portworx-volume Finalizers: [kubernetes.io/pvc-protection] Capacity: Access Modes: VolumeMode: Filesystem Mounted By: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning ProvisioningFailed 11s (x2 over 13s) persistentvolume-controller Failed to provision volume with StorageClass "px-ora-sc": rpc error: code = Internal desc = Failed to create volume: could not find enough nodes to provision volume: 7 out of 7 pools could not be selected because they did not satisfy the following requirement: pools must not over-commit provisioning space (required 100 GiB) : over-commit limit: 100 %, snap-overcommit limit: 30 % for nodes/pools matching labels: app=database.
Summary
In this post I have shared why we may need to understand the backing storage presented to Portworx and demonstrated how we can use the –provisioning-commit-labels to disable thin provisioning and avoid application issues due lack of suitable storage.
[twitter-follow screen_name=’RonEkins’ show_count=’yes’]
Leave a Reply