If you have been using kubernetes for a long time, then you know what it is resource quotas. But do you know it well enough? Do you know what mechanism is build on? If you did not - you will soon know.
First if all, kubernetes is a container management platform. Therefore, we will delve into the mechanisms of the container.

CGROUPS

Cgroups is a Linux kernel mechanism that allows you to place processes in hierarchical groups for which the use of system resources can be limited. For example, the "memory" controller limits the use of RAM, the "cpuacct" controller takes into account the use of processor time.
There are two versions of cgroup: v1 and v2.
cgroupv1:
cgroup v2 has several improvements over cgroup v1, for example:

CAPABILITIES

Capabilities are the means to manage privileges, which in traditional Unix-like systems were only available to processes.
Permissions for a process to make certain system calls. Only about 20 pieces
Examples:
Finally, about quotas. Mechanisms that allow you to limit the use of resources for a container (not for Pod)
Limit - defines the memory limit for cgroup. If the container tries to use more memory than the Limit, then OOMkiller will kill one of the processes.
Requests - with cgroups v1, they only affect the start of the pod. With cgroups v2 there are special memory.min and memory.low controllers. Exclusively allocated memory for the container, which no one else can use.
Tmpfs (ephemeral storage) counts as memory consumed by the container.
#Container resources example
apiVersion: v1
kind: Pod
metadata:
  name: frontend
spec:
  containers:
  - name: app
    image: images.my-company.example/app:v4
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
        ephemeral-storage: "2Gi"
      limits:
        memory: "128Mi"
        cpu: "500m"
        ephemeral-storage: "4Gi"

How CPU requests work

Requests - used by cpu.share. The root cgroup (root) contains the number of CPUs * 1024 shares and inherits child cgroups in proportion to their cpu.shares and so on.
If all shares are occupied, but no one is using anything, then you can leave them.

How CPU Limits Work

Limits - used by cfs_period_us and cfs_quota_us. Us is microseconds (mu). Unlike requests, limits are based on time spans.

 CPU Management Policy

The CPU Manager policy is set with the --cpu-manager-policy kubelet flag or the cpuManagerPolicy.
vim /etc/systemd/system/kubelet.service
And add the folowing lines:
--cpu-manager-policy=static \
  --kube-reserved=cpu=1,memory=2Gi,ephemeral-storage=1Gi \
  --system-reserved=cpu=1,memory=2Gi,ephemeral-storage=1Gi \

The role of K8S Scheduler in quotas distribution

Responsible for placing pods on cluster nodes. 2 stages:
There are 3 strategies for choose:

Storage Resource Quota

For example, if an operator wants to quota storage with gold storage class separate from bronze storage class, the operator can define a quota as follows:
gold.storageclass.storage.k8s.io/requests.storage: 500Gi
bronze.storageclass.storage.k8s.io/requests.storage: 100Gi

Ephemeral storage

In release 1.8, quota support for local ephemeral storage is added as an alpha feature:

Quite obscure quotas

Typical object counts:

It is possible to configure the total number of objects that can exist in the namespace.
Reasons for use Object Count Quota:

PID limits

Limits on the number of PIDs. If you create a lot of pids in the container, then the node will run out of PIDs.
Global kubelet setting - different behavior is possible on different nodes if the settings are different.
It is possible to prevent "Fork bomb".
:(){ :|:& };:

Quotas for extended resources

But in ext resources you can't use limits. Only requests. Example of extended resources quota:
#correct:
requests.nvidia.com/gpu: "4"
#not correct:
limits.nvidia.com/gpu: "4"

Network quotas and network Bandwidth

You can set the network bandwidth for the pod in spec.template.metadata.annotations to limit the network traffic of the container.
If the parameters are not specified, the network bandwidth is not limited by default.
The following is an example:
apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: nginx 
spec: 
  template: 
    metadata: 
      annotations:
       # Ingress bandwidth
        kubernetes.io/ingress-bandwidth: 100M
       # Egress bandwidth
        kubernetes.io/egress-bandwidth: 1G
    spec: 
      containers: 
      - image: nginx  
        imagePullPolicy: Always 
        name: nginx 

Some other shared resources

Conclusion

Why it is necessary. Because it: