-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NRI plugin add config failPolicy, eg. skip、fail #142
Comments
/cc @kad |
@lengrongfu Just thinking aloud about alternative existing possibilities, your plugin could annotate containers it has processed, and if it restarts (due to a timeout or other reasons) it could check if some critical container is missing that annotation and then use other means to restart it. |
@lengrongfu Also, I think this should be filed against the core NRI repo, not containerd itself. |
What you said is a solution to the problem; however, there are some problems. If we check whether the current container has what we injected after restarting, there will be a period of time when the container used by the user does not have the content we injected; Our scenario is that if we cannot inject when creating a container, then we don’t want to start the container. |
I understand that the design of nri is somewhat similar to kubernetes' webhook, which modifies the pod yaml; so these control logics should be in containerd. |
I still think this belongs to core NRI. If we implemented something like this
And if we did so, we probably wouldn't need any changes in the runtime-specific NRI integration code, only the common code in NRI itself. So based on this, it looks more of an NRI feature than a containerd/runtime one to me. |
Once we have discussed it clearly, we can track it from the nri repo. "we'd probably implement the configuration setting as a per-plugin parameter passed to NRI core/the runtime during plugin registration," |
Okay, sorry I misunderstood a bit what you were after. So you would like to ensure that some plugin or plugins are always present/running when some set of special containers come around, and that these plugins are able to adjust the special containers. And, if for any reason, that does not happen, then you'd like to fail the creation of the container altogether. One way to achieve that would be to have a 'verification' plugin as the last one in the chain of plugins (maybe installed locally), which would identify your special containers and check if all the necessary plugins have adjusted it (for instance by checking that each plugin has annotated the container) and return an error when this check fails. This would then cause the creation of the container to fail. |
Your understanding is correct, but we can simplify this problem. If all containers need to be adjusted by a certain NRi plug-in; if there is no adjustment or the adjustment is wrong, the container should be prevented from continuing to be created. The reasons for no adjustment and adjustment errors may be the following:
In any case when the adjustment fails, I hope to prevent the creation of this container on the Containerd side. |
@lengrongfu So, keeping in mind that if an NRI plugin returns an error to NRI itself, NRI will return an error to containerd and that will prevent the creation of the container, would the above suggestion work for you ? |
Could we agree on an This annotation could be added to the K8S webhook. |
@klihub this is also a request from networking integrations, because they want to block the pod creation until the NRI network. plugin is ready. How it works today with CNI is: if the specific CNI config file does not exist the Pod is not created, same as the old NRI model in 0.1.0 https://github.com/containerd/nri/blob/main/README-v0.1.0.md . For the same reasons NRI evolved to a new model, network integrations want to move to this service oriented model, there are several proposal and demand from the community for a CNI 2.0 but is not realistic that is going to happen. https://github.com/containernetworking/cni/issues?q=is%3Aissue%20state%3Aopen%20grpc , hence NRI crosses all these checks.
and if the plugin is deployed as a daemonset you have a deadlock, because you need to run a container to register the plugin. Let me step back and try dump my thoughts:
Problem: NRI plugins are commonly deployed as daemonset, so making a pluing required need to distinguish on different phases of the Node configuration / initialization. This is how a kubernetes node startup process goes.
I can;t remember exactly now what is the
We should try to avoid designing a system based on annotations. We need to find a good integration model and for that the CRI API is the communication channel between kubernetes and the runtimes, if we start taking shortcuts we'll start to create complex and custom systems that will behave differently depending on the integration killing all the benefits of standardization Some ideas:
|
What is the problem you're trying to solve
When NRI plugin is shutdown, containerd Will not reject container creation,If there is important logic in the CreateContainer method, such as injecting devices into the container, some containers may be missed during the restart of the nri plugin.
Describe the solution you'd like
Therefore, can we configure the failPolicy of each nri plugin, such as skip and fail? When the configuration is skip, the restart of the nri plugin is ignored, and when the configuration is fail, the container is refused to be created.
Additional context
No response
The text was updated successfully, but these errors were encountered: