adding the gateway_kernel_init_sequence_diagram.png to the docs #1039

telamonian · 2022-02-10T06:42:04Z

I've been working on and off the last couple of weeks on a sequence diagram showing all of the steps that get triggered when a frontend requests a kernel init via a gateway. The latest version is TeX based and open source friendly, so I figured I'd contribute it to the JEG docs. I've include the .tex source and a corresponding Makefile. This way, if any corrections or updates need to be made, it should be easy to do so (well, you'll still need a texlive environment).

Not sure if all of the details are correct, and I have no idea where exactly to put it in the docs, so I'm opening this PR to start a dialogue about that.

- also including the .tex source and a Makefile, in case any of the sequence diagram details need to be corrected or updated in the future

kevin-bates · 2022-02-10T14:35:44Z

Hi Max - this is great and will be useful.

There's a lot that happens when the browser issues the request to /api/sessions and I wonder if we should include the gateway package in here. I think we could show that api/sessions in jupyter_server will wind up issuing a POST to /api/kernels on the Gateway, which then, essentially creates a namespaced pod.

I wouldn't be opposed to generalizing things in terms of process proxies (later to be provisioners) rather than having this specific to Kubernetes. I suppose there's an entire sequence diagram there for the launch and discovery portions as well.

Regarding location in the docs, I happen to have the hood up on the docs as we speak and I'm doing a complete reorganization/overhaul to be role-based (users, operators, developers, contributors). That effort is just starting on this branch: https://github.com/kevin-bates/enterprise_gateway/tree/reorg-docs. This screenshot gives an idea:

I'm wondering if this would go in the Contributors section (near the System Architecture discussion) or as part of the Troubleshooting section (currently in the Operators section) although I'm thinking there might be a Troubleshooting section in each role or have a general Troubleshooting Guide where different topics are equated to one of the four roles - since different roles overlap. At any rate, it might be best to hold this merge until the new organization is in place.

Regarding the makefile, since it's all in the docs hierarchy, please add the necessary targets into docs/Makefile. We can then publish "forwarding targets" in the main Makefile at the root. I have not changed docs/Makefile very much so that shouldn't cause merge heartburn.

At any rate, this is useful and appreciated! Thank you.

kevin-bates

@telamonian - this is great - thank you!

I've made a few comments, one of which pertains to the a possible merge headache, so it might be best to drop that commit unless I'm missing the inclusion of this diagram in those changes.

We can work on placement once we get the diagram (and its build) settled. I hope by then I might have the docs reworked and we could then update to that.

kevin-bates · 2022-02-10T19:46:44Z

docs/source/system-architecture.md

-and the underlying kernel.  In addition to the 5 ports, is an IP address, a key, and a signature scheme 
-indicator used to interpret the key.  These eight pieces of information are conveyed to the kernel via a 
-json file, known as the connection file. 
+The first component is a set of five zero-MQ ports used to convey the Jupyter protocol between the Notebook


Are there any changes in this file besides trailing spaces? If not, I suggest we discard these for now since the docs are getting completely reorganized and this will present unnecessary merge hassles.

the one actual change is here:

enterprise_gateway/docs/source/system-architecture.md

Lines 36 to 38 in 8a26134

![Kernel Process Lanuch Sequence Diagram](images/gateway_kernel_init_sequence_diagram.png)

i can revert the whitespace

kevin-bates · 2022-02-10T19:49:24Z

docs/tex/gateway_kernel_init_sequence_diagram/Makefile

@@ -0,0 +1,13 @@
+fname=gateway_kernel_init_sequence_diagram


Could we drive this from docs/Makefile? I think I recently added a new target for installing requirements and perhaps we add a build_tex_images or tex target there.

Yes, but I figured that was kind of brittle since the odds of a valid texlive (with all the needed packages) existing by default in the doc build env has to be low.

I did just find tectonic. It apparently could be installed as part of the doc build conda env, is a self contained texlive, and pulls needed packages automagically. Seems a bit like overkill for one .png, but it'd probably work

Let's just be sure to leave build instructions somewhere (perhaps commented out in docs/Makefile?).

docs/tex/gateway_kernel_init_sequence_diagram/gateway_kernel_init_sequence_diagram.tex

kevin-bates · 2022-02-10T20:13:55Z

docs/tex/gateway_kernel_init_sequence_diagram/gateway_kernel_init_sequence_diagram.tex

+    \begin{sdblock}{kernel init}{}
+        \begin{call}{browser}{https POST api/sessions}{server}{api/sessions response}
+            \begin{call}{server}{https POST api/sessions}{gateway}{api/sessions response}
+                \begin{call}{gateway}{CoreV1API.create\_namespaced\_pod}{kernel}{kernel connection info\footnote{see https://github.com/jupyter/enterprise\_gateway/pull/877}}


If we wanted to make this generic, we could have something like:

Suggested change

\begin{call}{gateway}{CoreV1API.create\_namespaced\_pod}{kernel}{kernel connection info\footnote{see https://github.com/jupyter/enterprise\_gateway/pull/877}}

\begin{call}{gateway}{RemoteProcessProxy.launch\_process}{kernel}{kernel connection info\footnote{see https://github.com/jupyter/enterprise\_gateway/pull/877}}

There's also a discovery step performed by the process-proxy immediately following the response of the connection info. This is what actually determines the IP to use for the zmq ports (in addition to ensuring the kernel is "visible" via the resource manager's API).

I suspect this would need to add a "bar" for the resource manager in order to show this. Reverse engineering this, I suppose this would add something like...

\newinst[5]{rm}{:resource manager} ... \begin{call}{gateway}{RemoteProcessProxy.confirm\_remote\_startup}{rm}{status and host info}

There's also a discovery step performed by the process-proxy immediately following the response of the connection info. This is what actually determines the IP to use for the zmq ports (in addition to ensuring the kernel is "visible" via the resource manager's API).

I suspect this would need to add a "bar" for the resource manager in order to show this. Reverse engineering this, I suppose this would add something like...

\newinst[5]{rm}{:resource manager} ... \begin{call}{gateway}{RemoteProcessProxy.confirm\_remote\_startup}{rm}{status and host info}

@kevin you're going to have to explain this one a little better. Does information also end up flowing back the frontend from this, or just to the gateway? What's a resource manager in this context? For eg the k8s case, is the resource manager the k8s API itself?

Does information also end up flowing back the frontend from this, or just to the gateway?

No. This phase is only between the remote-process-proxy (EG) and the RM.

What's a resource manager in this context? For eg the k8s case, is the resource manager the k8s API itself?

Yes, for K8s it's the k8s API - and issued by the KubernetesProcessProxy. Same thing for Hadoop YARN (via the YarnClusterProcessProxy). We use a yarn python API that issues REST queries against the YARN RM. Both of these return state and host information. We then set the IP in the connection information from this host info.

…nit_sequence_diagram.tex Co-authored-by: Kevin Bates <[email protected]>

kevin-bates · 2022-02-18T16:32:58Z

Hi Max, I played around with sphinxcontrib-mermaid yesterday and came up with this, which can be expressed directly in the markdown:

The problem right now is that it leaves white space above (and below) the image that is the same "length" as the diagram instructions. I've opened this issue on the repo hoping I'm not doing something correctly.

Here are the instructions. I went ahead and tried to express the resource manager interaction and couch things in more generic terms. If you want to add these into your image, that would be fine. I would, however, like to have these be as dynamic as possible - ultimately, and baking the instructions into the docs seems like a good way to go. Perhaps we could live with the whitespace better if we placed each diagram into its own file and linked to that file until the whitespace issue can be resolved?

This would be wrapped in {mermaid} directive.

    sequenceDiagram
        participant JupyterLab
        participant JupyterServer
        participant EnterpriseGateway
        participant ProcessProxy
        participant Kernel
        participant ResourceManager
        Note left of JupyterLab: fetch kernelspecs
        JupyterLab->>JupyterServer: https GET api/kernelspecs
        JupyterServer->>EnterpriseGateway: https GET api/kernelspecs
        EnterpriseGateway-->>JupyterServer: api/kernelspecs response
        JupyterServer-->>JupyterLab: api/kernelspecs response

        Note left of JupyterLab: kernel init
        JupyterLab->>JupyterServer: https POST api/sessions
        JupyterServer->>EnterpriseGateway: https POST api/kernels
        EnterpriseGateway->>ProcessProxy: launch_process()
        ProcessProxy->>Kernel: launch kernel
        ProcessProxy->>ResourceManager: confirm startup
        Kernel-->>ProcessProxy: connection info
        ResourceManager-->>ProcessProxy: state & host info
        ProcessProxy-->>EnterpriseGateway: complete connection info
        EnterpriseGateway->>Kernel: TCP socket requests
        Kernel-->>EnterpriseGateway: TCP socket handshakes
        EnterpriseGateway-->>JupyterServer: api/kernels response
        JupyterServer-->>JupyterLab: api/sessions response

        JupyterLab->>JupyterServer: ws GET api/kernels
        JupyterServer->>EnterpriseGateway: ws GET api/kernels
        EnterpriseGateway-->>JupyterServer: websocket upgrade response
        JupyterServer-->>JupyterLab: websocket upgrade response

Another issue is that you can see the diagram instructions on refresh. I'm curious if you might have some ideas on what's going on with both the white space and refresh.

kevin-bates · 2022-07-13T22:56:44Z

Note that the aforementioned issue with mermaid sequence diagrams has a reasonable workaround which I've confirmed works.

Rather than continue with this pull request, I'm going to open a new one (co-authored with @telamonian) and go with the mermaid/markdown approach since it seems far less cumbersome from a build perspective.

kevin-bates · 2022-07-13T23:15:46Z

Closing in favor of #1132.

adding the gateway_kernel_init_sequence_diagram.png to the docs

8a26134

- also including the .tex source and a Makefile, in case any of the sequence diagram details need to be corrected or updated in the future

Zsailer mentioned this pull request Feb 10, 2022

Meeting Notes 2022 jupyter-server/team-compass#15

Closed

kevin-bates reviewed Feb 10, 2022

View reviewed changes

telamonian and others added 2 commits February 10, 2022 19:30

Update docs/tex/gateway_kernel_init_sequence_diagram/gateway_kernel_i…

9153994

…nit_sequence_diagram.tex Co-authored-by: Kevin Bates <[email protected]>

Update docs/tex/gateway_kernel_init_sequence_diagram/gateway_kernel_i…

ccb32de

…nit_sequence_diagram.tex Co-authored-by: Kevin Bates <[email protected]>

kevin-bates mentioned this pull request Jul 13, 2022

Add kernel launch sequence diagram to contributors guide #1132

Merged

kevin-bates closed this Jul 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding the gateway_kernel_init_sequence_diagram.png to the docs #1039

adding the gateway_kernel_init_sequence_diagram.png to the docs #1039

telamonian commented Feb 10, 2022

kevin-bates commented Feb 10, 2022

kevin-bates left a comment

kevin-bates Feb 10, 2022

telamonian Feb 11, 2022

kevin-bates Feb 10, 2022

telamonian Feb 11, 2022

kevin-bates Feb 17, 2022

kevin-bates Feb 10, 2022

telamonian Feb 11, 2022

kevin-bates Feb 17, 2022

kevin-bates commented Feb 18, 2022 •

edited

Loading

kevin-bates commented Jul 13, 2022

kevin-bates commented Jul 13, 2022


	![Kernel Process Lanuch Sequence Diagram](images/gateway_kernel_init_sequence_diagram.png)

	\begin{call}{gateway}{CoreV1API.create\_namespaced\_pod}{kernel}{kernel connection info\footnote{see https://github.com/jupyter/enterprise\_gateway/pull/877}}
	\begin{call}{gateway}{RemoteProcessProxy.launch\_process}{kernel}{kernel connection info\footnote{see https://github.com/jupyter/enterprise\_gateway/pull/877}}

adding the gateway_kernel_init_sequence_diagram.png to the docs #1039

adding the gateway_kernel_init_sequence_diagram.png to the docs #1039

Conversation

telamonian commented Feb 10, 2022

kevin-bates commented Feb 10, 2022

kevin-bates left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevin-bates commented Feb 18, 2022 • edited Loading

kevin-bates commented Jul 13, 2022

kevin-bates commented Jul 13, 2022

kevin-bates commented Feb 18, 2022 •

edited

Loading