HC1: Operations SIG 5 Aug 2024

From Zenon Wiki
(Redirected from Operations SIG 5 Aug 2024)
Jump to navigation Jump to search

Agenda

What: Meeting to Discuss Improving Node Operations

When: 5 Aug 2024 @ 6PM EST

Where: https://element.zenon.chat/#/room/#sig-operations:hc1.chat 4

Agenda:

  1. Discuss follow Up items from previous meeting
  2. Document action items
  3. Establish next meeting

If you want to attend please respond (or DM) with your full matrix username and I will invite you to the group. No FUD, anger or BS allowed.

Minutes

Mon, Aug 5, 2024, 12:57:41 - deeznnutz: SIG Meeting today (Monday) at 6PM EST

Mon, Aug 5, 2024, 16:59:48 - deeznnutz: Starting zoon

Mon, Aug 5, 2024, 17:00:35 - georgezgeorgez: hello

Mon, Aug 5, 2024, 17:00:56 - deeznnutz: Message deleted

Mon, Aug 5, 2024, 17:01:11 - coinselor: 🫡

Mon, Aug 5, 2024, 17:01:26 - deeznnutz: lets get started

Mon, Aug 5, 2024, 17:01:35 - deeznnutz: Follow up from last meeting... We discussed working on updating the docker compose script to launch a `go-zenon` node with monitoring, etc. I looked into my old project and wanted to make it even easier (one click) to launch a `go-zenon` node. So I changed focus to the bash script I previously wrote. It can be used to launch a node in one command.

Mon, Aug 5, 2024, 17:01:52 - deeznnutz: The repo is here: [1](https://github.com/go-zenon/go/tree/main)

Mon, Aug 5, 2024, 17:02:01 - deeznnutz: The basic framework is setup. Different options can be called with `--flags` The default flag is `--deploy` In an effort to get others involved, I created some enhancement under the issues tab of Github. <https://github.com/go-zenon/go/issues> Some of these are pretty easy and other will require a little more work (like Monit for monitoring and reporting).

Mon, Aug 5, 2024, 17:03:11 - georgezgeorgez: The group/org name is possibly a little confusing. Since go-zenon is the name of the repo for the znnd node right?

Mon, Aug 5, 2024, 17:03:26 - deeznnutz: ya,

Mon, Aug 5, 2024, 17:03:34 - deeznnutz: that was actually my first question

Mon, Aug 5, 2024, 17:03:41 - deeznnutz: My first question is, where should we host this? My initial thought for the script was to live at go.zenon.sh and we can run it with something like `curl -L `[`https://go.zenon.sh`](https://go.zenon.sh)` | sudo bash` Right now I setup an org called `/go-zenon` and I'm using Coolify to host it but I still need to mess around with redirects, etc... I would prefer for this to live at GH (without coolify), but I need to change the way zenon.sh is hosted for go.zenon.sh to redirect correctly. I cannot html redirect which is how it's setup now. if we want go.zenon.sh on github it needs a separate org like `/go-zenon` We could also use go.zenon.network if you don't like go.zenon.network.

Mon, Aug 5, 2024, 17:04:12 - deeznnutz: I'm open to changing it. It's confusing for sure.

Mon, Aug 5, 2024, 17:05:03 - georgezgeorgez: I would say let's focus on the actually script first. Changing the url is the easy part. It's not like most people will be typing it out. They'll probably just copy and paste a command.

Mon, Aug 5, 2024, 17:05:20 - deeznnutz: cool. makes sense.

Mon, Aug 5, 2024, 17:05:42 - georgezgeorgez: I fleshed out the Capabilities section here: https://zenon.wiki/index.php/HC1:_Operations_SIG

Mon, Aug 5, 2024, 17:06:09 - deeznnutz: awesome. thank you

Mon, Aug 5, 2024, 17:06:37 - georgezgeorgez: I think those 3 capabilities are good high level of what we are trying to do

Mon, Aug 5, 2024, 17:06:50 - georgezgeorgez:

  1. run znnd
  2. monitor it
  3. be able to report issues

Mon, Aug 5, 2024, 17:07:37 - georgezgeorgez: we're focused on 1. right now

Mon, Aug 5, 2024, 17:07:47 - georgezgeorgez: but i want to make sure we don't overfocus on that as well

Mon, Aug 5, 2024, 17:08:04 - georgezgeorgez: in terms of defining the work

Mon, Aug 5, 2024, 17:08:19 - georgezgeorgez: we should get some initial work defined for each of these

Mon, Aug 5, 2024, 17:08:50 - georgezgeorgez: I have an incentivization tool in PoC right now

Mon, Aug 5, 2024, 17:09:18 - georgezgeorgez: The fundamental concept is Incentives = Priority

Mon, Aug 5, 2024, 17:09:23 - georgezgeorgez: They are one and the same

Mon, Aug 5, 2024, 17:10:09 - deeznnutz: How should we best define the work. For example, monitoring & reporting. I have a pretty good idea of what is needed and have experience with Monit to define what we can monitor and report

Mon, Aug 5, 2024, 17:10:53 - deeznnutz: should we work to define that scope in the issues in GH?

Mon, Aug 5, 2024, 17:11:51 - georgezgeorgez: I think an Issue is good as long as it is something that can be marked "done"

Mon, Aug 5, 2024, 17:12:00 - alienc0der: https://medium.com/cypher-core/cosmos-how-to-set-up-your-own-network-monitoring-dashboard-fe49c63a8271

Mon, Aug 5, 2024, 17:12:17 - georgezgeorgez: Yeah prometheus is a pretty standard tooling these days

Mon, Aug 5, 2024, 17:12:40 - georgezgeorgez: Under the Observability capability I put * third party tooling integration

Mon, Aug 5, 2024, 17:12:51 - georgezgeorgez: and prometheus is probably something we will want

Mon, Aug 5, 2024, 17:12:52 - alienc0der: https://docs.cometbft.com/v0.37/core/metrics

Mon, Aug 5, 2024, 17:13:15 - alienc0der: CometBFT has built-in support for Prometheus that can be plugged in to a Grafana dashboard.

Mon, Aug 5, 2024, 17:13:41 - georgezgeorgez: When it comes to integration I think that there are two methods There is direct integration, where we can make the znnd node publish prometheus metrics

Mon, Aug 5, 2024, 17:13:44 - deeznnutz: I've setup this monitoring and reporting stack for znnd. I have some prebuild dashboards we can use infact.

Mon, Aug 5, 2024, 17:14:03 - georgezgeorgez: The other is to do something like write a collector, which parses logs

Mon, Aug 5, 2024, 17:14:20 - georgezgeorgez: Exporters https://prometheus.io/docs/instrumenting/writing_exporters/

Mon, Aug 5, 2024, 17:15:53 - georgezgeorgez: deeznnutz: I think the general approach for you is draw from your own experience ask the community for alternatives can create some placeholder issues for both and then the community can help decide for now the community can be just this SIG later on, we'll have other signalling methods

Mon, Aug 5, 2024, 17:16:44 - georgezgeorgez: So I'm thinking back to some of the crashes we had with znnd Basically people just said things were crashing And then some devs went and investigated

Mon, Aug 5, 2024, 17:16:58 - georgezgeorgez: I'd like for the next crash (lol) to be a bit smoother

Mon, Aug 5, 2024, 17:17:23 - georgezgeorgez: Where community members can file a report somewhere with some info

Mon, Aug 5, 2024, 17:17:38 - georgezgeorgez: Which will help devs get to the issue more quickly

Mon, Aug 5, 2024, 17:17:46 - deeznnutz: sort of like a log dump

Mon, Aug 5, 2024, 17:18:07 - georgezgeorgez: yeah, so that's the essence of the Community Support capability I put on the wiki

Mon, Aug 5, 2024, 17:18:20 - georgezgeorgez: need a place for people to dump things

Mon, Aug 5, 2024, 17:18:25 - georgezgeorgez: need to help them with what to dump

Mon, Aug 5, 2024, 17:18:30 - georgezgeorgez: and how to get it

Mon, Aug 5, 2024, 17:18:44 - georgezgeorgez: and again, all of this is relevant when it comes to testnets and new features

Mon, Aug 5, 2024, 17:19:56 - deeznnutz: Ok so the take away action is to focus on finding a system we can use to monitor / report

Mon, Aug 5, 2024, 17:20:03 - deeznnutz: monit is one and Grafana is another

Mon, Aug 5, 2024, 17:20:16 - deeznnutz: From what I know about both, Grafana is way better

Mon, Aug 5, 2024, 17:20:54 - deeznnutz: Monit really is a "static" monitoring tool and does not have nice dashboards.

Mon, Aug 5, 2024, 17:21:08 - georgezgeorgez: https://opentelemetry.io/

Mon, Aug 5, 2024, 17:21:14 - georgezgeorgez: This is something I've been meaning to look into as well

Mon, Aug 5, 2024, 17:21:40 - deeznnutz: OK - I can check that out too.

Mon, Aug 5, 2024, 17:23:26 - georgezgeorgez: How do we want to prioritize things for now? I'm still working on the incentivization PoC, but we can start picking up things now And fill in the incentives later Most of the first things won't be too big in any case

Mon, Aug 5, 2024, 17:24:18 - deeznnutz: i have one more question, before I answer that. how do you feel about offering a bootstrap to restore

Mon, Aug 5, 2024, 17:25:03 - georgezgeorgez: I think it is good as an internal tool If you create the bootstrap, get a signature on it You can trust it Or if you've created it yourself before You can get one from someone else, if the signature matches

Mon, Aug 5, 2024, 17:25:05 - deeznnutz: all the other items are pretty EZ. The monitoring / reporting will take some time. the other items on the list will be a few hours. Bootstrapping could take a little more time

Mon, Aug 5, 2024, 17:25:20 - georgezgeorgez: But I would not advocate everyone just blindly using a bootstrap

Mon, Aug 5, 2024, 17:25:35 - deeznnutz: should this bash script allow the user to restore from bootstrap?

Mon, Aug 5, 2024, 17:25:38 - coinselor: Alien suggested: "A step-by-step interactive tutorial for new aliens with local setup or different Cloud providers would be nice." Isn't this basically a one line script to copy paste and be done? I think when we have more optional monitoring/reporting tools we can add like a video instruction going over it.

Mon, Aug 5, 2024, 17:26:27 - georgezgeorgez: I think that's the script that 0x has started putting together right?

Mon, Aug 5, 2024, 17:26:29 - deeznnutz: yes it is a one liner.

Mon, Aug 5, 2024, 17:26:43 - georgezgeorgez: like a curl sh command

Mon, Aug 5, 2024, 17:26:59 - georgezgeorgez: oh one thing longer term, I'll add to the wiki Maybe we want to package it for different OS

Mon, Aug 5, 2024, 17:27:06 - georgezgeorgez: deb, rpm, brew

Mon, Aug 5, 2024, 17:27:09 - georgezgeorgez: for the future

Mon, Aug 5, 2024, 17:27:16 - georgezgeorgez: but let's slowly get all that stuff defined

Mon, Aug 5, 2024, 17:27:33 - coinselor: Maybe we can do a quick landing page like this one: https://omakub.org - it's basically the same, a one liner that installs a bunch of stuff and explains to you all that you need to know. We could add:

  • Hardware requirements
  • Instructions
  • etc

Mon, Aug 5, 2024, 17:27:55 - georgezgeorgez: Yup, hardware recommendations/requirements is definitely something we need to provide

Mon, Aug 5, 2024, 17:28:39 - deeznnutz: So in terms of priorities we basically have

  1. EZ enhansements
  2. add bootstraping
  3. monitoring / reporting

Mon, Aug 5, 2024, 17:29:20 - georgezgeorgez: Is bootstrapping a priority?

Mon, Aug 5, 2024, 17:29:25 - deeznnutz: Maybe we all try to work on #1 in the coming week and research #3?

Mon, Aug 5, 2024, 17:29:30 - deeznnutz: not really

Mon, Aug 5, 2024, 17:29:37 - deeznnutz: But I have it mostly done

Mon, Aug 5, 2024, 17:29:37 - georgezgeorgez: For me it's more important as a recovery feature

Mon, Aug 5, 2024, 17:30:00 - georgezgeorgez: Nodes really should validate the whole chain

Mon, Aug 5, 2024, 17:30:21 - deeznnutz: ya and with these new performance improvements it takes < 1d

Mon, Aug 5, 2024, 17:30:46 - coinselor: It can be useful when upgrading or sentrifying to avoid downtime

Mon, Aug 5, 2024, 17:31:05 - georgezgeorgez: Okay so that brings me back to my question

Mon, Aug 5, 2024, 17:31:16 - georgezgeorgez: I want to pick up a small ticket to do before our next meeting

Mon, Aug 5, 2024, 17:31:26 - georgezgeorgez: Let's say others do as well

Mon, Aug 5, 2024, 17:31:48 - georgezgeorgez: 0x how do you want to prioritize it And track who is doing what for now? And to give later bounties/etc

Mon, Aug 5, 2024, 17:31:56 - georgezgeorgez: If you want, you can just dictate

Mon, Aug 5, 2024, 17:32:28 - deeznnutz: looking at the list.

Mon, Aug 5, 2024, 17:32:28 - georgezgeorgez: Mr Chair

Mon, Aug 5, 2024, 17:32:42 - deeznnutz: lol - support for ARM64

Mon, Aug 5, 2024, 17:33:11 - deeznnutz: I was going to add a troubleshoot flag.

Mon, Aug 5, 2024, 17:33:31 - deeznnutz: so maybe those two in the next week and we research monitoring and reporting

Mon, Aug 5, 2024, 17:33:45 - deeznnutz: georgezgeorgez: do you want to take the arm one?

Mon, Aug 5, 2024, 17:33:59 - deeznnutz: coinselor: fix ascii art

Mon, Aug 5, 2024, 17:34:04 - georgezgeorgez: Sure. Do you want to assign placeholder bounties amount to start? Or figure that out later?

Mon, Aug 5, 2024, 17:34:16 - deeznnutz: me troubleshooting flag.

Mon, Aug 5, 2024, 17:34:19 - deeznnutz: maybe later.

Mon, Aug 5, 2024, 17:34:29 - deeznnutz: some of these could literally take 30 minutes

Mon, Aug 5, 2024, 17:34:51 - georgezgeorgez: Makes sense. Let's try and get something accomplished and then we have some credibility for incentives

Mon, Aug 5, 2024, 17:35:20 - georgezgeorgez: If the work is well defined and discrete, we can all track who has done what. And can help with AZ proposals.

Mon, Aug 5, 2024, 17:35:35 - deeznnutz: coinselor: you could also add a `--help` flag for instructions on how to use the script

Mon, Aug 5, 2024, 17:35:36 - georgezgeorgez: 5k/50k for ascii art?

Mon, Aug 5, 2024, 17:35:45 - deeznnutz: lol!! of course

Mon, Aug 5, 2024, 17:36:35 - coinselor: sure, I'll look into it. ASCI art not my forte, but I will at least run the script and checkout your piece of art.

Mon, Aug 5, 2024, 17:37:03 - deeznnutz: OK and we can all research reporting and monitoring by the next meeting in 2w?

Mon, Aug 5, 2024, 17:37:29 - deeznnutz: you can check out some of the stuff I did here

Mon, Aug 5, 2024, 17:37:30 - deeznnutz: https://github.com/0x3639/znndNode

Mon, Aug 5, 2024, 17:37:44 - georgezgeorgez: Yup, I can put together some possible stacks and we can talk through them next meeting.

Mon, Aug 5, 2024, 17:38:17 - deeznnutz: Does Aug 19 at 6PM EST work? coinselor is this too late for you?

Mon, Aug 5, 2024, 17:38:25 - deeznnutz: I can move up if that works for everyone

Mon, Aug 5, 2024, 17:39:53 - georgezgeorgez: I'm optimistic that if we show a good cadence, that we have a few more devs in the community that are capable in this domain.

Mon, Aug 5, 2024, 17:40:06 - coinselor: It's ok for me rn maybe after we adjust daylight savings again it would be nicer one hour earlier

Mon, Aug 5, 2024, 17:40:25 - georgezgeorgez: I think it can be discouraging to think about developing the network when you don't know cryptography etc. But a lot of the work needed is just every day software development and operations.

Mon, Aug 5, 2024, 17:40:56 - deeznnutz: yes and a lot of this can be done with chat GPT. it's really good with bash and python. Anyone can learn this

Mon, Aug 5, 2024, 17:41:01 - georgezgeorgez: So hopefully the SIG can be a vehicle for those people to contribute

Mon, Aug 5, 2024, 17:41:37 - deeznnutz: cool. that's all I have. I can post the minutes and follow up items today / tomorrow

Mon, Aug 5, 2024, 17:41:49 - deeznnutz: anything else?

Mon, Aug 5, 2024, 17:41:56 - georgezgeorgez: Sounds good. I'll assign myself the ticket in GitHub

Mon, Aug 5, 2024, 17:42:22 - georgezgeorgez: Thanks for chairing this SIG! I think we're making the small but necessary first steps!

Mon, Aug 5, 2024, 17:42:43 - deeznnutz: my pleasure!!

Mon, Aug 5, 2024, 17:42:53 - georgezgeorgez: I'm fairly flexible on next meeting time. Just post it when it's set.

Mon, Aug 5, 2024, 17:43:08 - deeznnutz: cool - I'll post Aug 19 at 6PM EST

Mon, Aug 5, 2024, 17:43:17 - deeznnutz: thx guys!

Mon, Aug 5, 2024, 17:43:21 - coinselor: 🫡

Follow Up Items

  • georgezgeorgez add arm64 support to bash script
  • coinselor fix ascii art
  • deeZNNutz develop troubleshooting flag
  • All parties research monitoring & reporting stacks discussed here

Next Meeting

19 August 2024 @ 6:00 PM EST