HC1: Operations SIG 5 Aug 2024
Agenda
What: Meeting to Discuss Improving Node Operations
When: 5 Aug 2024 @ 6PM EST
Where: https://element.zenon.chat/#/room/#sig-operations:hc1.chat 4
Agenda:
- Discuss follow Up items from previous meeting
- Document action items
- Establish next meeting
If you want to attend please respond (or DM) with your full matrix username and I will invite you to the group. No FUD, anger or BS allowed.
Minutes
Mon, Aug 5, 2024, 12:57:41 - deeznnutz: SIG Meeting today (Monday) at 6PM EST
Mon, Aug 5, 2024, 16:59:48 - deeznnutz: Starting zoon
Mon, Aug 5, 2024, 17:00:35 - georgezgeorgez: hello
Mon, Aug 5, 2024, 17:00:56 - deeznnutz: Message deleted
Mon, Aug 5, 2024, 17:01:11 - coinselor: 🫡
Mon, Aug 5, 2024, 17:01:26 - deeznnutz: lets get started
Mon, Aug 5, 2024, 17:01:35 - deeznnutz: Follow up from last meeting... We discussed working on updating the docker compose script to launch a `go-zenon` node with monitoring, etc. I looked into my old project and wanted to make it even easier (one click) to launch a `go-zenon` node. So I changed focus to the bash script I previously wrote. It can be used to launch a node in one command.
Mon, Aug 5, 2024, 17:01:52 - deeznnutz: The repo is here: [1](https://github.com/go-zenon/go/tree/main)
Mon, Aug 5, 2024, 17:02:01 - deeznnutz: The basic framework is setup. Different options can be called with `--flags` The default flag is `--deploy` In an effort to get others involved, I created some enhancement under the issues tab of Github. <https://github.com/go-zenon/go/issues> Some of these are pretty easy and other will require a little more work (like Monit for monitoring and reporting).
Mon, Aug 5, 2024, 17:03:11 - georgezgeorgez: The group/org name is possibly a little confusing. Since go-zenon is the name of the repo for the znnd node right?
Mon, Aug 5, 2024, 17:03:26 - deeznnutz: ya,
Mon, Aug 5, 2024, 17:03:34 - deeznnutz: that was actually my first question
Mon, Aug 5, 2024, 17:03:41 - deeznnutz: My first question is, where should we host this? My initial thought for the script was to live at go.zenon.sh and we can run it with something like `curl -L `[`https://go.zenon.sh`](https://go.zenon.sh)` | sudo bash` Right now I setup an org called `/go-zenon` and I'm using Coolify to host it but I still need to mess around with redirects, etc... I would prefer for this to live at GH (without coolify), but I need to change the way zenon.sh is hosted for go.zenon.sh to redirect correctly. I cannot html redirect which is how it's setup now. if we want go.zenon.sh on github it needs a separate org like `/go-zenon` We could also use go.zenon.network if you don't like go.zenon.network.
Mon, Aug 5, 2024, 17:04:12 - deeznnutz: I'm open to changing it. It's confusing for sure.
Mon, Aug 5, 2024, 17:05:03 - georgezgeorgez: I would say let's focus on the actually script first. Changing the url is the easy part. It's not like most people will be typing it out. They'll probably just copy and paste a command.
Mon, Aug 5, 2024, 17:05:20 - deeznnutz: cool. makes sense.
Mon, Aug 5, 2024, 17:05:42 - georgezgeorgez: I fleshed out the Capabilities section here: https://zenon.wiki/index.php/HC1:_Operations_SIG
Mon, Aug 5, 2024, 17:06:09 - deeznnutz: awesome. thank you
Mon, Aug 5, 2024, 17:06:37 - georgezgeorgez: I think those 3 capabilities are good high level of what we are trying to do
Mon, Aug 5, 2024, 17:06:50 - georgezgeorgez:
- run znnd
- monitor it
- be able to report issues
Mon, Aug 5, 2024, 17:07:37 - georgezgeorgez: we're focused on 1. right now
Mon, Aug 5, 2024, 17:07:47 - georgezgeorgez: but i want to make sure we don't overfocus on that as well
Mon, Aug 5, 2024, 17:08:04 - georgezgeorgez: in terms of defining the work
Mon, Aug 5, 2024, 17:08:19 - georgezgeorgez: we should get some initial work defined for each of these
Mon, Aug 5, 2024, 17:08:50 - georgezgeorgez: I have an incentivization tool in PoC right now
Mon, Aug 5, 2024, 17:09:18 - georgezgeorgez: The fundamental concept is Incentives = Priority
Mon, Aug 5, 2024, 17:09:23 - georgezgeorgez: They are one and the same
Mon, Aug 5, 2024, 17:10:09 - deeznnutz: How should we best define the work. For example, monitoring & reporting. I have a pretty good idea of what is needed and have experience with Monit to define what we can monitor and report
Mon, Aug 5, 2024, 17:10:53 - deeznnutz: should we work to define that scope in the issues in GH?
Mon, Aug 5, 2024, 17:11:51 - georgezgeorgez: I think an Issue is good as long as it is something that can be marked "done"
Mon, Aug 5, 2024, 17:12:00 - alienc0der: https://medium.com/cypher-core/cosmos-how-to-set-up-your-own-network-monitoring-dashboard-fe49c63a8271
Mon, Aug 5, 2024, 17:12:17 - georgezgeorgez: Yeah prometheus is a pretty standard tooling these days
Mon, Aug 5, 2024, 17:12:40 - georgezgeorgez: Under the Observability capability I put * third party tooling integration
Mon, Aug 5, 2024, 17:12:51 - georgezgeorgez: and prometheus is probably something we will want
Mon, Aug 5, 2024, 17:12:52 - alienc0der: https://docs.cometbft.com/v0.37/core/metrics
Mon, Aug 5, 2024, 17:13:15 - alienc0der: CometBFT has built-in support for Prometheus that can be plugged in to a Grafana dashboard.
Mon, Aug 5, 2024, 17:13:41 - georgezgeorgez: When it comes to integration I think that there are two methods There is direct integration, where we can make the znnd node publish prometheus metrics
Mon, Aug 5, 2024, 17:13:44 - deeznnutz: I've setup this monitoring and reporting stack for znnd. I have some prebuild dashboards we can use infact.
Mon, Aug 5, 2024, 17:14:03 - georgezgeorgez: The other is to do something like write a collector, which parses logs
Mon, Aug 5, 2024, 17:14:20 - georgezgeorgez: Exporters https://prometheus.io/docs/instrumenting/writing_exporters/
Mon, Aug 5, 2024, 17:15:53 - georgezgeorgez: deeznnutz: I think the general approach for you is draw from your own experience ask the community for alternatives can create some placeholder issues for both and then the community can help decide for now the community can be just this SIG later on, we'll have other signalling methods
Mon, Aug 5, 2024, 17:16:44 - georgezgeorgez: So I'm thinking back to some of the crashes we had with znnd Basically people just said things were crashing And then some devs went and investigated
Mon, Aug 5, 2024, 17:16:58 - georgezgeorgez: I'd like for the next crash (lol) to be a bit smoother
Mon, Aug 5, 2024, 17:17:23 - georgezgeorgez: Where community members can file a report somewhere with some info
Mon, Aug 5, 2024, 17:17:38 - georgezgeorgez: Which will help devs get to the issue more quickly
Mon, Aug 5, 2024, 17:17:46 - deeznnutz: sort of like a log dump
Mon, Aug 5, 2024, 17:18:07 - georgezgeorgez: yeah, so that's the essence of the Community Support capability I put on the wiki
Mon, Aug 5, 2024, 17:18:20 - georgezgeorgez: need a place for people to dump things
Mon, Aug 5, 2024, 17:18:25 - georgezgeorgez: need to help them with what to dump
Mon, Aug 5, 2024, 17:18:30 - georgezgeorgez: and how to get it
Mon, Aug 5, 2024, 17:18:44 - georgezgeorgez: and again, all of this is relevant when it comes to testnets and new features
Mon, Aug 5, 2024, 17:19:56 - deeznnutz: Ok so the take away action is to focus on finding a system we can use to monitor / report
Mon, Aug 5, 2024, 17:20:03 - deeznnutz: monit is one and Grafana is another
Mon, Aug 5, 2024, 17:20:16 - deeznnutz: From what I know about both, Grafana is way better
Mon, Aug 5, 2024, 17:20:54 - deeznnutz: Monit really is a "static" monitoring tool and does not have nice dashboards.
Mon, Aug 5, 2024, 17:21:08 - georgezgeorgez: https://opentelemetry.io/
Mon, Aug 5, 2024, 17:21:14 - georgezgeorgez: This is something I've been meaning to look into as well
Mon, Aug 5, 2024, 17:21:40 - deeznnutz: OK - I can check that out too.
Mon, Aug 5, 2024, 17:23:26 - georgezgeorgez: How do we want to prioritize things for now? I'm still working on the incentivization PoC, but we can start picking up things now And fill in the incentives later Most of the first things won't be too big in any case
Mon, Aug 5, 2024, 17:24:18 - deeznnutz: i have one more question, before I answer that. how do you feel about offering a bootstrap to restore
Mon, Aug 5, 2024, 17:25:03 - georgezgeorgez: I think it is good as an internal tool If you create the bootstrap, get a signature on it You can trust it Or if you've created it yourself before You can get one from someone else, if the signature matches
Mon, Aug 5, 2024, 17:25:05 - deeznnutz: all the other items are pretty EZ. The monitoring / reporting will take some time. the other items on the list will be a few hours. Bootstrapping could take a little more time
Mon, Aug 5, 2024, 17:25:20 - georgezgeorgez: But I would not advocate everyone just blindly using a bootstrap
Mon, Aug 5, 2024, 17:25:35 - deeznnutz: should this bash script allow the user to restore from bootstrap?
Mon, Aug 5, 2024, 17:25:38 - coinselor: Alien suggested: "A step-by-step interactive tutorial for new aliens with local setup or different Cloud providers would be nice." Isn't this basically a one line script to copy paste and be done? I think when we have more optional monitoring/reporting tools we can add like a video instruction going over it.
Mon, Aug 5, 2024, 17:26:27 - georgezgeorgez: I think that's the script that 0x has started putting together right?
Mon, Aug 5, 2024, 17:26:29 - deeznnutz: yes it is a one liner.
Mon, Aug 5, 2024, 17:26:43 - georgezgeorgez: like a curl sh command
Mon, Aug 5, 2024, 17:26:59 - georgezgeorgez: oh one thing longer term, I'll add to the wiki Maybe we want to package it for different OS
Mon, Aug 5, 2024, 17:27:06 - georgezgeorgez: deb, rpm, brew
Mon, Aug 5, 2024, 17:27:09 - georgezgeorgez: for the future
Mon, Aug 5, 2024, 17:27:16 - georgezgeorgez: but let's slowly get all that stuff defined
Mon, Aug 5, 2024, 17:27:33 - coinselor: Maybe we can do a quick landing page like this one: https://omakub.org - it's basically the same, a one liner that installs a bunch of stuff and explains to you all that you need to know. We could add:
- Hardware requirements
- Instructions
- etc
Mon, Aug 5, 2024, 17:27:55 - georgezgeorgez: Yup, hardware recommendations/requirements is definitely something we need to provide
Mon, Aug 5, 2024, 17:28:39 - deeznnutz: So in terms of priorities we basically have
- EZ enhansements
- add bootstraping
- monitoring / reporting
Mon, Aug 5, 2024, 17:29:20 - georgezgeorgez: Is bootstrapping a priority?
Mon, Aug 5, 2024, 17:29:25 - deeznnutz: Maybe we all try to work on #1 in the coming week and research #3?
Mon, Aug 5, 2024, 17:29:30 - deeznnutz: not really
Mon, Aug 5, 2024, 17:29:37 - deeznnutz: But I have it mostly done
Mon, Aug 5, 2024, 17:29:37 - georgezgeorgez: For me it's more important as a recovery feature
Mon, Aug 5, 2024, 17:30:00 - georgezgeorgez: Nodes really should validate the whole chain
Mon, Aug 5, 2024, 17:30:21 - deeznnutz: ya and with these new performance improvements it takes < 1d
Mon, Aug 5, 2024, 17:30:46 - coinselor: It can be useful when upgrading or sentrifying to avoid downtime
Mon, Aug 5, 2024, 17:31:05 - georgezgeorgez: Okay so that brings me back to my question
Mon, Aug 5, 2024, 17:31:16 - georgezgeorgez: I want to pick up a small ticket to do before our next meeting
Mon, Aug 5, 2024, 17:31:26 - georgezgeorgez: Let's say others do as well
Mon, Aug 5, 2024, 17:31:48 - georgezgeorgez: 0x how do you want to prioritize it And track who is doing what for now? And to give later bounties/etc
Mon, Aug 5, 2024, 17:31:56 - georgezgeorgez: If you want, you can just dictate
Mon, Aug 5, 2024, 17:32:28 - deeznnutz: looking at the list.
Mon, Aug 5, 2024, 17:32:28 - georgezgeorgez: Mr Chair
Mon, Aug 5, 2024, 17:32:42 - deeznnutz: lol - support for ARM64
Mon, Aug 5, 2024, 17:33:11 - deeznnutz: I was going to add a troubleshoot flag.
Mon, Aug 5, 2024, 17:33:31 - deeznnutz: so maybe those two in the next week and we research monitoring and reporting
Mon, Aug 5, 2024, 17:33:45 - deeznnutz: georgezgeorgez: do you want to take the arm one?
Mon, Aug 5, 2024, 17:33:59 - deeznnutz: coinselor: fix ascii art
Mon, Aug 5, 2024, 17:34:04 - georgezgeorgez: Sure. Do you want to assign placeholder bounties amount to start? Or figure that out later?
Mon, Aug 5, 2024, 17:34:16 - deeznnutz: me troubleshooting flag.
Mon, Aug 5, 2024, 17:34:19 - deeznnutz: maybe later.
Mon, Aug 5, 2024, 17:34:29 - deeznnutz: some of these could literally take 30 minutes
Mon, Aug 5, 2024, 17:34:51 - georgezgeorgez: Makes sense. Let's try and get something accomplished and then we have some credibility for incentives
Mon, Aug 5, 2024, 17:35:20 - georgezgeorgez: If the work is well defined and discrete, we can all track who has done what. And can help with AZ proposals.
Mon, Aug 5, 2024, 17:35:35 - deeznnutz: coinselor: you could also add a `--help` flag for instructions on how to use the script
Mon, Aug 5, 2024, 17:35:36 - georgezgeorgez: 5k/50k for ascii art?
Mon, Aug 5, 2024, 17:35:45 - deeznnutz: lol!! of course
Mon, Aug 5, 2024, 17:36:35 - coinselor: sure, I'll look into it. ASCI art not my forte, but I will at least run the script and checkout your piece of art.
Mon, Aug 5, 2024, 17:37:03 - deeznnutz: OK and we can all research reporting and monitoring by the next meeting in 2w?
Mon, Aug 5, 2024, 17:37:29 - deeznnutz: you can check out some of the stuff I did here
Mon, Aug 5, 2024, 17:37:30 - deeznnutz: https://github.com/0x3639/znndNode
Mon, Aug 5, 2024, 17:37:44 - georgezgeorgez: Yup, I can put together some possible stacks and we can talk through them next meeting.
Mon, Aug 5, 2024, 17:38:17 - deeznnutz: Does Aug 19 at 6PM EST work? coinselor is this too late for you?
Mon, Aug 5, 2024, 17:38:25 - deeznnutz: I can move up if that works for everyone
Mon, Aug 5, 2024, 17:39:53 - georgezgeorgez: I'm optimistic that if we show a good cadence, that we have a few more devs in the community that are capable in this domain.
Mon, Aug 5, 2024, 17:40:06 - coinselor: It's ok for me rn maybe after we adjust daylight savings again it would be nicer one hour earlier
Mon, Aug 5, 2024, 17:40:25 - georgezgeorgez: I think it can be discouraging to think about developing the network when you don't know cryptography etc. But a lot of the work needed is just every day software development and operations.
Mon, Aug 5, 2024, 17:40:56 - deeznnutz: yes and a lot of this can be done with chat GPT. it's really good with bash and python. Anyone can learn this
Mon, Aug 5, 2024, 17:41:01 - georgezgeorgez: So hopefully the SIG can be a vehicle for those people to contribute
Mon, Aug 5, 2024, 17:41:37 - deeznnutz: cool. that's all I have. I can post the minutes and follow up items today / tomorrow
Mon, Aug 5, 2024, 17:41:49 - deeznnutz: anything else?
Mon, Aug 5, 2024, 17:41:56 - georgezgeorgez: Sounds good. I'll assign myself the ticket in GitHub
Mon, Aug 5, 2024, 17:42:22 - georgezgeorgez: Thanks for chairing this SIG! I think we're making the small but necessary first steps!
Mon, Aug 5, 2024, 17:42:43 - deeznnutz: my pleasure!!
Mon, Aug 5, 2024, 17:42:53 - georgezgeorgez: I'm fairly flexible on next meeting time. Just post it when it's set.
Mon, Aug 5, 2024, 17:43:08 - deeznnutz: cool - I'll post Aug 19 at 6PM EST
Mon, Aug 5, 2024, 17:43:17 - deeznnutz: thx guys!
Mon, Aug 5, 2024, 17:43:21 - coinselor: 🫡
Follow Up Items
- georgezgeorgez add arm64 support to bash script
- coinselor fix ascii art
- deeZNNutz develop troubleshooting flag
- All parties research monitoring & reporting stacks discussed here
Next Meeting
19 August 2024 @ 6:00 PM EST