HC1: Operations SIG 23 Sept 2024

From Zenon Wiki
(Redirected from Operations SIG 23 Sept 2024)
Jump to navigation Jump to search

Agenda

What: Meeting to Discuss Improving Node Operations as part of the HC1: Operations SIG

When: 23 Sep 2024 @ 6PM EST

Where: https://matrix.to/#/#sig-operations:hc1.chat

Chair: 0x3639

Agenda:

  1. Discuss follow Up items from previous meeting
  2. Document action items
  3. Establish next meeting

If you want to attend please respond (or DM) with your full matrix username and I will invite you to the group. No FUD, anger or BS allowed.

Pre-meeting Notes

0x3639

George

Coinselor

Meeting Minutes

Mon, Sep 23, 2024, 17:00:21 - deeznnutz: === START OP SIG ===

Mon, Sep 23, 2024, 17:00:23 - deeznnutz: Hello

Mon, Sep 23, 2024, 17:00:43 - georgezgeorgez: hello

Mon, Sep 23, 2024, 17:01:56 - georgezgeorgez: i guess to start I want to clarify why we have these meetings. We've been doing some good work async over chat. But the meetings help set a cadence and gives us some target dates for completing things. By posting the meeting minutes to the wiki, community members can follow along with just the high-level updates as well.

Mon, Sep 23, 2024, 17:02:50 - georgezgeorgez: We have some other SIGs starting up, such as #sig-syrius

.chat, and some of the members here likely have to split our time between them and other initiatives.

Mon, Sep 23, 2024, 17:03:27 - georgezgeorgez: The SIG has agreed to move to a longer cadence, maybe every month or so.

Mon, Sep 23, 2024, 17:03:51 - georgezgeorgez: Now that we've released an initial product and will be relying on community feedback on next steps.

Mon, Sep 23, 2024, 17:04:10 - georgezgeorgez: We've also changed the scope of the SIG from Operations to Operations and Performance.

Mon, Sep 23, 2024, 17:04:27 - georgezgeorgez: Although the skillsets are different, the end outcome of both is the same.

Mon, Sep 23, 2024, 17:04:33 - georgezgeorgez: Community members being able to run infrastructure.

Mon, Sep 23, 2024, 17:05:11 - georgezgeorgez: Okay sorry, let's carry on.

Mon, Sep 23, 2024, 17:05:18 - georgezgeorgez: Just wanted to get some announcements out first.

Mon, Sep 23, 2024, 17:05:27 - deeznnutz: Good intro and thanks for that!!

Mon, Sep 23, 2024, 17:05:45 - deeznnutz: For those tuning in here is a quick summary.

Mon, Sep 23, 2024, 17:05:52 - deeznnutz: We accomplished our stated goals from the last meeting:

  • Gorg finished the znnd dashboard
  • We moved the repo to /hypercore-one/deployment and we released v0.0.0-alpha
  • The Community started to test the script and the feedback has been positive so far
  • We identified one bug. The --start flag did not work. We fixed that today.
  • We started to expand the Issues / Enhancement tracker on GitHub to track the roadmap
  • Coinselor posted on the forum looking for feedback on the most important features: https://forum.hypercore.one/t/community-poll-priority-enhacements-for-deployment-script/492

Mon, Sep 23, 2024, 17:06:32 - deeznnutz: I feel like we've made good progress on this small project and it's been useful on many levels.

Mon, Sep 23, 2024, 17:06:37 - georgezgeorgez: Great work everyone.

Mon, Sep 23, 2024, 17:07:16 - georgezgeorgez: I think the fact that other developers are joining these HC1 SIG channels is showing that this kind of collaboration model is working.

Mon, Sep 23, 2024, 17:07:31 - deeznnutz: ya I agree.

Mon, Sep 23, 2024, 17:07:49 - coinselor: I also enjoy the limited scopes; helps organizing efforts a lot. Good call.

Mon, Sep 23, 2024, 17:07:57 - georgezgeorgez: I really appreciate coinselor putting together that poll because it's exactly the next step we need to take.

Mon, Sep 23, 2024, 17:08:10 - georgezgeorgez: Having the community decide what the priorities are.

Mon, Sep 23, 2024, 17:08:45 - georgezgeorgez: People have expressed to me before how our development feels kind of trapped right now. Or that AZ is hostage to developer requests.

Mon, Sep 23, 2024, 17:09:16 - georgezgeorgez: We don't always get the value of a proposal immediately. But the community does not want to scare away our developers.

Mon, Sep 23, 2024, 17:10:25 - georgezgeorgez: So for these SIGs, I want us to move away from the model of large proposals priced in multiples of AZ.

Mon, Sep 23, 2024, 17:10:42 - georgezgeorgez: We can definitely break things down into small tasks, that the community can immediately get value from.

Mon, Sep 23, 2024, 17:10:48 - georgezgeorgez: And that don't cost a full AZ.

Mon, Sep 23, 2024, 17:11:25 - deeznnutz: makes sense to me. It also keeps the community informed and engaged in what devs are working on

Mon, Sep 23, 2024, 17:11:51 - georgezgeorgez: Maybe we can spend some time elaborating on each of the options in the poll?

Mon, Sep 23, 2024, 17:12:04 - deeznnutz: devs going away for a long time is stressful to community members I think

Mon, Sep 23, 2024, 17:12:34 - georgezgeorgez: yes and it discourages verification

Mon, Sep 23, 2024, 17:12:49 - deeznnutz: Regarding poll options: backup, I tried to summarize my thought here today: https://github.com/hypercore-one/deployment/issues/14

Is this what we are thinking about or something more elaborate?

Mon, Sep 23, 2024, 17:13:46 - georgezgeorgez: Got it, so this item requires a bit of research first before implementation.

Mon, Sep 23, 2024, 17:14:28 - coinselor: <@georgezgeorgez.chat "I really appreciate coinselor putting together..."> On this note, I haven't spread the poll in other channels to see if we wanted to edit it / add more options from today.

I also think we can clean up low-hanging fruits (like system checks) by next meeting and have a more 'stable' release we can publicly offer.

Mon, Sep 23, 2024, 17:14:54 - georgezgeorgez: I think it might also be good to limit the options

Mon, Sep 23, 2024, 17:14:59 - georgezgeorgez: Maybe to like 5?

Mon, Sep 23, 2024, 17:15:28 - coinselor: we can remove the 'hands-free' one, seems like that's the purpose to begin with anyway

Mon, Sep 23, 2024, 17:15:35 - coinselor: make it easier to run znnd

Mon, Sep 23, 2024, 17:16:05 - georgezgeorgez: Between meetings, we discuss what are likely the most valuable options.

And pick the top few to put in a poll.

We can put the others in a footnote or just on the wiki if people want to see what was left out but aren't following the meeting.

Mon, Sep 23, 2024, 17:16:19 - georgezgeorgez: That being said, I'm really glad to see some of our newer pillars so engaged in these meetings.

Mon, Sep 23, 2024, 17:16:19 - deeznnutz: <@coinselor.chat "we can remove the 'hands-free' one..."> was thinking the same. that is easy to update and we can probably do that while we wait for feedback.

Mon, Sep 23, 2024, 17:17:09 - sugoi: <@coinselor.chat "On this note, I haven't spread the poll..."> the poll is also 1 option only, maybe increase it to max 2?

Mon, Sep 23, 2024, 17:17:13 - georgezgeorgez: <@sugoi.chat "the poll is also 1 option only..."> I think that's fair.

Mon, Sep 23, 2024, 17:18:08 - georgezgeorgez: Yeah I agree also. Some of the ones, if they are basic, we can just do and notify.

Mon, Sep 23, 2024, 17:18:41 - deeznnutz: Are 1 & 3 sort of linked?

Mon, Sep 23, 2024, 17:19:06 - georgezgeorgez: I think 3 goes a bit deeper than 1

Mon, Sep 23, 2024, 17:19:23 - georgezgeorgez: black box vs white box

Mon, Sep 23, 2024, 17:19:57 - georgezgeorgez: real-time monitoring of events would be the sync graphs to start I guess

Mon, Sep 23, 2024, 17:20:03 - deeznnutz: Should we call that simply, local backup (bootstrap)?

Mon, Sep 23, 2024, 17:20:18 - georgezgeorgez: and performance benchmarking gets a bit more specific, maybe testing specific momentum ranges

Mon, Sep 23, 2024, 17:20:33 - coinselor: <@georgezgeorgez.chat "While I hope most community members are able..."> I think to maximize community engagement later on, we could showcase the script capabilities with like a discord community call with video going over it. Maybe a Q&A.

Mon, Sep 23, 2024, 17:20:47 - georgezgeorgez: no longer about monitoring a running node, but specific tests for performance

Mon, Sep 23, 2024, 17:21:38 - georgezgeorgez: I do think we should have a mix of feedback channels. Maybe every month or even ad-hoc as needed, we post polls like you've done

Mon, Sep 23, 2024, 17:21:53 - georgezgeorgez: Maybe once a quarter or something we do a keynote event or something lmao

Mon, Sep 23, 2024, 17:22:15 - georgezgeorgez: and do demos

Mon, Sep 23, 2024, 17:22:48 - georgezgeorgez: We can definitely do one to kick off 2025, another zenon year

Mon, Sep 23, 2024, 17:22:56 - georgezgeorgez: do you think we can squeeze one in before then?

Mon, Sep 23, 2024, 17:23:19 - deeznnutz: sure, why not?

Mon, Sep 23, 2024, 17:23:35 - coinselor: seems plenty of time, even to add the pretty colors to the cli 😏

Mon, Sep 23, 2024, 17:23:45 - deeznnutz: I'm curious, do we think this project can continue to live as a bunch of bash scripts?

Mon, Sep 23, 2024, 17:23:53 - georgezgeorgez: okay maybe let's do one in October. practice one. it's okay if it's a little scuffed

Mon, Sep 23, 2024, 17:24:07 - georgezgeorgez: well i think we've discussed a bit before about multi node orchestration

Mon, Sep 23, 2024, 17:24:25 - deeznnutz: ya, with ansible

Mon, Sep 23, 2024, 17:24:25 - georgezgeorgez: e.g. 1 pillar + 2 sentries

Mon, Sep 23, 2024, 17:24:28 - coinselor: <@deeznnutz.chat "I'm curious, do we think this project can..."> TON has a python wrapper, I don't think it's necessary but probably easier for developers. Bash not that friendly, I guess.

Mon, Sep 23, 2024, 17:24:45 - georgezgeorgez: yeah could be with ansible which is actually how most of us deployed these matrix servers i think

Mon, Sep 23, 2024, 17:25:13 - deeznnutz: https://docs.thorchain.org/thornodes/deploying

Mon, Sep 23, 2024, 17:25:30 - deeznnutz: eventually we will need a good set of docs / readme's

Mon, Sep 23, 2024, 17:25:46 - georgezgeorgez: i use kubernetes quite a bit for myself but i don't think we need to get to that level of complexity yet

Mon, Sep 23, 2024, 17:26:04 - georgezgeorgez: if we are at a point where we are dynamically provisioning new testnets on demand

Mon, Sep 23, 2024, 17:26:13 - georgezgeorgez: kubernetes might become useful to the wider community

Mon, Sep 23, 2024, 17:26:28 - georgezgeorgez: i think the next level of maturity would be something like ansible

Mon, Sep 23, 2024, 17:26:53 - deeznnutz: agree - really just showing off their docs and nice instructions https://docs.thorchain.org/thornodes/kubernetes/setup-linode

Mon, Sep 23, 2024, 17:27:09 - georgezgeorgez: yeah thorchain has a lot of operational complexity

Mon, Sep 23, 2024, 17:27:15 - georgezgeorgez: running a node of each bridged network

Mon, Sep 23, 2024, 17:27:29 - georgezgeorgez: so having kubernetes deployment be a widely supported default makes sense

Mon, Sep 23, 2024, 17:28:11 - georgezgeorgez: if someone is running znnd, an orchestrator, an extension chain node, a testnet node, and monitoring it all

Mon, Sep 23, 2024, 17:28:17 - georgezgeorgez: yeah it might be useful to us as well

Mon, Sep 23, 2024, 17:28:19 - georgezgeorgez: but let's build up to that

Mon, Sep 23, 2024, 17:28:32 - deeznnutz: in the poll, should we combine backup and restore into one activity? They really go together

Mon, Sep 23, 2024, 17:28:43 - georgezgeorgez: agree

Mon, Sep 23, 2024, 17:28:46 - georgezgeorgez: it's a single feature

Mon, Sep 23, 2024, 17:29:06 - deeznnutz: if we do that and remove the hands-free we are down to 5

Mon, Sep 23, 2024, 17:29:17 - georgezgeorgez: nice

Mon, Sep 23, 2024, 17:29:37 - georgezgeorgez: i think if we present too much choice, it gets confusing

Mon, Sep 23, 2024, 17:29:59 - georgezgeorgez: long term, I would expect themes to emerge as well

Mon, Sep 23, 2024, 17:30:13 - georgezgeorgez: deployment, monitoring, resilience

Mon, Sep 23, 2024, 17:30:22 - deeznnutz: If we do go with backup / restore I have two very well tested scripts I've been using that we can leverage.

Mon, Sep 23, 2024, 17:30:27 - georgezgeorgez: and each of these will have a level of maturity

Mon, Sep 23, 2024, 17:30:43 - georgezgeorgez: the options could just be, which theme should we try to get to the next level of maturity

Mon, Sep 23, 2024, 17:30:46 - georgezgeorgez: which would include these things

Mon, Sep 23, 2024, 17:31:05 - georgezgeorgez: <@deeznnutz.chat "If we do go with backup / restore I have..."> great

Mon, Sep 23, 2024, 17:31:50 - deeznnutz: Sounds like next steps are to update the poll, send it out everywhere for feedback, and then allocate work based on the next features.

Mon, Sep 23, 2024, 17:31:51 - georgezgeorgez: We should really stress though that if you use someone else's backup without verifying it, you might as well not run your own node

Mon, Sep 23, 2024, 17:32:14 - georgezgeorgez: The whole point of running a node is to trustlessly verify the chain

Mon, Sep 23, 2024, 17:32:21 - georgezgeorgez: So taking your own backup and using it is great

Mon, Sep 23, 2024, 17:32:28 - georgezgeorgez: Using someone else's, not so great

Mon, Sep 23, 2024, 17:32:55 - georgezgeorgez: <@deeznnutz.chat "Sounds like next steps are to..."> yup. I'm curious to see what kind of feedback we get.

Mon, Sep 23, 2024, 17:33:00 - deeznnutz: <@georgezgeorgez.chat "You might as well not run your own node..."> agree. I think people do get frustrated with the sync times and some know I have a bootstrap and ask to use it. I try my hardest to avoid that, but some guys get frustrated

Mon, Sep 23, 2024, 17:33:23 - georgezgeorgez: I know we released the initial script, do we need to try and get more comms around it?

Mon, Sep 23, 2024, 17:33:43 - deeznnutz: I've been testing it offline with many people

Mon, Sep 23, 2024, 17:33:52 - georgezgeorgez: have we posted it in all the relevant channels or maybe even go for a tweet

Mon, Sep 23, 2024, 17:33:56 - deeznnutz: maybe 4 or 5 in addition to what we see in the open

Mon, Sep 23, 2024, 17:33:57 - georgezgeorgez: especially for the feedback

Mon, Sep 23, 2024, 17:34:13 - georgezgeorgez: i want people to know how we run things here

Mon, Sep 23, 2024, 17:34:18 - deeznnutz: Yes, let's try for more

Mon, Sep 23, 2024, 17:34:31 - georgezgeorgez: SIGs define work, Community prioritizes it. (with incentives, but that's WIP)

Mon, Sep 23, 2024, 17:34:43 - coinselor: sent an image. (Media omitted)

Mon, Sep 23, 2024, 17:34:57 - georgezgeorgez: <@deeznnutz.chat "maybe 4 or 5 in addition to what..."> Do they have any reservations about joining this group/chat?

Mon, Sep 23, 2024, 17:35:04 - coinselor: deleted the previous poll and set a new one to max 2 votes

Mon, Sep 23, 2024, 17:35:12 - deeznnutz: no I see many here

Mon, Sep 23, 2024, 17:35:20 - deeznnutz: Cap, Stark, Shai

Mon, Sep 23, 2024, 17:35:48 - deeznnutz: One more I'm forgetting

Mon, Sep 23, 2024, 17:35:52 - georgezgeorgez: Okay cool. Summary level is great and needed. But the more we can directly engage users, the better.

Mon, Sep 23, 2024, 17:36:47 - georgezgeorgez: Should we have cloud-specific guides?

Mon, Sep 23, 2024, 17:37:02 - georgezgeorgez: Like, how to deploy znnd on AWS? or how to deploy znnd on Digital Ocean?

Mon, Sep 23, 2024, 17:37:03 - deeznnutz: for now I don't think we need it

Mon, Sep 23, 2024, 17:37:12 - deeznnutz: it's pretty much the same

Mon, Sep 23, 2024, 17:37:21 - georgezgeorgez: Okay, the most relevant thing would be cloud firewalls

Mon, Sep 23, 2024, 17:37:25 - deeznnutz: only difference is if they don't have apt maybe

Mon, Sep 23, 2024, 17:37:39 - deeznnutz: <@georgezgeorgez.chat "Okay, the most relevant thing wo..."> ya, maybe we handle that in the readme

Mon, Sep 23, 2024, 17:37:43 - georgezgeorgez: well that's just ubuntu/debian based

Mon, Sep 23, 2024, 17:37:50 - coinselor: <@georgezgeorgez.chat "Okay, the most relevant thing wo..."> I think just having a few paragraphs noting the differences / explaining them would suffice

Mon, Sep 23, 2024, 17:38:10 - georgezgeorgez: Sounds good

Mon, Sep 23, 2024, 17:38:44 - georgezgeorgez: Cool, so we'll get that feedback out there. If people could try and share the initial deployment script and the poll that would be great.

Mon, Sep 23, 2024, 17:39:00 - georgezgeorgez: Anything else or should we set a time in about a month?

Mon, Sep 23, 2024, 17:39:10 - deeznnutz: And once we know what features people want we can break that work up

Mon, Sep 23, 2024, 17:39:23 - coinselor: <@deeznnutz.chat "only difference is if they don't..."> script should def check for this, either we don't run it if it's not ubuntu/debian or add support for others. I think it should be trivial to add arch/fedora/etc. The ton script had a check for it.

Mon, Sep 23, 2024, 17:39:35 - georgezgeorgez: Hmm, I wonder if we should mention Vilkris MR more explicitly in the poll

Mon, Sep 23, 2024, 17:40:08 - georgezgeorgez: Both are important, any tooling or process created as a result of getting those MRs through

Mon, Sep 23, 2024, 17:40:13 - georgezgeorgez: but just getting those MRs through also is good

Mon, Sep 23, 2024, 17:40:33 - deeznnutz: Side note...

Mon, Sep 23, 2024, 17:40:36 - deeznnutz: check out their status

Mon, Sep 23, 2024, 17:40:39 - deeznnutz: sent an image. (Media omitted)

Mon, Sep 23, 2024, 17:40:47 - coinselor: I think we can do that with the comms, meaning we share the poll and point to people being able to test pending MRs?

Mon, Sep 23, 2024, 17:41:09 - georgezgeorgez: sounds good. i just don't want it to get lost that we have those MRs

Mon, Sep 23, 2024, 17:41:14 - deeznnutz: I'm going to add that as an enhancement. let's not put on the poll. we can track separately

Mon, Sep 23, 2024, 17:41:39 - georgezgeorgez: <@deeznnutz.chat "sent an image."> haha well our ascii art is in good company

Mon, Sep 23, 2024, 17:42:16 - georgezgeorgez: yes, it might be cool to have a summary screen like this

Mon, Sep 23, 2024, 17:42:26 - georgezgeorgez: and then maybe a countdown that you can interrupt before it just takes off

Mon, Sep 23, 2024, 17:43:09 - deeznnutz: I think we have some good next steps. I have time this week to work on the hands-free changes. it should be pretty simple

Mon, Sep 23, 2024, 17:43:33 - georgezgeorgez: With a longer cadence, we should try and get our GitHub issues in a better spot.

Mon, Sep 23, 2024, 17:43:35 - georgezgeorgez: I'm guilty

Mon, Sep 23, 2024, 17:43:43 - deeznnutz: How does 21 Oct 24 @ 6PM EST work for the next meeting?

Mon, Sep 23, 2024, 17:43:59 - georgezgeorgez: and actually use them as part of our regular workflow

Mon, Sep 23, 2024, 17:44:28 - georgezgeorgez: mondays have been good for me so far

Mon, Sep 23, 2024, 17:44:34 - deeznnutz: cap: sugoi any other feedback you want to share?

Mon, Sep 23, 2024, 17:45:32 - georgezgeorgez: Hmm before we end, going back to incentives and community driving the work.

Mon, Sep 23, 2024, 17:45:51 - sugoi: <@deeznnutz.chat "cap: sugoi any other feedback you..."> I did not have the time yet to implement Vilkris' v2 thingy, but the script ran smooth. No issues there

Mon, Sep 23, 2024, 17:47:05 - georgezgeorgez: What I want to create here is like an engine with a little fire. We are going along right now keeping that fire alive and slowly chugging along. Each SIG is like an engine.

The community's job is to decide which SIG to throw fuel (incentives) into so that we go much much faster.

Mon, Sep 23, 2024, 17:47:15 - cap: <@deeznnutz.chat "cap: sugoi any other feedback yo..."> nope, i think you guys are on the right track

Mon, Sep 23, 2024, 17:47:47 - georgezgeorgez: Wherever they put the fuel, we keep going.

Mon, Sep 23, 2024, 17:48:10 - georgezgeorgez: If Operations work is mostly stable, we still keep a small effort going but can focus on other SIGs.

Mon, Sep 23, 2024, 17:48:21 - coinselor: <@georgezgeorgez.chat "and actually use them as part of..."> I'll try to create smaller PRs for improvements to make the script more robust, tick off the easy stuff!

Mon, Sep 23, 2024, 17:48:22 - georgezgeorgez: At least, that's how I think it can work.

Mon, Sep 23, 2024, 17:49:05 - georgezgeorgez: Thank you sir. It all adds up.

Mon, Sep 23, 2024, 17:49:43 - deeznnutz: Great!! Glad we are humming along making progress.

Mon, Sep 23, 2024, 17:50:45 - georgezgeorgez: Over the next few weeks, we'll come up with some initial demo sessions.

Mon, Sep 23, 2024, 17:50:54 - georgezgeorgez: Thank you everyone, good meeting!

Mon, Sep 23, 2024, 17:51:16 - deeznnutz: I'll follow up on my items this week. Thanks everyone!

Mon, Sep 23, 2024, 17:51:34 - coinselor: I'm not even sure how deeznnutz figured out public nodes were offline, but I look forward to the day we have some bot leveraging logs or something we created to alert aliens in real time

Mon, Sep 23, 2024, 17:52:16 - georgezgeorgez: I think grafana can integrate with alerting services

Mon, Sep 23, 2024, 17:52:27 - georgezgeorgez: but let's continue that chat outside the meeting

Mon, Sep 23, 2024, 17:52:37 - georgezgeorgez: and properly end it for the minutes haha

Mon, Sep 23, 2024, 17:53:03 - georgezgeorgez: if people want details, they'll have to join

Mon, Sep 23, 2024, 17:53:06 - coinselor: Thanks for running it

Mon, Sep 23, 2024, 17:53:08 - deeznnutz: === END SIG ===