July 2019 - Adélie Open Governance - Adelie Linux Mailing Lists

Removing old Git repositories from GitLab

by A. Wilcox

Hello all, Now that we've migrated to the new GitLab, I think it is time we revisit the repositories we are still keeping (in archived form): * aports.git and aports-old.git Both are over 100 MB and both contain nothing helpful or useful to us any more. They're snapshots of 2017 and 2018 Alpine packages, sometimes with improvements that made it to system/ or user/. If we want to investigate how Alpine did something we can just look in their current Git instead of an old mirror. I'd like to remove both of these. * portageplus.git This is some patches to an old version of Portage, ca early 2017, that supports emitting APK binpkgs. It also defaults to Galapagos instead of Gentoo as a fallback repository. (How many here even remember Galapagos?) We can probably archive the apkkit patch for posterity and then delete it, IMO. * patches.git musl compatibility patches for some packages in /etc/portage/patches format. My suggestion is take what we haven't packaged in packages.git and save it somewhere (or maybe just package it?), then remove it. * etc-portage.git, apkkit-conf.git, systemsite.git Old infrastructure from when we were a Gentoo fork. Maybe archive for posterity, or delete it. Comments welcome. --arw -- A. Wilcox (awilfox) Project Lead, Adélie Linux https://www.adelielinux.org

1 year, 9 months

5
4
0 / 0

Proposal: Replacing Integricloud with Scaleway

by A. Wilcox

== Table of Contents == * Executive Summary * How Did We Get Here? * Reliability Is Not Availability * Enter Scaleway * Hard Numbers * Conclusion * References == Executive Summary == This is a formal proposal to retire the dedicated server we have with Integricloud and replace it with a set of virtual servers from Scaleway. We originally chose Integricloud's dedicated server offering primarily for reliability and security. While it has proven secure, and the hardware itself is reliable, its availability leaves something to be desired. Scaleway offers a similar level of reliability, and has a higher level of availability based on our current account with them. They additionally offer servers that are not based on the x86 architecture, so we are still protected from the numerous issues that plague x86. This will also reduce our hosting costs by almost 90%, and should reduce downtime by nearly 100%. == How Did We Get Here? == In early January 2019, we were notified that both of our dedicated servers at Rack911 were being retired, with very little notice. For some additional information, reference adelie-devel@ post with message ID <ba35ebd3-54b4-f18f-b65f-d327e9d0af80(a)adelielinux.org> (archived at [1]). After our sponsorship was pulled in October 2018, we had done a bit of investigation into replacement hosting providers in the event that this would happen. Our requirements at the time were: * non-x86 based (due to the plethora of x86 bugs being discovered) * at least 8 GB RAM minimum * dedicated hardware preferred * at least 3 IPv4 addresses We evaluated Packet.net for ARM64 based systems[2] and Integricloud for PPC64 based systems[3]. We found Integricloud to be approximately 60% of the cost of Packet.net[4]. Additionally, we had a professional working relationship with their parent company, Raptor Engineering, who make the Talos and Blackbird family of computers. In fact, the Integricloud system we were offered was to be a rack-mounted Talos II. Since we already had a Talos II in use as a build server, we felt this would be close to ideal, as any hardware oddities have already been worked out. We chose their 4-core (16-thread) PowerPC system with 8 GB RAM and 2 x 1 TB NVMe disk storage. One 1 TB NVMe disk is dedicated to mirrormaster.adelielinux.org. The other 1 TB NVMe disk is an LVM group, shared between the various KVM-based virtual servers run on it. == Reliability Is Not Availability == The Integricloud dedicated server, chloe.adelielinux.org, has has no hardware issues in over eight months of service. The hardware itself has been fast, stable, and very reliable. However, there have been multiple issues regarding availability. Integricloud has a single homed fibre infrastructure; per a public looking glass, it is run via Mediacom[5]. This has caused an unforeseen and consistent issue regarding availability. 2019-04-16 13:17 down 2019-04-16 22:24 9 hours, 7 minutes 2019-04-17 00:10 down 2019-04-17 12:29 12 hours, 19 minutes 2019-07-09 06:25 down 2019-07-09 20:01 13 hours, 37 minutes 2019-07-10 15:14 down 2019-07-10 15:39 25 minutes 2019-07-12 16:35 down 2019-07-12 16:43 8 minutes This has resulted in a 97% uptime for April, and a 98% uptime for July - and we are only 13 days into July, so this number could go down further. Additionally, many ISPs are not accepting Mediacom's IPv6 route announcements. This has caused mirrormaster to be inaccessible to many of our users, and even one of the members of our own Infra Team[6]. Finally, while yours truly was trying to show an Adélie Web page to someone while on public Wi-Fi at a well-known place in Broken Arrow, OK, I was greeted with an error page[7]: Sonicwall Network Security Appliance This site has been blocked by the network administrator. Block reason: Gateway GEO-IP Filter Alert IP address: 23.155.224.64 Connection initiated towards country: Unknown If a car dealership's firewall is blocking us, who knows what other firewalls are blocking us. How many people are unable discover us, and how many corporate sponsors are we missing out on, because they can't even connect to our Web site? And why can they not connect to our Web site? It could be the IPv6 peering issue, or a firewall blocking our IPv4 space, or because Mediacom has suffered another "fibre cut". == Enter Scaleway == We have had a working relationship with Scaleway for almost a year and a half. We launched our 32-bit ARM builder on the Scaleway ARM cloud in March 2018, and have had no downtime in that time: awilcox on erin [pts/0 Sat 13 9:33] ~: uptime 09:33:02 up 489 days, 5:59, load average: 0.00, 0.00, 0.00 The network has never suffered any outages, either. Since the Scaleway cloud features ARM servers, we would additionally still be able to avoid the x86 architecture and all of its failings. We have continually been limited by our lack of IPv4 space at Integricloud. Currently, we "proxy" every server via athdheise, a virtual server on our Integricloud dedicated system that has both an IPv4 and IPv6 address. All of our main systems are IPv6-only (wiki, bts, next, etc), and when an IPv4 system attempts to connect to any of these services, they have to be proxied via athdheise. If we use Scaleway virtual servers, every system gets its own dedicated IPv4 address, which drastically simplifies our administration. Additionally, we would receive a lot more RAM per virtual server. Currently, athdheise - the aforementioned Web server and proxy - has 256 MB RAM. It has 34 MB of available RAM. When documentation changes are made and the Git hook runs to cause athdheise to rebuild the documentation site (at help.adelielinux.org), sometimes the process runs out of memory. This means one of us has to log in, stop the web server, run the make process, and then restart the web server. The minimum RAM at Scaleway is 2 GB per virtual server. This is an extreme amount of overhead, and would even allow us to play with memcached (or other caching solutions) to reduce latency across our infrastructure. Finally, we would save a dramatic amount of money. We currently pay 225$/mo pre-tax for Integricloud. == Hard Numbers == The current systems we run on Integricloud are: enfys (postgresql) 768 MB RAM 30 GB disk rarity (these mailing lists) 1536 MB RAM 30 GB disk mirrormaster 256 MB RAM 1 TB disk bts (Bugzilla issue tracking) 512 MB RAM 8 GB disk athdheise (Web server/proxy) 256 MB RAM 4 GB disk wiki 512 MB RAM 8 GB disk annwyn (Nextcloud) 512 MB RAM 100 GB disk chatterbox (Quassel IRC) 512 MB RAM 40 GB disk Since Scaleway tops out at 500 GB disk, we will need to consider alternate hosting for mirrormaster. I believe we can run this on the Hetzner dedicated server that is being sponsored by Alyx at Leuhta Labs. And this is what we could pay per virtual system on Scaleway: 4 ARM CPUs, 2 GB RAM, 50 GB disk - 2.99€/mo 6 ARM CPUs, 4 GB RAM, 100 GB disk - 5.99€/mo 8 ARM CPUs, 8 GB RAM, 200 GB disk - 11.99€/mo By my approximation, we would be able to put every single system except annwyn on the smallest server, and annwyn on the second-smallest. 6× 2.99€ = 17.94€ per month 1× 4.99€ + 17.94€ = 22.93€ per month total cost, or approximately 25.81$. This is a savings of nearly 90% after tax. == Conclusion == I believe that retiring our Integricloud dedicated server and replacing it with Scaleway virtual ARM servers makes business sense. It will allow us to spend less time down, dramatically improve the architecture of our infrastructure, and reach more people. This will allow us to have an even greater reach, and allow us to grow into a larger, more healthy Linux distribution that can genuinely improve the world. I do not want to leave this proposal without a separate smaller proposal for how this could be effected easily. I believe that we can simply start by migrating the wiki server, since it is the least used service. We can feel out Scaleway's ARM offering for a while, and make sure that it will genuinely work for our needs. After we are satisfied, we can change the DNS for the wiki and begin work on another server. Assuming all goes well, we will eventually be able to quietly power off the Integricloud dedicated system with zero further downtime. Thank you so much for reading this proposal. I welcome any comments or questions you may have. You may respond here or poke me on IRC. I'll post a summary email in response with any important notes from IRC. Best, --arw == References == [1]: https://lists.adelielinux.org/hyperkitty/list/adelie-devel@lists.adelieli... [2]: https://www.packet.com/cloud/servers/c1-large-arm/ [3]: https://www.integricloud.com/ [4]: The Packet.net ARM box runs at 360$/mo. Integricloud is 220$/mo. [5]: https://bgp.he.net/AS46246 [6]: <aranea> awilfox: Looks like my routing issues are Mediacom's (that's Raptor's only upstream) fault. I doubt I'll have any success contacting them; this needs to come from a customer. I'll try contacting tpearson again with more details; if he doesn't respond, I may have to ask you to file an outage report or sth. <aranea> Short version: Mediacom doesn't follow some standard industry practices, and thus many of their peers aren't accepting the routes they announce on behalf of their customers (and guess what, Raport is their only IPv6 customer.) [7]: https://i.imgur.com/khmebJ5.png -- A. Wilcox (awilfox) Project Lead, Adélie Linux https://www.adelielinux.org

1 year, 10 months

4
4
0 / 0

Re: Proposal: Replacing Integricloud with Scaleway

by Kiyoshi Aman

I think that, before we can make a good decision regarding migration, we should look at the resources we already have available to us. We have a dedicated server in Finland with 32GB RAM and a fair amount of disk space available. We have two dedicated servers generously provided for our use in Pennsylvania, albeit one earmarked for buildbot service and the other as yet provisioned (by us) for service. In light of that, I want to take the server list below in reverse order. A. Wilcox wrote: > The current systems we run on Integricloud are: > > enfys (postgresql) 768 MB RAM 30 GB disk > > rarity (these mailing lists) 1536 MB RAM 30 GB disk > > mirrormaster 256 MB RAM 1 TB disk > > bts (Bugzilla issue tracking) 512 MB RAM 8 GB disk > > athdheise (Web server/proxy) 256 MB RAM 4 GB disk > > wiki 512 MB RAM 8 GB disk > > annwyn (Nextcloud) 512 MB RAM 100 GB disk > > chatterbox (Quassel IRC) 512 MB RAM 40 GB disk At the moment, both chatterbox and annwyn are personal resources. Leaving aside the discussion as to whether they belong on project infrastructure, they should be migrated to personal infrastructure unless they are intended to be made more widely available to Adélie contributors (even if only to the core and/or infrastructure teams). athdheise only exists because IntegriCloud was not able to provide IPv4 addresses at a price we were able to pay. It would be retired regardless. My understanding is that we were planning on retiring the wiki. This would be an excellent time to do so. I think that bts should be retired and merged with gitlab, or alternatively it can be on the same server if retiring it is contra-indicated (e.g. due to gitlab being unable to provide bug-tracking without a git repo associated). mirrormaster would need to be migrated to the Finland server regardless, since no VPS provider provides block storage in the quantities we need at a rate we can live with. The mailing lists should be on hosting separate from our other infrastructure, since it can and should be usable regardless of the rest of our infrastructure's dispositions. That leaves the postgresql server, which should be co-located with gitlab. I understand that one of our goals is for our infrastructure to not be subject to architectural issues with x86_64. In principle I agree, but migrating the majority of our infrastructure from VPSes on a single dedicated server to VPSes on an unknown number of servers, especially in today's security environment, carries more risk than ensuring our infrastructure sits on hardware we know is used only by us.

1 year, 10 months

4
5
0 / 0

Splitting debug packages into separate repositories

by A. Wilcox

Hi all, I would like to formally suggest that we take our -dbg packages and put them in a split repository, such as system-debug and user-debug. Non-stratum 1 mirrors could choose whether or not to mirror these as well. This would take a significant amount of disk space pressure off of stratum 2/3 mirrors. Output for: du -b *-dbg-*.apk | awk 'BEGIN { s = 0; } { s += $1; } END { print s; }' system/ppc: 1853417479 system/ppc64: 1877177133 user/ppc: 1.73955e+10 user/ppc64: 1.9961e+10 So, the overall size reduction would be about 20 GB per architecture. Consider that for ppc64, the entirety of system/ is 3.8G; user/, 28G. Please respond with any thoughts or comments. Best, --arw -- A. Wilcox (awilfox) Project Lead, Adélie Linux https://www.adelielinux.org

1 year, 10 months

6
9
0 / 0

2021

2020

2019

2018

Adélie Open Governance July 2019