@computergeek125 - K-Money's Lemmy

computergeek125@lemmy.world · 11 hours ago

If you’re trying to do VDI in the cloud, that can get expensive fast on account of the GPU processing needed

Most of the protocols I know of the run CPU-only (and I’m perfectly happy to be proven wrong and introduced to something new) tend to fray at high latency or high resolution. The usual top two I’ve seen are VNC and RDP (XRDP project on Linux), with NoMachine and plain x11 over SSH being right behind that. I think NoMachine had the best performance of those three, but it’s been a hot minute since I’ve personally used it. XRDP is the one I’ve used the most often, but getting login/lock/unlock working was fiddly at first but seems to be stable holding.

Jumping from the “basic connection, maybe barely but not always suitable for video” to “ultra high grade high speed”, we have Parsec and Sunshine+Moonlight. Parsec is currently limited to only Windows/Mac hosting (with Linux client available), and both Parsec and Sunshine require or recommend a reasonable GPU to handle the encoding stage (although I believe Sunshine may support an X264 encoder which may exert a heavy CPU tax depending on your resolution). The specific problem of sourcing a GPU in the cloud (since you mention EC2) becomes the expensive part. This class of remote access tends to fray at high resolution and frame rate less because it’s designed to transport video and games, rather than taking shortcuts to get a minimum desktop visible.

computergeek125@lemmy.world · 22 hours ago

If you have a server with out-of-band/lights-out management such as iDRAC (Dell), iLO (HPe), IPMI (generic, Supermicro, and others) or equivalent, those can measure the server’s power draw at both PSUs and total.

computergeek125@lemmy.world · 22 hours ago

Yeah that’s totally fair. I have nearly a kilowatt of real time power draw these days, Rome was not built in a day.

computergeek125@lemmy.world · 1 day ago

That’s the neat part - Ceph can use a full mesh of connections with just a pair of switches and one balance-slb 2-way bond per host. So each host only needs 2 NIC ports (could be on the same NIC, I’m using eno1 and eno2 of my R730’s 4-port LOM), and then you plug each of the two ports into one switch (two total for redundancy, in case a switch goes down for maintenance or crash). You just need to make sure the switches have a path to each other at the top.

computergeek125@lemmy.world · edit-2 1 day ago

I think you’re asking too much from ZFS. Ceph, Gluster, or some other form of cluster native filesystem (GFS, OCFS, Lustre, etc) would handle all of the replication/writes atomically in the background instead of having replication run as a post processor on top of an existing storage solution.

You specifically mention a gap window - that gap window is not a bug, it’s a feature of using a replication timer, even if it’s based on an atomic snapshot. The only way to get around that gap is to use different tech. In this case, all of those above options have the ability to replicate data whenever the VM/CT makes a file I/O - and the workload won’t get a write acknowledgement until the replication has completed successfully. As far as the workload is concerned, the write just takes a few extra milliseconds compared to pure local storage (which many workloads don’t actually care about)

I’ve personally been working on a project to convert my lab from ESXi vSAN to PVE+Ceph, and conversions like that (even a simpler one like PVE+ZFS to PVE+Ceph would require the target disk to be wiped at some point in the process.

You could try temporarily storing your data on an external hard drive via USB, or if you can get your workloads into a quiet state or maintenance window, you could use the replication you already have and rebuild the disk (but not the PVE OS itself) one node at a time, and restore/migrate the workload to the new Ceph target as it’s completed.

On paper, (I have not yet personally tested this), you could even take it a step farther: for all of your VMs that connect to the NFS share for their data, you could replace that NFS container (a single point of failure) with the cluster storage engine itself. There’s not a rule I know of that says you can’t. That way, your VM data is directly written to the engine at a lower latency than VM -> NFS -> ZFS/Ceph/etc

computergeek125@lemmy.world · 1 day ago

Yeah it’s a bit of a chonk. I don’t remember the exact itemization on the power bill and I don’t have one in front of me.

computergeek125@lemmy.world · 2 days ago

My server rack has

3x Dell R730
1x Dell R720
2x Cisco Catalyst 3750x (IP Routing license)
2x Netgear M4300-12x12f
1x Unifi USW-48-Pro
1x USW-Agg
3x Framework 11th Gen (future cluster)
1x Protectli FE4B

All together that draws… 0.1 kWh… in 0.327s.

In real time terms, measured at the UPS, I have a running stable state load of 900-1100w depending on what I have at load. I call it my computationally efficient space heater because it generates more heat than is required for my apartment in winter except for the coldest of days. It has a dedicated 120v 15A circuit

computergeek125@lemmy.world · 25 days ago

Isn’t venmo owned by PayPal for the past 10y?

computergeek125@lemmy.world · 28 days ago

I wondered if someone would post that second one.

For the first, I think Square Enix got it right - headphones light the right image, but with the bridge between the ear cups flopped back on their head.

Alternatively, you could have headphones like the first but with the drivers in the upper cat ear portion by their actual ears.

computergeek125@lemmy.world · 29 days ago

OT but am I the only one that noticed the fox’s headphones aren’t on their ears?

computergeek125@lemmy.world · 29 days ago

I have five Dell servers in the rack, and another two Dells and three x9? (Atom C2758 8-core if memory serves) Supermicros on the shelf.

I think only one or two of the Dells came with iDRAC Enterprise and all the Supermicros had full licensing. It’s absolutely beautiful (once you get done fighting the software updates to purge the Java gremlins).

My three R730s were upgraded to Enterprise as soon as I had budget and a spare line item to do so. Power on/off is great and console+ISO is peak. I love this.

computergeek125@lemmy.world · 30 days ago

If you’re looking at Intel, you might be thinking IME/vPro

IPMI (such as iDRAC on Dell) runs off-processor on a different section of the motherboard typically and is installed on AMD servers as well.

computergeek125@lemmy.world · 1 month ago

What’s the difference between horizontal and vertical integration? (I know a few business words but usually not enough to be intelligent, this is a genuine question of confusion)

computergeek125@lemmy.world · 1 month ago

It’s on APNews too - it’s real

computergeek125@lemmy.world · 2 months ago

I never said anything about EFI not supporting multi boot. I said that the had to be kept in lockstep during updates. I recognize the term “manual” might have been a bit of a misnomer there, since I included systems where the admin has to take action to enable replication. ESXi (my main hardware OS for now) doesn’t even have software RAID for single-server datastores (only vSAN). Windows and Linux both can do it, but its a non-default manual process of splicing the drives together with no apparent automatic replacement mechanism - full manual admin intervention. With a hardware RAID, you just have to plop the new disk in and it splices the drive back into the array automatically (if the drive matches)
- “EFI doesn’t understand (normal) MD RAID” - https://unix.stackexchange.com/a/742072/34724 (2023)
- (untested) “Using metadata 1.0 (end of disk) to splice EFI partitions together” - https://std.rocks/gnulinux_mdadm_uefi.html
- (untested) “splicing windows dynamic disks together” - https://learn.microsoft.com/en-us/troubleshoot/windows-server/backup-and-storage/set-up-dynamic-boot-partition-mirroring
Dell and HPe both have had RAM caching for reads and writes since at least 2011. That’s why the controllers have batteries :)
- also, I said it only had to handle the boot disk. Plus you’re ignoring the fact that all modern filesystems will do page caching in the background regardless of the presence of hardware cache. That’s not unique to ZFS, Windows and Linux both do it.
mdadm and hardware RAID offer the same level of block consistency validation to my current understanding- you’d need filesystem-level checksumming no matter what, and as both mdadm and hardware RAID are both filesystem agnostic, they will almost equally support the same filesystem-level features (Synology implements BTRFS on top of mdadm - I saw a small note somewhere that they had their implementation request block rebuild from mdadm if btrfs detected issues, but I have been unable to verify this claim so I do not consider it (yet) as part of my hardware vs md comparison)

Hardware RAID just works, and for many, that’s good enough. In more advanced systems, all its got to handle is a boot partition, and if you’re doing your job as a sysadmin there’s zero important data in there that can’t be easily rebuilt or restored.

computergeek125@lemmy.world · 2 months ago

I never said I didn’t use software RAID, I just wanted to add information about hardware RAID controllers. Maybe I’m blind, but I’ve never seen a good implementation of software RAID for the EFI partition or boot sector. During boot, most systems I’ve seen will try to always access one partition directly and a second in order, which is bypassing the concept of a RAID, so the two would need to be kept manually in sync during updates.

Because of that, there’s one notable place where I won’t - I always use hardware RAID for at minimum the boot disk because Dell firmware natively understands everything about it from a detect/boot/replace perspective. Or doesn’t see anything at all in a good way. All four of my primary servers have a boot disk on either a Startech RAID card similar to a Dell BOSS or have an array to boot off of directly on the PERC. It’s only enough space to store the core OS.

Other than that, at home all my other physical devices are hypervisors (VMware ESXi for now until I can plot a migration), dedicated appliance devices (Synology DSM uses mdadm), or don’t have a redundant disks (my firewall - backed up to git, and my NUC Proxmox box, both firewalls and the PVE are all running ZFS for features).

Three of my four ESXi servers run vSAN, which is like Ceph and replaces RAID. Like Ceph and ZFS, it requires using an HBA or passthrough disks for full performance. The last one is my standalone server. Notably, ESXi does not support any software RAID natively that isn’t vSAN, so both of the standalone server’s arrays are hardware RAID.

When it comes time to replace that Synology it’s going to be on TrueNAS

computergeek125@lemmy.world · 2 months ago

For recovering hardware RAID: most guaranteed success is going to be a compatible controller with a similar enough firmware version. You might be able to find software that can stitch images back together, but that’s a long shot and requires a ton of disk space (which you might not have if it’s your biggest server)

I’ve used dozens of LSI-based RAID controllers in Dell servers (of both PERC and LSI name brand) for both work and homelab, and they usually recover the old array to the new controller pretty well, and also generally have a much lower failure rate than the drives themselves (I find myself replacing the cache battery more often than the controller itself)

Only twice out of the handful of times I went to a RAID controller from a different generation

first time from a mobi failed R815 (PERC H700) physically moving the disks to an R820 (PERC H710, might’ve been an H710P) and they were able to foreign import easily
Second time on homelab I went from an H710 mini mono to an H730P full size in the same chassis (don’t do that, it was a bad idea), but aside from iDRAC being very pissed off, the card ran for years with the same RAID-1 array imported.

As others have pointed out, this is where backups come into play. If you have to replace the server with one from a different generation, you run the risk that the drives won’t import. At that point, you’d have to sanitize the super block of the array and re-initialize it as a new array, then restore from backup. Now, the array might be just fine and you never notice a difference (like my users that had to replace a failed R815 with an 820), but the result pattern is really to the extremes of work or fault with no in between.

Standalone RAID controllers are usually pretty resilient and fail less often than disks, but they are very much NOT infallible as you are correct to assess. The advantage to software systems like mdadm, ZFS, and Ceph is that it removed the precise hardware compatibility requirements, but by no means does it remove the software compatible requirements - you’ll still have to do your research and make sure the new version is compatible with the old format, or make sure it’s the same version.

All that’s said, I don’t trust embedded motherboard RAIDs to the same degree that I trust standalone controllers. A friend of mine about 8-10 years ago ran a RAID-0 on a laptop that got it’s super block borked when we tried to firmware update the SSDs - stopped detecting the array at all. We did manage to recover data, but it needed multiple times the raw amount of storage to do so.

we made byte images of both disks in ddrescue to a server that had enough spare disk space
found a software package that could stitch together images with broken super blocks if we knew the order the disks were in (we did), which wrote a new byte images back to the server
copied the result again and turned it into a KVM VM to network attach and copy the data off (we could have loop mounted the disk to an SMB share and been done, but it was more fun and rewarding to boot the recovered OS afterwards as kind of a TAKE THAT LENOVO…we were younger)
took in total a bit over 3TB to recover the 2x500GB disks to a usable state - and took about a week of combined machine and human time to engineer and cook, during which my friend opted to rebuild his laptop clean after we had images captured - to one disk windows, one disk Linux, not RAID-0 this time :P

computergeek125@lemmy.world · 2 months ago

Just because sponsor block exists, doesn’t mean video creators shouldn’t be better.

Just like UBO and web ads.

computergeek125@lemmy.world · 2 months ago

Sadly the so-called “smart TV” is becoming the norm. Companies add unnecessary crap to TVs that’s often as slow as your car’s factory infotainment system, and when they feel like not upgrading the software anymore for security issues in a few years, it’s a permanent security hazard until you disconnect it from the network.

I have a Vizio TV from several years ago with Yahoo branded smart functions (that should date it) that I need to factory reset because I can’t find the WiFi password erase.

computergeek125@lemmy.world · edit-2 2 months ago

What is heavens name is that captcha gate? I blocked notifications then it tried to scam me that my phone screen was broken and I had a virus

A handful of the bad forwards my pihole did manage to block

Got a better link?