StartOS on Proxmox as a KVM?

I have successfully installed the ISO for StartOS on my Proxmox VE 8 box and all is well. I have installed BitcoinKnots service, and LND + Tor, etc. I was able to get BTC sync completed (took about 4 days) and I’m currently doing the same with LND, although the graph sync seems to be taking forever.

Is this setup a good candidate for StartOS? I’m trying to virtualize everything for management & backups, etc. and it looks like it is working perfectly, but just curious if anyone has seen any downside for doing StartOS as VM here?

You can go ahead and add a peer. Sometimes LND won’t finish sync to graph until you do so.

While not supported by Start9, Proxmox is a common way that many people run StartOS.

OK, thanks. Good to know. I’m a Proxmox contrib dev and support consultant, so if anyone needs any help with it, I’m happy to offer what I can.

1 Like

Ah! A Proxmox Ninja. I think you’ll be popular. Thanks for hanging out with us!

1 Like

Are you running StartOS 0.4.0?

StartOS 0.4.0 beta .5

1 Like

I am running on Proxmox as well. So far rock solid. My VM has 4 cores and 10 GB Memory, Bitcoin Knots BIP 110, LND and RTL, lnbits, Electrs, Mempool, Datum Gateway.

I would recommend to put the VM on a passthrough SSD though, to isolate the write traffic from other VMs. My virtual qcow2 disk for Start9 already got corrupted once, but no problems since then using the SSD as passthrough.

Do you think Proxmox is not a good way to run your StartOS server on a RAID?

RAID is not a backup, and performance wise a single SSD is enough.
And how would RAID prevent the qcow2 corruption I experienced?

I use hardware RAID for most of my servers, with a hot spare drive. This is not StartOS specific, but just a general hardware platform default. It has served me well for decades but when you get a hardware drive failure (and I think for spinning HDDs the general Google metric is 8% chance of failure per year), RAID provides me time to recover and replace without losing any uptime.

In the case of software RAID, Proxmox went in hard with ZFS many years ago. It works and gives a similar recovery window. The issue (and this applies to hardware and software RAID) is that you need to be pretty well matched for HDD or SDD capacity. This means that over time, with entropy, hard drives get cheaper and bigger and trying to expect that you can buy replacement drives of the same capacity years later might work but often doesn’t. As a result, I’ve separated my processing from my perpetual data storage.

Although this mitigates having different sized hard drives in an array (ie. Unraid, etc. allow you to have arrays that have differing drive sizes in them), the biggest advantage of doing this is backup recovery. If you have every had to recover a poorly thought through 1TB KVM on a server, you’ll know what I’m talking about. Virtualization should be structured much like the old trick of having your OS on a boot drive, and data on a separate drive.

This is where Proxmox or other virtualization platforms shine. You can set it up to have your OS drive on a local, small RAID setup for resilience and recovery. But move your data drive to a NAS or something that is better suited for data storage. This will likely be more cost effective too. What I do is to have a RAID storage system for my VMs (OS drive) and a NFS share for my data (on a NAS). The crazy thing here is that you could run FreeNAS/TrueNAS for the NAS as a VM in your Proxmox system, and just have its storage be external with a NAS or DAS enclosure, and serve it as a NFS share or something like that to the other VMs (ie. StartOS).

Or just have a separate hardware box for your NAS and use that for everything storage on your network. It works for me. YMMV of course, and there are an almost unlimited number of possibilities on how you structure this. But if your ultimate goal is maximum uptime, having a RAID storage setup on your Proxmox rig, and putting your OS drive on that, and then separating your data drive to something more akin for longer term data storage (ie. a NAS) may be a direction you might want to follow. As the blockchains get bigger, it makes it easier to provision and manage storage this way, without expecting a hypervisor to also be your longer term perpetual data storage solution.

Some thoughts here, but there are many other ways of course. Just keep in mind what happens when the s#$t hits the fan and things fail, and plan your disaster recovery process BEFORE that happens.

What do you suppose is causing the qcow2 corruption shurtenburt experienced? One of the main reasons I’d use Proxmox for my server is the lack of advanced drive configuration options is StartOS. I’m a big fan of RAID 5 for a home server setup. I might even go RAID 6 if the additional fault tolerance was warranted.

qcow2 is just a protocol for storing a virtual hard drive on a hypervisor. Although I don’t know how shurtenburt has the VM configured, if everything is being stored in one large virtual hard drive as a file, then all I/O that goes to that file runs the risk of corruption. SSDs have a lifespan in terms of writes, etc. and add ZFS to that mix for redundancy and there are a lot of factors at play. That said, at a minimum, I would separate this into multiple qcow2 files - one for the OS, one for the Data and move the data to a NAS or some other storage device since restoring from backup after a failure would likely be a major pain and runs the risk of not being able to recover super large files if there is a problem.

This is my HW config for StartOS on PM. I have two hard disks - one for the OS (scsci0) and one for the data (scsci1). The data is 2TB provisioned, but is on a remote NAS server - not local to the KVM. I could have used iSCSI for this, but I chose a NFS share on a NAS, so that the NAS can snapshot and backup the drive independent of PM. If the VM goes down, I can recover the OS drive from a backup quickly and it will reconnect. If the data drive goes down, I can revert to a previous snapshot or recover it on the nas from our backup server.

This is why I have separated the whole thing and have one SSD passthrough’d to the VM. I do not use qcow2 anymore (for StartOS). No issues with ZFS thereforem, no problems with misaligning blocks and wear and tear is not much on my SSD. It still sits at 2% lifespan, so 98% left after a year. If it dies I simply replace it, install start9 from scratch and restore the backup.

Of course you should backup your server, but it’s preferred that the backup never be used to restore your system. Ideally you transfer your server from one drive to the other while it’s still working. Lightning channels are state dynamic and the backup is static. Therefore your channels cannot be safely restored, because they are state sensitive with a penalty mechanism to deter attacks. This means all channels must be closed when restoring the service form a static backup. Right now, with on-chain fees low, this may not be such a big deal. But in the future it could be quite expensive. For the same reason, it’s a very bad idea to backup StartOS by any other means, such as backing up the image in Proxmox.

2 Likes

This makes a lot of sense to me. The fact that a blockchain is a distributed ledger means that any copy of it that we have hosted is complimentary to the network - not definitive. That is, if we need to restore a backup, restoring the working OS as separate to the data, and then just repopulating the data from the blockchain is likely to be the fastest way to get a down system back online. Yes, the data drive is likely to be the largest for storage, but it more likely just means it has the largest chance of failure since it spans more physical media. Since it is already backed up on the blockchain, restoration by way of a repopulation of data seems reasonable to me. Of course, redundancy in it as a live storage image (ie. RAID, snapshots, etc.) would make recovery faster if there is physical media failure.

Bitcoin’s blockchain has no penalty mechanism. If your copy of the blockchain is incomplete it simply catches up. If your copy of the blockchain is wrong, it’s simply invalid, and must be re-organized. StartOS does not backup the blockchain, only because it would take up a lot of space in your backup, while all that data can to recovered from the network. Sure it would take a while, but again the goal is to never need to do it. Some people choose to, by other means, backup their blockchain data. That’s perfectly fine. Lightning channels are much more sensitive. If you broadcast an incorrect channel state, you forfeit all your funds in that channel.

I am using Proxmox for Start9 during 1 year.

Backup with Proxmox backup server every night, VM on Proxmox and NFS on NAS ( I know this is evil/bad :partying_face: )

I have 2 Start9 server, each was on 1 TB on disk, ZFS on Nvme .

Absolutely smooth like an ice cream :star_struck:

I replicate each VM on another Proxmox node, for stand by replication, instead in a real cluster mode.

During this week, I’ve fire up a another VM with 1,5TB HDD, because 1TB is not enough now.

Replicating Bitcoin node with the ssh and copy, and backup and restore with native backup Start9 process applications on the new VM: bitcoin blocks, CLN, LND etc.…

Zero loss of anything, my 2 lightning node with all my sats :heart: , everything good

So, to response to op original question:

Is this setup a good candidate for StartOS?

Absolutely yes !!!

1 Like