Photo by Argelis Rebolledo from Pexels

Time for a Personal Cloud

The Apple CSAM scanning highlights something many know but shrugged off, your data in the cloud is yours, and the providers. Unless you take strong measures to combat it, your data is yours and theirs. Even if you take those strong measures to protect your data at rest, the providers can still see a ton of data streaming from your device usage. In some cases, this metadata is necessary to provide service to you. That doesn't mean you like them selling your data to others without your explicit consent. And no, burying the consent in a dense, legal terms and conditions doesn't count as consent for the average person.

If you're a normal technology user these days, you likely have a mobile phone and a laptop. Possibly you have a tablet and maybe even a desktop for high-powered tasks like photo editing, video production, or gaming. Start with the simple arrangement, you have a phone and a laptop. Typically, both of these devices are either on you in transit or are generally near each other throughout your day. Now imagine a world where Apple, Google, Microsoft, and other big tech companies, aka cloud computing providers, do not exist. How do you enjoy the seamless synchronization between devices?

There are a few solutions, none of them are as seamless as Apple/Google solutions right now. Partially, this is because Apple/Google write the operating system on the phone and naturally favor their own solutions as native, or embedded, functionality. Basically, you need to become your own cloud provider. Let's start with the simple solution, you want to buy something that "just works" with minimal fuss. All the solutions involve installing apps from someone.

For most people, get The Helm and just leave it powered on next to your WiFI router. For around USD 450, you have a solution that works.

For semi-professionals who need more storage and more compute power, get a Synology DiskStation. It will take some more setup, but it mostly "just works" and lets you add in some automation. All local to your house. Or just buy a NextCloud Hub which comes pre-packaged as a self-contained on-premises cloud.

If you're more of a do-it-yourself person, get a TrueNAS Core array and use NextCloud. Or setup a linux/freebsd desktop with a bunch of disk space and install Seafile

This is about the state of the art when it comes to personal clouds. In reality, it's just shrinking parts of the cloud functionality into a single server in your house. Whatever you do, even with the cloud, you still need a backup solution. The cloud is not a backup, it's a synchronized copy of your data. You need to run backups on whatever you choose, because admin error or a parts failure is more likely than anything else. 

If you want to really nerd out, then look at self-hosted solutions. With nothing more than a raspberry pi 4 and lots of disk you can build your own solution pretty easily. For even more nerdery, look at setting up your own homelab

What if you want a global compute farm just for yourself? You don't want to use one of the cloud providers because the whole point of this is to have full control over your data at all times. What if you relax that constraint just a bit to general, secure control over your data at all times?  Now you're into distributed computing. The idea is that you hold the private keys to your data, but now it can be distributed widely to any computer in the world for storage/processing. This assumes the encryption involved remains intact over the years. Distributed.net was one of the first platforms for something like this.  A global filesystem like IPFS can do it do, although it's still in development and not really usable by non-technical people.

The leading edge is talking about the DWeb, or decentralized web. We'll get there, someday. I've tried them all and none of them work as well as I want. Especially if I want to use it as my main storage/compute platforms. Say for example, I want to use machine learning object detection on my 150,000+ photos. In "the cloud", I can upload the images, spin up 150,000 processes to do the identification and tagging, and write it all back to a single storage volume, and then sync the results back to my laptop. Ta-da, done. There is no easy way to do that with dweb technologies, yet. It's still early.