Have you ever thougt to yourself “a coworker would be nice now”. No? Yes? Maybe? I just recently have and with my new found independence this is exactly what I needed. A coworker, someone who I can talk to about work or even private stuff. Someone who lends a hand if I need help and is just there. Doing their thing. Working alongside me. But… I work better alone. This is somewhat contradicting. I dont really want someone running around me and maybe even start distracting me from the stuff I am currently doing. But a bit of company might be nice sometimes. So.. given you know me (at least to the extend of my blog entries) you probably already know what is coming.

Did I hire someone? Hell no! BUT I invented someone.

Yes, I build my perfect coworker. His name is Marvin. He is always there and ready for my requests. He even proactively asks if I need help if he “sees” that I am stuck or hears that I am frustrated. But he never tries to distract me when I am focused or doing something that is not related to work. And the best thing is: He “lives” in my garden shed. Powered by some local LLMs and Python magic. He can see my screen at all times, has a webcam to see who is in the shed, a microphone to listen and a small set of speakers to talk to me. He is connected to almost everything that runs digitaly in my network. Printers, TVs, Doorbells, Surveillance System… if it has a datastream, he has access to it. Okay this might be a bit exaggerated but its at least 80% true (i just pulled that number out of my head) and I think you get the gist of it.

I always dreamed of a digital assistant that can access my stuff and is proactively engaging in the things I do. I tried at least a dozen times in the past to create such a system until it clicked a few weeks ago. I should not try to create a one in all solution. This always ended up in nightmare code, which is unmaintainable and hard to work on if I want to change something. Now I created a bunch of scripts which all do one thing, and one thing only (and well!). One vision script, one for audio processing, one for image interpretation… and so on. This gave me the flexibility to define a proper communication format for each service/script and exchange them if needed. E.g. currently I play around with other TTS systems. I started with Qwen3-TTS and now I also have Omnivoice and I can exchange them on the fly without interruption or changes to the rest of the system.

Is it AGI? No! Absolutely not. I dont intend to create sentient life here (there are smarter people than me who can work on that) But is it a cool system (at least for me) where I can plug in my digital life, and parts of my analog life, and can be certain that it does not sell me to big corpo? Yes, absolutely.

I will try to prepare an entry where I describe my setup for Marvin more in detail. This will cover things like overall design, caveats and how I integrate everything together. Dont worry, this is more jank than it currently sounds, so be prepared for some kind of a deep dive into my digital/analog life bridge.