ADHD Diary #003 — Easter Monday, Spring Vibes, and a Brain That Never Stops
Easter Monday brought unexpected joy with swimming, treats, and quality time. Discover how these moments are a treasure for an ADHD brain.
Easter Monday. Outside it smells like spring — that unmistakable mix of damp earth, first blossoms, and the quiet promise that the coming months will be good. The sun still hangs a little too low, but it's there. And that's enough.
The morning belonged to us. We prepared breakfast together — Easter bread, colorfully painted eggs, coffee that tastes like coffee and not like rush. That's what quality time is. Not the expensive restaurant, not the perfectly planned excursion — but standing at the table together, slicing, spreading, laughing. If anyone asks me what wealth means: exactly these moments. No agenda, no must-do, no calendar. Just family.
Then a spontaneous surprise: I took my daughter swimming — just the two of us, completely unannounced. And at the same time, a small gift to my wife: a few hours where she could simply do whatever she felt like. No consideration, no plan — just her time.
My daughter was on board immediately. She jumps into the water, dives under and comes back up laughing — and I notice something in me relaxing that I didn't even know was tense. That's the secret of these moments: they just happen. They don't need preparation, optimization, or a to-do list. They only need presence. And that is sometimes the greatest challenge for an ADHD brain — and at the same time the most beautiful thing when it actually works.
After that: fries for my little one, fried potatoes with battered fish fillet for me. Not particularly healthy — but sometimes you just have to treat yourself to what you're craving. After the swim, a delicious ice cream for both of us, then home.
In the evening: time to fire up the grill. My wife conjured up a fresh salad, I threw the pork chop on the grill. It got cooler than expected, at some point we got the jackets — but nobody wanted to go inside. An evening you don't forget.
And then: my girls — my wife and my sweet daughter — to bed. Me to the computer.
The Brain Kicks Into Gear — and That's a Good Thing.
I'm currently working on a training dataset for a local language model. The goal: the model should be able to write in a very specific style about a specific topic — so precisely, so domain-specifically, that it can eventually generate high-quality articles on its own without me checking every sentence.
Sounds simple. It's not.
The first step is data collection and cleaning. We're talking about a corpus of 347 articles — collected, cleaned, manually reviewed. Each article was checked for minimum length (500 words), quality, and relevance. Short stubs, duplicates, and poor translations out — that costs more time than the actual training. In the end, 284 clean documents remain, split into 80% training, 10% validation, 10% test. That's approximately 420,000 tokens — tokenized with the base model's tokenizer.
The training format is JSONL. Each line one example:
{"instruction": "Write a technical article about [topic]",
"input": "",
"output": "The complete article..."}
284 such pairs. Simple — but the devil is in the output format. Consistency in structure, tone, technical terminology, length. All of that needs to be reflected in the dataset.
For the base model I used qwen2.5:14b today — the sweet spot between quality and training time on my local setup. The fine-tuning runs via LoRA (Low-Rank Adaptation) on Apple Silicon using the MPS backend. LoRA means: we don't retrain the entire model, but instead insert small adapter layers — saving massive amounts of memory and time, with barely measurable quality loss compared to full fine-tuning.
Apple Silicon is surprisingly good for this purpose — unified memory, high bandwidth, no bottleneck between CPU and GPU memory. But it has its limits, of course. For 14B models with LoRA it works very well. For 32B+ it gets slow, for 70B+ it gets painful. So if anyone asks what's on my wish list: an NVIDIA DGX Spark. To date I haven't found a sponsor for one — if anyone reading this is interested in supporting the NOG Community with local AI training: I'm very happy to talk.
Today's parameters:
Base model: qwen2.5:14b
LoRA rank: 32
LoRA alpha: 64
Learning rate: 2e-4 (with cosine scheduler)
Batch size: 4 (gradient accumulation: 8 -> effective batch: 32)
Epochs: 4
Warmup steps: 50
Total training steps: ~2,840
Hardware: Apple Silicon (MPS)
The first epoch took about 47 minutes — after that MPS warmed up and the remaining epochs were around 38–42 minutes each. Total runtime today: just under 3 hours. I started the run after the ice cream, evaluated the results around 11:15 PM.
The loss curve already says a lot:
Epoch 1 — Train Loss: 2.41 | Val Loss: 2.38
Epoch 2 — Train Loss: 1.74 | Val Loss: 1.79
Epoch 3 — Train Loss: 1.31 | Val Loss: 1.42
Epoch 4 — Train Loss: 1.08 | Val Loss: 1.19
Val Loss follows Train Loss — no overfitting. That's good. When Val Loss starts rising while Train Loss keeps falling, the model is memorizing the dataset instead of generalizing. That happens fast with small, specialized corpora.
After training: export as GGUF (quantized to Q5_K_M for the best ratio of size and quality) and loaded directly into Ollama. Then the first manual evaluation: three prompts, three articles, quality assessed manually. The model already captures the style significantly better than the base model — but there are still outliers in technical terms and sentence structure. Next run: expand the dataset by ~40 articles, targeted at the weaker areas.
Why all this effort? Because this isn't a single project.
I'm training an entire family of specialized models — all local, all built for specific tasks within the NOG Community:
MatchingLLM — a model for intelligent matching in event management. Who fits with whom? Which attendees should talk to each other? Which speaker fits which track? No simple keyword matching, but semantic understanding of profiles, interests, and context.
TopicLLM & SpeakerLLM — two closely related models for the speaker and topic hype cycle. Similar to the Gartner Hype Cycle, but specialized: which topics are currently on the rise, which are at the plateau, which are already past the peak of inflated expectations? Which speakers are bringing fresh ideas, which are established, which are being overlooked? These models analyze, cluster, and evaluate continuously — and help build conference programs that are genuinely relevant.
InfraLLM — the model that lets me sleep most soundly at night. It permanently monitors the infrastructure, detects anomalies, identifies weaknesses — and self-repairs where possible. Not as a replacement for classic monitoring tools, but as an intelligent layer on top: what does this error mean in context? Is this a known situation? Is there a proven countermeasure? And if so — execute it.
SecurityLLM — tightly coupled with ShieldX. This model specializes in attack detection and response: prompt injection, anomalous traffic, suspicious patterns in logs. When ShieldX detects an attack, SecurityLLM is the brain behind it — it assesses the context, decides on the response, and triggers automated countermeasures where necessary. This includes IP blackholing via BGP: a detected attacker doesn't just land on a local blocklist, but is actively removed from routing. This isn't theory — it's running.
All these models run locally. No internet communication, no external interfaces, no phoning home. What runs locally belongs to you. You control what it does, where it communicates, what it knows about you. For me that's not purely a technical decision — it's a conviction. Once you understand how little control you actually have over external services, you build local.
And that's exactly why a DGX Spark wouldn't be a luxury — it would be infrastructure.
Which project is today's training run actually for? That I'm not revealing yet. Patience.
ShieldX — Live Yesterday, An Exciting Experiment Today
Yesterday I published the benchmark results of my own ShieldX test — a detailed red team report on how ShieldX responds to real prompt injection attacks. If you haven't read it yet: it's worth reading.
But the most interesting thing happened today.
Shamim has agreed that his team will take a really close look at ShieldX. No friendly pat on the back — a genuine, realistic review by people who know what they're doing. I'm looking forward to it and at the same time genuinely nervous: have we built something solid here? Or did a critical flaw sneak in that I simply missed in my own testing?
I don't know yet. And that's exactly what makes it interesting. When the results are in and I can share them — you'll read them here first. If you hear nothing more about ShieldX... well, then you'll know what that means too.
Shamim is part of Team Phoenix — and for those who don't know Team Phoenix yet: it's an initiative I'm very glad to be involved with and that I consider enormously important. In June 2026 the Phoenix Summit takes place — and I can already say: if you have the opportunity to be there, take it. The team is still looking for sponsors, and Flexoptix will once again be doing a Fellowship sponsorship. I'm already looking forward to it.
The last event left a deep impression on me. It took place alongside bdNOG — so I didn't simply go home, I was right in the middle of a week full of encounters. The Fellowship sponsorship there was one of the best decisions I was part of. Tashi and I had conversations I won't forget. The smiling faces of the Fellows, their motivation, their energy — it's hard to put into words. Indescribable is the only right word for it. I'm excited about what June 2026 will bring.
EO_GP — When Distance No Longer Matters
There's a project I've barely named publicly — internally it runs under EO_GP. I'm not revealing more details today. Just this much:
It will be a tool that brings together people who are actually far apart. Imagine working, thinking, and deciding with someone — and then realizing that person is on another continent, in a different timezone, in the middle of a different morning. EO_GP is meant to create exactly that feeling: the distance disappears. What remains is the collaboration. The feeling of sitting side by side. No matter where. No matter when.
Those who understand who this is intended for — they'll get it. Those who don't yet know will find out when the time is right.
The Third Project — Not for Me
This is the most mysterious of all — and at the same time perhaps the one that occupies me most.
It's not a project for me. It's a project for someone else. And if it works out — and right now it's looking very promising — then it will change something. Not symbolically. Substantially.
I won't say more today. But I'm watching the development with a mix of anticipation and genuine respect for what might emerge. Sometimes the best things you can do are the things where you're not in the foreground at the end.
ADHD Reflection
Today was one of those days when the ADHD was almost invisible. Almost.
Already at breakfast I noticed several times how my head wanted to drift toward work. A thought here, an idea there — the training setup, an open question, a note I hadn't made yet. I've trained myself a small trick: write the thought down in three words and then let it go. The brain gives in when it knows the thought won't be lost. It just wants security — not immediate execution.
At swimming it was different. There the brain simply... stopped. Not from exhaustion, but from genuine experience. My daughter in the water, her laughter, the echo in the hall — these are moments that silence even the loudest ADHD brain for a little while.
This family time is demanding — just like traveling is, just like intensive work phases are. But it also charges me up. In a way no coffee and no productivity system in the world can replace. My wife is the anchor. My daughter is the light. When I sit down at the laptop in the evening and work for hours more, it's partly because of this: because the day before was good. Because the battery is full. Because I know what it's all for.
Then in the evening: hyperfocus. Three windows, two terminals, one document. No tiredness, no hunger, no distraction. That's the paradox of ADHD: during the day you sometimes fight for every thought — and at night you could rebuild the world. And suddenly it's 1 or 2 AM — and at 6:00 AM the alarm clock rings again. Both are real. Both are me.
I'm happy to work that out with myself.
More tomorrow.