Experience
A timeline of where I've spent the past two decades, and what I learned building each thing. I've kept names vague for my current employer and concrete for the previous ones — standard practice for someone working in a regulated industry.
I lead a small engineering team owning four Treasury services across Risk, Funding and Controls. End-to-end ownership: design, implementation, production operation, cross-team coordination with trading desks, risk officers and downstream reporting teams.
The core of the work over the last few years has been performance and reliability engineering on a distributed batch grid that runs at EOD against hundreds of thousands of positions. We reduced the P95 runtime from 5–8 hours down to 10–15 minutes, reliably completing within the post-trade window and unblocking every downstream system that depended on our outputs.
The work that got us there wasn't one heroic change; it was dozens of small ones:
- Redesigning caching and data access in a critical risk scenario that was causing grid node-timeouts. We moved from overweight cache keys to structurally hashed ones, introduced an intermediate-results layer, and replaced a handful of unindexed DB queries — guided by a graph-based profiler that visualized hotspots, cache hit/miss ratios, and CPU/memory per node.
- Removing high-cost patterns across hot paths: redundant multi-pass transforms, overeager precomputation, unnecessary collection copies, boxing in tight inner loops. Most of this was invisible individually and massive cumulatively.
- Building a reusable validation + health-check framework: config validation (schema, dependency resolution), data validation (DB-backed invariants), runtime health checks. Adopted by 4 teams across 7–30+ engineers, covering 13 critical processes. Weekly investigations dropped from 1–2 per week to near-zero.
- Migrating a very large CLI-argument-based configuration surface to HOCON with proper schema and inheritance — which eliminated a class of recurring misconfiguration failures and saved ~20–30 engineer-hours per month of investigation time.
Joined FXCM as a senior engineer, moved into a Project Lead role, and ended up managing a distributed team of 24 engineers out of Sofia while staying hands-on with the architecture. Regular on-site visits, full remote leadership in between.
The headline project was a trading data integration platform that fed Salesforce and analytics from 50+ Oracle databases. When I took it over, Salesforce latency was 5–6 hours — the business was running customer-facing decisions on data that was effectively stale for most of the trading day. By the time we shipped, we were down to 3–5 minutes, processing roughly 1M events and 500k trades per day.
Some things I learned there:
- SLA tiering is underrated. We sat down with business stakeholders and engineering leads and explicitly partitioned the data flows: event-driven real-time updates for trade-critical objects, timer-based batches for EOD summaries, nightly for low-priority enrichment. Before we did this, everything was "urgent" and the system was constantly thrashing.
- Building instead of buying can pay off if you have the team. We replaced an expensive third-party ETL tool with an in-house Kotlin/Spring engine. Took about a year of hardening before it was fully trusted in production, but after that it saved ~$900k per year and — more importantly — we could actually debug it.
- Distributed leadership is hard and mostly about writing. I spent a lot of time over those years writing docs, writing clear tickets, writing code reviews that taught rather than gated. The time-zone gap between NYC and Sofia is rough; async quality matters a lot.
My first serious job after moving to the US — and I stayed for twelve years. ActForex was a 15-person forex platform startup with the kind of flat structure you only get at small shops: Tech Lead reported to the Director who reported to the Owner. That was the whole chain.
Joined as a backend engineer and eventually ended up as Tech Lead. Built reconciliation systems, background reporting, a handful of the lower-level trading infrastructure pieces. Wrote a lot of Java against Oracle, which is a skill set I still find occasionally useful two decades later.
Twelve years at one place teaches you specific things: what code looks like after you're the one who has to maintain it for a decade, how to negotiate with a team that trusts you, and how to tell when a system needs to be rebuilt versus when it just needs to be understood better.