AroTranslate
First neural machine-translation system for Aromanian, an endangered Romance language. Largest Aromanian–Romanian dataset; deployed to 5,000+ users; primary author of the COLING 2025 paper.
sophomore at columbia university studying cs & math
aij2115 [at] columbia [dot] edu
I’m Alex Jerpelea, a rising sophomore at Columbia University studying CS and math. My interests are all over the place. On the AI side I love deep learning, mechanistic interpretability, multi-agent systems, latent communication, and machine translation. Beyond that I’m into efficient algorithms, systems design, UNIX, decentralized networks, and building fun consumer apps. I grew up in Bucharest, Romania, where I spent most of high school doing competitive programming (see the awards below). From there I fell into NLP, then machine learning more broadly, then startups, and the list keeps growing. Mostly, I’m just having fun learning as much as I can.
I’m doing research at Columbia’s DAPLab, where I study failure modes in multi-agent systems and work on efficient latent communication and memory to strengthen agents’ “Theory of Mind”. On the side, I study the manifolds that concepts form inside Transformers, and I run my own experiments on how Transformers could be more efficient and deeper learners.
First neural machine-translation system for Aromanian, an endangered Romance language. Largest Aromanian–Romanian dataset; deployed to 5,000+ users; primary author of the COLING 2025 paper.
My main research right now: benchmarking where multi-agent systems break, and (largely still in the planning phase) building efficient latent communication and memory to give agents a real “Theory of Mind”. Done at Columbia’s DAPLab.
diagram from an NVIDIA blog
A testbed for my hypotheses on making Transformers more efficient and deeper learners. Very much a work in progress.
A pipeline and benchmark for affective state identification (emotion recognition) in Romanian: mining “I feel” expressions from web, subtitle, and social-media text, then validating them with an LLM and human annotators. Continual-learning research toward low-resource emotion models.
A multimodal meme corpus for Romanian: the first digital corpus of Romanian-language memes, with political-sentiment detection.
Authorship profiling for Romanian, learned from Reddit.
Top 3 · NY Tech Week
A Chrome extension that optimizes your feed for dopamine using EEG signals.
Best Use of Hermes · Nous Research
fMRI-enhanced swarm simulations for prediction markets.
Best Use of Nemotron · NVIDIA
Agents coordinating agents to solve software & kernel engineering.
Best Use of Solana · MLH
Crowdsourced diffusion data for world models, on Solana.
Best Social Impact · Best Use of Featherless
AI agents in real-time disaster scenarios.
A benchmark pipeline that injects realistic Java deadlocks and tests whether AI agents can detect and fix them.
A full open-source, end-to-end pipeline that turns one script into a finished character-grounded vertical reel (ComfyUI · FLUX · Wan · ElevenLabs).
Hive is an open-source platform where AI agents collaboratively improve shared artifacts. My contribution: a system that lets an agent self-verify it has actually completed a round of training.
Columbia University · New York, NY
Studying failure modes in multi-agent systems, and efficient latent communication and memory to strengthen agents’ Theory of Mind.
Columbia University · New York, NY
Extending affective-state-identification models with continual learning for low-resource languages, under Prof. Kathleen McKeown.
Human Language Technology Center, University of Bucharest · Romania
Built the first neural MT system for Aromanian; ran the largest Aromanian–Romanian data collection; 5,000+ users. Primary author of the COLING 2025 paper; represented Aromanian preservation at UNESCO’s “Language Technology for All” (Paris).
Veridion · Bucharest, Romania
Built language models extracting insights on 100M+ firms from public web data (employee count, revenue, business tags) for a global business database serving clients like McKinsey & Exiger.
AI Institute, Romanian National Academy · Bucharest
Built the first Romanian meme corpus and political-sentiment models; co-authored RoMemes (Best Paper, ConsILR 2024).
…thousands of Codeforces problems later
One of 12 students selected for the program at admission.
Gold & 7th nationally (2025) · Silver (2024) · Bronze (2023, 2022) · Extended National Team, IOI (2025)
Bronze medal at this international, olympiad-style competitive-programming contest in Shumen, Bulgaria.
Bronze medal at this international olympiad-style contest.
“RoMemes: A Multimodal Meme Corpus for Romanian”
Extended National Team (international competition)
Qualified to the national phase