I'm looking for an intern who wants to work on exact runtimes, local models, OCR and document extraction, and research-engineering in public. This is for people who want to ship fast on technically serious systems, not for generic prompt-wrapper work.
Logistics
- Paid
- Remote
- IST-friendly collaboration
What This Role Is
This internship centers on problem-shaped virtual machines (PSVMs), the paper's term for task-specific exact runtimes with model-guided decision surfaces.
PSVM is not a standard term you are expected to already know. Part of the assignment is to see whether you can understand and critique the idea from first principles.
The core thesis is simple:
Code keeps truth, model ranks ambiguity.
This is not a generic "build an LLM wrapper" internship.
What The Work Will Feel Like
I care about speed, rigor, and public artifacts.
If we work together, expect a style closer to public work around yoyo, the PSVM paper, and Apache Iggy than to a slow internship queue.
What You Will Work On
- designing exact runtimes for narrow task families
- defining legal action surfaces, verifiers, and canonical traces
- building or improving browser-local demos
- training and evaluating small local models over structured runtime state
- improving candidate extraction, ranking, validation, and rollback behavior
- writing experiments, benchmarks, and paper-quality notes around what is proven and what is still weak
Typical problem areas include:
- Sudoku-style guided exact search
- receipt or invoice total extraction
- Tally-style voucher extraction
- deterministic workflows where legality and validation remain in code
Ideal Candidate Profile
- strong problem-solving and debugging fundamentals
- comfortable in JavaScript and Python
- understands basic ML training and evaluation
- likes exactness, constraints, state machines, validation, and metrics
- can read technical writing carefully and respond with precise criticism
- writes clearly
Strong Pluses
- Rust
- OCR or document extraction work
- ONNX or browser-local inference
- search, planning, constraint solving, or verification-heavy systems
- open-source contribution experience
What You Get
- visible ownership of real artifacts
- exposure to both research and shipping
- direct work on code, demos, evals, and paper framing
- a portfolio that is stronger than a generic AI intern role
- experience working at a pace that is closer to serious open-source building than to standard internship process
Assignment
Read the paper at psvm.aviraj.dev and explain what you understand from it in your own words.
The paper introduces the term PSVM, short for problem-shaped virtual machine. You are not expected to know the term in advance. I want to see whether you can understand and evaluate the idea from first principles.
This is not a summarization exercise. The goal is to see how clearly you can understand a technical thesis, separate strong claims from weak claims, and identify where you would contribute first if you were working on the project.
Suggested Timebox
60 to 90 minutes.
Prompt
Write a short note answering the following:
- What problem is the paper trying to solve?
- The paper introduces the term PSVM. What does it mean? Define it clearly in your own words.
- Why does the paper argue for "code keeps truth, model ranks ambiguity" instead of end-to-end generation?
- Explain the roles of the following pieces in the PSVM framing:
- exact runtime
- legal action surface
- verifier
- canonical trace
- resolver
- Use at least two examples from the paper or linked project work and explain what each example proves for the thesis.
Good examples include:
- Sudoku
- receipt or invoice total extraction
- Tally voucher extraction
- Where is the paper strongest today?
- Where is the paper still weak, incomplete, or not yet empirically proven?
- If you were joining the project for 6 to 8 weeks, what would you build, measure, or improve first, and why?
Expectations
- Write in your own language. Do not paraphrase the paper section by section.
- Be concrete. If you agree with a claim, explain why.
- If you disagree with a claim, explain the technical reason.
- Separate:
- what the paper claims
- what the repository demonstrates
- what is still unproven
- Prefer precise systems thinking over generic AI commentary.
Submission Format
- 800 to 1500 words
- Submit as Markdown or PDF
- Use short sections with clear headings
- End with a final section titled:
If I joined, my first milestone would be...
How To Apply
Email your application to [email protected].
Please include:
- a short introduction
- your resume or LinkedIn
- your GitHub or portfolio links
- your assignment response
If you have suggestions for the paper itself, you may also open a PR at problem-shaped-vms-paper.
That PR path is optional. It is not required for applying.
What I Will Look For
- clarity of thought
- technical accuracy
- ability to distinguish architecture from evidence
- quality of criticism
- strength of proposed next steps
- writing quality
Optional Bonus
If useful, include one simple diagram or table showing how you think the PSVM loop works:
task -> exact state -> legal frontier -> model ranking -> verification -> transition