Paid Remote Research Engineer Internship: Exact Runtime ML / Problem-Shaped VMs

I'm looking for an intern who wants to work on exact runtimes, local models, OCR and document extraction, and research-engineering in public. This is for people who want to ship fast on technically serious systems, not for generic prompt-wrapper work.

Logistics

Paid
Remote
IST-friendly collaboration

What This Role Is

This internship centers on problem-shaped virtual machines (PSVMs), the paper's term for task-specific exact runtimes with model-guided decision surfaces.

PSVM is not a standard term you are expected to already know. Part of the assignment is to see whether you can understand and critique the idea from first principles.

The core thesis is simple:

Code keeps truth, model ranks ambiguity.

This is not a generic "build an LLM wrapper" internship.

What The Work Will Feel Like

I care about speed, rigor, and public artifacts.

If we work together, expect a style closer to public work around yoyo, the PSVM paper, and Apache Iggy than to a slow internship queue.

What You Will Work On

designing exact runtimes for narrow task families
defining legal action surfaces, verifiers, and canonical traces
building or improving browser-local demos
training and evaluating small local models over structured runtime state
improving candidate extraction, ranking, validation, and rollback behavior
writing experiments, benchmarks, and paper-quality notes around what is proven and what is still weak

Typical problem areas include:

Sudoku-style guided exact search
receipt or invoice total extraction
Tally-style voucher extraction
deterministic workflows where legality and validation remain in code

Ideal Candidate Profile

strong problem-solving and debugging fundamentals
comfortable in JavaScript and Python
understands basic ML training and evaluation
likes exactness, constraints, state machines, validation, and metrics
can read technical writing carefully and respond with precise criticism
writes clearly

Strong Pluses

Rust
OCR or document extraction work
ONNX or browser-local inference
search, planning, constraint solving, or verification-heavy systems
open-source contribution experience

What You Get

visible ownership of real artifacts
exposure to both research and shipping
direct work on code, demos, evals, and paper framing
a portfolio that is stronger than a generic AI intern role
experience working at a pace that is closer to serious open-source building than to standard internship process

Assignment

Read the paper at psvm.aviraj.dev and explain what you understand from it in your own words.

The paper introduces the term PSVM, short for problem-shaped virtual machine. You are not expected to know the term in advance. I want to see whether you can understand and evaluate the idea from first principles.

This is not a summarization exercise. The goal is to see how clearly you can understand a technical thesis, separate strong claims from weak claims, and identify where you would contribute first if you were working on the project.

Suggested Timebox

60 to 90 minutes.

Prompt

Write a short note answering the following:

What problem is the paper trying to solve?
The paper introduces the term PSVM. What does it mean? Define it clearly in your own words.
Why does the paper argue for "code keeps truth, model ranks ambiguity" instead of end-to-end generation?
Explain the roles of the following pieces in the PSVM framing:
- exact runtime
- legal action surface
- verifier
- canonical trace
- resolver
Use at least two examples from the paper or linked project work and explain what each example proves for the thesis. Good examples include:
- Sudoku
- receipt or invoice total extraction
- Tally voucher extraction
Where is the paper strongest today?
Where is the paper still weak, incomplete, or not yet empirically proven?
If you were joining the project for 6 to 8 weeks, what would you build, measure, or improve first, and why?

Expectations

Write in your own language. Do not paraphrase the paper section by section.
Be concrete. If you agree with a claim, explain why.
If you disagree with a claim, explain the technical reason.
Separate:
- what the paper claims
- what the repository demonstrates
- what is still unproven
Prefer precise systems thinking over generic AI commentary.

Submission Format

800 to 1500 words
Submit as Markdown or PDF
Use short sections with clear headings
End with a final section titled:

If I joined, my first milestone would be...

How To Apply

Email your application to [email protected].

Please include:

a short introduction
your resume or LinkedIn
your GitHub or portfolio links
your assignment response

If you have suggestions for the paper itself, you may also open a PR at problem-shaped-vms-paper.

That PR path is optional. It is not required for applying.

What I Will Look For

clarity of thought
technical accuracy
ability to distinguish architecture from evidence
quality of criticism
strength of proposed next steps
writing quality

Optional Bonus

If useful, include one simple diagram or table showing how you think the PSVM loop works:

task -> exact state -> legal frontier -> model ranking -> verification -> transition