Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
ethan_smith
6 months ago
|
parent
|
context
|
favorite
| on:
About AI Evals
AI Evals are systematic frameworks for measuring LLM performance against defined benchmarks, typically involving test cases, metrics, and human judgment to quantify capabilities, identify failure modes, and track improvements across model versions.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: