EarlyOom's submissions

1.		Replace OCR with Vision Language Models (github.com/vlm-run)
		292 points by EarlyOom 11 months ago \| past \| 125 comments
2.		Show HN: Visually parse an entire YouTube video frame by frame (github.com/vlm-run)
		5 points by EarlyOom 11 months ago \| past
3.		Ask HN: What are folks using to train/fine-tune Vision Language Models
		1 point by EarlyOom 11 months ago \| past
4.		A Node.js SDK for calling Vision Language Models (github.com/vlm-run)
		6 points by EarlyOom 11 months ago \| past
5.		Run structured extraction on documents/images locally with Ollama and Pydantic (github.com/vlm-run)
		170 points by EarlyOom 11 months ago \| past \| 29 comments
6.		Show HN: Vlm Run, Extract JSON from images, videos and documents in a simple API (vlm.run)
		2 points by EarlyOom on Aug 13, 2024 \| past
7.		Fine-grained Visual Transcription for YouTube videos (nos.run)
		9 points by EarlyOom on June 10, 2024 \| past \| 3 comments
8.		"Ok Computer, why are you slow?" (scottloftin.substack.com)
		2 points by EarlyOom on Jan 31, 2024 \| past
9.		Show HN: NOS – A fast, and ergonomic PyTorch inference server (github.com/autonomi-ai)
		3 points by EarlyOom on Dec 14, 2023 \| past