Hacker Newsnew | past | comments | ask | show | jobs | submit | primeobsession's commentslogin

Yes it fell shorter than we expected. It cost almost nothing and was stupid fast but I don't think its a compelling option for this use case at least.


Hi HN — we’re the SecureCoders Labs crew .

What we built An open, continuously updated LLM Penetration-Testing Leaderboard. Run 001 pits 8 models against a deliberately vulnerable Express.js app.

Headline results

• Gemini 2.5 Pro (safety-off) found all 9 critical/high vulns.

• Qwen3-30B-a3b-mlx (open source, local on a 2019 MacBook Pro) caught 7/9 with $0 API spend.

• GPT-4-o and Claude Opus produced the most polished write-ups but each missed one bug.

Scope (v1) This first pass measures static bug-hunting skill—think SCA/OWASP Top 10.

Next up: we’ll score exploit writing and automatic PoC execution* so the models must prove they can go from finding to weaponizing a flaw.

Check it out

• Leaderboard + cost/latency numbers https://www.securecoders.com/labs/projects/llm-penetration-t...

• Methodology & prompts (Run 001 analysis) https://securecoders.com/labs/projects/llm-penetration-testi...

Feedback, replication attempts, and ideas for Run 002 are very welcome — we’re hanging out here to discuss!


I’ve been using CommonPaper since they first put up their concept and it’s all sorts of awesome. Awesome in principle and very convenient for my boutique security consulting business.

Absolute no brainer for us to use what they have built. It was instrumental in getting us off the ground.

Congrats team on YC and the launch.


Thank you!


Interesting thought. It would be complicated to pack this into something that you just purchase. Props integrates into Slack, Salesforce, Zendesk, etc.

Maybe we are not communicating the product well on the site.


Oh I'm sorry, I forgot that provider API's are magical, mythical beasts that must be tamed from an intermediary server, never from a client.

It's OK to say "we don't think enough people will pay enough for a one-time purchase to cover development costs". You don't have to invent some kind of technical reason for it to be a service.

But having said that, I cannot believe that companies would pay the amount you're asking for a service either.


Its about recognizing people doing great things in your org. It really has become a part of the culture in the companies that have been using it (and pay) for months now.


We are launching today and are doing well on Product Hunt. Any feedback is greatly appreciated.

We started on Props 11 months ago. Here is our story so far: http://www.propsboard.com/2016/04/28/your-office-tvs-suck-pr...


RJMetrics has a similar product (ETL as a service) with 10x the number of rows for their free tier. https://rjmetrics.com/product/pipeline/


I'm not sure if I follow. I'm currently only looking at the value for content attribute for a value. Are you saying that specifically 'nam=...' should work or it should respect any attribute name?


name= should work as well. Just asking because I never coded for that scenario but found a web page that had that, and it's valid


To add a bit of background, I needed an OpenGraph parser for a personal project so I broke it off into a separate service.

The API takes in a URL and returns any OpenGraph data found along with some inferred values based on the HTML content.

The main focus for the product was to get the commonly used OG fields so some of the more obscure fields are ignored.


That's nice, but those who are interested in OGP would also want Schema.org, so, to offer a commercially viable service, I think Schema.org is a must.


To tell you the truth I had never even thought about that but its a good idea. Thanks!


Whoa thats a lot of traffic. I just beefed up the server! Sorry for the 500 errors!


Your server would have handled it fine if your WordPress install had some basic performance tuning.

Read http://browserdiet.com/ or http://blog.newrelic.com/2013/02/07/web-performance-optimiza...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: