More

naveen_k · 2025-06-12T20:57:38 1749761858

I have been experimenting with using 5 phases of movements with each phase covering different areas of the screen while being actively moving. The last phase makes the dot move in a Lissajous-like motion which is more fluid like you are suggesting.

The challenge is recording and syncing the motion at a higher frequency and being able to save without much drift and the performance of these landmark/gaze models is often slow.

One more option to speed it up is not to do the eye tracking at record time, just record a crop video of the face and the screen first at 60Hz and then run the model on each frame and update the metadata of the dataset.

naveen_k · 2025-06-12T04:16:13 1749701773

Ha! The timing is impeccable. This is a great demo. I've been experimenting with using gaze and eye tracking for cursor prediction as a side project. I like the idea of pressing 'space' for each dot. I just had the 9-dot movement going from one point to another. I'm using Mediapipe's face landmarks model (I wasn't aware of WebGazer). I'll continue to work on it, but it's great to see a similar thought process.

naveen_k · 2025-04-01T20:10:07 1743538207

Thanks! I initially just wanted to build a dashboard, with the power optimization part being a later addition. Based on the HN response, it seems that's the feature that resonated most with people. I'll be making improvements to the optimization component in the coming days and will publish what I have.

naveen_k · 2025-04-01T20:04:25 1743537865

That's actually pretty cool. ESPs are awesome little things.

naveen_k · 2025-04-01T17:49:54 1743529794

It's a 1600W PSU (Coolmaster 1600 V2 Platinum)

naveen_k · 2025-04-01T17:48:58 1743529738

I'm actually using a 1600W PSU. 1400W is my target max draw. This is a dual EPYC (64 core CPU each) system btw. The max draw by the CPU+MB+Drives running at peak 3700MHz without the GPU is 495W! Adding 4x 4090 (underclocked) will quickly get you to 1400W+.

naveen_k · 2025-04-01T17:40:55 1743529255

Quick update: Definitely wasn't expecting this to end up on the front page. I was more focused on publishing the dashboard than the power optimizer service I'm running. I'll take all the feedback into account and will open source an improved version of it soon. Appreciate all the comments!

vondur · 2025-04-01T20:32:36 1743539556

That's quite a beefy workstation you got there!

naveen_k · 2025-04-01T16:44:44 1743525884

Sorry, I only open sourced the dashboard part as mentioned in the bottom of the blogpost. Still working on improving the 'Power optimizer' service so will open source that soon as well.

PeterStuer · 2025-04-01T17:37:41 1743529061

If it were up to me I would go for switching complete performance profiles through something like tuned-adm rather than trying to change just cpu frequencies. There's too many interlinked things that can have an effect on throughput efficiency.

naveen_k · 2025-04-01T17:44:35 1743529475

Thanks, I'll check it out.

naveen_k · 2025-04-01T16:42:12 1743525732

Thanks! That's an excellent point. You're right that there's likely a sweet spot that would be more efficient overall than aggressive throttling.

The current implementation uniformly sets max frequency for all 128 cores, but I'm working on per-core frequency control that would allow much more granular optimization. I'll definitely measure aggregate consumption with your suggestion versus my current implementation to see the difference.

schiffern · 2025-04-01T23:29:19 1743550159

Zooming out, 80-90% of a computer's lifecycle energy use is during manufacturing, not pulled from the wall during operation.[1] To optimize lifetime energy efficiency, it probably pushes toward extending hardware longevity (within reason, until breakeven) and maximizing compute utilization.

Ideally these goal are balanced (in some 'efficient' way) against matching electricity prices. It's not either/or, you want to do both.

Besides better amortizing the embodied energy, improving compute utilization could also mean increasing the quality of the compute workloads, ie doing tasks with high external benefits.

Love this project! Thanks for sharing.

[1] https://forums.anandtech.com/threads/embodied-energy-in-comp...

KennyBlanken · 2025-04-01T17:23:39 1743528219

Please go learn about modern Ryzen power and performance management, namely Precision Boost Overdrive and Curve Optimizer - and how to undervolt an AM4/AM5 processor.

The stuff the chip and motherboard do, completely built-in, is light-years ahead of what you're doing. Your power-saving techniques (capping max frequency) are more than a decade out of date.

You'll get better performance and power savings to boot.

naveen_k · 2025-04-01T17:37:39 1743529059

Thanks for the suggestion! I'm actually using dual EPYC server processors in this workstation, not Ryzen. I'm not sure EPYC supports PBO/Curve Optimizer functionality that's available in AM4/AM5 platforms.

That said, I'm definitely interested in learning more about processor-specific optimizations for EPYC. If there are server-focused equivalents to what you've mentioned that would work better than frequency capping, I'd love to explore them!

ac29 · 2025-04-01T19:09:51 1743534591

For people with Intel processors, check out raplcap: https://github.com/powercap/raplcap

It lets you set specific power consumption limits in W instead of attempting to do the same by restricting maximum core frequencies (which could also be useful in addition to overall power limits).

csdvrx · 2025-04-01T16:48:42 1743526122

Another suggestion: when you want to save power, use irq affinity with /proc/irq/$irq/smp_affinity_list to put them all on one core.

This core will get to sleep less than the others.

You can also use the CPU "geometry" (which cores share cache) to set max frequency on its neighboring cores first, before recruiting the other cores

naveen_k · 2025-04-01T17:30:13 1743528613

Thanks for the suggestion. Will check it out.

naveen_k · 2025-04-01T16:34:15 1743525255

Good point. I'm often running multiple parallel jobs with varying priorities where uniform throttling actually makes sense. Many LLM inference tasks are long-running but not fully utilizing hardware (often waiting on I/O or running at partial capacity)

The dual Epyc CPUs (128 cores) in my setup have a relatively high idle power draw compared to consumer chips. Even when "idle" they're consuming significant power maintaining all those cores and I/O capabilities. By implementing uniform throttling when utilization is low, the automation actually reduces the baseline power consumption by a decent amount without much performance hit.

foobarian · 2025-04-02T02:51:11 1743562271

It seems it may be relatively accessible to take a few representative tasks and actually measure the soup-to-nuts energy consumed at the plug. Would be very interesting to see in tandem with the power optimizations!

naveen_k · 2025-04-02T03:24:52 1743564292

That's exactly what I did first! I ran a CPU torture test at full clock speed and measured the power draw at the plug, then repeated the same test with the lowest clock speed setting. For the Epyc system, there was about 225W lower power draw at the reduced clock speed. Even at idle, capping the max frequency reduced the power draw by about 20+%.