Shameless plug . . . I run a startup who is working to help this https://neuralwatt.com We are starting with an os level (as in no model changes/no developer changes required) component which uses RL to run AI with a ~25% energy efficiency improvement w/out sacrificing UX. Feel free to dm me if you are interested in chatting either about problems you face with energy and ai or if you'd like to learn more.