Hi all, I'm currently in a few ML classes and, while they do a great job covering theory, they don't cover application. At least not past some basic implementations in a Jupyter Notebook.
One friction point I keep running into is how to handle logging and evaluation of the models. Right now I'm using Jupyter Notebook, I'll train the model, then produce a few graphs for different metrics with the test set.
This whole workflow seems to be the standard among the folks in my program but I can't shake the feeling that it seems vibes-based and sub optimal.
I've got a few projects coming up and I want to use them as a chance to improve my approach to training models. What method works for you? Are there any articles or libraries that you would recommend? What do you wish Jr. Engineers new about this?
Thanks!
Two resources that might be useful are AWS’ SageMaker documentation and the Machine Learning Engineering book by Andriy Burkov. This book doesn’t really go into detail on logging though. One way to evaluate a model is to run a SageMaker processing job that saves the performance metrics in a json file in S3 somewhere. More info on processing jobs: https://docs.aws.amazon.com/sagemaker/latest/dg/processing-j... . AWS has various services for logging which you can look into. This will mostly apply to orgs using AWS, but it might give a sense of how things can be done more generally.