Hacker Newsnew | past | comments | ask | show | jobs | submit | legothief's commentslogin

Hi! I'm also here (other co-founder of the company/project), we'd love to understand if this is a problem only we encountered - or others also need to keep track of their simulated model performance over time!


Mark here, happy to answer any questions or provide more info about fold!:)


We'll be working on the pytorch integration soon! `Fold`'s scope is time series, but there's nothing stopping you from sending in vector embeddings of any kind, timestamped.

Figuring out how to create those embeddings (that make sense over time) can mean quite a bit of research work, and requires flexibility, so it's probably better done outside of the time series library, with the tools of your choice.


> We'll be working on the pytorch integration soon! `Fold`'s scope is time series, but there's nothing stopping you from sending in vector embeddings of any kind, timestamped.

Awesome. I'll take a look. Thanks!

> Figuring out how to create those embeddings (that make sense over time) can mean quite a bit of research work, and requires flexibility, so it's probably better done outside of the time series library ...

Obviously :-)


Mark here (other co-founder), we're really curious if you are using Time Series Cross-Validation, with what tools, how frequently, and what kind of issues did you bump into!


Really nice, I haven't heard of pygraphistry, looks like something that would have made our lives a lot easier!


That is true, but unfortunately "out of the box", they're not well suited just be "fed into" an NN. Even if you think of the adjacency matrix as very similar to how the weights are laid out in a feed-forward neural network, you can't ignore that:

- in real life, graphs are not fixed

- you need to deal with the many different potential representations of the graph (permutation invariance)

- the nodes are usually containing more features than a single scalar value

but this is definitely not the best explanation, I think this guy does a lot better job: https://youtu.be/JtDgmmQ60x8


Sure, but GNNs modeling neurons is nonsensical since the graph is the analyte of the NN, you are not a priori doing anything with the graph. So in a sense my point is "to use NNs to model neurons", using a GNN doesn't buy you anything because the G in GNN isn't being subjected to dynamic activation.


Thank you for pointing that out, we've corrected that in the article!


In my mind, GNNs are designed to solve graph problems, in the usual case, with message passing, that enables (I'd emphasise the aggregation step) to "do ML on graphs".


I'm also quite excited about that - there's existing research, quite a few papers that are using graph-based models for MLOnCode: https://proceedings.neurips.cc/paper/2021/file/c2937f3a1b3a1... https://arxiv.org/abs/2203.05181 https://arxiv.org/abs/2005.02161 https://arxiv.org/abs/2012.07023 https://arxiv.org/abs/2005.10636v2 https://arxiv.org/abs/2106.10918

Definitely check them out! There are also tools that were made available by some of the authors: https://github.com/google-research/python-graphs


Are these papers somehow "curated" or "recommended"?!

Unfortunately, GNNs are lagging LLMs in the code domain. Maybe because

a. LLMs and transformers rulezz OR b. there is far more source code than there are compiled code graphs


I wouldn't rule out the fact that transformers are very amenable to parallel computation as the reason


Thank you!! I've been looking to get my feet wet with SoTA research for MLONCode, so this is very helpful!


We'll definitely consider it! Just needed to start with something :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: