Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Parallel – shell tool for executing jobs in parallel using one or more computers (gnu.org)
43 points by kristianpaul on July 14, 2023 | hide | past | favorite | 10 comments


GNU parallel devs really want you to cite their work... it's just a tool folks, it's like citing the manual of a hammer.


    #!/usr/bin/python
    print(
    """Acknowledgments: This Python[1] script runs on Linux[2] based operating systems
    and uses the ls (see list-segments[3]) utility, written in the ANSI C[4] language.

    [1] Python programming language, Van Rossum et al., 1991-
    [2] Linux kernel, Torvalds et al., 1991-
    [3] Multics operating system, Corbató et al., 1969-
    [4] C Programming Language 2nd. ed., Kernighan, Ritchie, 1988""")
    ...


Want to like this tool, but every time I need to use it I’ve gotta relearn the idiosyncratic syntax it uses. Lately I just use Ruby plus a couple of gems instead


I remember trying to work around some aspect of this tool in a shell tool for a client , and in the end abandoned that, in favour of background tasks and tracking pids, which is kind of astonishing.

I don’t actually remember the problem but possibly something to do with how output is handled.


Almost every time I think I want this tool, I end up with a ad-hoc Python script instead, and Python is objectively speaking a terrible language for multiprocessing tasks.


I use paralel often and find the syntax reasonable:

    cat args | parallel this-program {}
    parallel this-program {} ::: arg1 arg2 arg3 ...
The only really new thing is ::: which separates the program from the arguments if you want to give them directly.

Can you show what the syntax should be for it to be intuitive to you?


Huh, I just had the opposite experience. I wanted to parallelize the work of converting a bunch of raw camera images on my laptop to JPGs using ImageMagick, and after looking into what it would take to do it in Ruby I found that `parallel` was the better way to go.


I've used it a bunch of times for some giant batch jobs across hundreds or thousands of cores... And I've found it easier to build a job file of the exact commands and redirect that into it.


Unfortunately, "parallel" is also a tool in moreutils, and moreutils version has a completely different args. I often have moreutils installed (for "errno" and "ts" tools) so I cannot have GNU parallel installed at the same time. And even if you fix your PC, if you share your script and wrong parallel is installed, it will look like syntax errors in your script, instead of a missing package.

This, and the obnoxious cite requirement, is why I recommend avoiding it. Use xargs -j for simple cases, and python scripts for a more complex ones. Parallel execution in python is pretty simple, and often requires no locks at all thanks to GIL.


it was so good, it inspired me to put this out there - https://github.com/korovkin/parallel




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: