I have a table URL_Experiment in my database (mySQL database). I have 2 million URL links in this table.
For each URL, I am checking if some particular text is present in the URL and updating the information back in my URL_Experiment table.
Now, I need to make the script run fast. The easy solution that comes to my mind is:
split the URL_Experiment into 10 tables (each with 200,000 rows) and run the script 10 times simultaneously. This way, I can ensure that the updates happen properly. But for this, I need to save 10 scripts, each accessing the correct table (I need to rename the tables from 1 to 10).
Though the above seems a pretty neat solution, I am looking for a more sophisticated solution. Something which will execute the same script simultaneously (in parallel) as 10 processes. My concern is the execution time. I do not want the resources to be wasted when they can be utilized.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can try GNU parallel. Many examples can be found here and here.
I am a member of the Swift team and we do exactly these kinds of things for large research projects.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0