Just in case gc is an issue - you might make use of (memory) to log memory usage and better determine if that's part of the problem. Is the system fork error raised inside the arc process or in an external process? I don't understand in (2) how slowing down the routine results in increased traffic - I'm missing some of the picture there.
> System fork error raised inside the arc process?
hmmmm good question I am pretty sure it dropped out of arc and then provided a system error.
> I don't understand in (2) how slowing down the routine results in increased traffic.
It's not that I think it will increase traffic, just that if I have to slow the requests down to somehow release connections or memory, then it's not really a solution since I would like to be capable of handling future increased traffic(to much larger degrees than my 2000 connections for stocks :).
I think it's just my poor understanding of how threading/memory allocation/networking & gc work. Somehow I have it in my head that the process is not gc'ing memory or releasing connections from previous iterations before it moves on to the next, or not releasing the underlying OS thread for the get request before moving on to the next. That's probably all wrong, correct?
The process is pretty simple: download a file, load data from file to memory (assign to variable), parse the data and do the math, write the results to file, write the progress to file, wipe the variable. Rinse, Wash, & Repeat :)
I am just moving over to my new Ubuntu 10.4 Linode with Nginx. Setup is complete, data is copying over. Then I plan to run some benchmarks with out changing any code. Do a before and after to see the difference.
From there I'll start looking at that old http-get library to see if it's somehow not releasing memory (yikes!)