ADR-001: pycurl over requests for upload clients¶
Status: Accepted Date: 2025-01-15
Context¶
BBDrop uploads files to 10+ image and file hosts through concurrent worker threads. The upload pipeline needs:
- Real-time bandwidth tracking (bytes sent per chunk, not just per request).
- Bandwidth limiting so uploads don't saturate the user's connection.
- Persistent TCP/TLS connections reused across uploads to the same host.
- Thread-safe concurrent uploads with per-thread connection handles.
- Direct control over SSL verification, timeouts, and proxy settings on every request.
Python's requests library is the standard choice for HTTP, but it doesn't
expose the low-level callbacks and connection controls listed above. libcurl
(via pycurl) does.
At the same time, IMX.to --- the first host BBDrop supported --- has a simple
JSON REST API where requests works well and was already in place before the
multi-host architecture existed.
Decision¶
Use pycurl for all upload clients except IMX.to, which continues to use requests.
Why pycurl¶
pycurl exposes libcurl options that requests cannot provide:
CURLOPT_XFERINFOFUNCTION--- progress callback invoked during data transfer, reporting bytes sent per chunk. This drives the real-time speed display and per-gallery progress bars.CURLOPT_MAX_SEND_SPEED_LARGE--- server-side-enforced bandwidth cap without application-level throttling.- Persistent connections --- calling
curl.reset()clears options but keeps the underlying TCP+TLS connection alive, giving connection reuse without a session abstraction. - Thread-local handles --- each upload thread stores its own
pycurl.Curlinstance inthreading.local(), providing thread safety and per-thread connection pooling without shared state. - Per-request control --- SSL verification, connect/read timeouts, proxy
configuration (HTTP, SOCKS4, SOCKS5), and custom headers are set on each
handle before
perform().
All three pycurl-based image host clients (TurboImageHostClient,
PixhostClient) and the FileHostClient (covering 7 file hosts) follow this
pattern. Each creates thread-local curl handles via _get_thread_curl() and
reuses them across uploads within a gallery.
Why requests for IMX.to¶
ImxToUploader in bbdrop.py uses requests because:
- IMX.to's API is a straightforward JSON REST endpoint ---
requests.post()with a file and JSON parsing is all that's needed. requestswas already in use before the multi-host architecture was designed.- The IMX API returns the normalized response shape natively
(
{status, data: {image_url, thumb_url, gallery_id}}), so no low-level response parsing is required.
Consequences¶
Positive:
- Fine-grained bandwidth tracking powers the GUI speed display and per-gallery progress without polling.
- Connection reuse reduces TLS handshake overhead on bulk uploads (50--500 images per gallery).
- Thread-local handles keep concurrent uploads isolated without locks on the HTTP layer.
Negative:
- pycurl requires
libcurldevelopment headers at build time, complicating installation from source on some platforms. - The pycurl API is more verbose than
requests--- each call requires manual option setting, buffer management, and error handling. - pycurl handles are harder to mock in unit tests compared to
requests.Session.
Tradeoff:
- Two HTTP libraries coexist in the codebase (
requestsfor IMX,pycurlfor everything else). This adds a small maintenance cost, but each library is used where its strengths apply. Replacingrequestsfor IMX would add complexity without benefit; replacingpycurleverywhere else would sacrifice bandwidth tracking and connection control.