"http" could download from multiple sources #18

Open
opened 2022-01-01 21:42:50 +01:00 by boyska · 1 comment
Owner

Is it a bug or a feature request?

A feature request

Describe what happens

Let's say I have an audiospec whose url is http://dl1.us.example.com/blah.mp3.
It might be the case that I already know that if this is a valid URL, then also http://dl2.us.example.com/blah.mp3 and http://dl2.fr.mirror.example.com/blah.mp3 are (or could be) valid URLs, because I already know how the mirror network works.

As a sysadmin, I might want to configure larigira so that it knows how to rewrite URLs, both changing URLs (ie: canonicalizing them) and expanding a single URL to multiple URLs.

Ideally, those URLs could even be used to download in parallel, making it faster. At least having them as a fallback would be great.

Implementation ideas

Of course some function that can download a file from a list of URLs is needed.

As for the user-side of it, it could work this way:

  • the sysadmin can set a variable (LARIGIRA_AUDIOGEN_HTTP_REWRITEURL) to a path of an executable file
  • this file is executed at every run, and relevant information is passed as environment variables. Its output must be a list of URLs
## Is it a bug or a feature request? A feature request ## Describe what happens Let's say I have an `audiospec` whose `url` is `http://dl1.us.example.com/blah.mp3`. It might be the case that I already know that if this is a valid URL, then also `http://dl2.us.example.com/blah.mp3` and `http://dl2.fr.mirror.example.com/blah.mp3` are (or could be) valid URLs, because I already know how the mirror network works. As a sysadmin, I might want to configure larigira so that it knows how to rewrite URLs, both changing URLs (ie: canonicalizing them) and expanding a single URL to multiple URLs. Ideally, those URLs could even be used to download in parallel, making it faster. At least having them as a fallback would be great. ## Implementation ideas Of course some function that can download a file from a _list_ of URLs is needed. As for the user-side of it, it could work this way: * the sysadmin can set a variable (`LARIGIRA_AUDIOGEN_HTTP_REWRITEURL`) to a path of an executable file * this file is executed at every run, and relevant information is passed as environment variables. Its output must be a list of URLs
Author
Owner

We could let the user have a custom http downloader, with a well-defined interface. This would automatically allow this feature to be downloaded (inside your custom script), but will also allow more feature to be implemented by the sysadmin:

  • #17 could be implemented
  • file could be retrieved from non-HTTP sources too. For example, archive.org allows download to happen over BitTorrent. Or you could know how to find those files on local drive instead of using the web.
We could let the user have a custom http downloader, with a well-defined interface. This would automatically allow this feature to be downloaded (inside your custom script), but will also allow more feature to be implemented by the sysadmin: - #17 could be implemented - file could be retrieved from non-HTTP sources too. For example, archive.org allows download to happen over BitTorrent. Or you could know how to find those files on local drive instead of using the web.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: boyska/larigira#18
No description provided.