#27 HTTP API to access to files on GIN server?

Aperto
aperto 4 anni fa da straw · 2 commenti
Andrew Straw ha commentato 4 anni fa

Hi,

Gin and git-annex newbie here. Does the GIN server currently support direct HTTP access to files in the repository with range requests? If no, would be possible within the architecture to permit such access? Are there examples I have missed somewhere that demonstrate how to do this? An example showing how to download a single file with curl byte range support would be very useful. I found documentation for downloading files from the webserver, but this is oriented for web browser usage rather than exposing an HTTP API.

I am asking because in our lab we have been using a file format (.zip) and software which allow random access into big files using HTTP range requests. This allows data analysis on parts of big files by users on modest computers (e.g. inexpensive laptops) without downloading anything to local storage. With this system and fast networking to our storage server, there is no need to copy the big files onto the laptop to do analysis - we simply load the relevant parts of the files to RAM directly from an HTTP call. This is done sequentially, first by reading the file index and then, based on that, to read the relevant ranges of the file.

Things like S3 and many web servers support HTTP range requests directly. It looks like git-annex also supports S3 and web, so on the surface, this seems potentially possible. I could imagine authorization within the GIN model may be an issue. Also, I don't know if there is support for range requests, which would be important for not downloading all the data in a given file.

If there is a better place for this question, please redirect me there.

Thanks, Andrew

Hi, Gin and git-annex newbie here. Does the GIN server currently support direct HTTP access to files in the repository with [range requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests)? If no, would be possible within the architecture to permit such access? Are there examples I have missed somewhere that demonstrate how to do this? An example showing how to download a single file with [curl byte range support](https://ec.haxx.se/http/http-ranges) would be very useful. I found [documentation for downloading files from the webserver](https://gin.g-node.org/G-Node/Info/wiki/WebInterface#downloading-files), but this is oriented for web browser usage rather than exposing an HTTP API. I am asking because in our lab we have been using a file format (.zip) and software which allow random access into big files using HTTP range requests. This allows data analysis on parts of big files by users on modest computers (e.g. inexpensive laptops) without downloading anything to local storage. With this system and fast networking to our storage server, there is no need to copy the big files onto the laptop to do analysis - we simply load the relevant parts of the files to RAM directly from an HTTP call. This is done sequentially, first by reading the file index and then, based on that, to read the relevant ranges of the file. Things like S3 and many web servers support HTTP range requests directly. It looks like [git-annex also supports S3 and web](https://git-annex.branchable.com/special_remotes/), so on the surface, this seems potentially possible. I could imagine authorization within the GIN model may be an issue. Also, I don't know if there is support for range requests, which would be important for not downloading all the data in a given file. If there is a better place for this question, please redirect me there. Thanks, Andrew
Andrew Straw ha commentato 4 anni fa
Autore

I just saw the page about DAV access, which looks highly relevant. This might be the answer I am looking for. It is working for me with the client built into macOS, but so far I am not succesful with curl at the command line.

I just saw the page about [DAV access](https://gin.g-node.org/G-Node/Info/wiki/Dav), which looks highly relevant. This might be the answer I am looking for. It is working for me with the client built into macOS, but so far I am not succesful with curl at the command line.
Achilleas Koutsou ha commentato 4 anni fa
Proprietario

Hello Andrew,

Unfortunately GOGS (the project which GIN is based on) doesn't support range requests. Your use case is very interesting and it's definitely something I can imagine would be useful for a lot of our users.

It seems it's not trivial to add this, but it shouldn't be difficult either.

I'm not sure the DAV access will help with this, but let me know if you have any issues there.

I'd like to keep this issue open and update it when I have more to share. I'd like to test a few things to get a sense of how much work it would require to support this.

Hello Andrew, Unfortunately GOGS (the project which GIN is based on) doesn't support range requests. Your use case is very interesting and it's definitely something I can imagine would be useful for a lot of our users. It seems it's not trivial to add this, but it shouldn't be difficult either. I'm not sure the DAV access will help with this, but let me know if you have any issues there. I'd like to keep this issue open and update it when I have more to share. I'd like to test a few things to get a sense of how much work it would require to support this.
Sign in to join this conversation.
Nessuna milestone
Nessun assegnatario
2 Partecipanti
Caricamento...
Annulla
Salva
Non ci sono ancora contenuti.