#62 Presumably corrupted git-annex branches

Abierta
abierta hace 1 año por adswa · 0 comentarios

Hi! First and foremost a huge thank you for Gin! It is an immeasurably useful infrastructure for science.

I've recently noticed what I presume to be a corruption of the git-annex branch after pushing to Gin, and reported it originally at https://github.com/datalad/datalad-gooey/issues/349.

The issue presents as follows: At the moment, pushing a DataLad dataset/git annex repo causes a severance of the git-annex branch, and complete divergence of my local and the remote git-annex branch on Gin. This happens with datasets I previously pushed successfully (small datasets I often use for demonstrations or ad-hoc testing).

An example is this dataset (you might see different gin repos in the errors below as I tried to pin this down to parametrization or operating system, but the errors were identical over different scenarios). Its originally from https://github.com/datalad-datasets/machinelearning-books, and contains PDFs that have a web special remote registered (i.e., files came from a git annex addurl call). If I add a new gin repository as a remote, and push it using datalad push, the push succeeds for the default branch, but fails with a non-fast-forward error for the git-annex branch, similar to the one below:

*	refs/heads/master:refs/heads/master	[new branch]
!	refs/heads/git-annex:refs/heads/git-annex	[rejected] (non-fast-forward)
Done'] [err: 'Delta compression using up to 16 threads
Total 422 (delta 198), reused 149 (delta 33), pack-reused 0                                                                                      error: failed to push some refs to 'gin.g-node.org:/adswa/ml-books-only-ssh.git'
hint: Updates were rejected because a pushed branch tip is behind its remote
hint: counterpart. Check out this branch and integrate the remote changes
hint: (e.g. 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.']

Investigating the remote git-annex branch on Gin shows that the git-annex branch has been re-created from scratch (it seems), by a committer ID called "Gogs": https://gin.g-node.org/adswa/mlbooksmoretests/src/git-annex. The local git-annex branch shows commits indicating that the branch was rewritten or otherwise vastly changed:

(gooyey) C:\Users\adina\Desktop\ml-books2>git log git-annex
commit 4e226892a69de8989b56cef5f41c49f138aee09e (git-annex)
Author: Adina Wagner <adina.wagner@t-online.de>
Date:   Fri Oct 14 09:22:57 2022 +0200

    continuing transition ["forget git history"]

commit 38be5a7d07b019e2a7e42c8dff0734926c276f7d
Author: Adina Wagner <adina.wagner@t-online.de>
Date:   Fri Oct 14 09:17:56 2022 +0200

    update

commit 72cd967f9648209aab5c55aebf5b60f1aea41099 (origin/git-annex)
Author: Adina Wagner <adina.wagner@t-online.de>
Date:   Tue Apr 19 13:29:07 2022 +0200

    update

A manual pull fails locally:

❱ git pull gin git-annex
From https://gin.g-node.org/adswa/mlbooksmoretests
 * branch            git-annex  -> FETCH_HEAD
fatal: refusing to merge unrelated histories

And annexed data that should be readily available from the web special remote can't be retrieved after cloning the repository.

(gooey) adina@muninn in /tmp/mlbooksmoretests on git:master
❱ git-annex whereis A.Shashua-Introduction_to_Machine_Learning.pdf          1 !
whereis A.Shashua-Introduction_to_Machine_Learning.pdf (0 copies) failed
whereis: 1 failed
(gooey) adina@muninn in /tmp/mlbooksmoretests on git:master

❱ git annex get A.Shashua-Introduction_to_Machine_Learning.pdf            130 !
get A.Shashua-Introduction_to_Machine_Learning.pdf (not available) 
  No other repository is known to contain the file.
failed
get: 1 failed
(gooey) adina@mun

I have seen this on Linux and Windows-based operating systems with different versions of git-annex, using DataLad but also only git push and git annex sync commands. I also reproduced this with several datasets I previously pushed successfully, with data available from web special remotes, other types of special remotes, or purely local availability. Can you advise what might be wrong?

Hi! First and foremost a huge thank you for Gin! It is an immeasurably useful infrastructure for science. I've recently noticed what I presume to be a corruption of the git-annex branch after pushing to Gin, and reported it originally at https://github.com/datalad/datalad-gooey/issues/349. The issue presents as follows: At the moment, pushing a DataLad dataset/git annex repo causes a severance of the git-annex branch, and complete divergence of my local and the remote git-annex branch on Gin. This happens with datasets I previously pushed successfully (small datasets I often use for demonstrations or ad-hoc testing). An example is [this dataset](https://gin.g-node.org/adswa/mlbooksmoretests) (you might see different gin repos in the errors below as I tried to pin this down to parametrization or operating system, but the errors were identical over different scenarios). Its originally from https://github.com/datalad-datasets/machinelearning-books, and contains PDFs that have a web special remote registered (i.e., files came from a `git annex addurl` call). If I add a new gin repository as a remote, and push it using ``datalad push``, the push succeeds for the default branch, but fails with a non-fast-forward error for the ``git-annex`` branch, similar to the one below: ``` * refs/heads/master:refs/heads/master [new branch] ! refs/heads/git-annex:refs/heads/git-annex [rejected] (non-fast-forward) Done'] [err: 'Delta compression using up to 16 threads Total 422 (delta 198), reused 149 (delta 33), pack-reused 0 error: failed to push some refs to 'gin.g-node.org:/adswa/ml-books-only-ssh.git' hint: Updates were rejected because a pushed branch tip is behind its remote hint: counterpart. Check out this branch and integrate the remote changes hint: (e.g. 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details.'] ``` Investigating the remote git-annex branch on Gin shows that the git-annex branch has been re-created from scratch (it seems), by a committer ID called "Gogs": https://gin.g-node.org/adswa/mlbooksmoretests/src/git-annex. The local git-annex branch shows commits indicating that the branch was rewritten or otherwise vastly changed: ``` (gooyey) C:\Users\adina\Desktop\ml-books2>git log git-annex commit 4e226892a69de8989b56cef5f41c49f138aee09e (git-annex) Author: Adina Wagner <adina.wagner@t-online.de> Date: Fri Oct 14 09:22:57 2022 +0200 continuing transition ["forget git history"] commit 38be5a7d07b019e2a7e42c8dff0734926c276f7d Author: Adina Wagner <adina.wagner@t-online.de> Date: Fri Oct 14 09:17:56 2022 +0200 update commit 72cd967f9648209aab5c55aebf5b60f1aea41099 (origin/git-annex) Author: Adina Wagner <adina.wagner@t-online.de> Date: Tue Apr 19 13:29:07 2022 +0200 update ``` A manual pull fails locally: ``` ❱ git pull gin git-annex From https://gin.g-node.org/adswa/mlbooksmoretests * branch git-annex -> FETCH_HEAD fatal: refusing to merge unrelated histories ``` And annexed data that should be readily available from the web special remote can't be retrieved after cloning the repository. ``` (gooey) adina@muninn in /tmp/mlbooksmoretests on git:master ❱ git-annex whereis A.Shashua-Introduction_to_Machine_Learning.pdf 1 ! whereis A.Shashua-Introduction_to_Machine_Learning.pdf (0 copies) failed whereis: 1 failed (gooey) adina@muninn in /tmp/mlbooksmoretests on git:master ❱ git annex get A.Shashua-Introduction_to_Machine_Learning.pdf 130 ! get A.Shashua-Introduction_to_Machine_Learning.pdf (not available) No other repository is known to contain the file. failed get: 1 failed (gooey) adina@mun ``` I have seen this on Linux and Windows-based operating systems with different versions of git-annex, using DataLad but also only git push and git annex sync commands. I also reproduced this with several datasets I previously pushed successfully, with data available from web special remotes, other types of special remotes, or purely local availability. Can you advise what might be wrong?
Inicie sesión para unirse a esta conversación.
Sin Milestone
Sin asignado
1 participantes
Cargando...
Cancelar
Guardar
Aún no existe contenido.