Running bridge pull for the first time, obtains a large chunk of them until the limit is exhausted.
% git bug bridge pull
...
new issue: 1b7fd677f16d7dfea56f0caad895c5e944b57c31
changed label: da246a71ce5a0ee9f4603b25385a9a2ab69af2921ebffe62b92740f9db66a0b6
import error: API rate limit exceeded
imported 1935 issues and 63 identities with default bridge
git bug bridge pull 336.95s user 60.75s system 15% cpu 43:33.15 total
I was expecting a resume (after a cool down period) would pick up a few more, but it doesn't:
% git bug bridge pull
import error: API rate limit exceeded
imported 0 issues and 0 identities with default bridge
git bug bridge pull 169.74s user 78.93s system 9% cpu 42:08.63 total
Similar runtime suggests that the same issues were processed, hence none were obtained in addition, before the same rate limit kicked in.
I can use --since to get a few more, but not all it seems. Is there a way to incrementally pull a large number of issues from github?
Michael Muré (MichaelMure) commented
--since would be your best bet, but really the correct solution would be to deal with the rate limiting within the bridge itself.
Michael Muré (MichaelMure) changed the title from GitHub bridge doesnt' handle mutation API rate limit to GitHub bridge doesnt' handle API rate limit
Michael Muré (MichaelMure)
added label
kind/bug
Michael Muré (MichaelMure) commented
To be more explicit, in a normal import, the bridge record the time of import when everything goes right so it can resume from there on the next run. Here the rate limiting is considered an error and the bridge does't record that time, which make the next run start from the beginning.
i just hit the same - is there a way to query what was that last successfully date imported so to know which --since to use ?
Max Rydahl Andersen (maxandersen) commented
answered my self - git bug webui seem to show last updated by default so used that shown date for a --since
Michael Muré (MichaelMure) commented
This is an annoying issue indeed. Maybe I should string match the error to implement a workaround until it's resolved upstream ...
Tony O (bqv) commented
I hit the same while trying to use http://github.com/nixos/nixpkgs
Tony O (bqv) commented (edited)
Not quite as simple with this one. I've managed to get 3.2k issues, but while datalad has ~4k, nixpkgs has nearly 20k
6543 (6543) commented (edited)
github return information about rate limit and when it's allowed to call the api next, so look at this vaule and wait based on that ... will likely solve this
According to the automated test, github seems to become more and more sensitive about this issue ...
rng-dynamics (rng-dynamics) commented
FYI: I am working on it.
Michael Muré (MichaelMure) commented
@rng-dynamics will your changes also cover rate limiting for mutations?
rng-dynamics (rng-dynamics) commented
@rng-dynamics will your changes also cover rate limiting for mutations?
At the moment I do not cover it. I will look at it and if it is simple, I will include it. Otherwise I would like to aim for incremental improvement (that is, I could look at the rate limiting for mutations later).
I might be able to offer you a PR to fix the rate limiting issue of the bridge pull after the weekend.
Michael Muré (MichaelMure) commented
Incremental changes are definitely good. I'm just asking because the rate-limiting on mutation is making the CI fail, and that's never fun.
Tony O (bqv) commented
Hi @rng-dynamics , how is the progress on your plan for this? Do you have anything you'd like me to help test?
rng-dynamics (rng-dynamics) commented (edited)
Hi @rng-dynamics , how is the progress on your plan for this? Do you have anything you'd like me to help test?
Hi @bqv! You can see what I am working on in my fork on the dev-gh-bridge-wip branch. Please be aware that the code still needs a thorough cleanup. That is why I cannot create a pull request yet.
I have optimised the GraphQL queries and I can pull many more issues without hitting the rate limit now. Additionally, the bridge may wait when hitting the limit until the Github API limit is reset (once an hour). https://github.com/datalad/datalad can be pulled without hitting the limit. I am trying to pull all the issues from http://github.com/nixos/nixpkgs at the time of writing this text.
Yes, additional testing would be good. (The code is work in progress but it does compile and run.) The output of the program is not user-friendly at the moment. So, if the bridge decides to sleep for up to an hour, the corresponding message will not be visible at the bottom of the output (but somewhere above). I am not sure if it is easy to hit the rate limit at all. So I think we should try to pull from multiple repositories with a big number of issues at the same time. I think https://github.com/golang/go would be a good additional test case.
No hurry to fix it though, presumably you have a compilable version
Michael Muré (MichaelMure) commented
@rng-dynamics could you open a draft PR with your changes so I don't lose track of this important work? Or even better, push your branch to this repo and open the PR from here so that the CI can test it.
rng-dynamics (rng-dynamics) commented
@bqv Yes, sorry, a file was missing. It should be there now.
@MichaelMure I will do that soon.
Michael Muré (MichaelMure) changed the title from GitHub bridge doesnt' handle API rate limit to GitHub bridge doesnt' handle mutation API rate limit
Michael Muré (MichaelMure) commented
Unless I'm mistaken, @rng-dynamics great work brought a proper handling of the rate limiting when pulling data. What's left is the counterpart when pushing changes.
rng-dynamics (rng-dynamics) commented
Yes, the rate limiting for importing from Github is fixed. I also tested it on https://github.com/datalad/datalad and everything seems correct to me. @MichaelMure: do you want to close this issue?
What's left is the counterpart when pushing changes.
I will have a look at it in the not so distant future.
Michael Muré (MichaelMure) commented
Let's keep this issue open until mutation rate limiting is handled properly. I updated the title.