-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infrastructure tracker #40721
Comments
To folks who tend to monitor the PR queue or otherwise help out with infrastructure: for the time being, I'd like to try using this issue to centralize some tracking of what's going on with infrastructure, and ways people can get involved. Right now too much of this work is falling on too few shoulders (cough @alexcrichton cough) and we need to work on spreading it out. If you see something amiss with any piece of infrastructure, please take a look at the status page on the top here to see if the issue is known. If it's not, open a new A-infrastructure issue and leave a comment with a link. When in doubt, leave a comment here. Similarly, if you want to help out but don't know how, leave a comment here. cc @rust-lang/compiler @rust-lang/libs @frewsxcv @TimNN @Mark-Simulacrum @erickt @edunham @japaric @est31 @durka |
@aturon Is perf.rlo updating? At the bottom it says "Updated as of: 12/30/2016, 1:24:27 PM" |
No. @Mark-Simulacrum has been working on some improvements that should get it going again. |
One longstanding spurious failure is general network errors and I've opened an issue which I believe will help mitigate at least one instance of that, and help implementing it would be greatly appreciated! |
It looks like OSX cycle time for i686-apple-darwin has regressed 20% recently, and unfortunately I'm not sure how to explain it :( |
Looks like all appveyor builds are currently failing: #40694 (comment) |
My fault :( Back to the drawing board... |
Seemingly unrelated to the previous couple messages, the past few attempted PRs failed with the same error message on
|
Similar to something lots of people using arduino saw? The resolving PR is fairly opaque about what the actual fix was though. |
Going to provide a summary of the current perf.rlo situation as I know it (cc @nikomatsakis, who I've talked to about this). The current collection infrastructure is broken, for relatively unknown reasons, and I've deemed it hard to fix and sufficiently difficult to maintain that it needed a rewrite. That work has been started here: https://github.com/Mark-Simulacrum/rustc-perf-collector. The project works for the collection side of things (though it does not upload results to github), but it has not been integrated into the HTTP server for perf.rlo. I've been meaning to devote some time to this, as I don't expect it to be all that hard, but haven't quite gotten around to it yet, and we (Niko and myself) have come up with a few potential roadblocks to getting it started. A number of the roadblocks are discussed and summarized in this internals post. To summarize the current situation:
What needs to be done to get perf.rlo working once more:
Let me know of any questions; I'd be happy to answer them. |
An outage related to a Centos EOL being discussed in the #rust-infra channel. 3:39 PM so typically we cache docker images on travis |
Note: the previous comment is about a general outage for the queue. |
Regarding the previous comments, that issue has been resolved via #41045, though it (and other PRs) appear to be struggling to land because we keep hitting the three hour mark on Travis. |
every time we there has to be a better way |
https://travis-ci.org/rust-lang/rust/builds/219024973 if you look at the two osx builders that took >2hr30min (!) you'll see that the logs have been truncated. When it was still building, opening the page got the truncated log, then new logs (e.g. test output) was streamed to me. Refreshing truncated it back down again. Didn't cause a build failure, but maybe worth being aware of - I can imagine this being very annoying if the build had failed with truncated logs. |
@aidanhs yeah I've found the output to sometimes be confusing on Travis. The raw logs at least appear to not be truncated? Note that we've got a separate issue for how slow OSX is |
Odd, I definitely checked that and they were truncated too (or I wouldn't have mentioned it). I guess it was a blip that corrected itself, which is a relief. |
Heh I think I've definitely noticed that as well before, it sometimes just corrects itself ... |
@nrc Is there any desire to merge your changes to highfive into the upstream servo/highfive repo? We've done a lot of work since your fork, and AFAIK there aren't any things in yours that were overly rustaecous and non-upstreamable. |
@larsbergstrom I haven't looked at the Servo highfive for ages, but the last time I checked, the two had diverged considerably and merging would be very non-trivial. I have nothing against doing so, but it seems pretty low priority and is quite a lot of work so I can't see it actually happening. |
It looks like the CentOS 'vault' no longer has packages related to CentOS 5, so our builds will fail until we find a resolution: |
To follow up with my previous comment, it turns out we were using the wrong 'vault' URL path. The fix is in #41231. |
Triage ping. Not sure if this issue is still valid or worth it. |
Nominating for infra team discussion; I personally support closing this issue -- I don't think a tracking issue like this adds much to our work. |
Status
Monitoring
Known issues
is the s3-based ccache clone written in Rust that we use to cache LLVM builds
Help wanted
All infrastructure issues
Easy
Check the PR queue for old PRs that have yet to be reviewed, and ping the reviewer on IRC or elsewhere. (Yes, you can and should do this!).
Check the PR queue for build failures, find the failed build, and extract out the information onto a comment on the PR.
Medium
Travis: add a tool to print timestamp for log messagesSwitch AppVeyor to use Docker
Set up CI testing for asm.js and wasmHard
Infrastructure projects
CI + releases. Currently set up via Travis + AppVeyor, with some additional infrastructure in Rust Central Station to monitor and control the builds.
Rust Central Station. Oversees CI/releases and nagbox. Set up using Docker.
homu. The bot behind @bors. Hooks into the above CI infrastructure to actually land PRs.
rfcbot. A bot for managing the FCP process of RFCs and tracking issues.
rusty-dash. A dashboard tracking a number of metrics for Rust and its community.
highfive. A bot that welcomes new contributors and randomly assigns reviewing duties.
nagbot. A bot for sending email reminders to the Rust subteams about reviewing duties.
rustbuild. The
x.py
build system for the Rust compiler.play. The infrastructure behind https://play.rust-lang.org/
perf. Performance monitoring for the Rust compiler.
The text was updated successfully, but these errors were encountered: